T3D Batch Runs of Parallel GENESIS


Introduction

Once a parallel script has been developed and debugged on the PSC Supercluster or local workstation, it can be run on the Cray T3D. Parallel GENESIS makes use of the PVM libraries for communication between parallel elements (or nodes) running during a simulation. There are slight differences between how PVM is run on the T3D and how it is run on a network of workstations such as the Supercluster. Because of this, there are a few idiosyncracies and special rules documented below that one should be aware of when running parallel GENESIS simulations on the T3D.

Submitting a T3D job

The T3D runs in a batch environment, and as such interactive use is discouraged. In order to run a GENESIS simulation on the T3D, one needs to submit the job to a queue by means of a submission script. Submissions for the T3D are made from PSC's CRAY C90 or from any of PSC's front-end machines as described here.

Here is a sample job which runs the parameter searching example.

PGENESIS: T3D versus Supercluster/workstations

PGENESIS operates slightly differently on the T3D as compared with its operation on workstation platforms such as the Supercluster. There are four major differences:

Xodus not supported

Because the T3D version of Parallel GENESIS is intended to be used in batch mode, the Xodus tools have not been included in this release.

Execution parameters and paron statement

The invocation of parallel GENESIS and the spawning of GENESIS nodes varies between the workstation version of parallel GENESIS and the T3D version. In the workstation version, the invocation of GENESIS is perfomed on a single node, the master node, which in turn spawns a a number of children nodes as defined by the paron statement in a script or by the value of the environment variable NNODES. When spawning these children nodes, the master takes into account all parameters listed in the paron statement.

On the T3D, however, no such spawning is done. Instead, the GENESIS executable is invoked simultaneously on all processing elements requested. (The number of PEs to be used is defined by the environment variable MPP_NPES. See the section on number of nodes below). There is logically still a master node (the first node that happens to get initialized) and worker nodes. In reality, however, this master node does not spawn the worker nodes. Since the master node does not spawn, it has no direct control over the execution parameters of the worker nodes.

Because of this difference, many of the parameters of the paron statement are ignored in the T3D version of parallel GENESIS. Instead, equivalent parameters must be specified on the GENESIS command line used to invoke GENESIS on all PEs.

These parameters are ignored by the T3D parallel GENESIS since they are already determined by the time the paron statement is interpretted.

-executable filename
-startup script
These paron parameters should be replaced with the following command line equivalents on the T3D:
paron statement         T3D command line equivalent

-simrc filename         -altsimrc filename
-silent level           -silent level
-nice level             -nice level
-execdir directory      -execdir directory
This parameter is read from the paron statement and interpretted by each node:
-debug level
These parameters are not ignored, and are discussed further in the following sections:
-output filename.
-nnodes nnodes

Number of GENESIS nodes

Specification of the number of GENESIS nodes to run is generally defined by the nnodes parameter of the paron statement. On the workstation version, this value can also be defined using the environment variable NNODES.

When running parallel GENESIS on the T3D, the user must first define how many PEs are to be used simultaneously to run the simulation. This is done by setting the value of the environment variable MPP_NPES. This variable must be set before invoking parallel GENESIS and will be the deciding factor on the maximum number of GENESIS nodes that can used during any given simulation. (It must currently also be a power of 2.) Because of this, the value of MPP_NPES should be equal to or greater to the number of nodes specified on the paron statement in the GENESIS script being run.

When the value of MPP_NPES is greater than the number of nodes requested by a paron statement, the excess PEs do invoke parallel GENESIS. However, when the paron statement is interpretted, they will immediatly exit and a warning message will be issued. For maximum utilization of the T3D, it is recommended that the number of nodes requested on the paron statement be the same as the number of PEs requested for the run (and as such be a power of 2). If the nnodes parameter is not specified on the paron statement, the number of nodes on the T3D will default to the value of MPP_NPES. On the workstation version, the number of nodes will default to either the value of the environment variable NNODES (if it is set) or the number of available pvm hosts (if NNODES is not set). Note that the T3D version of parallel GENESIS ignores the NNODES variable.

If the value of MPP_NPES is less than the number of nodes requested on the paron statement, an error message will be generated and parallel GENESIS will abort.

Output

The paron -output parameter redirects the output of worker nodes into a file. Utimately, this option, like the -debug option, will be one of the options that will not be ignored by the T3D version and will work exactly like the workstation version of parallel GENESIS.

Due to a bug in PVM for the T3D, however, this option is not currently supported on the T3D version of parallel GENESIS. Instead, all output from both the master and worker nodes are directed to stdout. Once a fix is in place for the PVM bug, this option will be enabled in T3D parallel GENESIS.