Running PGENESIS on the T3E


Introduction

Once a parallel script has been developed and debugged on the PSC Supercluster or local workstation, it can be run on the Cray T3E. PGENESIS makes use of the PVM libraries for communication between parallel elements (or nodes) running during a simulation. There are slight differences between how PVM is run on the T3E and how it is run on a network of workstations such as the Supercluster. Because of this, there are a few idiosyncracies and special rules documented below that one should be aware of when running PGENESIS simulations on the T3E.

Running Interactively

It is possible to run short (< 10 min and <= 16 nodes) T3E jobs interactively, which is useful for verifying that your scripts run as expected in the T3E environment. The pgenesis shell script is the easiest way to invoke PGENESIS once you have logged onto the T3E. By using the -nodes flag to the pgenesis script you can specify the number of T3E nodes to use for PGENESIS.

Submitting a T3E job

For longer runs the T3E provides a batch environment. In order to run a PGENESIS simulation on the T3E, one should submit the job to a queue by means of a submission script. Submissions for the T3E are made from the T3E itself as described here.

Here is a sample job which runs the parameter searching example.

PGENESIS: T3E versus Supercluster/workstations

PGENESIS operates slightly differently on the T3E as compared with its operation on workstation platforms such as the Supercluster. There are four major differences:

Xodus not supported

Because the T3E version of PGENESIS is intended to be primarily used in batch mode, the Xodus tools have not been included in this release.

Execution parameters and paron statement

The invocation of PGENESIS and the spawning of PGENESIS nodes varies between the workstation version of PGENESIS and the T3E version. In the workstation version, the invocation of PGENESIS is perfomed on a single node, the master node, which in turn spawns a a number of children nodes as defined by the paron statement in a script or by the value of the environment variable NNODES. When spawning these children nodes, the master takes into account all parameters listed in the paron statement.

On the T3E, however, no such spawning is done. Instead, the PGENESIS executable is invoked simultaneously on all processing elements requested. (The number of PEs to be used is defined by the environment variable MPP_NPES. See the section on number of nodes below). There is logically still a master node (the first node that happens to get initialized) and worker nodes. In reality, however, this master node does not spawn the worker nodes. Since the master node does not spawn, it has no direct control over the execution parameters of the worker nodes.

Because of this difference, many of the parameters of the paron statement are ignored in the T3E version of PGENESIS. Instead, equivalent parameters must be specified on the PGENESIS command line used to invoke PGENESIS on all PEs. (If these parameters are given to the pgenesis script, they will be passed on to the PGENESIS executables.)

These parameters are ignored by the T3E PGENESIS since they are already determined by the time the paron statement is interpreted:

-executable filename
-startup script

These paron parameters should be replaced with the following command line equivalents on the T3E:

paron statement         T3E command line equivalent

-simrc filename         -altsimrc filename
-silent level           -silent level
-nice level             -nice level
-execdir directory      -execdir directory

This parameter is read from the paron statement and interpreted by each node:

-debug level

These parameters are not ignored, and are discussed further in the following sections:

-output filename.
-nnodes nnodes

Number of PGENESIS nodes

Specification of the number of GENESIS nodes to run is generally defined by the nnodes parameter of the paron statement. On the workstation version, this value can also be defined using the environment variable NNODES.

When running PGENESIS on the T3E, the user must first define how many PEs are to be used simultaneously to run the simulation. This is done by setting the value of the environment variable MPP_NPES. This variable must be set before invoking parallel GENESIS and will be the deciding factor on the maximum number of GENESIS nodes that can used during any given simulation. (It must currently also be a power of 2.) Because of this, the value of MPP_NPES should be equal to or greater to the number of nodes specified on the paron statement in the PGENESIS script being run.

When the value of MPP_NPES is greater than the number of nodes requested by a paron statement, the excess PEs do invoke parallel GENESIS. However, when the paron statement is interpretted, they will immediatly exit and a warning message will be issued. For maximum utilization of the T3D, it is recommended that the number of nodes requested on the paron statement be the same as the number of PEs requested for the run (and as such be a power of 2). If the nnodes parameter is not specified on the paron statement, the number of nodes on the T3D will default to the value of MPP_NPES. On the workstation version, the number of nodes will default to either the value of the environment variable NNODES (if it is set) or the number of available pvm hosts (if NNODES is not set). Note that the T3D version of parallel GENESIS ignores the NNODES variable.

If the value of MPP_NPES is less than the number of nodes requested on the paron statement, an error message will be generated and parallel GENESIS will abort.

Output

The paron -output parameter redirects the output of worker nodes into a file. Utimately, this option, like the -debug option, will be one of the options that will not be ignored by the T3E version and will work exactly like the workstation version of PGENESIS.

Due to a bug in PVM for the T3E, however, this option is not currently supported on the T3E version of PGENESIS. Instead, all output from both the master and worker nodes are directed to stdout. Once a fix is in place for the PVM bug, this option will be enabled in T3E PGENESIS.