Skeleton Parameter Search


To demonstrate the use of pGenesis for doing parameter searching, we have constructed an example that performs a simple genetic search.

In performing the genetic search, we keep a fixed-sized population of individuals "alive". We randomly pick an individual from this population and mutate its representation with a certain probability. We then evaluate the new individual and, if it is better than some other, we replace the worst-evaluated individual in the population with this new individual. To allow for parallelism, we evaluate multiple new individuals simultaneously, and only remove existing individuals in the population as new individuals with superior evaluations are found.

In this example we represent each parameter (out of a total of 4 parameters per individual) as a 16-bit string. The floating point value which this bit-string represents is determined by linearly mapping the 16 bit integer with range [0,65535] into the range [-32.768, 32.767]. When constructing a new individual from an old one, we mutate each of the bits with probability 0.02. We evaluate each individual by computing a trivial function over the parameters. In a real-world parameter search, this step would be the most time-consuming since it would involve running a neural simulation. Here, we have set our evaluation function to 1.0 / sqrt(abs(a + b + c + d - 5.3467)). This directs the search toward values for parameters a, b, c, and d that sum to 5.3467. Note that there is no unique solution to this problem, so the final result will depend on the random number generator (and the seed it was started with).

The main pGenesis script follows. Note that because the individuals can run independently of each other, here pGenesis is started in "farm" mode i.e., with each node running in its own zone. Also note that synchronizing zones is accomplished with the barrierall rather than barrier command.

// every node will run these script commands
paron -farm -silent 0 -nodes {n_nodes} -output o.out \
       -altsimrc ../.simrc -executable nxpgenesis
echo I am node {mytotalnode}

// first wait for every node to start up
barrierall

if ({mytotalnode} == 0)
       search          // the master (on node 0) runs the parameter search
end
barrierall 7 1000000   // workers will sit here waiting for task
                       // assignments from the master
quit

The major functions in the scripts are included in abbreviated form here:

// this is the function that directs the search
function search
    int   i, chosen, bs_a, bs_b, bs_c, bs_d
    float a, b, c, d

    init_search
    init_farm

     for (i = 0; i < individuals; i = i + 1)
       // pick random individual from population
       chosen = {rand 0 population}
       if (chosen >= actual_population)
           // pick a random bit string
           bs_a = {rand 0 65536}
           bs_b = {rand 0 65536}
           bs_c = {rand 0 65536}
           bs_d = {rand 0 65536}
       else
           // mutate an existing bit string
           bs_a = {mutate {getfield /population[{chosen}] a_value}}
           bs_b = {mutate {getfield /population[{chosen}] b_value}}
           bs_c = {mutate {getfield /population[{chosen}] c_value}}
           bs_d = {mutate {getfield /population[{chosen}] d_value}}
       end
       // assign this task to a worker
       //    (delegate_task is a routine in farm.g)
       delegate_task {i} {bs_a} {bs_b} {bs_c} {bs_d}
    end 

    finish        // wait for all workers to finish
end

// worker_task performs the task that the master has doled out to this node
function worker_task (index, bs_a, bs_b, bs_c, bs_d)
       int index, bs_a, bs_b, bs_c, bs_d;
       float a, b, c, d;
       float fit

       // compute the real-valued parameters from the bit-strings (the
       //    "genes")
       a = (bs_a - 32768.0) / 1000.0;
       b = (bs_b - 32768.0) / 1000.0;
       c = (bs_c - 32768.0) / 1000.0;
       d = (bs_d - 32768.0) / 1000.0;

       // determine that fitness value for this individual
       fit = {evaluate {a} {b} {c} {d}}

       // return result to the master (node 0 in zone 0)
       return_result@0.0 {mytotalnode} {index} {bs_a} {bs_b} {bs_c} {bs_d} {fit}
end

// the workers call return_result on the master to provide the results
//     of their evaluation
function return_result (node, index, bs_a, bs_b, bs_c, bs_d, fit)
       int node, index, bs_a, bs_b, bs_c, bs_d
       float fit

       if (actual_population < population)
           // if we are underpopulated, each new individual stays "alive"
           least_fit = actual_population
           min_fitness = -1e+10
           actual_population = actual_population + 1
       end
       if (fit > min_fitness)
           // replace the least fit individual with the current individual
           setfield /population[{least_fit}] fitness {fit}
           setfield /population[{least_fit}] a_value {bs_a}
           setfield /population[{least_fit}] b_value {bs_b}
           setfield /population[{least_fit}] c_value {bs_c}
           setfield /population[{least_fit}] d_value {bs_d}
           recompute_fitness_extremes
       end

       // that node is now free, and can be assigned a new task
       setfield /free[{node}] value 1

       // print out the fitness value of the individual we just evaluated
       ...
end

To run this example on the PSC Supercluster:

  1. "cd" to /afs/psc/biomed/prod/genesis/parallel/Scripts/param.
  2. Run this example by typing "pgenesis -nox demo.g". The "-nox" indicates that we will not be using the Xodus interface. This will print out the fitness value of each new individual upon completion of its evaluation. Note how the values improve over time.

To run this example interactively on the PSC T3E:

  1. "cd" to /ptmp/neural/pgenesis/Scripts/param.
  2. Run this example by typing "pgenesis -nox -nodes 4 demo.g". The "-nox" indicates that we will not be using the Xodus interface and the "-nodes 4" indicates that we want a 4-processor partition on the T3E.. This will print out the fitness value of each new individual upon completion of its evaluation. Note how the values improve over time. If you want to try this example with more or less than 4 processors, you must edit the demo.g file and change the n_nodes value, as well as invoke pgenesis with an -nodes value greater than or equal to n_nodes.

To run this example on your own workstation:

  1. Ensure that your pGenesis environment is set up.
  2. "cd" to the Scripts/param subdirectory of the directory in which pGenesis is installed on your machine.
  3. Start up the PVM daemon by typing"pvm". At the prompt, type "quit".
  4. Run this example by typing "nxpgenesis demo.g". This will print out the fitness value of each new individual upon completion of its evaluation. Note how the values improve over time.
  5. (Optional) If you don't want to leave the pvm daemon running, type "echo halt | pvm".