1. Introduction

The BG/Q resource manager, Cobalt, provides a mechanism to run multiple small jobs (sub-block jobs) repeatedly over a larger outer block. In order to run an application in this mode, the user must manually determine the optimal geometry (i.e., select a subset of the nodes based on their interconnection, allowing for best internode communication) and related low-level parameters of the system. The subjob technique addresses this challenge and enables MTC applications on the BG/Q. The technique lets users submit multiple, independent, repeated jobs within a single larger Cobalt block.

The Swift-subjob package provides tools, scripts and example use-cases to run Swift applications in subjob mode over the ALCF BG/Q resources: Cetus (for small-scale testing) and Mira. The framework is flexible in that the same configuration can be used to run subjob or non-subjob mode depending on the scale and size of a run. Users can run multiple waves of jobs asynchronously and in parallel.

2. Quickstart

Download the subjob demo package as follows:

wget http://mcs.anl.gov/~ketan/subjobs.tgz

followed by:

tar zxf subjobs.tgz
cd  subjobs/simanalyze/part05

To run the example application:

./runcetus.sh #on cetus
#or
./runmira.sh #on mira

Another example is found in subjobs/simanalyze/part06

For the details about the working of this example, see Swift tutorial here.

3. Background

This section briefly discusses the Swift parallel scripting, Subjobs and their integration.

3.1. Swift

The Swift parallel scripting framework eases the expression and execution of workflow and MTC applications such as ensembles and parameter sweeps.

3.2. Subjobs

Subjobs offer a way to run ensembles and parameter sweep like computations over BG/Q resources.

4. Diving deep

4.1. Convert any Swift script to subjob

To configure a Swift application to run in subjob mode, the following changes are required:

First, add bg.sh as the application invoker in place of sh or any other invoker. For example, if the app definition is as follows:

sh @exe @i @o arg("s","1") stdout=@sout stderr=@serr;

Replace the invocation with the bg.sh invocations like so:

bg.sh @exe @i @o arg("s","1") stdout=@sout stderr=@serr;

Second, export the SUBBLOCK_SIZE environment variable. For example:

export SUBBLOCK_SIZE=16
Note
The value of SUBBLOCK_SIZE variable must be a power of 2 and less than 512.

4.2. Swift Configuration

A complete example config file for a sub-block job run on ALCF is shown below:

sites : cluster
site.cluster {
    execution {
        type: "coaster"
        URL: "localhost"
        jobManager: "local:cobalt"
        options {
            maxNodesPerJob: 32
            maxJobs: 1
            tasksPerNode: 2
            #workerLoggingLevel = "DEBUG"
            nodeGranularity: 32
            maxJobTime = "00:60:00"
        }
    }
    filesystem {
        type: "local"
        URL: "localhost"
    }
    staging : direct
    workDirectory: "/home/"${env.USER}"/swift.work"
    maxParallelTasks: 30
    initialParallelTasks: 29
    app.bgsh {
        executable: "/home/ketan/SwiftApps/subjobs/bg.sh"
        maxWallTime: "00:04:00"
        env.SUBBLOCK_SIZE="16"
    }
}

executionRetries: 0
keepSiteDir: true
providerStagingPinSwiftFiles: false
alwaysTransferWrapperLog: true

Of note are the SUBBLOCK_SIZE property which must be present in the sites definition. It defines the size of the subblock needed to run the script. In this particular example, we have the outer block size to be 256 nodes whereas the subblock size is 16 nodes. This results in a total of 16 subblocks resulting in jobsPerNode value to be 16.

Note
Swift installation for sub-block jobs on Vesta and Mira machines can be found at /home/ketan/swift-k/dist/swift-svn/bin/swift

5. Use-Case Applications

This section discusses some of the real-world use-cases that are set up as demo applications with this package. These applications are tested with subblock as well as non-subblock runs on BG/Q system.

5.1. NAMD

NAMD is a molecular dynamics simulation code developed at uiuc. The Swift source and configuration files along with application inputs can be found in the namd directory in the subjobs pacjage. To run NAMD example:

cd namd #change to the package's namd directory
./runvesta.sh #run on vesta
./runmira.sh  #run on mira

To run NAMD with a different input dataset, change the input files in the input_files directory and reflect the changes in the namd.swift source code, especially the data definitions part:

file psf <"input_files/h0_solvion.psf">;
file pdb <"input_files/h0_solvion.pdb">;
file coord_restart <"input_files/h0_eq.0.restart.coor">;
file velocity_restart <"input_files/h0_eq.0.restart.vel">;
file system_restart <"input_files/h0_eq.0.restart.xsc">;
file namd_config <"input_files/h0_eq.conf">;
file charmm_parameters <"input_files/par_all22_prot.inp">;

Similarly, in order to change the scale and size of runs, make changes to the parameters in the sites file as described in section 1 above.

5.2. Rosetta

Rosetta is a molecular docking toolkit with many related programs used by many large-scale HPC science applications. This implementation shows how to run FlexPeptide Docking on ALCF systems. The Swift scripts, configuration and application specific inputs are found in the rosetta directory. To run the example:

cd rosetta
./runmira.sh #run on mira

To change scale, size and/or inputs of the run, change the location of input files in the Swift source file (rosetta.swift) like so:

file pdb_files[] <filesys_mapper; location="hlac_complex", suffix=".pdb">;

In the above line all .pdb files in the directory hlac_complex will be processed in parallel. Change the location from hlac_complex to where you have your input files.

To change the number of generated structs by Flexpep Docking program, change the struct argument in the call to the rosetta application like so:

(scorefile, rosetta_output, rosetta_error) = rosetta(pdb, 2);

In the above line, the number 2 indicates the number of structs to be generated by the docking. Change this value to the desired size to change the desired number of structs. To make changes to other parameters, make changes to the commandline as invoked in the app definition in the Swift script like so:

bgsh \
"/home/ketan/openmp-gnu-july16-mini/build/src/debug/linux/2.6/64/ppc64/xlc/static-mpi/FlexPepDocking.staticmpi.linuxxlcdebug" \
"-database" "/home/ketan/minirosetta_database" \
"-pep_refine" "-s" @pdb_file "-ex1" "-ex2aro" \
"-use_input_sc" "-nstruct" nstruct "-overwrite" \
"-scorefile" @_scorefile stdout=@out stderr=@err;

5.3. Dock

Dock is another molecular docking program used by many applications. The Swift source, configuration and application related inputs can be found in the dock directory. To run the Dock example:

cd dock
./runcetus.sh #run on cetus
./runmira.sh #run on mira

6. Diving deeper

The key driver of the Swift sub-block jobs is a script called bg.sh that does the sub-block jobs calculations and othe chores for the users. The script looks as follows:

#!/bin/bash

set -x

mname=$(hostname)

# vesta and mira has different path than cetus
if [[ $mname == *vesta* || $mname == *mira* ]]
then
    export PATH=/soft/cobalt/bgq_hardware_mapper:$PATH
else
    export PATH=/soft/cobalt/cetus/bgq_hardware_mapper:$PATH
fi

#export SUBBLOCK_SIZE=16

# Prepare shape based on subblock size
# provided by user in sites environment
case "$SUBBLOCK_SIZE" in
1) SHAPE="1x1x1x1x1"
;;
8) SHAPE="1x2x2x2x1"
;;
16) SHAPE="2x2x2x2x1"
;;
32) SHAPE="2x2x2x2x2"
;;
64) SHAPE="2x2x4x2x2"
;;
128) SHAPE="2x4x4x2x2"
;;
256) SHAPE="2x4x4x4x2"
;;
512) SHAPE="4x4x4x4x2"
;;
*) echo "SUBBLOCK_SIZE not set or incorrectly set: will not use subblock jobs"
;;
esac

# If subblock size is provided, do subblock business
if [ "$SUBBLOCK_SIZE"_ != "_" ]
then
    export SWIFT_SUBBLOCKS=$(get-corners.py "$COBALT_PARTNAME" $SHAPE)
    export SWIFT_SUBBLOCK_ARRAY=($SWIFT_SUBBLOCKS)

    #echo "$0": SWIFT_SUBBLOCKS="$SWIFT_SUBBLOCKS"

    if [ "_$SWIFT_SUBBLOCKS" = _ ]; then
      echo ERROR: "$0": SWIFT_SUBBLOCKS is null.
      exit 1
    fi

    nsb=${#SWIFT_SUBBLOCK_ARRAY[@]}

    CORNER=${SWIFT_SUBBLOCK_ARRAY[$SWIFT_JOB_SLOT]}

    #Some logging
    echo "$0": running BLOCK="$COBALT_PARTNAME" SLOT="$SWIFT_JOB_SLOT"
    echo "$0": running cmd: "$0" args: "$@"
    echo "$0": running runjob --strace none --block "$COBALT_PARTNAME" --corner "$CORNER" --shape "$SHAPE" -p 16 --np "$((16*$SUBBLOCK_SIZE))" : "$@"

    #without timeout
    #runjob --strace none --block "$COBALT_PARTNAME" --corner "$CORNER" --shape "$SHAPE" -p 16 --np "$((16*$SUBBLOCK_SIZE))" : "$@"
    runjob --block "$COBALT_PARTNAME" --corner "$CORNER" --shape "$SHAPE" -p 16 --np "$((16*$SUBBLOCK_SIZE))" : "$@"

    echo "Runjob finished."
else
    # run w/o subblocks if no subblock size provided
    echo "Running in nonsubblock mode."
    echo "$0": running runjob -p 16 --block $COBALT_PARTNAME : "$@"

    #strace -o "$HOME/strace.runjob.out" runjob --strace none -p 16 --block $COBALT_PARTNAME : "$@"
    runjob -p 16 --block $COBALT_PARTNAME : "$@"

    echo "Finished Running in nonsubblock mode."
fi
exit 0

7. Further Information

  1. More information about Swift can be found here.

  2. More about ALCF can be found here.

  3. More about IBM BlueGene sub-block jobs can be found here (PDF).