Implementation


Up: Contents Previous: Parallel partition info

It is important that these commands themselves execute in parallel. In interactive use, it is common to expect a command to complete in a second or less. The parallel version of the same command should not take much longer. This requires that the commands be executed in parallel.

A simple way to arrange for parallel execution is to use recursive subdivision. Each node is given some number of processes to run a command on. It divides that list in half, and sends the upper half to the first processor in that half. This process continues until only one process is left. This takes steps forp processes. A simple form of this is shown in Figure 3 for pls. This sample code has no error checking and assumes a single range of processors from start to end. The names of the nodes are spnodei, for .

Various optimizations of this process are possible. For example, for small numbers, the recursive subdivision may be replaced with a simple loop. Other optimizations can take advantage of the particular structure of a parallel machine, adapting the subdivision strategy to the available communication network and services.

In order to provide maximum parallelism, each subdivision must execute the subdivided processes in the background. It is important to insure that a command does not return until all of its children have completed.


on load over ethernet, provide some timings and analysis

Many of these commands can be implemented in terms of pexec or ppred, perhaps combined with some relatively simple awk or perl scripts. We have chosen to provide a larger set of commands because they represent common cases for which we believe shortcuts should be provided.


 
 
        #! /bin/csh 
        set start = $1 
        set end   = $2 
        shift 
        shift 
        set nodename = `hostname` 
        ls $* |& sed "s/^/$nodename/g"  
        while (1) 
            if ($start >= $end) break 
            # Compute the separator for the tree 
            @ sep = ($start + $end) / 2 
            if ($sep == $start) then 
                @ sep = $sep + 1 
                if ($sep > $end) break 
            endif 
            rsh spnode$sep -n /tmp/pls $sep $end $* & 
            @ end = $sep - 1 
        end 
        wait 

Figure 3: Simplified code for parallel ls

On many systems, the time to load a program from a central filesystem can be significant. On these systems, programs (including these tools) should be loaded from local disks (assumed to be /tmp). Our prototype implementation includes a program ptinit that copies the codes to /tmp. The programs that are executed by these tools (e.g., ps) should also reside on local disks where possible.

To handle the case where a node in the list is not responding or unavailable, the programs should issue a warning message and skip to the next process in the upper half to insure that processors are not missed because their `parent' in the subdivision tree was not available. Because it is time-comsuming to detect down nodes, a replicated database of down nodes should be used.

The implementation of parallel rerun should sort the input script and use the recursive subdivision (or at least collect all commands for the same processor together and send them in a lump).

Also note that all of these commands can execute faster if a server process is always running on each of the parallel processors. Such a server is not required however; the prototype implementation is written entirely in terms of shell scripts.