Many massively parallel processors (MPPs) are providing a full Unix environment on each processor. This has many advantages, including providing a standard environment that users are familiar with. The disadvantage is that many common tasks, such as listing processes and files on each processor, can now take a significant amount of time and generate too much output to be quickly grasped.
This paper discusses the design of versions of some commonly used Unix tools for this kind of parallel environment as well as some issues relevant to their implementation. Prototype versions of many of these programs have been written as shell scripts and are in use at the High Performance Computing Research Facility at the Argonne National Laboratory. These prototypes are available by anonymous ftp from info.mcs.anl.gov in file pub/ibm_sp1/ptools.tar.Z.
In designing these programs, we set several goals that we believe are crucial to their success:
Familiar to Unix users. They should have easy-to-remember names (we chose to use p<unix-command-name>) and take the same arguments.
Scalable. They should be fast enough to use with the same regularity that users use ls and ps, regardless of the number of processors.
Not generate too much output. It should be possible to restrict the amount of output to a single screenful.
pps -all aux | grep joe
which is almost identical to the uniprocessor version ps aux | grep joe.
The last requirement on the amount of output difficult to make consistent with the first requirement. That is, if the natural extension of the Unix command to many processors would produce several hundred lines of output, we have no choice but to generate that data. However, we do provide two ways to help achieve this third goal. One way is to generate the output in a form that makes it easy for the user to provide his own filters. Another is to provide some additional programs that provide options that can help the user reduce the amount of output. An example of this is in looking for a file. On uniprocessor Unix systems, the command ls is often used to check if a file is present: the user types ls filename and looks at the output to see if the file is indeed present. This is (usually) fine on a uniprocessor system, but on a parallel processor with individual file systems, this could generate hundreds of lines of output. Worse, if the file is present on most but not all processors, it is easy to miss that fact in the massive outpouring of data that executing ls on each processor could produce.
The answer to this problem that we have selected is to look at other
ways that Unix provides to answer the same question. For example, on a
uniprocessor system the user could have executed
test -s filename
if ($status == 0) echo "file does not exist"
We provide a capability like this with ppred, where we have
simplified the interface by combining the test with the action.
Managing processes has the same problem; executing ps on even a few processors can produce too much data to grasp easily. We introduce a command pfps that provides services similar to find applied to the space of processes instead of files.
An alternative way to manage large volumes of data is to use graphical rather than text-based display. We describe a program pdisp that can translate the output from our other tools into a graphical display.
In order to simplify the processing of any output from any of these tools by other Unix tools (including the graphical display tools we will discuss in Section Parallel display ), all output lines are prepended with the nodename of the processor.
It is particularly important that output be ``line-atomic;'' that is, output send to stdout from one processor should not appear in a line generated by another processor.