MPICH.NT FAQ

Internet Explorer users can click to expand and double-click to contract the answers.
Exapand all answers
Collapse all answers

What are the system requirements to run MPICH on Windows?

To use the default launcher, mpd, the following must be true:

If the top two are not true then you cannot use MPICH.  If you do not have administrator privileges you can use mpd from the command line without installing it.  Download the source distribution and read the manual for instructions how to use mpd from a command prompt.

Can I run a single MPICH application on multiple Windows and Unix/Linux boxes?

No.  The unix tcp device uses p4 code while the Windows device uses Windows specific code and they are not compatible.  Even if you had a launcher that could start processes on Linux and Windows nodes, the Windows device cannot make socket connections to the p4 device on the linux nodes and vice versa.

Can I use MS Visual C++ 5.0 or older?

No.  The libraries and project files provided were all created with VC 6 and Visual Fortran 6.  If you want to use an older version of the compiler you will have to re-compile the MPICH source.  This may require editing the project files by hand so they can be read by an older version of Visual Studio.

Can I run MPICH applications on Windows9x/ME?

In a limited way yes.    The TCP/IP device for Windows has code that only runs on WindowsNT/2000/XP, but you can use the -localonly option to mpirun on  a Win9x machine.  This means you can run multiple processes on a single Win9x machine but you cannot run applications across multiple Win9x machines.  This capability is provided so you can compile and test programs on a single Win9x machine and then run the code on an NT cluster at some other time.  To install on a Win9x machine, download the source distribution, unzip the contents, use mpirun from the bin directory and make sure the dlls in the lib directory are in your path.  Help files are in the www directory www\index.html.

Why doesn't mpirun work in the cygwin bash shell?

The cygwin environment has problems with the Windows API function CreateProcess.  A workaround was introduced in mpich.nt.1.2.2 Oct 10, 2001.  This and more recent versions of mpirun function in a bash shell running in a command prompt.  MPIRun does not work in the XFree86 windowing environment.

How do I debug my application?

1) Put printf's and fflush(stdout)'s in you program. 

2)  The help pages describe how to launch an application by hand without using mpirun.  It involves setting environment variables by hand in a command prompt and then executing the application.  You can use this method to debug an application.  First, bring up two command prompts and set the environment variables so that you can run a two process job.  Then instead of running the application, execute "msdev myapp.exe".  This will bring up the developer studio and then you can step through the code using the debugger.  See the help pages for the specific environment variables to set.

Why do I get this error "LaunchProcess failed, CreateProcessAsUser failed, No more connections can be made to this remote computer at this time because there are already as many connections as the computer can accept."?

This error usually occurs when you try to launch an executable from a shared directory on WindowsNT Workstation, Windows 2000 Professional, or WindowsXP Professional.  The professional versions of Windows as apposed to the server editions have limitations on the file sharing capabilities.  Place the executable on a network share on a server machine or copy the executable to the local drive of each machine.

I'm having a problem with mpirun. When I use the command-line interface my application loads fine and works. When I try using a configuration file or use guimpirun, for some reason, my application is unable to create a window.

1) The process launcher for MPICH, mpd, runs as a service. When it launches processes they are put in their own hidden desktop. Any windows these processes bring up are hidden from view. If you must be able to see your windows, you can allow processes to share the default desktop by re-installing mpd with the interact option. Execute "mpd -remove" to uninstall and then execute "mpd -install -interact" to re-install.

This will not work for a terminal services session. This will only allow windows to show up on the default logon desktop (the monitor directly connected to the host). Also, there may be permission issues if a user is logged on to a machine and a different user attempts to launch a process on the same machine. So this is not the default nor recommended method of installation.

2) But sometimes I can see my windows, even with the default installation.  This is true.  If mpirun determines that you are only running processes on the local machine, it bypasses mpd and launches the processes in the current context - thus allowing you to see your windows. When mpirun parses a configuration file, it always use mpd. guiMPIRun always uses mpd.

Why doesn't the mpirun option do what the help pages say it shoud do?

MPIRun options must be specified before the name of the executable.  Any options specified after the executable will be passed as arguments to the executable and not parsed as mpirun options.  For example: "mpirun -np 5 myapp.exe -machinefile filename" will not use the machine file specified by 'filename' because mpirun considers this an argument to the application.

What does this error mean?: Unable to create a temporary file on 'host'  FAIL The directory name is invalid.

The TEMP value specified in MPIConfig must be a local path.  You cannot use \\server\share as a temporary path.  It must be something like c:\ or c:\temp

Why do my Fortran programs work on some machines in the cluster but not all of them?

The Fortran dlls that come with the Fortran compilers are not very version friendly.  Make sure you have the same version of the compiler dlls on all the machines.  Or you can place the compiler dlls in the same directory as your executable.  Windows loads dlls from the executable directory first before searching the path so this will insure that all the processes us the same dlls.

Why does mpirun return immediately without any output and apparently without running any processes?

There are several things that can cause your job to crash before it even starts.  The process launcher by default prevents any popup windows from appearing when your processes crash.  So if the job crashes at startup you may not know that it has even run.  The two main causes are:

1) You are missing a dll required by the process.  Many times you will compile a program that works on the local machine and then crashes on a remote machine because the remote machine does not have the necessary dlls.  If you compiled with cygwin you need to make sure the cygwin dll is on all the nodes.  If you compile with the Microsoft Visual tools you need to make sure those libraries are available.  One way to solve this problem without copying files to the remote nodes is to place all necessary dlls in the same directory as the executable.  Windows looks in the directory of the executable first before searching the path for dlls.

2) Your executable is broken.  If you place a very large array in the global space (popular with Fortran developers) the process will crash at load time if this exceeds the process's reserved global variable space.

Why do I get this error, "Logon failure: unknown user name or bad password"?

You must have the same account credentials on all the nodes participating in the mpich job.  If your cluster is set up with a domain controller then you can use a domain account to launch an mpich job.  If you do not have a domain controller then you must set up user accounts on all the nodes individually with the same credentials on each node.  Each user can have whatever password they choose, but they must use the same password on all the nodes.  In other words, UserA-PasswordA must be the same on all the nodes and UserB-PasswordB must be the same on all the nodes, etc.

Why do I get this error, "LaunchProcess failed, CreateProcessAsUser failed, The system cannot find the file specified."?

The executable used in an mpich job must be available to all the nodes participating in the job.  The path to the executable must be valid on all the nodes.  This can be accomplished by copying the executable to a common location on all the nodes or copying it to a shared location.  For example, you could copy cpi.exe to c:\temp\cpi.exe on all the nodes and run "mpirun -np 3 c:\temp\cpi.exe".  Or you could copy cpi.exe to a shared directory \\myhost\myshare\cpi.exe and then execute "mpirun -np 3 \\myhost\myshare\cpi.exe".

Why do I get this error, "CreateThread failed in ControlLoopThread. Not enough storage is available to process this command."?

MPICH uses two to four threads internally.  Some users may want to increase the thread stack so they can have large static variables.  This is common with many Fortran programs.  The problem is that your program runs out of memory for the stack.  You can't set the thread stack size greater than 1/4th the total available stack memory.  The easiest solution is to leave the stack size alone and malloc your large variables.