Date: Fri, 6 Jul 2001 15:59:43 +0200 (MET DST) From: Georg Bisseling Subject: Clarifying and extending the definition of MPI_Comm_disconnect To: mpi-21@XXXXXXXXXXXXX Cc: Hans-Christian.Hoppe@XXXXXXXXXX,Alexander.Supalov@XXXXXXXXXX,MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii Content-MD5: HqOsSLmLg/d+jtKXlUZrZA== X-Mailer: dtmail 1.3.0 @(#)CDE Version 1.3.5 SunOS 5.7 i86pc i386 Sender: owner-mpi-21@XXXXXXXXXXXXX Precedence: bulk Reply-To: mpi-21@XXXXXXXXXXXXX Dear Group, I want to bring up an issue that was explained to us by application programmers that use process spawning and MPI_Comm_disconnect in a master-slave manner: The master is spawning child process groups in a sequential order. Due to resource limits the programmers needed to know somehow, when exactly the child processes were gone and their respective CPUs available again. So the programmers wanted a MPI_Comm_disconnect that waits for the termination of the child processes. As mentioned in the discussion about the (non)collective nature of MPI_Finalize() there are other programming models (aka "fire and forget") that are very useful and sensible also. But they would require a MPI_Comm_disconnect that does not wait for the termination of the child process. I think the application programmers would really appreciate to have a predefined key (e.g. MPI_DISCONNECT_WAIT) for communicators to specify what flavor of MPI_Comm_disconnect to use for this comm. There are some problems about that: - one has to reference count when the last comm lets a certain process go and wait only then for its termination. This way the communicators can be disconnected in any order without breaking the progress requirement. - what if a comm is disconnected last that does not contain the father process in the OSes sense? Nobody can call wait! This problem has many incarnations depending on the flavor of (remote) execution used. Any implementation chosen would require careful documentation about the behavior. Maybe one could guarantee expected behavior only for spawned comms and intracomms that are merged spawned comms. Maybe there are other OS or batch system dependent requirements on the behavior of MPI_Comm_disconnect that I did not even think about? Regards Georg // pallas GmbH ............ Georg Bisseling ........... Hermuelheimer Str. 10 Software Engineer D-50321 Bruehl, Germany Georg.Bisseling@XXXXXXXXXX fax +49-(0)2232-1896-29 phone +49-(0)2232-1896-0 http://www.pallas.com direct +49-(0)2232-1896-44 ..........................................................