| File | FIXME Note |
|---|---|
| src/mpi/pt2pt/sendrecv.c | Performance for small messages might be better if MPID_Send() were used here instead of MPID_Isend() |
| src/mpi/pt2pt/sendrecv.c | should we cancel the pending (possibly completed) receive request or wait for it to complete? |
| src/mpi/pt2pt/bsendutil.c | This is the wrong error text (notimpl) |
| src/mpi/pt2pt/bsendutil.c | |
| src/mpi/pt2pt/sendrecv_rep.c | should we cancel the pending (possibly completed) receive request or wait for it to complete? |
| src/mpi/pt2pt/mpir_request.c | are Ibsend requests added to the send queue? |
| src/mpi/pt2pt/mpir_request.c | MPIR_SENDQ_FORGET(request_ptr); -- it appears that persistent sends are not currently being added to the send queue. should they be, or should this release be conditional? |
| src/mpi/pt2pt/mpir_request.c | What is this routine for? [BRT] it is used by testall, although looking at testall now, I think the algorithm can be change slightly and eliminate the need for this routine |
| src/mpi/pt2pt/mpir_request.c | Why do we need to get the thread-private storage here? |
| src/mpi/pt2pt/ibsend.c | There should be no unreferenced args! |
| src/mpi/pt2pt/ibsend.c | use the memory management macros |
| src/mpi/pt2pt/ibsend.c | should we be setting the request at all in the case of an error? |
| src/mpi/pt2pt/bsend.c | Check for MPID_WOULD_BLOCK? |
| src/mpi/pt2pt/iprobe.c | Is this correct for intercomms? |
| src/mpi/pt2pt/greq_start.c | MT this function is not thread safe when using fine-grained threading |
| src/mpi/init/async.c | The CS_ENTER/EXIT code should be abstracted out correctly, instead of relying on the #if protection here. |
| src/mpi/init/async.c | We assume that waiting on some request forces progress on all requests. With fine-grained threads, will this still work as expected? We can imagine an approach where a request on a non-conflicting communicator would not touch the remaining requests to avoid locking issues. Once the fine-grained threads code is fully functional, we need to revisit this and, if appropriate, either change what we do in this thread, or delete this comment. |
| src/mpi/init/initthread.c | Move to os-dependent interface? |
| src/mpi/init/initthread.c | This severly degrades performance but fixes alignment issues with the datatype code. |
| src/mpi/init/initthread.c | The call to err init should be within an ifdef HAVE_ ERROR_CHECKING block (as must all uses of Err_create_code) |
| src/mpi/init/initthread.c | the default behavior for all MPI routines is to abort. This isn't always convenient, because there's no other way to get this routine to simply return. But we should provide some sort of control for that and follow the default defined by the standard |
| src/mpi/init/initthread.c | Define these in the interface. Does Timer init belong here? |
| src/mpi/init/initthread.c | Does this need to come before the call to MPID_InitComplete? For some debugger support, MPIR_WaitForDebugger may want to use MPI communication routines to collect information for the debugger |
| src/mpi/init/initthread.c | Can we get away without locking every time. Now, we need a MPID_CS_ENTER/EXIT around MPI_Init and MPI_Init_thread. Progress may be called within MPI_Init, e.g., by a spawned child process. Within progress, the lock is released and reacquired when blocking. If the lock isn't acquired before then, the release in progress is incorrect. Furthermore, if we don't release the lock after progress, we'll deadlock the next time this process tries to acquire the lock. MPID_CS_ENTER/EXIT functions are used here instead of MPIU_THREAD_CS_ENTER/EXIT because MPIR_ThreadInfo.isThreaded hasn't been initialized yet. |
| src/mpi/init/abort.c | 100 is arbitrary and may not be long enough |
| src/mpi/init/abort.c | It isn't clear that Abort should wait for a critical section, since that could result in the Abort hanging if another routine is hung holding the critical section. Also note the "not thread-safe" comment in the description of MPI_Abort above. |
| src/mpi/init/abort.c | This is not internationalized |
| src/mpi/init/finalize.c | Why is this not one of the finalize callbacks?. Do we need pre and post MPID_Finalize callbacks? |
| src/mpi/init/finalize.c | Should this also be a finalize callback? |
| src/mpi/init/finalize.c | Many of these debugging items could/should be callbacks, added to the finalize callback list |
| src/mpi/init/finalize.c | Both the memory tracing and debug nesting code blocks should be finalize callbacks |
| src/mpi/init/finalize.c | We'd like to arrange for the mem dump output to go to separate files or to be sorted by rank (note that the rank is at the head of the line) |
| src/mpi/init/init.c | Can we get away without locking every time. Now, we need a MPID_CS_ENTER/EXIT around MPI_Init and MPI_Init_thread. Progress may be called within MPI_Init, e.g., by a spawned child process. Within progress, the lock is released and reacquired when blocking. If the lock isn't acquired before then, the release in progress is incorrect. Furthermore, if we don't release the lock after progress, we'll deadlock the next time this process tries to acquire the lock. MPID_CS_ENTER/EXIT functions are used here instead of MPIU_THREAD_SINGLE_CS_ENTER/EXIT because MPIR_ThreadInfo.isThreaded hasn't been initialized yet. |
| src/mpi/attr/comm_get_attr.c | We could initialize some of these here; only tag_ub is used in the error checking. |
| src/mpi/attr/comm_get_attr.c | Attributes must be visable from all languages |
| src/mpi/attr/win_set_attr.c | communicator of window? |
| src/mpi/comm/comm_create.c | BUBBLE SORT |
| src/mpi/comm/comm_split.c | Bubble sort |
| src/mpi/comm/comm_group.c | Add a sanity check that the size of the group is the same as the size of the communicator. This helps catch corrupted communicators |
| src/mpi/comm/intercomm_create.c | A temporary version for lpids within my comm world |
| src/mpi/comm/intercomm_create.c | A temp for lpids within my comm world |
| src/mpi/comm/intercomm_create.c | for MPI1, all process ids are relative to MPI_COMM_WORLD. For MPI2, we'll need to do something more complex |
| src/mpi/comm/intercomm_create.c | Error |
| src/mpi/comm/intercomm_create.c | Should be using the local_size argument |
| src/mpi/comm/intercomm_merge.c | For the intracomm, we need a consistent context id. That means that one of the two groups needs to use the recvcontext_id and the other must use the context_id |
| src/mpi/comm/commutil.c | Reusing context ids can lead to a race condition if (as is desirable) MPI_Comm_free does not include a barrier. Consider the following: Process A frees the communicator. Process A creates a new communicator, reusing the just released id Process B sends a message to A on the old communicator. Process A receives the message, and believes that it belongs to the new communicator. Process B then cancels the message, and frees the communicator. The likelyhood of this happening can be reduced by introducing a gap between when a context id is released and when it is reused. An alternative is to use an explicit message (in the implementation of MPI_Comm_free) to indicate that a communicator is being freed; this will often require less communication than a barrier in MPI_Comm_free, and will ensure that no messages are later sent to the same communicator (we may also want to have a similar check when building fault-tolerant versions of MPI). |
| src/mpi/comm/commutil.c | this is an alternative constructor that doesn't use MPIR_Comm_create! |
| src/mpi/comm/commutil.c | No local functions for the collectives |
| src/mpi/comm/commutil.c | No local functions for the topology routines |
| src/mpi/comm/commutil.c | TEMP for current yield definition |
| src/mpi/comm/commutil.c | we need an internal tag or communication channel. Can we use a different context instead?. Or can we use the tag provided in the intercomm routine? (not on a dup, but in that case it can use the collective context) |
| src/mpi/spawn/lookup_name.c | change **fail to something more meaningful |
| src/mpi/spawn/lookup_name.c | change **fail to something more meaningful |
| src/mpi/spawn/comm_disconnect.c | MT should we be checking this? |
| src/mpi/spawn/open_port.c | If info_ptr is non-null, we should validate it |
| src/mpi/datatype/status_set_elements.c | The device may want something else here |
| src/mpi/datatype/typeutil.c | does the order of this list need to correspond to anything in particular? There are several lists of predefined types sprinkled throughout the codebase and it's unclear which (if any) of them must match exactly. [goodell@ 2009-03-17] |
| src/mpi/datatype/typeutil.c | This is actually an error, since this routine should only be called once |
| src/mpi/datatype/typeutil.c | This needs to be make thread safe |
| src/mpi/datatype/type_match_size.c | Should make use of Fortran optional types (e.g., MPI_INTEGER2) |
| src/mpi/datatype/register_datarep.c | UNIMPLEMENTED |
| src/mpi/timer/mpidtime.c | We need to cleanup the use of the MPID_Generic_wtick prototype |
| src/mpi/topo/topoutil.c | thread safe code needs a thread lock here, followed by another test on the keyval to see if a different thread got there first |
| src/mpi/topo/topoutil.c | free the attribute data structure |
| src/mpi/topo/cart_create.c | nranks could be zero (?) |
| src/mpi/topo/dist_gr_neighb_count.c | Why does this routine require a CS? |
| src/mpi/topo/dist_gr_neighb.c | Why does this routine need a CS |
| src/mpi/topo/dims_create.c | This routine does not work well with non-powers of two (and probably with collections of different factors. For example, factoring squares like 6*6 or 7*7 in to 2 dimensions doesn't yield the expected results (18,2) and (49,1) in current tests |
| src/mpi/topo/dims_create.c | |
| src/mpi/errhan/comm_set_errhandler.c | Add a check that this is a comm errhandler. This test should be a macro so that it can be used in all calls, along with similar calls for win and file |
| src/mpi/errhan/dynerrutil.c | Does this need a thread-safe init? |
| src/mpi/errhan/dynerrutil.c | Unallocated error code? |
| src/mpi/errhan/dynerrutil.c | Unallocated error code? |
| src/mpi/errhan/dynerrutil.c | For robustness, we should make sure that the associated string is initialized to null |
| src/mpi/errhan/file_get_errhandler.c | check for a valid file handle (fh) before converting to a pointer |
| src/mpi/errhan/file_get_errhandler.c | Is this obsolete now? |
| src/mpi/errhan/errutil.c | Now that error codes are chained, this does not produce a valid error code since there is no valid ring index corresponding to this code |
| src/mpi/errhan/errutil.c | We probably want to set this to MPI_ERR_UNKNOWN and discard the rest of the bits |
| src/mpi/errhan/errutil.c | Should this check against dynamically-created error classes? |
| src/mpi/errhan/errutil.c | These strings need to be internationalized |
| src/mpi/errhan/errutil.c | This should really be the same as MPI_MAX_ERROR_STRING, or in the worst case, defined in terms of that |
| src/mpi/errhan/errutil.c | Not internationalized |
| src/mpi/errhan/errutil.c | Not internationalized |
| src/mpi/errhan/errutil.c | This routine isn't quite right yet |
| src/mpi/errhan/errutil.c | not internationalized |
| src/mpi/errhan/errutil.c | These functions are not thread safe |
| src/mpi/errhan/errutil.c | default is not thread safe |
| src/mpi/errhan/errutil.c | A check for MPI_IN_PLACE should only be used where that is valid |
| src/mpi/errhan/errutil.c | We may want to use 0x%p for systems that (including Windows) that don't prefix %p with 0x. This must be done with a capability, not a test on particular OS or header files |
| src/mpi/errhan/errutil.c | Where is the documentation for this function? What is it for? |
| src/mpi/errhan/errutil.c | This needs to be made consistent with the different thread levels, since in the "global" thread level, an extra thread mutex is not required. |
| src/mpi/errhan/errutil.c | ERR_OTHER is overloaded; this may mean "OTHER" or it may mean "No additional error, just routine stack info" |
| src/mpi/errhan/errutil.c | Internal error. Generate some debugging information; Fix for the general release |
| src/mpi/errhan/errutil.c | This flag wasn't documented in the release specs, and in any event shouldn't be controlled through source-code changes (i.e., make it either a configure option or a runtime option) |
| src/mpi/errhan/errutil.c | This is wrong. The only way that you can get here without errcode beign MPI_SUCCESS is if there is an error in the processing of the error codes. Dropping through into the next level of code (particularly when that code doesn't check for valid error codes!) is erroneous |
| src/mpi/errhan/errutil.c | Not internationalized |
| src/mpi/errhan/errutil.c | Shouldn't str be const char * ? |
| src/mpi/errhan/errutil.c | Don't use Snprint to append a string ! |
| src/mpi/errhan/errutil.c | The following code is broken as described above (if the errcode is not valid, then this code is just going to cause more problems) |
| src/mpi/errhan/errutil.c | (Here and elsewhere) Make sure any string is non-null before you use it |
| src/mpi/errhan/errutil.c | Not internationalized |
| src/mpi/errhan/errutil.c | Remove the bogus fn argument from all uses of this routine |
| src/mpi/errhan/errutil.c | How do we determine that we failed to unwind the stack? |
| src/mpi/errhan/file_call_errhandler.c | check for a valid file handle (fh) before converting to a pointer |
| src/mpi/errhan/file_call_errhandler.c | Is this obsolete now? |
| src/mpi/errhan/file_set_errhandler.c | check for a valid file handle (fh) before converting to a pointer |
| src/mpi/errhan/file_set_errhandler.c | Is this obsolete now? |
| src/mpi/errhan/file_set_errhandler.c | We need an error return |
| src/mpi/errhan/file_set_errhandler.c | We need an error return |
| src/mpi/errhan/add_error_code.c | verify that errorclass is a dynamic class |
| src/mpi/coll/helper_fns.c | should we cancel the pending (possibly completed) receive request or wait for it to complete? |
| src/mpi/coll/helper_fns.c | For the brief-global and finer-grain control, we must ensure that the global lock is *not* held when this routine is called. (unless we change progress_start/end to grab the lock, in which case we must *still* make sure that the lock is not held when this routine is called). |
| src/mpi/coll/reduce.c | does this need to be checked after each uop invocation for predefined operators? |
| src/mpi/coll/reduce.c | does this need to be checked after each uop invocation for predefined operators? |
| src/mpi/coll/allreduce.c | mpi_errno is error CODE, not necessarily the error class MPI_ERR_OP. In MPICH2, we can get the error class with errorclass = mpi_errno & ERROR_CLASS_MASK; |
| src/mpi/coll/red_scat.c | should we be checking the op_errno here? |
| src/mpi/coll/red_scat.c | this version only works for power of 2 procs |
| src/mpi/coll/oputil.h | Is MPI_Fint really OK here? |
| src/mpi/coll/bcast.c | move to somewhere else |
| src/mpi/coll/bcast.c | it would be nice if we could refactor things to minimize duplication between this and MPIR_Scatter and friends. We can't use MPIR_Scatter as is without inducing an extra copy in the noncontig case. |
| src/mpi/coll/bcast.c | This function uses some heuristsics based off of some testing on a cluster at Argonne. We need a better system for detrmining and controlling the cutoff points for these algorithms. If I've done this right, you should be able to make changes along these lines almost exclusively in this function and some new functions. [goodell@ 2008/01/07] |
| src/mpi/coll/bcast.c | binomial may not be the best algorithm for on-node bcast. We need a more comprehensive system for selecting the right algorithms here. |
| src/mpi/coll/bcast.c | binomial may not be the best algorithm for on-node bcast. We need a more comprehensive system for selecting the right algorithms here. |
| src/mpi/coll/bcast.c | It would be good to have an SMP-aware version of this algorithm that (at least approximately) minimized internode communication. |
| src/mpi/coll/allgather.c | saving an MPI_Aint into an int |
| src/mpi/coll/allgather.c | saving an MPI_Aint into an int |
| src/mpi/debugger/tvtest.c | Move these into dbgstub.c |
| src/mpi/debugger/dbginit.c | In MPICH2, the executables may not have the information on the other processes; this is part of the Process Manager Interface (PMI). We need another way to provide this information to a debugger |
| src/mpi/debugger/dll_mpich2.c | include the mpichconf.h file? |
| src/mpi/debugger/dll_mpich2.c | handle size properly (declared as 64 in mpi_interface.h) |
| src/mpi/romio/mpi-io/get_extent.c | handle other file data representations |
| src/mpi/romio/mpi-io/mpioimpl.h | We should be more restricted in what we include from MPICH2 into ROMIO |
| src/mpi/romio/mpi-io/iotestall.c | THIS IS NOT CORRECT (see above). But most applications won't care |
| src/mpi/romio/mpi-io/get_view.c | It is wrong to use MPI_Type_contiguous; the user could choose to re-implement MPI_Type_contiguous in an unexpected way. Either use NMPI_Barrier as in MPICH2 or PMPI_Type_contiguous |
| src/mpi/romio/mpi-io/get_view.c | Ditto for MPI_Type_commit - use NMPI or PMPI |
| src/mpi/romio/mpi-io/get_view.c | Ditto for MPI_Type_xxx - use NMPI or PMPI |
| src/mpi/romio/mpi-io/read_ord.c | Check for error code from ReadStridedColl? |
| src/mpi/romio/mpi-io/write_ordb.c | Check for error code from WriteStridedColl? |
| src/mpi/romio/mpi-io/write_alle.c | we should really ensure that the split_datatype remains valid by incrementing the ref count in the write_allb.c routine and decrement it here after setting the bytes |
| src/mpi/romio/mpi-io/seek_sh.c | explain why the barrier is necessary |
| src/mpi/romio/mpi-io/write_ord.c | Check for error code from WriteStridedColl? |
| src/mpi/romio/mpi-io/close.c | It is wrong to use MPI_Barrier; the user could choose to re-implement MPI_Barrier in an unexpected way. Either use NMPI_Barrier as in MPICH2 or PMPI_Barrier |
| src/mpi/romio/mpi-io/glue/mpich2/mpio_err.c | These external prototypes should be included from mpich2/src/include/mpiext.h |
| src/mpi/romio/mpi-io/glue/mpich2/mpio_err.c | This is a hack in case no error handler was set |
| src/mpi/romio/adio/common/ad_init.c | should be checking error code from MPI_Info_create here |
| src/mpi/romio/adio/common/req_malloc.c | Insert error here |
| src/mpi/romio/adio/common/strfns.c | Really need a check for varargs.h vs stdarg.h |
| src/mpi/romio/adio/common/strfns.c | Assumes ASCII |
| src/mpi/romio/adio/common/lock.c | This is a temporary hack until we use flock64 where available. It also doesn't fix the broken Solaris header sys/types.h header file, which declars off_t as a UNION ! Configure tests to see if the off64_t is a union if large file support is requested; if so, it does not select large file support. |
| src/mpi/romio/adio/common/lock.c | This should use the error message system, especially for MPICH2 |
| src/mpi/romio/adio/ad_ntfs/ad_ntfs_iwrite.c | Validate the args -- has it already been done by the caller ? |
| src/mpi/romio/adio/ad_ntfs/ad_ntfs_iwrite.c | Validate the args -- has it already been done by the caller ? |
| src/mpi/romio/adio/ad_ntfs/ad_ntfs_iwrite.c | Is the timeout in seconds ? |
| src/mpi/romio/adio/ad_ntfs/ad_ntfs_iwrite.c | Validate the args -- has it already been done by the caller ? |
| src/mpi/romio/adio/ad_ntfs/ad_ntfs_iwrite.c | Pass appropriate error code to user |
| src/mpi/romio/adio/ad_ntfs/ad_ntfs_iwrite.c | Pass appropriate error code to user |
| src/mpid/ch3/include/mpidimpl.h | We want to avoid even storing information about the builtins if we can |
| src/mpid/ch3/include/mpidimpl.h | XXX DJG for TLS hack |
| src/mpid/ch3/include/mpidimpl.h | This must be used within an MPIDCOMM critical section. |
| src/mpid/ch3/include/mpidimpl.h | Why does a send request need the match information? Is that for debugging information? In case the initial envelope cannot be sent? Ditto for the dev.user_buf, count, and datatype fields when the data is sent eagerly. The following fields needed to be set: datatype_ptr status.MPI_ERROR Note that this macro requires that rank, tag, context_offset, comm, buf, datatype, and count all be available with those names (they are not arguments to the routine) |
| src/mpid/ch3/include/mpidimpl.h | We've moved to allow finer-grain critical sections... |
| src/mpid/ch3/include/mpidimpl.h | Determine which of these functions should be exported to all of the MPICH routines and which are internal to the device implementation |
| src/mpid/ch3/include/mpidimpl.h | MPIDI_PG_Get_vc_set_active is a macro, not a routine |
| src/mpid/ch3/include/mpidimpl.h | This duplicates a value in util/sock/ch3usock.h |
| src/mpid/ch3/include/mpidimpl.h | Switch this to use the common debug code |
| src/mpid/ch3/include/mpidimpl.h | This does not belong here |
| src/mpid/ch3/include/mpidimpl.h | These should be defined only when these particular utility packages are used. Best would be to keep these prototypes in the related util/xxx directories, and either copy them into an include directory used only for builds or add (yet another) include path |
| src/mpid/ch3/include/mpidimpl.h | Where should this go? |
| src/mpid/ch3/include/mpidimpl.h | This is a macro! |
| src/mpid/ch3/include/mpidimpl.h | Move these prototypes into header files in the appropriate util directories |
| src/mpid/ch3/include/mpidimpl.h | Make these part of the channel support headers |
| src/mpid/ch3/include/mpidimpl.h | Currently forcing the PER_OBJECT case to do a global lock. We need a better way of fixing this. |
| src/mpid/ch3/include/mpidpost.h | mpidpost.h is included by mpiimpl.h . However, mpiimpl.h should refer only to the ADI3 prototypes and should never include prototypes specific to any particular device. Factor the include files to maintain better modularity by providing mpiimpl.h with only the definitions that it needs |
| src/mpid/ch3/include/mpidpre.h | This header should contain only the definitions exported to the mpiimpl.h level |
| src/mpid/ch3/include/mpidpre.h | Who defines this name |
| src/mpid/ch3/include/mpidpre.h | the precise meaning of this field is unclear, comments/docs about it should be added |
| src/mpid/ch3/include/mpidpre.h | This ifndef test is a temp until mpidpre is cleaned of all items that do not belong (e.g., all items not needed by the top layers of MPICH2) |
| src/mpid/ch3/include/mpidpre.h | The progress routines will be made into ch3-common definitions, not channel specific. Channels that need more will need to piggy back or otherwise override |
| src/mpid/ch3/include/mpidpkt.h | Having predefined names makes it harder to add new message types, such as different RMA types. |
| src/mpid/ch3/include/mpidpkt.h | no sync eager |
| src/mpid/ch3/include/mpidpkt.h | should be stream put |
| src/mpid/ch3/include/mpidpkt.h | Unused |
| src/mpid/ch3/include/mpidpkt.h | Experimental for now |
| src/mpid/ch3/util/sock/findinterfaces.c | THIS IS WRONG. INSTEAD, SPECIFY THE SPECIFIC FEATURE LEVEL NEEDED, AND THEN ONLY IF A CONFIGURE TEST SHOWS THAT IT IS REQUIRED THESE NEED TO BE SET FOR ALL COMPILATIONS TO AVOID HAVING DIFFERENT FILES COMPILED WITH DIFFERENT AND INCOMPATIBLE HEADER FILES. WHAT IS APPARENTLY NEEDED (SEE /usr/include/features.h ON A LINUX SYSTEM) IS EITHER _BSD_SOURCE OR _SVID_SOURCE; THIS MUST BE SET CONSISTENTLY FOR ALL FILES COMPILED AS PART OF MPICH2 |
| src/mpid/ch3/util/sock/ch3u_connect_sock.c | We need a little security here to avoid having a random port scan crash the process. Perhaps a "secret" value for each process could be published in the key-val space and subsequently sent in the open pkt. |
| src/mpid/ch3/util/sock/ch3u_connect_sock.c | Describe what these routines do |
| src/mpid/ch3/util/sock/ch3u_connect_sock.c | Clean up use of private packets (open/accept) |
| src/mpid/ch3/util/sock/ch3u_connect_sock.c | This size is unchanging, so get it only once (at most); we might prefer for connections to simply point at the single process group to which the remote process belong |
| src/mpid/ch3/util/sock/ch3u_connect_sock.c | Why does the name include "to_root"? |
| src/mpid/ch3/util/sock/ch3u_connect_sock.c | Describe the algorithm for the connection logic |
| src/mpid/ch3/util/sock/ch3u_connect_sock.c | where does this vc get freed? |
| src/mpid/ch3/util/sock/ch3u_connect_sock.c | There may need to be an additional routine here, to ensure that the channel is initialized for this pair of process groups (this process and the remote process to which the vc will connect). |
| src/mpid/ch3/util/sock/ch3u_connect_sock.c | To avoid this global (MPIDI_CH3I_sock_set) which is used only in ch3_progress.c and ch3_progress_connect.c in the channels, this should be a call into the channel, asking it to setup the socket for a connection and return the connection. That will keep the socket set out of the general ch3 code, even if this is the socket utility functions. |
| src/mpid/ch3/util/sock/ch3u_connect_sock.c | These are small routines; we may want to bring them together into a more specific post-connection-for-sock |
| src/mpid/ch3/util/sock/ch3u_connect_sock.c | This is a hack to allow Windows to continue to use the host description string instead of the interface address bytes when posting a socket connection. This should be fixed by changing the Sock_post_connect to only accept interface address. Note also that Windows does not have the inet_pton routine; the Windows version of this routine will need to be identified or written. See also channels/sock/ch3_progress.c and channels/ssm/ch3_progress_connect.c |
| src/mpid/ch3/util/sock/ch3u_connect_sock.c | We should start switching to getaddrinfo instead of gethostbyname |
| src/mpid/ch3/util/sock/ch3u_connect_sock.c | We don't make use of the ifname in Windows in order to provide backward compatibility with the (undocumented) host description string used by the socket connection routine MPIDU_Sock_post_connect. We need to change to an interface-address (already resolved) based description for better scalability and to eliminate reliance on fragile DNS services. Note that this is also more scalable, since the DNS server may serialize address requests. On most systems, asking for the host info of yourself is resolved locally (i.e., perfectly parallel). Regrettably, not all systems do this (e.g., some versions of FreeBSD). |
| src/mpid/ch3/util/sock/ch3u_connect_sock.c | What does the above comment mean? |
| src/mpid/ch3/util/sock/ch3u_connect_sock.c | separate out the accept and connect sides to make it easier to follow the logic |
| src/mpid/ch3/util/sock/ch3u_connect_sock.c | Is there an assumption about conn->state? |
| src/mpid/ch3/util/sock/ch3u_connect_sock.c | where does this vc get freed? |
| src/mpid/ch3/util/sock/ch3u_connect_sock.c | Possible ambiguous state (two ways to get to OPEN_LSEND) |
| src/mpid/ch3/util/sock/ch3u_connect_sock.c | is this the correct assert? |
| src/mpid/ch3/util/sock/ch3u_connect_sock.c | Should conn->vc be freed? Who allocated? Why not? |
| src/mpid/ch3/util/sock/ch3u_connect_sock.c | Should probably reduce ref count on conn->vc |
| src/mpid/ch3/util/sock/ch3u_connect_sock.c | What happens to the state of the associated VC? Why isn't it changed? Is there an assert here, such as conn->vc->conn != conn (there is another connection chosen for the vc)? |
| src/mpid/ch3/util/sock/ch3u_connect_sock.c | What does post close do here? |
| src/mpid/ch3/util/sock/ch3u_connect_sock.c | This should really be combined with handle_conn_event |
| src/mpid/ch3/util/sock/ch3u_connect_sock.c | This routine is called when? What is valid in conn? |
| src/mpid/ch3/util/sock/ch3u_connect_sock.c | the connect side of this sets conn->vc to NULL. Why is this different? The code that checks CONN_STATE_CLOSING uses conn == NULL to identify intentional close, which this appears to be. |
| src/mpid/ch3/util/sock/ch3u_connect_sock.c | What does this do? |
| src/mpid/ch3/util/sock/ch3u_connect_sock.c | This is a hack to allow Windows to continue to use the host description string instead of the interface address bytes when posting a socket connection. This should be fixed by changing the Sock_post_connect to only accept interface address. See also channels/ssm/ch3_progress_connect.c |
| src/mpid/ch3/util/sock/ch3u_connect_sock.c | What does this do? |
| src/mpid/ch3/util/sock/ch3u_connect_sock.c | This function also used in channels/sock/src/ch3_progress.c |
| src/mpid/ch3/util/sock/ch3u_init_sock.c | Why are these unused? |
| src/mpid/ch3/util/sock/ch3u_init_sock.c | Get the size from the process group |
| src/mpid/ch3/util/sock/ch3u_init_sock.c | This should probably be the same as MPIDI_VC_InitSock. If not, why not? |
| src/mpid/ch3/util/sock/ch3u_init_sock.c | Note that MPIDI_CH3_VC_Init sets state, sendq_head and tail. so this should be MPIDI_CH3_VC_Init( &pg_p->vct[p] ); followed by MPIDI_VC_InitSock( ditto ); In fact, there should be a single VC_Init call here |
| src/mpid/ch3/util/sock/ch3u_init_sock.c | Why isn't this MPIDI_VC_Init( vc, NULL, 0 )? |
| src/mpid/ch3/util/sock/ch3u_init_sock.c | This doesn't belong here, since the pg is not created in this routine |
| src/mpid/ch3/util/unordered/unordered.c | This should probably be *rreqp = NULL? |
| src/mpid/ch3/util/unordered/unordered.c | processing send cancel requests requires that we be aware of pkts in the reorder queue |
| src/mpid/ch3/util/shmbase/ch3_shm.c | |
| src/mpid/ch3/util/shmproc/shmproc.c | Do we need these routines for all shmem or only for some options? |
| src/mpid/ch3/util/shmproc/shmproc.c | Now what? Why not continue to read? |
| src/mpid/ch3/util/shm/ch3u_connect_sshm.c | Packet types used for establishing connections |
| src/mpid/ch3/util/shm/ch3u_connect_sshm.c | What does this routine do? |
| src/mpid/ch3/util/shm/ch3u_connect_sshm.c | It appears that this routine is used as part of the initial "get the peers connected" code in comm_connect/accept and spawn |
| src/mpid/ch3/util/shm/ch3u_connect_sshm.c | where does this vc get freed? |
| src/mpid/ch3/util/shm/ch3u_connect_sshm.c | Do we want to free the vc instead? Or put this into the fail block? |
| src/mpid/ch3/util/shm/ch3u_connect_sshm.c | Non conforming (not internationalizable) error message. why not just MPIU_ERR_POP? |
| src/mpid/ch3/util/shm/ch3u_init_sshm.c | This file contains too many ifdefs. A cleaner design is needed; examples may be found in MPICH1. For example, separate files for each type of shared memory and common routines implementing a shared memory abstraction |
| src/mpid/ch3/util/shm/ch3u_init_sshm.c | Why is this commented out? |
| src/mpid/ch3/util/shm/ch3u_init_sshm.c | This code probably reflects a bug caused by commenting out the above code |
| src/mpid/ch3/util/shm/ch3u_init_sshm.c | MPIU_g_nLockSpinCount is in a different module (mpid/common/locks) where it is used (and defined) only in some cases. Commenting out this bogus access for now; uncommenting it may cause link failures. Update - see FIXME in mpidu_process_locks - there are serious problem with how configuration information is passed to mpidu_process_locks.h , leading to *different* choices in the files in mpid/ch3 that include it (like this one) and in mpidu_process_locks.c (!). For that reason, I've left this statement in the code, but once mpidu_process_locks is fixed, this will need to be as well. |
| src/mpid/ch3/util/shm/ch3u_init_sshm.c | Is this initialization correct (it should have been done when the pg was created. If not, then the process by which channels and PG's are initialized needs to be documented and explained! An alternate approach to having this done after the PG is created is to define a method initialization routine that is called after the pg is created but before the final, channel-specific virtual connection initialization is done. |
| src/mpid/ch3/util/shm/ch3u_finalize_sshm.c | Should this be registered as a finalize handler? Should there be a corresponding abort handler? |
| src/mpid/ch3/util/shm/ch3i_shm_bootstrapq.c | What is this routine for? Why does this routine duplicate so much of the code in the MPIDI_CH3I_SHM_Get_mem function (but not exactly; e.g., is there a reason that this routine and the Get_mem version use slightly different arguments to the shm_open routine and take different action on failure? |
| src/mpid/ch3/util/shm/ch3i_bootstrapq.c | Are these needed here (these were defined within an unrelated ifdef in mpidimpl.h) |
| src/mpid/ch3/util/shm/ch3i_bootstrapq.c | These routines allocate memory but rarely free it. They need a finalize handler to remove the allocated memory. A clear description of the purpose of these routines and their scope of use would make that easier. |
| src/mpid/ch3/util/shm/ch3i_bootstrapq.c | These routines should *first* decide on the method of shared memory and queue support, then implement the Bootstrap interface in terms of those routines, if necessary using large (but single) ifdef regions rather than the constant ifdef/endif code used here. In addition, common service routines should be defined so that the overall code can simply |
| src/mpid/ch3/util/shm/ch3i_bootstrapq.c | These routines need a description and explanation. For example, where is the bootstrap q pointer exported? Is there more than one bootstrap queue? |
| src/mpid/ch3/util/shm/ch3i_bootstrapq.c | What does this routine do? |
| src/mpid/ch3/util/shm/ch3i_bootstrapq.c | This memory is not freed |
| src/mpid/ch3/channels/nemesis/src/ch3i_comm.c | need a better solution here. e.g., allocate some barrier_vars on the first barrier for the life of the communicator (as is the case now), others must be allocated for each barrier, then released. If we run out of barrier_vars after that, then use msg_barrier. |
| src/mpid/ch3/channels/nemesis/src/ch3_init.c | Circular dependency. Before calling MPIDI_CH3_Init, MPID_Init calls InitPG which calls MPIDI_PG_Create which calls MPIDI_CH3_VC_Init. But MPIDI_CH3_VC_Init needs nemesis to be initialized. We can't call MPIDI_CH3_Init before initializing the PG, because it needs the PG. There is a hook called MPIDI_CH3_PG_Init which is called from MPIDI_PG_Create before MPIDI_CH3_VC_Init, but we don't have the pg_rank in that function. Maybe this shouldn't really be a FIXME, since I believe that this issue will be moot once nemesis is a device. So what we do now, is do nothing if MPIDI_CH3_VC_Init is called before MPIDI_CH3_Init, and call MPIDI_CH3_VC_Init from inside MPIDI_CH3_Init after initializing nemesis |
| src/mpid/ch3/channels/nemesis/src/ch3_init.c | where does this vc get freed? ANSWER (goodell@) - ch3u_port.c FreeNewVC (but the VC_Destroy is in this file) |
| src/mpid/ch3/channels/nemesis/nemesis/include/mpid_nem_queue.h | We shouldn't really be using the MPID_Thread_mutex_* code but the MPIDU_Process_locks code is a total mess right now. In the long term we need to resolve this, but in the short run it should be safe on most (all?) platforms to use these instead. Usually they will both boil down to a pthread_mutex_t and and associated functions. |
| src/mpid/ch3/channels/nemesis/nemesis/include/mpid_nem_datatypes.h | We are using this as an interprocess lock in the queue code, although it's not strictly guaranteed to work for this scenario. These should really use the "process locks" code, but it's in such terrible shape that it doesn't really work for us right now. Also, because it has some inline assembly it's not really a fair comparison for studying the impact of atomic instructions. [goodell@ 2009-01-16] |
| src/mpid/ch3/channels/nemesis/nemesis/include/mpid_nem_post.h | Do not export these definitions all the way into mpiimpl.h (which will happen if this is included in mpidpost.h or mpidpre.h) |
| src/mpid/ch3/channels/nemesis/nemesis/include/mpid_nem_defs.h | This definition should be gotten from mpidi_ch3_impl.h |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/tcp/socksm.c | trace/log all the state transitions |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/tcp/socksm.c | need to handle error condition better |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/tcp/socksm.c | Z1 |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/tcp/socksm.c | Z1 |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/tcp/socksm.c | is it needed ? |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/tcp/socksm.c | Z1 |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/tcp/socksm.c | Z1 |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/tcp/socksm.c | Z1 |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/tcp/socksm.c | We need to set addr and port using bc. If a process is dynamically spawned, vc->pg is NULL. In that case, same procedure is done in MPID_nem_tcp_connect_to_root() |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/tcp/socksm.c | XXX DJG do we need to do anything here to ensure that the final close(TRUE) packet has made it into a writev call? The code might have a race for queued messages. |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/tcp/socksm.c | retry 'n' number of retries before signalling an error to VC layer. |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/tcp/tcp_impl.h | How do we find the max length of a bc? |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/tcp/tcp_getip.c | This should use the standard debug/logging routines and macros |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/tcp/tcp_getip.c | This more-or-less duplicates the code in ch3/util/sock/ch3u_getintefaces.c , which is already more thoroughly tested and more portable than the code here. |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/tcp/socksm.h | should plfd really be const const? Some handlers may need to change the plfd entry. |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/tcp/socksm.h | bc actually contains port_name info |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/wintcp_finalize.c | Move all externs to say socksm_globals.h |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/wintcp_finalize.c | Why don't we have a finalize for sm - MPID_nem_newtcp_module_finalize_sm() - ? |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/wintcp_finalize.c | Shouldn't the order of finalize() be the reverse order of init() ? i.e., *finalize_sm(); *poll_finalize(); *send_finalize(); |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/wintcp_poll.c | Get rid of these dummy funcs |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/wintcp_utility.c | At least on windows there is no guarantee that a successful socket call resets the socket error code We will assume that sock is connected for now and let the state machine handle errors |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.c | Why should the listen sock conn be global ? |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.c | Tune the skip poll values since we now have a fast executive (instead of select) |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.c | We won't need a common place holder for these handlers. Remove it eventually. |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.c | We need to check whether a non-linear growth of the sc_tbl will be better for performance for large jobs |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.c | Use if-else and get rid of goto |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.c | The index could be invalid in the case of an error |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.c | need to handle error condition better |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.c | Post a write instead of assuming a write would succeed Looks like we should be *usually* ok here since the conn is just established and socket send buffer > sizeof iov |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.c | Post a write instead of assuming a write would succeed Looks like we should be *usually* ok here since the conn is just established and socket send buffer > sizeof iov |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.c | We should ideally do a shutdown() not close () |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.c | We should ideally do a shutdown() not close () |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.c | Remove the assumption that we will be able to read the complete id/tmp_vc info in a single read... |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.c | Just retry if we don't read the complete header |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.c | We might want to use sc->tmp_buf - after hdr for reading the pg_id |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.c | Post a read if read fails - instead of aborting |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.c | Z1 |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.c | is it needed ? |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.c | Don't assume read will read the whole pg_id/tmpvc |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.c | Z1 |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.c | We are assuming the command packets are only sent during connection phase - enough sock buffer for the calls to succeed |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.c | Z1 |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.c | We need to set addr and port using bc. If a process is dynamically spawned, vc->pg is NULL. In that case, same procedure is done in MPID_nem_newtcp_module_connect_to_root() |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.c | Post a close and let the quiescent handler take care of adding sc to freeq |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.c | How can the sc->fd be invalid here ? |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.c | XXX DJG do we need to do anything here to ensure that the final close(TRUE) packet has made it into a writev call? The code might have a race for queued messages. |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.c | WINTCP ASYNC MPIU_Assert(VC_FIELD(vc, sc_ref_count) == 0); |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.c | We are calling the cntd success handler explicitly here. Get rid of one of these states |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.c | Danger add error string actual string : "**cannot send idinfo" |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.c | Don't assume that the whole hdr is read |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.c | Remove this restriction |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.c | Add more states instead of setting handlers explicitly.. OR get rid of all states |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.c | Is this reqd ? - there is a *accept_fail_handler |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.c | Implement connect retry ... |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.c | Is this needed ? On accept side the sc won't have a vc right now |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.c | Rename the status type to look like a non-MACRO |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.c | Re-introduce after we get rid of explicit calls to handlers MPIU_Assert(nb > 0); |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.c | Get rid of found_better_sc() in the C_CNTD state. Always have the connection ACK/NACK by the listen side There are 2 cases here: 1) A connect() is in progress - Send a NACK - Let the other connection succeed 2) A connection already exists - Send a NACK |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.c | When is snd_nak true ? |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.c | Create post thread here |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.c | Kill post thread here |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.c | We also need to tune the number of times we need to wait for an event. eg: SOCK_PROGRESS_EVENTS_MAX = 2 x No_of_fds_in_ex_set |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.c | Try to avoid calling the rd handler explicitly |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.c | Get rid of static allocn of listen sock and post a close |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/wintcp_init.c | Move all externs to say socksm_globals.h |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/wintcp_init.c | All MPI util funcs should return an MPI error code |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/wintcp_init.c | No error code returned ! |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/wintcp_init.c | WINTCP ASYNC MPID_nem_newtcp_module_g_lstn_sc.handler = MPID_nem_newtcp_module_state_listening_handler; |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/wintcp_init.c | Why happens on an error ? |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/wintcp_impl.h | WINTCP ASYNC int MPID_nem_newtcp_module_conn_wr_enable (struct MPIDI_VC *const vc); int MPID_nem_newtcp_module_conn_wr_disable (struct MPIDI_VC *const vc); |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/wintcp_impl.h | WINTCP ASYNC MPID_NEM_NEWTCP_MODULE_SOCK_STATUS_t MPID_nem_newtcp_module_check_sock_status(MPIU_SOCKW_Waitset_sock_hnd_t fd_ws_hnd); |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/wintcp_impl.h | WINTCP ASYNC int MPID_nem_newtcp_module_recv_handler (MPIU_SOCKW_Waitset_sock_hnd_t fd_ws_hnd, sockconn_t *sc); |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/wintcp_impl.h | WINTCP ASYNC int MPID_nem_newtcp_module_state_listening_handler(MPIU_SOCKW_Waitset_sock_hnd_t fd_ws_hnd, sockconn_t *const l_sc); |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.h | This structure can be moved to sock wrappers once MPIU_Sock_fd_t becomes more opaque OR we replace MPIU_Sock_fd_t with an opaque MPIU_Sock_t Also, look into the MSMPI code on how to organize this private struct members into different states, listen_state/accept_state/read_state/write_state, depending on the type of socket. |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.h | Check whether the elements of sockconn is aligned correctly |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.h | Remove the index. We no longer need it |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.h | CONTAINING_RECORD is only defined for windows - Use offsetof() to define CONTAINING_RECORD() on unix |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/wintcp/socksm.h | bc actually contains port_name info |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/mx/mx_poll.c | this might not work in the case of multiple netmods |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/mx/mx_poll.c | |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/mx/mx_poll.c | this function can't return an error because it's called from a recvq function that doesn't check for errors. For now, I'm replacing the chkandjump with an assertp. |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/mx/mx_poll.c | how can we report MPI_ERR_TRUNC in this case? |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/mx/mx_send.c | match_info is used uninitialized |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/mx/mx_init.c | create a real error string for this |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/mx/mx_init.c | create a real error string for this |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/mx/mx_cancel.c | The vc is only needed to find which function to call |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/mx/mx_cancel.c | this test is probably not correct with multiple netmods |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/newmad/newmad_poll.c | this might not work in the case of multiple netmods |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/newmad/newmad_probe.c | |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/newmad/newmad_probe.c | |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/gm/gm_init.c | create a real error string for this |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/gm/gm_init.c | create a real error string for this |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/gm/gm_init.c | this is only valid for processes in COMM_WORLD |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/elan/elan_init.c | create a real error string for this |
| src/mpid/ch3/channels/nemesis/nemesis/netmod/elan/elan_init.c | create a real error string for this |
| src/mpid/ch3/channels/nemesis/nemesis/src/mpid_nem_lmt_vmsplice.c | we should write our own function that isn't dependent on the in-request iov array. This will let us use IOVs that are larger than MPID_IOV_LIMIT. |
| src/mpid/ch3/channels/nemesis/nemesis/src/mpid_nem_lmt_dma.c | randomly chosen |
| src/mpid/ch3/channels/nemesis/nemesis/src/mpid_nem_lmt_dma.c | The knem module iovec is potentially different from the system iovec. This causes all sorts of fun if you don't realize it and use the system iovec directly instead. Eventually we need to either unify them or avoid this extra copy. |
| src/mpid/ch3/channels/nemesis/nemesis/src/mpid_nem_lmt_dma.c | The knem module iovec is potentially different from the system iovec. This causes all sorts of fun if you don't realize it and use the system iovec directly instead. Eventually we need to either unify them or avoid this extra copy. |
| src/mpid/ch3/channels/nemesis/nemesis/src/mpid_nem_lmt_dma.c | we should write our own function that isn't dependent on the in-request iov array. This will let us use IOVs that are larger than MPID_IOV_LIMIT. |
| src/mpid/ch3/channels/nemesis/nemesis/src/mpid_nem_lmt.c | where does this request get freed? |
| src/mpid/ch3/channels/nemesis/nemesis/src/mpid_nem_lmt.c | where does this request get freed? |
| src/mpid/ch3/channels/nemesis/nemesis/src/mpid_nem_finalize.c | I removed this PMI_Barrier in PMI2. The barrier seems not to be necessary, but I'm keeping this here until I'm convinced that this isn't here to handle some timing-related bug. DARIUS |
| src/mpid/ch3/channels/nemesis/nemesis/src/mpid_nem_init.c | DARIUS set these to default for now |
| src/mpid/ch3/channels/nemesis/nemesis/src/mpid_nem_init.c | DARIUS -- enable this assert once these functions are implemented |
| src/mpid/ch3/channels/nemesis/nemesis/src/mpid_nem_init.c | ch3 assumes there is a field called sendq_head in the ch portion of the vc. This is unused in nemesis and should be set to NULL |
| src/mpid/ch3/channels/nemesis/nemesis/src/mpid_nem_barrier.c | this is not a scalable algorithm because everyone is polling on the same cacheline |
| src/mpid/ch3/channels/nemesis/include/mpidi_ch3_pre.h | we should really use the copy of these values that is stored in the MPID_Comm structure |
| src/mpid/ch3/channels/nemesis/include/mpidi_ch3_impl.h | used for debugging |
| src/mpid/ch3/channels/nemesis/include/mpidi_ch3_impl.h | ch3 assumes there is a field called sendq_head in the ch portion of the vc. This is unused in nemesis and should be set to NULL |
| src/mpid/ch3/channels/shm/src/ch3_finalize.c | This should be used in abort as well |
| src/mpid/ch3/channels/shm/src/ch3_init.c | the vc's must be set to active for the close protocol to work in the shm channel |
| src/mpid/ch3/channels/shm/src/ch3_init.c | Should the malloc be within the init? |
| src/mpid/ch3/channels/shm/src/ch3_init.c | Need to free this request when the vc is removed |
| src/mpid/ch3/channels/shm/src/ch3_init.c | Should these also be set in the VC_Init, or is VC_Init moot (never called because this channel does not support dynamic processes?) |
| src/mpid/ch3/channels/shm/src/ch3_init.c | Figure out whether this is a common feature of process groups (and thus make it part of the general PG_Init) or something else. Avoid a "get" routine because of the danger in using "get" where "dup" is required. |
| src/mpid/ch3/channels/shm/src/ch3_init.c | Move this code to the general init pg for shared memory |
| src/mpid/ch3/channels/shm/src/ch3_init.c | This should call a routine from the ch3/util/shm directory to initialize the use of shared memory for processes WITHIN this process group |
| src/mpid/ch3/channels/shm/src/ch3_isendv.c | the current code only agressively writes the first IOV. Eventually it should be changed to agressively write as much as possible. Ideally, the code would be shared between the send routines and the progress engine. |
| src/mpid/ch3/channels/shm/include/mpidi_ch3_impl.h | We need to ensure read/write barriers for correctness, if the processor might possibly reorder |
| src/mpid/ch3/channels/shm/include/mpidi_ch3_impl.h | The ssm channel uses these names instead. We'll switch to them eventually (since there is a great deal of commonality between the ssm and shm code, and that commonality should be reflected in common support routines) |
| src/mpid/ch3/channels/dllchan/include/mpidi_ch3_impl.h | Old code from sock below here |
| src/mpid/ch3/channels/dllchan/include/mpidi_ch3_impl.h | Any of these used in the ch3->channel interface should be defined in a header file in ch3/include that defines the channel interface |
| src/mpid/ch3/channels/dllchan/include/mpidi_ch3_pre.h | These should be removed |
| src/mpid/ch3/channels/dllchan/include/mpidi_ch3_pre.h | Are the following packet extensions? Can the socket connect/accept packets be made part of the util/sock support? |
| src/mpid/ch3/channels/dllchan/include/mpidi_ch3_pre.h | We need a little security here to avoid having a random port scan crash the process. Perhaps a "secret" value for each process could be published in the key-val space and subsequently sent in the open pkt. |
| src/mpid/ch3/channels/dllchan/include/mpidi_ch3_pre.h | Explain these; why is this separate from the VC state? |
| src/mpid/ch3/channels/dllchan/include/mpidi_ch3_post.h | We need to document all of these parameters. There should be no ifdef that is not documented, other than those set by configure |
| src/mpid/ch3/channels/dllchan/src/ch3impl.c | As a workaround for the use of PG_Destroy *after* CH3_Finalize is called, this entry point is set to something benign, and the VC_Destroy that PG_Destroy may invoke, is also set to ignore. |
| src/mpid/ch3/channels/ssm/src/ch3_progress_sock.c | This is nowhere set to true. The name is non-conforming if it is not static |
| src/mpid/ch3/channels/ssm/src/ch3_progress_sock.c | the following should be handled by the close protocol |
| src/mpid/ch3/channels/ssm/src/ch3_progress_sock.c | where does this vc get freed? |
| src/mpid/ch3/channels/ssm/src/ch3_progress_connect.c | This is a hack to allow Windows to continue to use the host description string instead of the interface address bytes when posting a socket connection. This should be fixed by changing the Sock_post_connect to only accept interface address. See also channels/sock/ch3_progress.c |
| src/mpid/ch3/channels/ssm/src/ch3_init.c | These VC debug print routines belong in mpid_vc.c, with channel-specific code in an appropriate file. Much of that channel-specific code might belong in ch3/util/sock or ch3/util/shm |
| src/mpid/ch3/channels/ssm/src/ch3_init.c | This doesn't allocate and only inits one field. Is this now part of the channel-specific hook for channel-specific VC info? |
| src/mpid/ch3/channels/ssm/src/ch3_init.c | This should call a routine from the ch3/util/shm directory to initialize the use of shared memory for processes WITHIN this process group |
| src/mpid/ch3/channels/ssm/src/ch3_progress.c | It looks like this is a relic of code to occassional call sleep to implement a yield. As no code defined the use sleep yield macro, that code was removed, exposing this odd construction. |
| src/mpid/ch3/channels/ssm/src/ch3_progress.c | copied from sock # if (USE_THREAD_IMPL == MPICH_THREAD_IMPL_GLOBAL_MUTEX) { MPID_Thread_cond_create(&MPIDI_CH3I_progress_completion_cond, NULL); } # endif |
| src/mpid/ch3/channels/ssm/src/ch3_progress.c | Cleanly shutdown other socks and MPIU_Free connection structures. (close protocol?) |
| src/mpid/ch3/channels/ssm/src/ch3_progress.c | copied from sock # if (USE_THREAD_IMPL == MPICH_THREAD_IMPL_GLOBAL_MUTEX) { MPID_Thread_cond_destroy(&MPIDI_CH3I_progress_completion_cond, NULL); } # endif |
| src/mpid/ch3/channels/ssm/src/ch3_progress.c | What is this routine for? |
| src/mpid/ch3/channels/ssm/src/ch3_progress.c | This is a temp to free memory allocated in this file |
| src/mpid/ch3/channels/ssm/src/ch3_isendv.c | the current code only agressively writes the first IOV. Eventually it should be changed to agressively write as much as possible. Ideally, the code would be shared between the send routines and the progress engine. |
| src/mpid/ch3/channels/ssm/src/ch3_isend.c | Integrate this with ch3_isendv.c so that there are fewer files, and so that common code is in one place and not duplicated (or almost duplicated) across files |
| src/mpid/ch3/channels/ssm/src/ch3_isend.c | For fault tolerance, retry the operation before marking the connection as failed |
| src/mpid/ch3/channels/ssm/include/mpidi_ch3_impl.h | We should only include the ones of these that we use within this channel |
| src/mpid/ch3/channels/ssm/include/mpidi_ch3_impl.h | Note the use of the USE_SYSV_MQ to only include this header includes when needed. The other headers should be similarly organized |
| src/mpid/ch3/channels/ssm/include/mpidi_ch3_impl.h | We need to move these queue routines out of the header files to allow dynamic loading/selection of channels, and to provide better modularity of the ch3 code |
| src/mpid/ch3/channels/ssm/include/mpidi_ch3_impl.h | Why 100? |
| src/mpid/ch3/channels/ssm/include/mpidi_ch3_impl.h | These should be known only by the code that is managing the business cards |
| src/mpid/ch3/channels/ssm/include/mpidi_ch3_post.h | These sizes need to be documented at least, and it should be easier to change them, at least at configure/build time |
| src/mpid/ch3/channels/ssm/include/mpidi_ch3_post.h | Are these the same definitions used by all channels? If so, they should be defined in the same place (and it would make sense for the progress routines in the ch3 device to have the same interface independent of channel) |
| src/mpid/ch3/channels/ssm/include/ch3i_progress.h | We need an Undefined for the tick type |
| src/mpid/ch3/channels/ssm/include/ch3i_progress.h | These common definitions (accessing the timer) should be in a single file |
| src/mpid/ch3/channels/ssm/include/ch3i_progress.h | What are these used for (possibly one of the spinwait variations?) Document the use |
| src/mpid/ch3/channels/sock/src/ch3_isendv.c | the current code only agressively writes the first IOV. Eventually it should be changed to agressively write as much as possible. Ideally, the code would be shared between the send routines and the progress engine. |
| src/mpid/ch3/channels/sock/src/ch3_isendv.c | Shouldn't the vc->state also change? |
| src/mpid/ch3/channels/sock/src/ch3_isend.c | Shouldn't the vc->state also change? |
| src/mpid/ch3/channels/sock/src/ch3_progress.c | Move thread stuff into some set of abstractions in order to remove ifdefs |
| src/mpid/ch3/channels/sock/src/ch3_progress.c | Was (USE_THREAD_IMPL == MPICH_THREAD_IMPL_NOT_IMPLEMENTED), which really meant not-using-global-mutex-thread model . This was true for the single threaded case, but was probably not intended for that case |
| src/mpid/ch3/channels/sock/src/ch3_progress.c | Cleanly shutdown other socks and free connection structures. (close protocol?) |
| src/mpid/ch3/channels/sock/src/ch3_progress.c | the following should be handled by the close protocol |
| src/mpid/ch3/channels/sock/src/ch3_progress.c | (a) what does this do and where is it used and (b) we could replace it with a #define for the single-method case |
| src/mpid/ch3/channels/sock/src/ch3_progress.c | This function also used in ch3u_connect_sock.c |
| src/mpid/ch3/channels/sock/src/ch3_progress.c | Is dequeue/get next the operation we really want? |
| src/mpid/ch3/channels/sock/src/ch3_progress.c | What is this routine for? |
| src/mpid/ch3/channels/sock/include/mpidi_ch3_pre.h | These should be removed |
| src/mpid/ch3/channels/sock/include/mpidi_ch3_pre.h | Are the following packet extensions? Can the socket connect/accept packets be made part of the util/sock support? |
| src/mpid/ch3/channels/sock/include/mpidi_ch3_pre.h | We need a little security here to avoid having a random port scan crash the process. Perhaps a "secret" value for each process could be published in the key-val space and subsequently sent in the open pkt. |
| src/mpid/ch3/channels/sock/include/mpidi_ch3_pre.h | Explain these; why is this separate from the VC state? |
| src/mpid/ch3/channels/sock/include/mpidi_ch3_post.h | We need to document all of these parameters. There should be no ifdef that is not documented, other than those set by configure |
| src/mpid/ch3/channels/sock/include/mpidi_ch3_impl.h | Any of these used in the ch3->channel interface should be defined in a header file in ch3/include that defines the channel interface |
| src/mpid/ch3/channels/sctp/src/ch3_progress.c | is it possible for tmp_vc to be non-NULL and pkt to be an accept pkt? |
| src/mpid/ch3/channels/sctp/src/ch3_progress.c | what if not a full pkt (!MSG_EOR)? |
| src/mpid/ch3/channels/sctp/src/ch3_progress.c | why would the vc ref_count be zero w/o being destroyed yet? A check against zero like this is usually not thread-safe... |
| src/mpid/ch3/channels/sctp/src/ch3_progress.c | done elsewhere so add this to a function |
| src/mpid/ch3/channels/sctp/src/ch3_progress.c | define error code |
| src/mpid/ch3/channels/sctp/src/ch3_progress.c | TODO : change error codes to have SCTP errors |
| src/mpid/ch3/channels/sctp/src/ch3_progress.c | atoi doesn't detect errors (e.g., non-digits) |
| src/mpid/ch3/channels/sctp/src/ch3_progress.c | atoi doesn't detect errors (e.g., non-digits) |
| src/mpid/ch3/channels/sctp/src/ch3_progress.c | might need to be more sophisticated with multiple fd's |
| src/mpid/ch3/channels/sctp/src/sctp_common.c | assumes success |
| src/mpid/ch3/channels/sctp/src/sctp_util.c | Is this documented? How does it work if the process manager cannot give each process a different value for an environment name? What if a different interface is needed? |
| src/mpid/ch3/channels/sctp/src/sctp_util.c | Try to get the environment variable that uses the rank in comm world, i.e., MPICH_INTERFACE_HOSTNAME_R_%d. For this, we'll need to know the rank for this process. |
| src/mpid/ch3/channels/sctp/src/sctp_util.c | We should start switching to getaddrinfo instead of gethostbyname |
| src/mpid/ch3/channels/sctp/src/sctp_util.c | We don't make use of the ifname in Windows in order to provide backward compatibility with the (undocumented) host description string used by the socket connection routine MPIDU_Sock_post_connect. We need to change to an interface-address (already resolved) based description for better scalability and to eliminate reliance on fragile DNS services. Note that this is also more scalable, since the DNS server may serialize address requests. On most systems, asking for the host info of yourself is resolved locally (i.e., perfectly parallel). |
| src/mpid/ch3/channels/sctp/src/sctp_util.c | This is a hack to allow Windows to continue to use the host description string instead of the interface address bytes when posting a socket connection. This should be fixed by changing the Sock_post_connect to only accept interface address. Note also that Windows does not have the inet_pton routine; the Windows version of this routine will need to be identified or written. See also channels/sock/ch3_progress.c and channels/ssm/ch3_progress_connect.c |
| src/mpid/ch3/channels/sctp/src/ch3_init.c | define error code |
| src/mpid/ch3/channels/sctp/src/ch3_init.c | define error code |
| src/mpid/ch3/channels/sctp/src/ch3_isendv.c | check specific sendQ and send_active |
| src/mpid/ch3/channels/sctp/src/ch3_isendv.c | the current code only agressively writes the first IOV. Eventually it should be changed to agressively write as much as possible. Ideally, the code would be shared between the send routines and the progress engine. |
| src/mpid/ch3/channels/sctp/src/ch3_isendv.c | enqueue x |
| src/mpid/ch3/channels/sctp/src/ch3_istartmsg.c | change error code |
| src/mpid/ch3/channels/sctp/include/mpidi_ch3_pre.h | These should be removed |
| src/mpid/ch3/channels/sctp/include/mpidi_ch3_pre.h | Are the following packet extensions? Can the socket connect/accept packets be made part of the util/sock support? |
| src/mpid/ch3/channels/sctp/include/mpidi_ch3_pre.h | We need a little security here to avoid having a random port scan crash the process. Perhaps a "secret" value for each process could be published in the key-val space and subsequently sent in the open pkt. |
| src/mpid/ch3/channels/sctp/include/mpidi_ch3_pre.h | this is called in mpid_vc.c |
| src/mpid/ch3/channels/sctp/include/mpidi_ch3_impl.h | Any of these used in the ch3->channel interface should be defined in a header file in ch3/include that defines the channel interface |
| src/mpid/ch3/channels/sctp/include/mpidi_ch3_post.h | We need to document all of these parameters. There should be no ifdef that is not documented, other than those set by configure |
| src/mpid/ch3/src/mpidi_pg.c | These routines need a description. What is their purpose? Who calls them and why? What does each one do? |
| src/mpid/ch3/src/mpidi_pg.c | straighten out the use of PMI_Finalize - no use after PG_Finalize |
| src/mpid/ch3/src/mpidi_pg.c | This routine needs to make it clear that the pg_id, for example is saved; thus, if the pg_id is a string, then that string is not copied and must be freed by a PG_Destroy routine |
| src/mpid/ch3/src/mpidi_pg.c | it would be good if we could make this assertion. Unfortunately, either: 1) We're not being disciplined and some caller of this function doesn't bother to manage all the refcounts because he thinks he knows better. Annoying, but not strictly a bug. (wdg - actually, that is a bug - managing the ref counts IS required and missing one is a bug.) 2) There is a real bug lurking out there somewhere and we just haven't hit it in the tests yet. |
| src/mpid/ch3/src/mpidi_pg.c | What does DEV_IMPLEMENTS_KVS mean? Why is it used? Who uses PG_To_string and why? |
| src/mpid/ch3/src/mpidi_pg.c | This is a temporary hack for devices that do not define MPIDI_DEV_IMPLEMENTS_KVS FIXME: MPIDI_DEV_IMPLEMENTS_KVS should be removed |
| src/mpid/ch3/src/mpidi_pg.c | This is a hack to avoid including shared-memory queue names in the business card that may be used by processes that were not part of the same COMM_WORLD. To fix this, the shared memory channels should look at the returned connection info and decide whether to use sockets or shared memory by determining whether the process is in the same MPI_COMM_WORLD. |
| src/mpid/ch3/src/mpidi_pg.c | The more general problem is that the connection information needs to include some information on the range of validity (e.g., all processes, same comm world, particular ranks), and that representation needs to be scalable |
| src/mpid/ch3/src/mpidi_pg.c | this should be a failure to call this routine |
| src/mpid/ch3/src/mpidi_pg.c | Turn this into a valid error code create/return |
| src/mpid/ch3/src/mpidi_pg.c | This is a hack, and it doesn't even work |
| src/mpid/ch3/src/mpidi_pg.c | MT: This should be a fetch and increment for thread-safety |
| src/mpid/ch3/src/mpidi_pg.c | This routine should invoke a close method on the connection, rather than have all of the code here |
| src/mpid/ch3/src/mpidi_pg.c | Remove this IFDEF |
| src/mpid/ch3/src/mpidi_pg.c | If the reference count for the vc is not 0, something is wrong |
| src/mpid/ch3/src/mpid_cancel_send.c | This should call a channel-provided routine to deliver the cancel message, once the code decides that the request can still be cancelled |
| src/mpid/ch3/src/mpid_cancel_send.c | if send cancellation packets are allowed to arrive out-of-order with respect to send packets, then we need to timestamp send and cancel packets to insure that a cancellation request does not bypass the send packet to be cancelled and erroneously cancel a previously sent message with the same request handle. |
| src/mpid/ch3/src/mpid_cancel_send.c | A timestamp is more than is necessary; a message sequence number should be adequate. |
| src/mpid/ch3/src/mpid_cancel_send.c | Note that this routine is called from within the packet handler. If the message queue mutex is different from the progress mutex, this must be protected within a message-queue mutex |
| src/mpid/ch3/src/mpid_cancel_send.c | This is called within the packet handler |
| src/mpid/ch3/src/ch3u_handle_send_req.c | Should this test be an MPIU_Assert? |
| src/mpid/ch3/src/ch3u_handle_send_req.c | MT: this has to be done atomically |
| src/mpid/ch3/src/ch3u_rma_sync.c | MT: this needs to be done atomically because other procs have the address and could decrement it. |
| src/mpid/ch3/src/ch3u_rma_sync.c | We can't do this because fence_cnt must be updated collectively |
| src/mpid/ch3/src/ch3u_rma_sync.c | We can't do this because fence_cnt must be updated collectively |
| src/mpid/ch3/src/ch3u_rma_sync.c | MT: this has to be done atomically |
| src/mpid/ch3/src/ch3u_rma_sync.c | We can't do this because fence_cnt must be updated collectively |
| src/mpid/ch3/src/ch3u_rma_sync.c | MT: this has to be done atomically |
| src/mpid/ch3/src/ch3u_rma_sync.c | Only change the handling of completion if post_data_receive reset the handler. There should be a cleaner way to do this |
| src/mpid/ch3/src/ch3u_rma_sync.c | Only change the handling of completion if post_data_receive reset the handler. There should be a cleaner way to do this |
| src/mpid/ch3/src/ch3u_rma_sync.c | MT: This may need to be done atomically. |
| src/mpid/ch3/src/ch3u_rma_sync.c | MT: The queuing may need to be done atomically. |
| src/mpid/ch3/src/ch3u_rma_sync.c | MT: This may need to be done atomically. |
| src/mpid/ch3/src/ch3u_rma_sync.c | MT: The queuing may need to be done atomically. |
| src/mpid/ch3/src/ch3u_rma_sync.c | Only change the handling of completion if post_data_receive reset the handler. There should be a cleaner way to do this |
| src/mpid/ch3/src/ch3u_rma_sync.c | It is likely that this cannot happen (never perform a get with a 0-sized item). In that case, change this to an MPIU_Assert (and do the same for accumulate and put) |
| src/mpid/ch3/src/mpid_getpname.c | Make thread safe |
| src/mpid/ch3/src/ch3u_port.c | If dynamic processes are not supported, this file will contain no code and some compilers may warn about an "empty translation unit" |
| src/mpid/ch3/src/ch3u_port.c | pg_translation is used for ? |
| src/mpid/ch3/src/ch3u_port.c | Describe the algorithm used here, and what routine is user on the other side of this connection |
| src/mpid/ch3/src/ch3u_port.c | This code is still broken for the following case: If the same process opens connections to the multiple processes, this context ID might get out of sync. |
| src/mpid/ch3/src/ch3u_port.c | we probably need a unique context_id. |
| src/mpid/ch3/src/ch3u_port.c | Explain why |
| src/mpid/ch3/src/ch3u_port.c | Why do we do a dup here? |
| src/mpid/ch3/src/ch3u_port.c | Who sets? Why? Where is this defined? Document. Why is this not done as part of the VC initialization? |
| src/mpid/ch3/src/ch3u_port.c | A hook should not be such a specific function; instead, it should invoke a function pointer defined in the channel interface structure |
| src/mpid/ch3/src/ch3u_port.c | If we fail to connect, someone needs to free this newcomm |
| src/mpid/ch3/src/ch3u_port.c | why is this commented out? |
| src/mpid/ch3/src/ch3u_port.c | If this is really needed, make it a destroy callback on the process group rather than an SSHM-specific item |
| src/mpid/ch3/src/ch3u_port.c | Need to understand this and either remove or make common to all channels |
| src/mpid/ch3/src/ch3u_port.c | Error, the pg_list is broken |
| src/mpid/ch3/src/ch3u_port.c | We should use the PG destroy function for this, and ensure that the PG fields are valid for that function |
| src/mpid/ch3/src/ch3u_port.c | why is this commented out? |
| src/mpid/ch3/src/ch3u_port.c | How much of this could/should be common with the upper level (src/mpi/comm/ *.c) code? For best robustness, this should use the same routine (not copy/paste code) as in the upper level code. |
| src/mpid/ch3/src/ch3u_port.c | The free and the create routines should be in the same file |
| src/mpid/ch3/src/ch3u_port.c | remove this ifdef - method on connection? |
| src/mpid/ch3/src/ch3u_port.c | should this be an MPIU_CALL macro? |
| src/mpid/ch3/src/ch3u_port.c | What is an Accept queue and who uses it? Is this part of the connect/accept support? These routines appear to be called by channel progress routines; perhaps this belongs in util/sock (note the use of a port_name_tag in the dequeue code, though this could be any string). Are the locks required? If this is only called within the progress engine, then the progress engine locks should be sufficient. If a finer grain lock model is used, it needs to be very carefully designed and documented. |
| src/mpid/ch3/src/ch3u_port.c | Use CHKPMEM |
| src/mpid/ch3/src/ch3u_port.c | Stack or queue? |
| src/mpid/ch3/src/mpid_isend.c | HOMOGENEOUS SYSTEMS ONLY -- no data conversion is performed |
| src/mpid/ch3/src/mpid_isend.c | The routines MPID_Isend, MPID_Issend, MPID_Irsend are nearly identical. It would be better if these did roughly: MPID_Irsend -> always eager send (with ready mode for error detection) MPID_Issend -> always rendezvous send MPID_Isend -> chose eager/rendezvous based on a threshold (and consider making the threshold configurable at either compile time (for best low-latency performance) or run-time (for application tuning). Then the 3 routines share little code, particularly if the eager/rendezvous implementations are in their own routine |
| src/mpid/ch3/src/mpid_isend.c | flow control: limit number of outstanding eager messsages containing data and need to be buffered by the receiver |
| src/mpid/ch3/src/mpid_isend.c | fill temporary IOV or pack temporary buffer after send to hide some latency. This requires synchronization because the CTS packet could arrive and be processed before the above iStartmsg completes (depending on the progress engine, threads, etc.). |
| src/mpid/ch3/src/ch3u_handle_connection.c | What is this routine for? It appears to be used only in ch3_progress, ch3_progress_connect, or ch3_progress_sock files. Is this a general operation, or does it belong in util/sock ? It appears to be used in multiple channels, but probably belongs in mpid_vc, along with the vc exit code that is currently in MPID_Finalize |
| src/mpid/ch3/src/ch3u_handle_connection.c | The only event is event_terminated. Should this have a different name/expected function? |
| src/mpid/ch3/src/ch3u_handle_connection.c | Decrement the reference count? Who increments? |
| src/mpid/ch3/src/ch3u_handle_connection.c | The reference count is often already 0. But not always |
| src/mpid/ch3/src/ch3u_handle_connection.c | Who increments the reference count that this is decrementing? |
| src/mpid/ch3/src/ch3u_handle_connection.c | This should be done when the reference count of the vc is first decremented |
| src/mpid/ch3/src/ch3u_handle_connection.c | Remove this IFDEF |
| src/mpid/ch3/src/ch3u_handle_connection.c | This situation has been seen with the ssm device, when running on a single process. It may indicate that the close protocol, for shared memory connections, can occasionally reach this state when both sides start closing the connection. We will act as if this is duplicate information can be ignored (rather than triggering the Assert in the next case) |
| src/mpid/ch3/src/ch3u_handle_connection.c | Debugging |
| src/mpid/ch3/src/mpid_rsend.c | HOMOGENEOUS SYSTEMS ONLY -- no data conversion is performed |
| src/mpid/ch3/src/mpid_rsend.c | How does this differ from eager send? It should differ in only a few bits (e.g., indicate that the send is ready and should fail if there is no matching receive) |
| src/mpid/ch3/src/mpid_startall.c | This bsend header shouldn't be needed (the function prototype should be in mpiimpl.h), to allow a devices MPID_Startall to use the MPIR_Bsend_isend function |
| src/mpid/ch3/src/mpid_startall.c | Consider using function pointers for invoking persistent requests; if we made those part of the public request structure, the top-level routine could implement startall unless the device wanted to study the requests and reorder them |
| src/mpid/ch3/src/mpid_startall.c | Where is the memory registration call in the init routines, in case the channel wishes to take special action (such as pinning for DMA) the memory? This was part of the design. |
| src/mpid/ch3/src/mpid_startall.c | The odd 7th arg (match.context_id - comm->context_id) is probably to get the context offset. Do we really need the context offset? Is there any case where the offset isn't zero? |
| src/mpid/ch3/src/mpid_startall.c | Move the routines that initialize the persistent requests into this file, since startall must be used with all of them |
| src/mpid/ch3/src/mpid_comm_disconnect.c | How can we get to a state where we are still connecting a VC but the MPIR_Comm_release will find that the ref count decrements to zero (it may be that some operation fails to increase/decrease the reference count. A patch could be to increment the reference count while connecting, then decrement it. But the increment in the reference count should come from the step that caused the connection steps to be initiated. Possibility: if the send queue is not empty, the ref count should be higher. |
| src/mpid/ch3/src/mpid_comm_disconnect.c | This doesn't work yet |
| src/mpid/ch3/src/mpid_comm_disconnect.c | Describe what more might be required |
| src/mpid/ch3/src/ch3u_handle_recv_pkt.c | We can turn this into something like MPIU_Assert(pkt->type <= MAX_PACKET_TYPE); mpi_errno = MPIDI_CH3_ProgressFunctions[pkt->type](vc,pkt,rreqp); in the progress engine itself. Then this routine is not necessary. |
| src/mpid/ch3/src/ch3u_handle_recv_pkt.c | We want to set the OnDataAvail to the appropriate function, which depends on whether this is an RMA request or a pt-to-pt request. |
| src/mpid/ch3/src/ch3u_handle_recv_pkt.c | Set OnDataAvail to 0? If not, why not? |
| src/mpid/ch3/src/ch3u_handle_recv_pkt.c | to improve performance, allocate temporary buffer from a specialized buffer pool. |
| src/mpid/ch3/src/ch3u_handle_recv_pkt.c | to avoid memory exhaustion, integrate buffer pool management with flow control |
| src/mpid/ch3/src/ch3u_handle_recv_pkt.c | We want to set the OnDataAvail to the appropriate function, which depends on whether this is an RMA request or a pt-to-pt request. |
| src/mpid/ch3/src/ch3u_handle_recv_pkt.c | to improve performance, allocate temporary buffer from a specialized buffer pool. |
| src/mpid/ch3/src/ch3u_handle_recv_pkt.c | to avoid memory exhaustion, integrate buffer pool management with flow control |
| src/mpid/ch3/src/ch3u_handle_recv_pkt.c | we still need to implement flow control. As a reminder, we don't mark these parameters as unused, because a full implementation of this routine will need to make use of all 4 parameters |
| src/mpid/ch3/src/ch3u_handle_recv_pkt.c | This should be initialized by a separate, RMA routine. That would allow different RMA implementations. We could even do lazy initialization (make this part of win_create) |
| src/mpid/ch3/src/ch3u_eager.c | Why do we set the message type? |
| src/mpid/ch3/src/ch3u_eager.c | We want to set the OnDataAvail to the appropriate function, which depends on whether this is an RMA request or a pt-to-pt request. |
| src/mpid/ch3/src/ch3u_eager.c | The MPICH2 tests do not exercise this branch |
| src/mpid/ch3/src/ch3u_eager.c | When eagershort is enabled, provide a preallocated space for short messages (which is used even if eager short is not used), since we don't want to have a separate check to figure out which buffer we're using (or perhaps we should have a free-buffer-pointer, which can be null if it isn't a buffer that we've allocated). |
| src/mpid/ch3/src/ch3u_eager.c | This is not optimized for short messages, which should have the data in the same packet when the data is particularly short (e.g., one 8 byte long word) |
| src/mpid/ch3/src/ch3u_eager.c | an error packet should be sent back to the sender indicating that the ready-send failed. On the send side, the error handler for the communicator can be invoked even if the ready-send request has already completed. |
| src/mpid/ch3/src/mpid_finalize.c | This routine needs to be factored into finalize actions per module, In addition, we should consider registering callbacks for those actions rather than direct routine calls. |
| src/mpid/ch3/src/mpid_finalize.c | The correct action here is to begin a shutdown protocol that lets other processes know that this process is now in finalize. Note that only requests that have been freed with MPI_Request_free are valid at this point; other pending receives can be ignored since a valid program should wait or test for them before entering finalize. The easist fix is to allow an MPI_Barrier over comm_world (and any connected processes in the MPI-2 case). Once the barrier completes, all processes are in finalize and any remaining unmatched receives will never be matched (by a correct program; a program with a send in a separate thread that continues after some thread calls MPI_Finalize is erroneous). Avoiding the barrier is hard. Consider this sequence of steps: Send in-finalize message to all connected processes. Include information on whether there are pending receives. (Note that a posted receive with any source is a problem) (If there are many connections, then this may take longer than the barrier) Allow connection requests from anyone who has not previously connected only if there is an possible outstanding receive; reject others with a failure (causing the source process to fail). Respond to an in-finalize message with the number of posted receives remaining. If both processes have no remaining receives, they can both close the connection. Processes with no pending receives and no connections can exit, calling PMI_Finalize to let the process manager know that they are in a controlled exit. Processes that still have open connections must then try to contact the remaining processes. |
| src/mpid/ch3/src/mpid_finalize.c | Using MPI_Barrier on MPI_COMM_WORLD here is dangerous. It is fine, of course, for correct programs, but incorrect programs, for examples, ones that call MPI_Barrier(MPI_COMM_WORLD) in some but not all processes will show unexpected symptoms (e.g., the otherwise Unmatched MPI_Barrier calls will match this barrier, and then MPI_Finalize will hang. To fix this, we need a separate Barrier operation, either an independent Barrier or an independent communicator that is not used by any other (user) routine. |
| src/mpid/ch3/src/mpid_finalize.c | The close actions should use the same code as the other connection close code |
| src/mpid/ch3/src/mpidi_printf.c | What are these routines for? Who uses them? Why are they different from the src/util/dbg routines? |
| src/mpid/ch3/src/mpidi_printf.c | This "unreferenced_arg" is an example of a problem with the API (unneeded level argument) or the code (failure to check the level argument). Inserting these "unreference_arg" macros erroneously suggests that the code is correct with this ununsed argument, and thus commits the grave harm of obscuring a real problem |
| src/mpid/ch3/src/mpidi_printf.c | It would be much better if the routine to print packets used routines defined by packet type, making it easier to modify the handling of packet types (and allowing channels to customize the printing of packets). For example, an array of function pointers, indexed by packet type, could be used. Also, these routines should not use MPIU_DBG_PRINTF, instead they should us a simple fprintf with a style allowance (so that the style checker won't flag the use as a possible problem). This should switch to using a table of functions MPIDI_PktPrintFunctions[pkt->type](stdout,pkt); |
| src/mpid/ch3/src/mpidi_printf.c | Move these RMA descriptions into the RMA code files |
| src/mpid/ch3/src/mpid_ssend.c | HOMOGENEOUS SYSTEMS ONLY -- no data conversion is performed |
| src/mpid/ch3/src/mpid_abort.c | Who uses/sets MPIDI_DEV_IMPLEMENTS_ABORT? |
| src/mpid/ch3/src/mpid_abort.c | We should move this into a header file so that we don't need the ifdef. Also, don't use exit (add to coding check) since not safe in windows. To avoid confusion, define a RobustExit? or MPIU_Exit? |
| src/mpid/ch3/src/mpid_abort.c | This routine *or* MPI_Abort should provide abort callbacks, similar to the support in MPI_Finalize |
| src/mpid/ch3/src/mpid_abort.c | Do we want the rank of the input communicator here or the rank of comm world? The message gives the rank but not the communicator, so using other than the rank in comm world does not identify the process, as the message suggests |
| src/mpid/ch3/src/mpid_abort.c | Not internationalized |
| src/mpid/ch3/src/mpid_abort.c | Not internationalized |
| src/mpid/ch3/src/mpid_abort.c | This should not use an ifelse chain. Either define the function by name or set a function pointer |
| src/mpid/ch3/src/mpid_abort.c | What is the scope for PMI_Abort? Shouldn't it be one or more process groups? Shouldn't abort of a communicator abort either the process groups of the communicator or only the current process? Should PMI_Abort have a parameter for which of these two cases to perform? |
| src/mpid/ch3/src/mpid_irecv.c | if the user buffer is contiguous, just move the data without using a separate routine call |
| src/mpid/ch3/src/mpid_init.c | This does not belong here |
| src/mpid/ch3/src/mpid_init.c | the PMI init function should ONLY do the PMI operations, not the process group or bc operations. These should be in a separate routine |
| src/mpid/ch3/src/mpid_init.c | This is a good place to check for environment variables and command line options that may control the device |
| src/mpid/ch3/src/mpid_init.c | Why are pg_size and pg_rank handled differently? |
| src/mpid/ch3/src/mpid_init.c | Why do we add a ref to pg here? |
| src/mpid/ch3/src/mpid_init.c | To allow just the "root" process to request the port and then use MPIR_Bcast to distribute it to the rest of the processes, we need to perform the Bcast after MPI is otherwise initialized. We could do this by adding another MPID call that the MPI_Init(_thread) routine would make after the rest of MPI is initialized, but before MPI_Init returns. In fact, such a routine could be used to perform various checks, including parameter consistency value (e.g., all processes have the same environment variable values). Alternately, we could allow a few routines to operate with predefined parameter choices (e.g., bcast, allreduce) for the purposes of initialization. |
| src/mpid/ch3/src/mpid_init.c | Check that this intercommunicator gets freed in MPI_Finalize if not already freed. |
| src/mpid/ch3/src/mpid_init.c | Document this |
| src/mpid/ch3/src/mpid_init.c | We may want to allow the channel to ifdef out the use of PMI calls, or ask the channel to provide stubs that return errors if the routines are in fact used |
| src/mpid/ch3/src/mpid_init.c | We can allow the channels to tell the PG how to get connection information by passing the pg to the channel init routine |
| src/mpid/ch3/src/mpid_init.c | Who is this for and where does it belong? |
| src/mpid/ch3/src/mpid_init.c | has_args and has_env need to come from PMI eventually... |
| src/mpid/ch3/src/mpid_init.c | The PG code should supply these, since it knows how the pg_ids and other data are represented |
| src/mpid/ch3/src/mpid_comm_spawn_multiple.c | Correct description of function |
| src/mpid/ch3/src/mpidi_isend_self.c | Explain this function |
| src/mpid/ch3/src/mpidi_isend_self.c | should there be a simpler version of this for short messages, particularly contiguous ones? See also the FIXME about buffering short messages |
| src/mpid/ch3/src/mpidi_isend_self.c | Insert code here to buffer small sends in a temporary buffer? |
| src/mpid/ch3/src/ch3u_handle_recv_req.c | MT: this has to be done atomically |
| src/mpid/ch3/src/ch3u_handle_recv_req.c | What is this variable for (it is never referenced)? |
| src/mpid/ch3/src/ch3u_handle_recv_req.c | Temp to avoid SEGV when memory tracing |
| src/mpid/ch3/src/ch3u_handle_recv_req.c | MT: Must be done atomically |
| src/mpid/ch3/src/ch3u_handle_recv_req.c | MT: The setting of the lock type must be done atomically |
| src/mpid/ch3/src/ch3u_handle_recv_req.c | MT: All queue accesses need to be made atomic |
| src/mpid/ch3/src/ch3u_handle_recv_req.c | MT: Must be done atomically |
| src/mpid/ch3/src/ch3u_handle_recv_req.c | MT: The setting of the lock type must be done atomically |
| src/mpid/ch3/src/mpid_issend.c | HOMOGENEOUS SYSTEMS ONLY -- no data conversion is performed |
| src/mpid/ch3/src/mpid_issend.c | fill temporary IOV or pack temporary buffer after send to hide some latency. This requires synchronization because the CTS packet could arrive and be processed before the above iStartmsg completes (depending on the progress engine, threads, etc.). |
| src/mpid/ch3/src/ch3u_eagersync.c | This sometimes segfaults |
| src/mpid/ch3/src/mpid_recv.c | in the common case, we want to simply complete the message and make as few updates as possible. Note in addition that this routine is used only by MPI_Recv (a blocking routine; the intent of the interface (which returns a request) was to simplify the handling of the case where the message was not found in the unexpected queue. |
| src/mpid/ch3/src/mpid_recv.c | We do not need to add a datatype reference if the request is blocking. This is currently added because of the actions that are taken when a request is freed. (specifically, the datatype and comm both have their refs decremented, and are freed if the refs are zero) |
| src/mpid/ch3/src/mpid_port.c | MPIU_xxx routines should return regular mpi error codes |
| src/mpid/ch3/src/mpid_port.c | We should instead ask the mpid_pg routines to give us a connection string. There may need to be a separate step to restrict us to a connection information that is only valid for connections between processes that are started separately (e.g., may not use shared memory). We may need a channel-specific function to create an exportable connection string. |
| src/mpid/ch3/src/mpid_irsend.c | HOMOGENEOUS SYSTEMS ONLY -- no data conversion is performed |
| src/mpid/ch3/src/ch3u_rma_ops.c | There should be no unreferenced args |
| src/mpid/ch3/src/ch3u_rma_ops.c | This needs to be fixed for heterogeneous systems |
| src/mpid/ch3/src/ch3u_rma_ops.c | It makes sense to save the rank (and size) of the communicator in the window structure to speed up these operations, or to save a pointer to the communicator structure, rather than just the handle |
| src/mpid/ch3/src/ch3u_rma_ops.c | Where does this memory get freed? |
| src/mpid/ch3/src/ch3u_rma_ops.c | It makes sense to save the rank (and size) of the communicator in the window structure to speed up these operations |
| src/mpid/ch3/src/ch3u_rma_ops.c | It makes sense to save the rank (and size) of the communicator in the window structure to speed up these operations, or to save a pointer to the communicator structure, rather than just the handle |
| src/mpid/ch3/src/mpid_iprobe.c | The routine CH3U_Recvq_FU is used only by the probe functions; it should atomically return the flag and status rather than create a request. Note that in some cases it will be possible to atomically query the unexpected receive list (which is what the probe routines are for). |
| src/mpid/ch3/src/mpid_iprobe.c | It would be helpful to know if the Progress_poke operation causes any change in state; we could then avoid a second test of the receive queue if we knew that nothing had changed |
| src/mpid/ch3/src/ch3u_rndv.c | fill temporary IOV or pack temporary buffer after send to hide some latency. This requires synchronization because the CTS packet could arrive and be processed before the above iStartmsg completes (depending on the progress engine, threads, etc.). |
| src/mpid/ch3/src/ch3u_rndv.c | What if the receive user buffer is not big enough to hold the data about to be cleared for sending? |
| src/mpid/ch3/src/ch3u_rndv.c | is that a ch3 or an mpid critical section? |
| src/mpid/ch3/src/ch3u_rndv.c | Ideally we could specify that a req not be returned. This would avoid our having to decrement the reference count on a req we don't want/need. |
| src/mpid/ch3/src/mpid_vc.c | What should this do? See proc group and vc discussion |
| src/mpid/ch3/src/mpid_vc.c | Need a better way to define how vc's are closed that takes into account pending operations on vcs, including close events received from other processes. |
| src/mpid/ch3/src/mpid_vc.c | this is still bogus, the VCRT may contain a mix of dynamic and non-dynamic VCs, so the ref_count isn't guaranteed to have started at 2. The best thing to do might be to avoid overloading the reference counting this way and use a separate check for dynamic VCs (another flag? compare PGs?) |
| src/mpid/ch3/src/mpid_vc.c | the correct test is ACTIVE or REMOTE_CLOSE |
| src/mpid/ch3/src/mpid_vc.c | These routines belong in a different place |
| src/mpid/ch3/src/mpid_vc.c | a quick check on the min/max values of the lpid for this process group could help speed this search |
| src/mpid/ch3/src/mpid_vc.c | For testing, we just test in place |
| src/mpid/ch3/src/mpid_vc.c | We need a cleaner way to handle this case than using an ifdef. We could have an empty version of MPID_PG_BCast in ch3u_port.c, but that's a rather crude way of addressing this problem. Better is to make the handling of local and remote PIDS for the dynamic process case part of the dynamic process "module"; devices that don't support dynamic processes (and hence have only COMM_WORLD) could optimize for that case |
| src/mpid/ch3/src/mpid_vc.c | Printf for debugging |
| src/mpid/ch3/src/mpid_vc.c | We shouldn't have any references to the channel-specific fields in this part of the code. This case should actually not be needed; if there is a pending send element, the top-level state should not be inactive |
| src/mpid/ch3/src/mpid_vc.c | Printf for debugging |
| src/mpid/ch3/src/mpid_vc.c | We need a better abstraction for initializing the thread state for an object |
| src/mpid/ch3/src/mpid_vc.c | need to handle oddeven cliques DARIUS |
| src/mpid/ch3/src/mpid_vc.c | this routine can't handle the dynamic process case at this time. This will require more support from the process manager. |
| src/mpid/ch3/src/mpid_vc.c | 'PMI_process_mapping' only applies for the original PG (MPI_COMM_WORLD) |
| src/mpid/ch3/src/mpid_vc.c | need a better algorithm -- this one does O(N^2) strncmp()s! |
| src/mpid/ch3/src/ch3u_comm_spawn_multiple.c | Place all of this within the mpid_comm_spawn_multiple file |
| src/mpid/ch3/src/ch3u_comm_spawn_multiple.c | We can avoid these two routines if we define PMI as using MPI info values |
| src/mpid/ch3/src/ch3u_comm_spawn_multiple.c | This is *really* awkward. We should either Fix on MPI-style info data structures for PMI (avoid unnecessary duplication) or add an MPIU_Info_getall(...) that creates the necessary arrays of key/value pairs |
| src/mpid/ch3/src/ch3u_comm_spawn_multiple.c | info may be needed for port name |
| src/mpid/ch3/src/ch3u_comm_spawn_multiple.c | translate the pmi error codes here |
| src/mpid/ch3/src/ch3u_comm_spawn_multiple.c | What does this function do? Who calls it? Can we assume that it is called only dynamic process operations (specifically spawn) are supported? Do we need the concept of a port? For example, could a channel that supported only shared memory call this (it doesn't look like it right now, so this could go into util/sock, perhaps? It might make more sense to have this function provided as a function pointer as part of the channel init setup, particularly since this function appears to access channel-specific storage (MPIDI_CH3_Process) |
| src/mpid/ch3/src/ch3u_comm_spawn_multiple.c | DARIUS |
| src/mpid/ch3/src/ch3u_buffer.c | Explain the contents of this file |
| src/mpid/ch3/src/ch3u_buffer.c | Explain this routine . Used indirectly by mpid_irecv, mpid_recv (through MPIDI_CH3_RecvFromSelf) and in mpidi_isend_self.c |
| src/mpid/ch3/src/mpid_send.c | HOMOGENEOUS SYSTEMS ONLY -- no data conversion is performed |
| src/mpid/ch3/src/mpid_send.c | this is a fatal error because a sequence number has already been allocated. If sequence numbers are not being used then this could be a recoverable error. A check needs to be added that sets the error to fatal or recoverable depending on the use of sequence numbers. |
| src/mpid/ch3/src/mpid_send.c | flow control: limit number of outstanding eager messsages containing data and need to be buffered by the receiver |
| src/mpid/ch3/src/ch3u_recvq.c | Recvq_lock/unlock removed because it is not needed for the SINGLE_CS approach and we might want a different, non-lock-based approach in a finer-grained thread-sync version. For example, some routines can be implemented in a lock-free fashion (because the user is required to guarantee non-conflicting accesses, such as doing a probe and a receive that matches in different threads). There are a lot of routines here. Do we really need them all? The search criteria can be implemented in a single 64 bit compare on systems with efficient 64-bit operations. The rank and contextid can also in many cases be combined into a single 32-bit word for the comparison (in which case the message info should be stored in the queue in a naturally aligned, 64-bit word. |
| src/mpid/ch3/src/ch3u_recvq.c | If this routine is only used by probe/iprobe, then we don't need to set the cancelled field in status (only set for nonblocking requests) |
| src/mpid/ch3/src/ch3u_recvq.c | Why doesn't this exit after it finds the first match? |
| src/mpid/ch3/src/ch3u_request.c | This makes request creation expensive. We need to trim this to the basics, with additional setup for special-purpose requests (think base class and inheritance). For example, do we really* want to set the kind to UNDEFINED? And should the RMA values be set only for RMA requests? |
| src/mpid/ch3/src/ch3u_request.c | status fields meaningful only for receive, and even then should not need to be set. |
| src/mpid/ch3/src/ch3u_request.c | RMA ops shouldn't need to be set except when creating a request for RMA operations |
| src/mpid/ch3/src/ch3u_request.c | This fails to fail if debugging is turned off |
| src/mpid/ch3/src/ch3u_request.c | We need a lighter-weight version of this to avoid all of the extra checks. One posibility would be a single, no special case (no comm, datatype, or srbuf to check) and a more general (check everything) version. |
| src/mpid/ch3/src/ch3u_request.c | We need a better way to handle these so that we do not always need to initialize these fields and check them when we destroy a request |
| src/mpid/ch3/src/ch3u_request.c | We need a way to call these routines ONLY when the related ref count has become zero. |
| src/mpid/ch3/src/ch3u_request.c | we should drain the data off the pipe here, but we don't have a buffer to drain it into. should this be a fatal error? |
| src/mpid/common/datatype/mpid_ext32_segment.h | !!! TODO!!! FOO!!! DO THIS!!! DETECT ME!!! Include mpiimpl.h up here too, taking it out of mpid_ext32_segment.c Consider using MPIU_INT64_T etc. types instead of the EIGHT_BYTE_BASIC_TYPE stuff, or put #defines at the top of this file assigning them in a simple manner. Doing that might require that we create MPIU_UINT64_T types (etc), because it looks like we really want to have unsigned types for the various convert functions below. |
| src/mpid/common/datatype/mpid_ext32_segment.h | This is necessary with some compilers or compiler settings |
| src/mpid/common/datatype/mpid_ext32_segment.h | Who defines __BYTE_ORDER or __BIG_ENDIAN? They aren't part of C |
| src/mpid/common/datatype/mpid_ext32_segment.h | The else test assumes that the byte order is little endian, whereas it may simply have been undetermined. This should instead use either a configure-time test (for which there are macros) or a runtime test and not use this non-portable check |
| src/mpid/common/datatype/mpid_ext32_segment.h | "BLENDIAN" is a non-conforming name - it could conflict with some other definition in a non-mpich2 header file |
| src/mpid/common/datatype/mpid_datatype.h | calculating this value this way is foolish, we should make this more automatic and less error prone |
| src/mpid/common/datatype/mpid_datatype_contents.c | Why is this here? The value is never used. Should it be? |
| src/mpid/common/datatype/mpid_type_create_resized.c | ??? |
| src/mpid/common/datatype/mpid_contents_support.c | Is this routine complete? Why is it needed? If it is needed, it must have a comment that describes why it is needed and the arguments must have ATTRIBUTE((unused)) |
| src/mpid/common/datatype/dataloop/segment_ops.c | The blkidx and index functions are not used since they optimize the count by coalescing contiguous segments, while functions using the count do not optimize in the same way (e.g., flatten code) |
| src/mpid/common/locks/mpidu_process_locks.h | This inclusion is disgusting but necessary for now. Including mpidimpl.h will get us USE_BUSY_LOCKS from the ssm channel (or any other device that defines it), that's why we do it. Unfortunately, it sucks in the whole MPI headers ball of wax at the same time, and implies a potential circular dependency. Doing the right thing here basically means a total gut job, which is out of scope at this time. [goodell@ 2008-03-19] |
| src/mpid/common/locks/mpidu_process_locks.h | It should be sufficient to include mpidpre.h (updating the ssm etc devices if necessary). That's still not great, but better than all of mpidimpl.h, which should not be used outside of the device code. WDG - 8/8/08 |
| src/mpid/common/locks/mpidu_process_locks.h | Including mpidimpl.h includes mpiimpl.h, which is a superset of mpishared.h. mpishared.h should only be included when the regular mpiimpl.h headers are not used. |
| src/mpid/common/locks/mpidu_process_locks.h | First use the configure ifdefs to decide on an approach for locks. Then put all lock code in one place, or at least guarded by the same "USE_xxx" ifdef. It is nearly impossible with the current code to determine, for example, what is the definition of MPIDU_Process_lock_t. (Specifically, for the Intel compiler on an x86, it appears to be missing a volatile, needed when using the _InterlockedExchange inline function |
| src/mpid/common/locks/mpidu_process_locks.h | This uses an invalid prefix |
| src/mpid/common/locks/mpidu_process_locks.c | The build scheme for this file is broken, and this definition illustrates the problem. The issue is that some files include mpidu_process_locks.h with *different* definitions than are used to compile this file! That shows up when this variable, which is only used for busy locks, is only conditionally defined when busy locks are defined. The problem is that code in, for example, the ch3:ssm channel includes mpidu_process_locks.h with USE_BUSY_LOCKS defined, but the configure in the mpid/common/locks directory will never* define USE_BUSY_LOCKS |
| src/mpid/common/locks/mpidu_process_locks.c | Why is this here? Is this a misnamed use-inline-locks? |
| src/mpid/common/sock/mpidu_sock.h | This is not the right way to add error values to an MPICH module. Note that (a) the last class values are not respected by the error handling code, (b) the entire point of codes and classes is to provide a natural grouping of codes to a class, (c) this approach can only be used by one module and hence breaks any component design, and (d) this is what the MPI dynamic error codes and classes was designed for. |
| src/mpid/common/sock/poll/mpidu_socki.h | !!! |
| src/mpid/common/sock/poll/sock_immed.i | Why is this the _immed file (what does immed stand for?) |
| src/mpid/common/sock/poll/sock_immed.i | What do any of these routines do? What are the arguments? Special conditions (see the FIXME on len = SSIZE_MAX)? preconditions? postconditions? |
| src/mpid/common/sock/poll/sock_immed.i | What does this function do? What are its arguments? It appears to execute a nonblocking accept call |
| src/mpid/common/sock/poll/sock_immed.i | Either use the syscall macro or correctly wrap this in a test for EINTR |
| src/mpid/common/sock/poll/sock_immed.i | There should be a simpler macro for reporting errno messages |
| src/mpid/common/sock/poll/sock_immed.i | Who sets the socket buffer size? Why isn't the test made at that time? |
| src/mpid/common/sock/poll/sock_immed.i | There's normally no need to check that the socket buffer size was set to the requested size. This should only be part of some more verbose diagnostic output, not a general action |
| src/mpid/common/sock/poll/sock_immed.i | There's normally no need to check that the socket buffer size was set to the requested size. This should only be part of some more verbose diagnostic output, not a general action |
| src/mpid/common/sock/poll/sock_immed.i | Cut and paste code is a disaster waiting to happen. Particularly in any non-performance critical section, create a separate routine instead of using cut and paste. |
| src/mpid/common/sock/poll/sock_immed.i | multiple passes should be made if len > SSIZE_MAX and nb == SSIZE_MAX |
| src/mpid/common/sock/poll/sock_immed.i | This is a scary test/assignment. It needs an explanation (presumably that this routine will be called again if len is shortened. However, in that case, the description of the routine (which is also missing!!!!) needs to be very clear about this requirement. |
| src/mpid/common/sock/poll/sock_immed.i | multiple passes should be made if len > SSIZE_MAX and nb == SSIZE_MAX |
| src/mpid/common/sock/poll/sock_immed.i | We need (1) a standardized test for including multithreaded code and (2) include support for user requests for a lower-level of thread safety. Finally, things like this should probably be implemented as an abstraction (e.g., wakeup_progress_threads?) rather than this specific code. |
| src/mpid/common/sock/poll/sock_init.i | The usual missing documentation (what are these routines for? preconditions? who calls? post conditions? |
| src/mpid/common/sock/poll/sock_init.i | Who calls? When? Should this be a finalize handler instead? |
| src/mpid/common/sock/poll/sock_set.i | Move the thread-specific operations into thread-specific routines (to allow for alternative thread sync models and for runtime control of thread level) |
| src/mpid/common/sock/poll/sock_post.i | Provide an overview for the functions in this file |
| src/mpid/common/sock/poll/sock_post.i | It would be better to include a special formatting clue for system error messages (e.g., %dSE; in the recommended revision for error reporting (that is, value (errno) is an int, but should be interpreted as an System Error string) |
| src/mpid/common/sock/poll/sock_post.i | What does this routine do? Why does it take a host description instead of an interface name or address? |
| src/mpid/common/sock/poll/sock_post.i | strtok may change the contents of host_description. Shouldn't the host description be a const char [] and not modified by this routine? |
| src/mpid/common/sock/poll/sock_post.i | For ipv6, we should use getaddrinfo |
| src/mpid/common/sock/poll/sock_post.i | Set error |
| src/mpid/common/sock/poll/sock_post.i | Use the parameter interface and document this |
| src/mpid/common/sock/poll/sock_post.i | What does this function do? |
| src/mpid/common/sock/poll/sock_wait.i | return code? If an error occurs do we return it instead of the error specified in the event? |
| src/mpid/common/sock/poll/sock.c | What do these mean? Why is 32 a good size (e.g., is it because 32*32 = 1024 if these are bits in a 4-byte int? In that case, should these be related to a maximum processor count or an OS-defined fd limit? |
| src/mpid/common/sock/poll/sock.c | Why aren't these static |
| src/mpid/common/sock/poll/sock.c | Why are these files included in this way? Why not make them either separate files or simply part of (one admittedly large) source file? |
| src/mpid/common/sock/poll/sock_misc.i | Who uses this and why? |
| src/mpid/common/sock/poll/sock_misc.i | This routine is misnamed; it is really get_interface_name (in the case where there are several networks available to the calling process, this picks one but even in the current code can pick a different interface if a particular environment variable is set) . This routine is used in smpd so we should not change its name yet. |
| src/mpid/common/sock/poll/sock_misc.i | Is this documented? How does it work if the process manager cannot give each process a different value for an environment name? What if a different interface is needed? |
| src/mpid/common/sock/poll/sock_misc.i | What is this for? It appears to be used in debug printing and in smpd |
| src/mpid/common/sock/poll/sock_misc.i | This function violates the internationalization design by using English language strings rather than the error translation mechanism. This unnecessarily breaks the goal of allowing internationalization. Read the design documentation and if there is a problem, raise it rather than ignoring it. |
| src/mpid/common/sock/poll/sock_misc.i | This appears to only be used in smpd |
| src/mpid/common/sock/poll/sock_misc.i | It appears that this function was used instead of making use of the existing MPI-2 features to extend MPI error classes and code, of to export messages to non-MPI application (smpd?) |
| src/mpid/common/sock/poll/socki_util.i | These need to separate the operations from the thread-related synchronization to ensure that the code that is independent of threads is always the same. Also, the thread-level check needs to be identical to all others, and there should be an option, possibly embedded within special thread macros, to allow runtime control of the thread level |
| src/mpid/common/sock/poll/socki_util.i | Does this need a runtime check on whether threads are in use? |
| src/mpid/common/sock/poll/socki_util.i | Low usage operations like this should be a function for better readability, modularity, and code size |
| src/mpid/common/sock/poll/socki_util.i | Are these really optional? Based on their definitions, it looks like they should only be used when debugging the code. |
| src/mpid/common/sock/poll/socki_util.i | Should this use the CHKPMEM macros (perm malloc)? |
| src/mpid/common/sock/poll/socki_util.i | We need an abstraction for the thread sync operations |
| src/mpid/common/sock/poll/socki_util.i | move last element into current position and update sock associated with last element. |
| src/mpid/common/sock/poll/socki_util.i | Shouldn't this be an mpi error code? |
| src/mpid/common/sock/poll/socki_util.i | Who allocates eventq tables? Should there be a check that these tables are empty first? |
| src/mpid/common/sock/poll/socki_util.i | Is this the name that we want to use (this was chosen to match the original, undocumented name) |
| src/mpid/common/sock/iocp/sock.c | Is this correct for resource errors like WSAENOBUFS or an interrupted operation? |
| src/mpid/common/sock/iocp/sock.c | does this data need to be handled immediately? |
| src/util/mem/handlemem.c | The alloc_complete routine should be removed. It is used only in typeutil.c (in MPIR_Datatype_init, which is only executed from within the MPI_Init/MPI_Init_thread startup and hence is guaranteed to be single threaded). When used by the obj_alloc, it adds unnecessary overhead, particularly when MPI is single threaded |
| src/util/dbg/dbg_printf.c | This should use MPIU_Param_get_string |
| src/util/dbg/dbg_printf.c | put everything on one line until a \n is found |
| src/util/dbg/dbg_printf.c | |
| src/util/dbg/dbg_printf.c | Need to know how many MPI_COMM_WORLDs are known |
| src/util/dbg/dbg_printf.c | This is a hack to handle the common case of two worlds |
| src/util/dbg/dbg_printf.c | Get world number |
| src/util/dbg/dbg_control.c | Delete this file (functionality switched to MPIU_DBG_MSG interface |
| src/util/multichannel/mpi.c | Should we allow for short wrapper names like 'mpe'? |
| src/util/wrappers/mpiu_process_wrappers.h | Currently this functionality is also implemented in mpidu_process_locks.h . Should we use MPIU_Thread_yield() ? |
| src/util/wrappers/mpiu_process_wrappers.h | Allow fallbacks with sleep/select |
| src/util/wrappers/mpiu_sock_wrappers.h | Replace wrapper funcs implemented as complex macros with static inline funcs |
| src/util/wrappers/mpiu_sock_wrappers.h | Add more comments |
| src/util/wrappers/mpiu_sock_wrappers.h | Try receiving data with AcceptEx() call to improve perf |
| src/util/wrappers/mpiu_sock_wrappers.h | This is cheating... Expand dynamically - we need to take care of updating sock Handles if we expand dynamically. |
| src/util/wrappers/mpiu_sock_wrappers.h | Cheating - Expand dynamically |
| src/util/wrappers/mpiu_sock_wrappers.h | Try not to use MPIU_Assert(). Utils should not depend on a device impl of assert |
| src/util/wrappers/mpiu_shm_wrappers.h | Add underscore to the end of funcs/macro names not to be exposed to user |
| src/util/wrappers/mpiu_shm_wrappers.h | Reduce the conditional checks for wrapper-internal utility funcs/macros. |
| src/util/wrappers/mpiu_shm_wrappers.h | What if val is a non-null terminated string ? |
| src/util/wrappers/mpiu_shm_wrappers.h | Don't print ENGLISH strings on error. Define the error strings in errnames.txt |
| src/util/wrappers/mpiu_shm_wrappers.h | Pass (void **)shm_addr_ptr instead of (char **) shm_addr_ptr since a util func should be generic Currently not passing (void **) to reduce warning in nemesis code which passes (char **) ptrs to be attached to a seg |
| src/util/wrappers/mpiu_shm_wrappers.h | Close local handle only when closing the shm handle |
| src/util/logging/rlog/trace_input.c | Improve failure reporting |
| src/util/logging/rlog/printrlog.c | Add a structured comment for the man page generate to create the basic documentation on this routine, particularly since this routine only prints a subset of information by default |
| src/util/logging/rlog/printrlog.c | This should also check for the GNU-standard --help, --usage, and -h options. |
| src/util/logging/rlog/printrlog.c | What is the default behavior with just an rlogfile? |
| src/util/logging/rlog/rlogtime.c | This name needs to be changed to ensure no conflicts with user-defined globals |
| src/util/param/param.c | This is the code from EnvGetRange |
| src/util/procmap/local_proc.c | this is including a ch3 include file! Implement these functions at the device level. |
| src/util/procmap/local_proc.c | this could/should be a list of ranks on the local node, which should take up much less space on a typical thin(ish)-node system. |
| src/util/info/info_dup.c | multithreaded |
| src/util/info/info_get.c | The real fix is to change MPIU_Strncpy to set the null at the end (always!) and return an error if it had to truncate the result. |
| src/util/ex/ex.c | Change ExCreateSet() to allow adding the set to a list & allow sending this list to MPIU_ExProcessCompletions() MPIU_ExSetHandle_t MPIU_ExCreateSet(MPIU_ExCreateSetList_t list) |
| src/util/ex/ex.c | Should'nt there be a retry count ? |
| src/util/thread/mpiu_thread.c | faster allocation, or avoid it all together? |
| src/util/thread/mpiu_thread.c | convert error to an MPIU_THREAD_ERR value |
| src/util/thread/mpiu_thread.c | faster allocation, or avoid it all together? |
| src/util/thread/mpiu_thread.c | convert error to an MPIU_THREAD_ERR value |
| src/pm/util/process.c | kill children |
| src/pm/util/process.c | kill children |
| src/pm/util/process.c | Should initialize with the number of created processes |
| src/pm/util/process.c | Indicate whether all processes have exited. Then mpiexec programs can decide (probably based on a debugging flag) what to do if they have not all exited. |
| src/pm/util/process.c | This is an ugly hack to ensure that the macros to manipulate the cpu_set_t are defined - without these, you can't use the affinity routines |
| src/pm/util/process.c | How do we bind the threads when we don't have direct access to them? |
| src/pm/util/simple_pmiutil2.c | Decide what role PMIU_printf should have (if any) and select the appropriate MPIU routine |
| src/pm/util/simple_pmiutil2.c | Make this an optional output |
| src/pm/util/cmnargs.c | Get values from the environment first. Command line options override the environment |
| src/pm/util/cmnargs.c | Move this routine else where; perhaps a pmutil.c? |
| src/pm/util/cmnargs.c | handle sign, invalid input |
| src/pm/util/cmnargs.c | handle negative increments, and e not s + k incr |
| src/pm/util/pmiserv.h | We need to specify a minimum length for both key and value. The "business cards" used by the ch3 channel implementations for sockets and for sockets+shared memory can exceed 128 characters. |
| src/pm/util/pmiserv.h | This is a temporary declaration (non-conforming name). Who added this, and why? |
| src/pm/util/pmiserv.c | We may need to record some information, such as the curPMIGroup, in the pState or pmiprocess entry |
| src/pm/util/pmiserv.c | What should be the defaults for the spawned env? Should the default be the env ov the spawner? |
| src/pm/util/pmiserv.c | Otherwise, we have a problem |
| src/pm/util/pmiserv.c | call user-specified info handler, if any. Unspecified info keys are ignored |
| src/pm/util/pmiserv.c | Make this an optional output |
| src/pm/util/env.c | This should be getGenv (so allocation/init in one place) |
| src/pm/util/rm.c | need path and external designation of file names |
| src/pm/util/ioloop.c | Occassionally, data from stdout has been lost when the job is exiting. I don't know whether data is being lost because the writer is discarding it or the reader (mpiexec) is failing to finish reading from all of the sockets before exiting. |
| src/pm/util/ioloop.c | an EINTR may also mean that a process has exited (SIGCHILD). If all processes have exited, we may want to exit |
| src/pm/util/pmiport.c | simple_pmi should *never* start mpiexec with a bogus interface name |
| src/pm/remshell/mpiexec.c | The singleton code goes here |
| src/pm/remshell/mpiexec.c | This should be part of the PMI initialization in the clients |
| src/pm/remshell/mpiexec.c | |
| src/pm/smpd/smpd_read_write.c | why does this return success on truncation? |
| src/pm/smpd/smpd_read_write.c | Error on truncation? if (*str != '\0') { num_decoded = n; return SMPD_FAIL; } |
| src/pm/smpd/smpd_connect.c | this insertion needs to be thread safe |
| src/pm/smpd/smpd_connect.c | cleanup sspi structures |
| src/pm/smpd/smpd_connect.c | this could overflow |
| src/pm/smpd/smpd_connect.c | Insert code here to parse a compressed host string |
| src/pm/smpd/smpd_connect.c | this could overflow |
| src/pm/smpd/smpd_connect.c | this could overflow |
| src/pm/smpd/smpd_service.c | Remove hardcoded values -- eg: Length of error Mesg, 256 |
| src/pm/smpd/smpd_affinitize.c | Is this defined for win32 ? case RelationProcessorPackage : return prioPackage; |
| src/pm/smpd/smpd_iov.h | How is IOV_LIMIT chosen? |
| src/pm/smpd/smpd_host_util.c | If these are not NULL should they be freed? |
| src/pm/smpd/smpd.c | Get rid of this hack - we already create local KVS for all singleton clients by default |
| src/pm/smpd/mp_parse_command_line.c | We don't detect overflow case in atoi |
| src/pm/smpd/mp_parse_command_line.c | How do we tell the difference between an error parsing the file and parsing the last entry? |
| src/pm/smpd/smpd_launch_process.c | The return vals of these functions are not checked ! |
| src/pm/smpd/smpd_database.c | Should this cleanup be done here ? This lets one forget to do a smpd_dbs_destroy() for every smpd_dbs_create() |
| src/pm/smpd/smpd_database.c | Replace SMPD_DBS_SUCCESS/... error codes with SMPD_SUCCESS/... |
| src/pm/smpd/smpd_database.c | Use routines below to replace dbs_first(), dbs_next() routines |
| src/pm/smpd/smpd_database.c | Go for a macro here ? |
| src/pm/smpd/smpd_database.c | Desc error msgs ? |
| src/pm/smpd/smpd_user_data.c | function not implemented |
| src/pm/smpd/smpd_handle_spawn.c | create a mechanism to timeout spawned processes and only those spawned processes |
| src/pm/smpd/smpd_handle_spawn.c | This will turn on exit code printing for all processes. Implement a new mechanism for only printing codes for an individual process group. |
| src/pm/smpd/smpd_handle_spawn.c | Tell MPICH that the spawned processes will not make any SMPD calls, including SMPD_Init - so don't to a comm_accept or it will hang! |
| src/pm/smpd/mpiexec.c | Get rid of this hack - we already create local KVS for all singleton clients by default |
| src/pm/smpd/smpd_state_machine.c | what does 0 bytes written mean? Does it mean that no bytes could be written at that moment or does it mean that there is an error on the socket? |
| src/pm/smpd/smpd_state_machine.c | the smpd password needs to be able to be set |
| src/pm/smpd/smpd_state_machine.c | Instead of encoding a pointer to the context, add an integer id to the context structure and look up the context based on this id from a global list of contexts. |
| src/pm/smpd/smpd_state_machine.c | Add code to cleanup sspi structures |
| src/pm/smpd/smpd_state_machine.c | Add code to cleanup sspi structures |
| src/pm/smpd/smpd_state_machine.c | Add code to cleanup sspi structures |
| src/pm/smpd/smpd_state_machine.c | Add code to cleanup sspi structures |
| src/pm/smpd/smpd_state_machine.c | Add code to cleanup sspi structures |
| src/pm/smpd/smpd_state_machine.c | Add code to cleanup sspi structures |
| src/pm/smpd/smpd_state_machine.c | Add code to cleanup sspi structures |
| src/pm/smpd/smpd_state_machine.c | Add code to cleanup sspi structures |
| src/pm/smpd/smpd_state_machine.c | Add code to cleanup sspi structures |
| src/pm/smpd/smpd_state_machine.c | Add code to cleanup sspi structures |
| src/pm/smpd/smpd_state_machine.c | This assumes that the server knows to post a write of the delegate command because it knows that no buffer will be returned by the iter command |
| src/pm/smpd/smpd_state_machine.c | Add code to cleanup sspi structures |
| src/pm/smpd/smpd_state_machine.c | Instead of encoding a pointer to the context, add an integer id to the context structure and look up the context based on this id from a global list of contexts. |
| src/pm/smpd/smpd_state_machine.c | Add code to cleanup sspi structures |
| src/pm/smpd/smpd_state_machine.c | Add code to cleanup sspi structures |
| src/pm/smpd/smpd_state_machine.c | Add code to cleanup sspi structures |
| src/pm/smpd/smpd_state_machine.c | Add code to cleanup sspi structures |
| src/pm/smpd/smpd_state_machine.c | Add code to cleanup sspi structures |
| src/pm/smpd/smpd_state_machine.c | Add code to cleanup sspi structures |
| src/pm/smpd/smpd_state_machine.c | Add code to cleanup sspi structures |
| src/pm/smpd/smpd_state_machine.c | Add code to cleanup sspi structures |
| src/pm/smpd/smpd_state_machine.c | Instead of encoding a pointer to the context, add an integer id to the context structure and look up the context based on this id from a global list of contexts. |
| src/pm/smpd/smpd_state_machine.c | Instead of encoding a pointer to the context, add an integer id to the context structure and look up the context based on this id from a global list of contexts. |
| src/pm/smpd/smpd_state_machine.c | Reintroduce event.error == SMPDU_SOCK_ERR_CONN_CLOSED once we have a better error code |
| src/pm/smpd/smpd_state_machine.c | should we send a message to the root process manager to close stdin to the root process? |
| src/pm/smpd/smpd_handle_command.c | smpd_get_hostname() does the same thing |
| src/pm/smpd/smpd_handle_command.c | We shouldn't abort if a process fails to launch as part of a spawn command |
| src/pm/smpd/smpd_handle_command.c | mismatch btw size of connect_to->host and max host name len |
| src/pm/smpd/smpd_handle_command.c | decrypt the password |
| src/pm/smpd/smpd_handle_command.c | This assumes that the server knows to post a write of the delegate command because it knows that no buffer will be returned by the iter command |
| src/pm/smpd/smpd_handle_command.c | Add code to cleanup sspi structures |
| src/pm/smpd/smpd_handle_command.c | username and password needed to map a drive |
| src/pm/smpd/smpd_handle_command.c | create a new return class, for now use SMPD_DBS_RETURN |
| src/pm/smpd/smpd_handle_command.c | decrypt the password |
| src/pm/smpd/smpd_handle_command.c | How do we determine whether to provide user credentials or impersonate the current user? |
| src/pm/smpd/smpd_handle_command.c | Assign the appropriate src/dst for singinit/die |
| src/pm/smpd/mpiexec_rsh.c | Why is this func defined here ? - shouldn't this be in smpd_util*.lib ? |
| src/pm/smpd/mpiexec_rsh.c | Do we need a generic handle type for smpd thread/proc ? |
| src/pm/smpd/smpd_implthread.h | SMPD internally calls the utils defined for the SMPD device eg: SMPDU_Sock_wait() -- Remove this dependency by copying the required socket code to smpd |
| src/pm/smpd/smpd.h | Use an enum instead enum SMPD_RetVal{ SMPD_ERR_INVALID_USER=-3, SMPD_ABORT, SMPD_FAIL, SMPD_SUCCESS, SMPD_DBS_RETURN, SMPD_CONNECTED, SMPD_CLOSE, SMPD_EXIT, SMPD_LAST_RETVAL }; |
| src/pm/smpd/smpd.h | Do something sensible like mpitypedefs.h |
| src/pm/smpd/smpd.h | Remove this |
| src/pm/smpd/smpd.h | Remove this |
| src/pm/smpd/smpd.h | Cleanup this struct after we start using spn list handles everywhere |
| src/pm/smpd/smpd_printf.c | For now ... translate correctly later |
| src/pm/smpd/sock/poll/smpd_sock_post.i | Provide an overview for the functions in this file |
| src/pm/smpd/sock/poll/smpd_sock_post.i | It would be better to include a special formatting clue for system error messages (e.g., %dSE; in the recommended revision for error reporting (that is, value (errno) is an int, but should be interpreted as an System Error string) |
| src/pm/smpd/sock/poll/smpd_sock_post.i | What does this routine do? Why does it take a host description instead of an interface name or address? |
| src/pm/smpd/sock/poll/smpd_sock_post.i | strtok may change the contents of host_description. Shouldn't the host description be a const char [] and not modified by this routine? |
| src/pm/smpd/sock/poll/smpd_sock_post.i | For ipv6, we should use getaddrinfo |
| src/pm/smpd/sock/poll/smpd_sock_post.i | Set error |
| src/pm/smpd/sock/poll/smpd_sock_post.i | Use the parameter interface and document this |
| src/pm/smpd/sock/poll/smpd_sock_post.i | What does this function do? |
| src/pm/smpd/sock/poll/smpd_sock_wait.i | return code? If an error occurs do we return it instead of the error specified in the event? |
| src/pm/smpd/sock/poll/smpd_util_socki.h | !!! |
| src/pm/smpd/sock/poll/smpd_util_sock.c | What do these mean? Why is 32 a good size (e.g., is it because 32*32 = 1024 if these are bits in a 4-byte int? In that case, should these be related to a maximum processor count or an OS-defined fd limit? |
| src/pm/smpd/sock/poll/smpd_util_sock.c | Why aren't these static |
| src/pm/smpd/sock/poll/smpd_util_sock.c | Why are these files included in this way? Why not make them either separate files or simply part of (one admittedly large) source file? |
| src/pm/smpd/sock/poll/smpd_sock_immed.i | Why is this the _immed file (what does immed stand for?) |
| src/pm/smpd/sock/poll/smpd_sock_immed.i | What do any of these routines do? What are the arguments? Special conditions (see the FIXME on len = SSIZE_MAX)? preconditions? postconditions? |
| src/pm/smpd/sock/poll/smpd_sock_immed.i | What does this function do? What are its arguments? It appears to execute a nonblocking accept call |
| src/pm/smpd/sock/poll/smpd_sock_immed.i | Either use the syscall macro or correctly wrap this in a test for EINTR |
| src/pm/smpd/sock/poll/smpd_sock_immed.i | There should be a simpler macro for reporting errno messages |
| src/pm/smpd/sock/poll/smpd_sock_immed.i | Who sets the socket buffer size? Why isn't the test made at that time? |
| src/pm/smpd/sock/poll/smpd_sock_immed.i | multiple passes should be made if len > SSIZE_MAX and nb == SSIZE_MAX |
| src/pm/smpd/sock/poll/smpd_sock_immed.i | This is a scary test/assignment. It needs an explanation (presumably that this routine will be called again if len is shortened. However, in that case, the description of the routine (which is also missing!!!!) needs to be very clear about this requirement. |
| src/pm/smpd/sock/poll/smpd_sock_immed.i | multiple passes should be made if len > SSIZE_MAX and nb == SSIZE_MAX |
| src/pm/smpd/sock/poll/smpd_sock_immed.i | We need (1) a standardized test for including multithreaded code and (2) include support for user requests for a lower-level of thread safety. Finally, things like this should probably be implemented as an abstraction (e.g., wakeup_progress_threads?) rather than this specific code. |
| src/pm/smpd/sock/poll/smpd_sock_misc.i | Who uses this and why? |
| src/pm/smpd/sock/poll/smpd_sock_misc.i | This routine is misnamed; it is really get_interface_name (in the case where there are several networks available to the calling process, this picks one but even in the current code can pick a different interface if a particular environment variable is set) . This routine is used in smpd so we should not change its name yet. |
| src/pm/smpd/sock/poll/smpd_sock_misc.i | Is this documented? How does it work if the process manager cannot give each process a different value for an environment name? What if a different interface is needed? |
| src/pm/smpd/sock/poll/smpd_sock_misc.i | What is this for? It appears to be used in debug printing and in smpd |
| src/pm/smpd/sock/poll/smpd_sock_misc.i | This function violates the internationalization design by using English language strings rather than the error translation mechanism. This unnecessarily breaks the goal of allowing internationalization. Read the design documentation and if there is a problem, raise it rather than ignoring it. |
| src/pm/smpd/sock/poll/smpd_sock_misc.i | This appears to only be used in smpd |
| src/pm/smpd/sock/poll/smpd_sock_misc.i | It appears that this function was used instead of making use of the existing MPI-2 features to extend MPI error classes and code, of to export messages to non-MPI application (smpd?) |
| src/pm/smpd/sock/poll/smpd_sock_set.i | Move the thread-specific operations into thread-specific routines (to allow for alternative thread sync models and for runtime control of thread level) |
| src/pm/smpd/sock/poll/smpd_sock_init.i | The usual missing documentation (what are these routines for? preconditions? who calls? post conditions? |
| src/pm/smpd/sock/poll/smpd_sock_init.i | Who calls? When? Should this be a finalize handler instead? |
| src/pm/smpd/sock/poll/smpd_socki_util.i | These need to separate the operations from the thread-related synchronization to ensure that the code that is independent of threads is always the same. Also, the thread-level check needs to be identical to all others, and there should be an option, possibly embedded within special thread macros, to allow runtime control of the thread level |
| src/pm/smpd/sock/poll/smpd_socki_util.i | Low usage operations like this should be a function for better readability, modularity, and code size |
| src/pm/smpd/sock/poll/smpd_socki_util.i | Are these really optional? Based on their definitions, it looks like they should only be used when debugging the code. |
| src/pm/smpd/sock/poll/smpd_socki_util.i | Should this use the CHKPMEM macros (perm malloc)? |
| src/pm/smpd/sock/poll/smpd_socki_util.i | move last element into current position and update sock associated with last element. |
| src/pm/smpd/sock/poll/smpd_socki_util.i | Shouldn't this be an mpi error code? |
| src/pm/smpd/sock/poll/smpd_socki_util.i | Who allocates eventq tables? Should there be a check that these tables are empty first? |
| src/pm/smpd/sock/poll/smpd_socki_util.i | Is this the name that we want to use (this was chosen to match the original, undocumented name) |
| src/pm/smpd/sock/iocp/smpd_util_sock.c | Use WSADuplicateSocket() instead |
| src/pm/smpd/sock/iocp/smpd_util_sock.c | Use WSADuplicateSocket instead |
| src/pm/smpd/sock/iocp/smpd_util_sock.c | Do we need this any more ? - PM sock util is single threaded |
| src/pm/smpd/sock/iocp/smpd_util_sock.c | Use WSADuplicateSocket() instead |
| src/pm/smpd/sock/iocp/smpd_util_sock.c | Is this correct for resource errors like WSAENOBUFS or an interrupted operation? |
| src/pm/smpd/sock/iocp/smpd_util_sock.c | Why don't we check for errors during post ? |
| src/pm/smpd/sock/iocp/smpd_util_sock.c | Wait on progress engine instead |
| src/pm/smpd/sock/iocp/smpd_util_sock.c | does this data need to be handled immediately? |
| src/pm/smpd/sock/iocp/smpd_util_sock.c | Provide a rich set of error codes as in mpid/common/sock |
| src/pm/gforker/mpiexec.c | The following should be a single routine in pmiport |
| src/pm/gforker/mpiexec.c | This should be part of the PMI initialization in the clients |
| src/pm/hydra/mpl/include/mpltrmem.h | Consider an option of specifying __attribute__((malloc)) for gcc - this lets gcc-style compilers know that the returned pointer does not alias any pointer prior to the call. |
| src/pm/hydra/mpl/src/mpltrmem.c | We should use generalized parameter handling here to allow use of the command line as well as environment variables |
| src/pm/hydra/mpl/src/mpltrmem.c | why do we skip the first few ints? [goodell@] |
| src/pm/hydra/mpl/src/mplenv.c | We need to provide a way to signal this error |
| src/pm/hydra/mpl/src/mplstr.c | Assumes ASCII |
| src/pm/hydra/tools/bootstrap/utils/bscu_wait.c | We rely on gettimeofday here. This needs to detect the timer type available and use that. Probably more of an MPL functionality than Hydra's. |
| src/pm/hydra/tools/bind/hwloc/hwloc/include/hwloc.h | agregate nbobjs from different levels? |
| src/pm/hydra/tools/bind/hwloc/hwloc/include/hwloc/helper.h | agregate nbobjs from different levels? |
| src/pm/hydra/tools/bind/hwloc/hwloc/src/traversal.c | add an invalid value ? |
| src/pm/hydra/tools/bind/hwloc/hwloc/src/topology-linux.c | gather page_size_kB as well? MaMI needs it |
| src/pm/hydra/tools/bind/hwloc/hwloc/src/topology-linux.c | get the right DMI info of each machine |
| src/pm/hydra/tools/bind/hwloc/hwloc/src/topology-linux.c | gather page_size_kB as well? MaMI needs it |
| src/pm/hydra/pm/pmiserv/pmi_common.c | We initialize to whatever PMI version we detect while reading the PMI command, instead of relying on what the init command gave us. This part of the code should not know anything about PMI-1 vs. PMI-2. But the simple PMI client-side code is so hacked up, that commands can arrive out-of-order and this is necessary. |
| src/pm/hydra/pm/pmiserv/pmiserv_cb.c | Instead of doing this from this process itself, fork a bunch of processes to do this. |
| src/pm/hydra/pm/pmiserv/pmiserv_pmi_v2.c | This should be an output parameter. We need to change the structure of the PMI function table to be able to take additional arguments, and not just the ones passed on the wire. |
| src/pm/hydra/pm/pmiserv/pmi_common.h | PMI-1 uses 256, PMI-2 uses 1024; we use the MAX |
| src/pmi/pmi2/simple_pmiutil.c | Decide what role PMI2U_printf should have (if any) and select the appropriate PMI2U routine |
| src/pmi/pmi2/simple_pmiutil.c | Make this an optional output |
| src/pmi/pmi2/simple2pmi.c | Why is setvbuf commented out? |
| src/pmi/pmi2/simple2pmi.c | What if the output should be fully buffered (directed to file)? unbuffered (user explicitly set?) |
| src/pmi/pmi2/simple2pmi.c | (protocol design flaw): command line arguments may contain both = and <space> (and even tab!). |
| src/pmi/pmi2/simple2pmi.c | Use a valid hostname |
| src/pmi/pmi2/simple2pmi.c | We need to support a distinct kvsname for each process group |
| src/pmi/pmi2/simple2pmi.c | Check for valid integer after : |
| src/pmi/pmi2/simple2pmi.c | This is an example of an incorrect fix - writeline can change the second argument in some cases, and that will break the const'ness of request. Instead, writeline should take a const item and return an error in the case in which it currently truncates the data. |
| src/pmi/simple/simple_pmiutil.c | Decide what role PMIU_printf should have (if any) and select the appropriate MPIU routine |
| src/pmi/simple/simple_pmiutil.c | Make this an optional output |
| src/pmi/simple/simple_pmi.c | Why is setvbuf commented out? |
| src/pmi/simple/simple_pmi.c | What if the output should be fully buffered (directed to file)? unbuffered (user explicitly set?) |
| src/pmi/simple/simple_pmi.c | Why does this depend on their being a port??? |
| src/pmi/simple/simple_pmi.c | What is this for? |
| src/pmi/simple/simple_pmi.c | This should use a cmd/response rather than a expecting the server to set a value in this and only this case |
| src/pmi/simple/simple_pmi.c | And it most ceratainly should not happen *before* the initialization handshake |
| src/pmi/simple/simple_pmi.c | This is something that the PM should tell the process, rather than deliver it through the environment |
| src/pmi/simple/simple_pmi.c | My name should be cached rather than re-acquired, as it is unchanging (after singleton init) |
| src/pmi/simple/simple_pmi.c | We need to support a distinct kvsname for each process group |
| src/pmi/simple/simple_pmi.c | What is this function? How is it defined and used? |
| src/pmi/simple/simple_pmi.c | Check for tempbuf too short |
| src/pmi/simple/simple_pmi.c | This should have used rc and msg |
| src/pmi/simple/simple_pmi.c | Do correct error reporting |
| src/pmi/simple/simple_pmi.c | Do correct error reporting |
| src/pmi/simple/simple_pmi.c | (protocol design flaw): command line arguments may contain both = and <space> (and even tab!). |
| src/pmi/simple/simple_pmi.c | Check for tempbuf too short |
| src/pmi/simple/simple_pmi.c | This mixes init with get maxes |
| src/pmi/simple/simple_pmi.c | This is an example of an incorrect fix - writeline can change the second argument in some cases, and that will break the const'ness of request. Instead, writeline should take a const item and return an error in the case in which it currently truncates the data. |
| src/pmi/simple/simple_pmi.c | Use a valid hostname |
| src/pmi/simple/simple_pmi.c | We need to support a distinct kvsname for each process group |
| src/pmi/simple/simple_pmi.c | Check for valid integer after : |
| src/pmi/smpd/smpd_ipmi.c | Send the actual exe name |
| src/pmi/smpd/smpd_ipmi.c | Can we send a pid_t as an int ? |
| src/pmi/smpd/smpd_ipmi.c | Currently only used for singleton init -- mostly only one pair of (key, val) . Inefficient for large number of (key,val)s |
| src/pmi/smpd/smpd_ipmi.c | Switch to PMI v2 to recognize non-MPICH2 mpiexecs |
| src/pmi/smpd/smpd_ipmi.c | Reduce size of rank_str & size_str |
| src/pmi/smpd/smpd_ipmi.c | Make sure the newly created mpiexec process is also killed in the case of an error |
| src/pmi/smpd/smpd_ipmi.c | On failure do we have a local KVS ? |
| src/pmi/smpd/smpd_ipmi.c | Do we need a check for PMI_KVS to determine if client is singleton ? |
| src/pmi/smpd/smpd_ipmi.c | hack - Is there a better way ? |
| src/pmi/smpd/smpd_ipmi.c | Get rid of this hack - we already create local KVS for all singleton clients by default |
| src/pmi/smpd/smpd_ipmi.c | Test only for singleton init proc |
| src/pmi/smpd/smpd_ipmi.c | We don't support user environment infos for spawn() |
| src/pmi/smpd/smpd_ipmi.c | Should it be an error to call setup_name_service twice? |
| src/pmi/smpd/smpd_ipmi.c | Why is this func defined here ? - shouldn't this be in smpd_util*.lib ? |
| src/nameserv/file/file_nameserv.c | Determine if the directory exists before trying to create it |
| src/nameserv/file/file_nameserv.c | An error. Ignore most ? For example, ignore EEXIST? |
| src/nameserv/file/file_nameserv.c | This should use internationalization calls |