mcs | publications | abstracts

1997 Abstracts of MCS Reports and Preprints


Reports

S. Balay, W. Gropp, L. C. McInnes, and B. Smith, "PETSc 2.0 Users Manual, Rev. 2.0.16," Technical Report ANL-95/11, February 1997. This manual describes the use of PETSc 2.0 for the numerical solution of partial differential equations and related problems on high-performance computers. The Portable, Extensible Toolkit for Scientific Computation (PETSc) is a suite of data structures and routines that provide the building blocks for the implementation of large-scale application codes on parallel (and serial) computers. PETSc 2.0 uses the MPI standard for all message-passing communication. PETSc includes an expanding suite of parallel linear and nonlinear equation solvers that may be used in application codes written in Fortran, C, and C++. PETSc provides many of the mechanisms needed within parallel application codes, such as simple parallel matrix and vector assembly routines that allow the overlap of communication and computation. In addition, PETSc includes growing support for distributed arrays. The library is organized hierarchically, enabling users to employ the level of abstraction that is most appropriate for a particular problem. By using techniques of object-oriented programming, PETSc provides enormous flexibility for users. PETSc is a sophisticated set of software tools; as such, for some users it initially has a much steeper learning curve than a simple subroutine library. In particular, for individuals without some computer science background or experience programming in C, Pascal, or C++, it may require a large amount of time to take full advantage of the features that enable efficient software use. However, the power of the PETSc design and the algorithms it incorporates make the efficient implementation of many application codes much simpler than ``rolling them'' yourself. For many simple (or even relatively complicated) tasks a package such as Matlab is often the best tool; PETSc is not intended for the classes of problems for which effective Matlab code can be written. Since PETSc is still under development, small changes in usage and calling sequences of PETSc routines will continue to occur. Although keeping one's code up to date can be somewhat annoying, all PETSc users will be rewarded in the long run with a cleaner, better designed, and easier-to-use interface.
Joseph Czyzyk, Sanjay Mehrotra, and Stephen J. Wright, "PCx User Guide," Technical Report ANL/MCS-TM-217, March 1997. We describe the code PCx, a primal-dual interior-point code for linear programming. Information is given about problem formulation and the underlying algorithm, along with instructions for installing, invoking, and using the code. Computational results on standard test problems are tabulated. The current version is 1.0.
John M. Herbert, "A General Formula for Rayleigh-Schrodinger Perturbation Energy Utilizing a Power Series Expansion of the Quantum Mechanical Hamiltonian," Technical Memorandum ANL/MCS-TM-222, February 1997. Perturbation theory has long been utilized by quantum chemists as a method for approximating solutions to the Schrodinger equation. Perturbation treatments represent a system's energy as a power series in which each additional term further corrects the total energy; it is therefore convenient to have an explicit formula for the nth-order energy correction term. If all perturbations are collected into a single Hamiltonian operator, such a closed-form expression for the nth-order energy correction is well known; however, use of a single perturbed Hamiltonian often leads to divergent energy series, while superior convergence behavior is obtained by expanding the perturbed Hamiltonian in a power series. This report presents a closed-form expression for the nth-order energy correction obtained using Rayleigh-Schrodinger perturbation theory and a power series expansion of the Hamiltonian.
John Michalakes, "FLIC: A Translator for Same-Source Parallel Implementation of Regular Grid Applications," Technical Memorandum ANL/MCS-TM-223, February 1997. FLIC, a Fortran loop and index converter, is a parser-based source translation tool that automates the conversion of program loops and array indices for distributed-memory parallel computers. This conversion is important in the implementation of gridded models on distributed memory because it allows for decomposition and shrinking of model data structures. FLIC does not provide the parallel services itself, but rather provides an automated and transparent mapping of the source code to calls or directives of the user's choice of run-time systems or parallel libraries. The amount of user-supplied input required by FLIC to direct the conversion is small enough to fit as command line argument for the tool. The tool requires no additional statements, comments, or directives in the source code, thus avoiding the pervasiveness and intrusiveness imposed by directives-based preprocessors and parallelizing compilers. FLIC is lightweight and suitable for use as a precompiler and facilitates a same-source approach to operability on diverse computer architectures. FLIC is targeted to new or existing applications that employ regular gridded domains, such as weather models, that will be parallelized by data-domain decomposition.
R. Thakur, E. Lusk, and W. Gropp, "Users Guide for ROMIO: A High-Performance, Portable MPI-IO Implementation," Technical Memorandum ANL/MCS-TM-234, October 1997. ROMIO is a high-performance, portable implementation of MPI-IO (the I/O chapter in MPI-2). This document describes how to install and use ROMIO version 1.0.0 on various machines.
M. Wenzel, J. Czyzyk, and S. Wright, "Computational Experience with a Dense Column Feature for Interior-Point Methods," Technical Memorandum ANL/MCS-TM-227, August 1997. Most software that implements interior-point methods for linear programming formulates the linear algebra at each iteration as a system of normal equations. This approach can be extremely inefficient when the constraint matrix has dense columns, because the density of the normal equations matrix is much greater than the constraint matrix and the system is expensive to solve. In this report we describe a more efficient approach for this case, that involves handling the dense columns by using a Schur-complement method and conjugate gradient iteration. We report numerical results with the code PCx, into which our technique now has been incorporated.
Larry Wos, "Experiments Concerning the Automated Search for Elegant Proofs," Technical Memorandum ANL/MCS-TM-221, July 1997. The research reported in this technical report was spawned by the request to find an elegant proof (of a theorem from Boolean algebra) to replace the known proof consisting of 816 deduced steps. The request was met by finding a proof consisting of 100 deduced steps. The methodology used to obtain the far shorter proof is presented in detail through a sequence of experiments. Although clearly not an algorithm, the methodology is sufficiently general to enable its use for seeking elegant proofs regardless of the domain of study. The methodology relies heavily on the assistance of McCune's automated reasoning program OTTER. Of the aspects of proof elegance, the main focus here is on proof length, with brief attention paid to the type of term present, the number of variables required, and the complexity of deduced steps. The methodology is iterative, relying heavily on the use of three strategies: the resonance strategy, the hotlist strategy, and McCune's ratio strategy. To provide some insight regarding the value of the methodology, I discuss its successful application to other problems from Boolean algebra and to problems from lattice theory. Research suggestions and challenges are also offered.
L. Wos, "Experiments with the Hot List Strategy," Technical Memorandum ANL/MCS-TM-232, October 1997. Experimentation strongly suggests that, for attacking deep questions and hard problems with the assistance of an automated reasoning program, the more effective paradigms rely on the retention of deduced information.  A significant obstacle ordinarily presented by such a paradigm is the deduction and retention of one or more needed conclusions whose complexity sharply delays their consideration.  To mitigate the severity of the cited obstacle, I formulated and feature in this report the \fIhot list strategy\fR.  The hot list strategy asks the researcher to choose, usually from among the input statements, one or more clauses that are conjectured to play a key role for assignment completion.  The chosen clauses conjectured to merit revisiting, again and again are placed in an input list of clauses, called the hot list.  When an automated reasoning program has decided to retain a new conclusion C---before any other clause is chosen to initiate conclusion drawing---the presence of a nonempty hot list (with an appropriate assignment of the input parameter known as heat) causes each inference rule in use to be applied to C together with the appropriate number of members of the hot list.  Members of the hot list are used to complete applications of inference rules and not to initiate applications.  The use of the hot list strategy thus enables an automated reasoning program to briefly consider a newly retained conclusion whose complexity would otherwise prevent its use for perhaps many CPU-hours.  To give evidence of the value of the strategy, I focus on four contexts: (1) dramatically reducing the CPU time required to reach a desired goal; (2) finding a proof of a theorem that had previously resisted all but the more inventive automated attempts; (3) discovering a proof that is more elegant than previously known; and (4) answering a question that had steadfastly eluded researchers relying on an automated reasoning program.  I also discuss a related strategy, the dynamic hot list strategy (formulated by my colleague W. McCune), that enables the program during a run to augment the contents of the hot list.  In the Appendix, I give useful input files and interesting proofs.  Because of frequent requests to do so, I include for consideration challenge problems and questions whose answers are unknown, commentary on my approach to experimentation and research, and suggestions to guide one in the use of McCune's automated reasoning program OTTER.
Po-Ting Wu, Christian H. Bischof, and Paul D. Hovland, ``Using ADIFOR and ADIC to Provide Jacobians for the SNES Component of PETSc," Technical Memorandum ANL/MCS-TM-233, November 1997. The solution of large-scale nonlinear problems is important to many areas of computational science. The SNES component of PETSc provides a robust and flexible suite of numerical routines for solving such problems. These routines generally utilize the Jacobian matrix. We present a strategy for using ADIFOR or ADIC to assists in the development of a subroutine for computing this matrix. We illustrate this strategy using one of the PETSc example programs and four different approaches to computing the Jacobian via automatic differentiation.

Preprints

S. Balay, W. D. Gropp, L. C. McInnes, and B. F. Smith, "Efficient Management of Parallelism in Object-Oriented Numerical Software Libraries," to appear in Modern Software Tools in Scientific Computing, E. Arge, A. M. Bruaset, and H. P. Langtangen, Eds., Birkhauser Press, 1997. Also Preprint  ANL/MCS-P634-0197, January 1997. Parallel numerical software based on the message-passing model is enormously complicated. This paper introduces a set of techniques to manage the complexity, while maintaining high efficiency and ease of use. The PETSc 2.0 package uses object-oriented programming to conceal the details of the message passing, without concealing the parallelism, in a high-quality set of numerical software libraries. In fact, the programming used by PETSc is also the most appropriate for NUMA shared-memory machines, since they require the same careful attention to memory hierarchies as do distributed-memory machines. Thus, the concepts discussed are appropriate for all scalable computing systems. The PETSc libraries provide many of the data structures and numerical kernels required for the scalable solution of PDEs, offering performance portability.
J. Abate, C. Bischof, L. Roh, and A. Carle, "Algorithms and Design for a Second-Order Automatic Differentiation Module," Proc. Int. Symp. on Symbolic and Algebraic Computing (ISSAC) '97, pp. 149-155, ACM, New York, 1997. Also Preprint ANL/MCS-P636-0197, April 1997. This article describes approaches to computing second-order derivatives with automatic differentiation (AD) based on the forward mode and the propagation of univariate Taylor series. Performance results are given that show the speedup possible with these techniques relative to existing approaches. We also describe a new source transformation AD module for computing second-order derivatives of C and Fortran codes and the underlying infrastructure used to create a language-independent translation tool.
C. H. Bischof and P.-T. Wu, "Time-Parallel Computation of Pseudo-Adjoints for a Leapfrog Scheme," Preprint ANL/MCS-P639-0197. The leapfrog scheme is a commonly used second-order difference scheme for solving differential equations. If Z(t) denotes the state of the system at time t, the leapfrog scheme computes the state at the next time step as Z(t+1) = H(Z(t),Z(t-1),W), where H is the nonlinear timestepping operator and W are parameters that are not time dependent. In this article, we show how the associativity of the chain rule of differential calculus can be used to compute a so-called adjoint x^T * (dZ(t)/d[Z(0),W]) efficiently in a parallel fashion. to this end, we (1) employ the reverse mode of automatic differentiation at the outermost level, (2) use a sparsity-exploiting incarnation of the forward mode of automatic differentiation to compute derivatives of H at every time step, and (3) exploit chain rule associativity to compute derivatives at individual time steps in parallel. We report on experimental results with a 2-D shallow-water equation model problem on an IBM SP parallel computer and a network of Sun SPARCstations.
T. Disz, M. E. Papka, and R. Stevens, ``UbiWorld: An Environment Integrating Virtual Reality, Supercomputing, and Design," Preprint ANL/MCS-P641-0197. UbiWorld is a concept being developed by the Futures Laboratory group at Argonne National Laboratory that ties together the notion of ubiquitous computing (Ubicomp) with that of using virtual reality for rapid prototyping. The goal is to develop an environment where one can explore Ubicomp-type concepts without having to build real Ubicomp hardware. The basic notion is to extend object models in a virtual world by using distributed wide area heterogeneous computing technology to provide complex networking and processing capabilities to virtual reality objects.",
W. McCune, "Solution of the Robbins Problem," J. of Automated Reasoning 19(3) (1997), pp. 263-276.  Also Preprint ANL/MCS-P642-0197, January 1997. In this article we show that the three equations known as commutativity, associativity, and the Robbins equation are a basis for the variety of Boolean algebras. The problem was posed by Herbert Robbins in the 1930s. The proof was found automatically by EQP, a theorem-proving program for equational logic. We present the proof and the search strategies that enabled the program to find the proof.
S. J. Wright, "Superlinear Convergence of a Stabilized SQP Method to a  Degenerate Solution, Computational Optimization and Applications 11 (1998), pp. 253-275.   Also Preprint ANL/MCS-P643-0297, March 1997. We describe a slight modification of the well-known sequential quadratic programming method for nonlinear programming that attains superlinear convergence to a primal-dual solution even when the Jacobian of the active constraints is rank deficient at the solution. We show that rapid convergence occurs even in the presence of the roundoff errors that are introduced when the algorithm is implemented in floating-point arithmetic.
E. Y. Bobrovnikova and S. A. Vavasis, ``Accurate Solution of Weighted Least Squares by Iterative Methods,'' Preprint  ANL/MCS-P644-0297, February 1997. We consider the weighted least-squares (WLS) problem with a very ill-conditioned weight matrix. Weighted least-squares problems arise in many applications including linear programming, electrical networks, boundary value problems, and structures. Because of roundoff errors, standard iterative methods for solving a WLS problem with ill-conditioned weights may not give the correct answer. Indeed, the difference between the true and computed solution (forward error) may be large. We propose an iterative algorithm, called MINRES-L, for solving WLS problems. The MINRES-L method is the application of MINRES, a Krylov-space method due to Paige and Saunders, to a certain layered linear system. Using a simplified model of the effects of roundoff error, we prove that MINRES-L gives answers with small forward error. We present computational experiments for some applications.
T. Disz, R. Olson, and R. Stevens, ``Performance Model of the Argonne Voyager Multimedia Server,'' Preprint ANL/MCS-P647-0297, July 1997. The Argonne Voyager Multimedia Server is being developed in the Futures Lab of the Mathematics and Computer Science Division at Argonne National Laboratory. As a network-based service for recording and playing multimedia streams, it is important that the Voyager system be capable of sustaining certain minimal levels of performance in order for it to be a viable system. In this article, we examine the performance characteristics of the server. As we examine the architecture of the system, we try to determine where bottlenecks lie, show actual vs. potential performance, and recommend areas for improvement through custom architectures and system tuning.
R. B. Lehoucq, ``Truncated QR Algorithms and the Solution of Large-Scale Eigenvalue Problems,'' Preprint ANL/MCS-P648-0297. The QR algorithm has emerged as the general-purpose method of choice for computing the Schur decomposition of a matrix. For most large eigenvalue problems, however, the QR algorithm cannot be used because of the explicit storage of the matrix and because often only the action of the matrix upon a vector (or group of vectors) is available. Typically, only a small number of eigenvalues and the associated invariant subspace are required. This article considers a truncated QR algorithm. We show that a truncated QR algorithm is a generalization of Sorensen's implicitly restarted Arnoldi method to block Arnoldi reductions. Moreover, implicitly restarting an Arnoldi reduction is simultaneous iteration with an implicit projection step to accelerate convergence to the invariant subspace of interest. This is a generalization of the Rayleigh-Ritz procedure on a block Krylov subspace for a non-Hermitian matrix. The moral of our storage is that the large scale eigenvalue problem is intimately involved with the dense one.
W. Gropp and E. Lusk, ``Sowing MPICH: A Case Study in the Dissemination of a Portable Environment for Parallel Scientific Computing,'' International Journal of Supercomputer Applications (to appear). Also Preprint ANL/MCS-P650-0297, March 1997. MPICH is an implementation of the MPI specification for a standard message-passing library interface. In this article we focus on the lessons learned from preparing MPICH for diverse parallel computing environments. These lessons include how to prepare software for configuration in unknown environments; how to structure software to absorb contributions by others; how to automate the preparation of man pages, Web Pages, and other documentation; how to automate prerelease testing for both correctness and performance; and how to manage the inevitable problem reports with a minimum of resources for support.
W. Gropp and E. Lusk, ``A High-Performance MPI Implementation on a Shared-Memory Vector Supercomputer,'' Parallel Computing (to appear). Also Preprint ANL/MCS-P651-0297, March 1997. In this article we recount the sequence of steps by which MPICH, a high-performance, portable implementation of the Message-Passing Interface (MPI) standard, was ported to the NEC SX-4, a high-performance parallel supercomputer. Each step in the sequence raised issues that are important for shared-memory programming in general and shed light on both MPICH and the SX-4. The result is a low-latency, very high bandwidth implementation of MPI for the NEC SX-4. In the process, MPICH was also improved in several general ways.
T. L. Disz, R. Olson, M. E. Papka, R. Stevens, M. Szymanski, and R. J. Firby, "Two Implementations of Shared Virtual Space Environments,''  Preprint ANL/MCS-P652-0297. While many issues in the area of virtual reality (VR) research have been addressed in recent years, the constant leaps forward in technology continue to push the field forward. VR research no longer is focused only on computer graphics, but instead has become even more interdisciplinary, combining the fields of networking, distributed computing, and even artificial intelligence. In this article we discuss some of the issues associated with distributed, collaborative virtual reality, as well as lessons learned during the development of two distributed virtual reality applications. 
T. Disz, I. Judson, R. Olson, and R. Stevens, ``The Argonne Voyager Multimedia Server,'' Preprint ANL/MCS-P653-0397. With the growing presence of multimedia-enabled systems, we will see an integration of collaborative computing concepts into the everyday environments of future scientific and technical workplaces. Desktop teleconferencing is in common use today, while more complex desktop teleconferencing technology that relies on the availability of multipoint (greater than two nodes) enabled tools is now starting to become available on PCs. A critical problem when using these collaboration tools is the inability to easily archive multistream, multipoint meetings and make the content available to others. Ideally one would like the ability to capture, record, playback, index, annotate, and distribute multimedia stream data as easily as we currently handle text or still image data. While the ultimate goal is still some years away, the Argonne Voyager project is aimed at exploring and developing media server technology needed to provide a flexible virtual multipoint recording/playback capability. In this article we describe the motivating requirements, architecture, implementation, operation, performance, and related work.
W. Gropp and J. J. More', ``Optimization Environments and the NEOS Server,'' Preprint ANL/MCS-P654-0397, March 1997. The Network-Enabled Optimization System (NEOS) is an Internet-based service for optimization providing information, software, and problem-solving services for optimization. The main components of NEOS are the NEOS Guide and the NEOS Server. Additional information on the various services provided by NEOS can be obtained from the home page of the Optimization Technology Center http://www.mcs.anl.gov/home/otc/.
L. Wos, ``Programs That Offer Fast, Flawless, Logical Reasoning,'' Communication of the ACM (to appear). Also Preprint ANL/MCS-P655-0397, April 1997. Sherlock Holmes and Mr. Spock of Star Trek could reason logically and flawlessly, always. Some people you know that that ability, sometimes. Unfortunately, without perfect reasoning, diverse problems arise:
  • Bugs in computer programs. If a sort program places Sun before Intel, more than disappointment is experienced.
  • Flaws in chip design. One Pentium chip became famous because of a flaw.
  • Errors in mathematical proofs. Papers having a title of the form ``On an Error by MacLane'' are well remembered, but not with pleasure (at least by MacLane).

How can the likelihood of the various cited disasters be reduced? One answer is automated reasoning. (For further information on automated reasoning at Argonne National Laboratory, see http://www.mcs.anl.gov/home/mccune/ar/, which will give you all of the needed pointers for obtaining a versatile automated reasoning program, for new results, for various neat proofs, and for puzzles.)

V. E. Taylor, M. Huang, T. Canfield, R. Stevens, D. Reed, S. Lamm, ``Performance Modeling of Interactive, Immersive Virtual Environments for Finite Element Simulations,'' Preprint ANL/MCS-P656-0397, May 1997. Interactive, immersive virtual environments allow observers to move freely about computer generated 3D objects and to explore new environments. The effectiveness of these environments is dependent upon the graphics used to model reality and the end-to-end lag time (i.e., the delay between a user's action and the display of the result of that action). In this paper we focus on the latter issue, which has been found to be equally important as frame rate for interactive displays. In particular, we analyze the components of lag time resulting from executing a finite element simulation on a multiprocessor system located in Argonne, Illinois connected via ATM to the interactive visualization display located in San Diego, California. The primary application involves the analysis of a disk brake system that was demonstrated at the Supercomputing 1995 conference as part of the Information Wide Area Year (IWAY) project, which entailed the interconnection of various supercomputing centers via a high-bandwidth, limited-access ATM network. The results of the study indicate that the major components of the end-to-end lag are simulation, synchronization, and rendering times; the use of the ATM network resulted in the network time comprising only a small fraction of the end-to-end lag time.
L. Freitag and C. Ollivier-Gooch, ``Tetrahedral Mesh Improvement Using Swapping and Smoothing,'' Preprint ANL/MCS-P657-0497. Automatic mesh generation and adaptive refinement methods for complex three-dimensional domains have proven to be very successful tools for the efficient solution of complex applications problems. these methods can, however, produce poorly shaped elements that cause the numerical solution to be less accurate and more difficult to compute. Fortunately, the shape of the elements can be improved through several mechanisms, including face- and edge-swapping techniques, which change local connectivity, and optimization-based mesh smoothing methods, which adjust mesh point location. We consider several criteria for each of these two methods and compare the quality of several meshes obtained by using different combinations of swapping and smoothing. Computational experiments show that swapping is critical to the improvement of general mesh quality and that optimization-based smoothing is highly effective in eliminating very small and very large angles. High-quality meshes are obtained in a computationally efficient manner by using optimization-based smoothing to improve only the worst elements and a smart variant of Laplacian smoothing on the remaining elements. Based on our experiments, we offer several recommendations for the improvement of tetrahedral meshes.
Yuan-Jye Jason Wu, ``Downdating a Rank-Revealing URV Decomposition,'' Preprint ANL/MCS-P658-0597, May 1997. The rank-revealing URV decomposition is a useful tool for the subspace-tracking problem in digital signal processing. Updating the decomposition is a stable process. However, downdating a rank-revealing URV decomposition can be unstable because the R factor is ill-conditioned. In this article, we review some existing downdating algorithms for the full-rank URV decomposition in the absence of the U factor and develop a new combined algorithm. The combined algorithm has the merits of low cost and no intermediate breakdown, so the down date is always computable in floating-point arithmetic. For the rank-revealing URV decomposition, we develop a two-step method that applies full-rank downdating algorithms to the signal and noise parts separately without using hyperbolic rotations. We prove that Park and Olden's reduction algorithm and the combined algorithm have relational stabilities for both full-rank and rank-revealing cases. We demonstrate the efficiency and accuracy of our combined algorithm on ill-conditioned problems.
J. Michalakes, ``MM90: A Scalable Parallel Implementation of the Penn State/NCAR Mesoscale Model (MM5),'' Preprint ANL/MCS-P659-0597, May 1997. This paper describes MM90, a parallel regional weather model based on the Penn State/NCAR MM5. Parallelization of finite differencing, horizontal interpolation, and nesting on distributed-memory (message-passing) computers is handled transparently by using the RSL library package. Fortran90 modules, derived data types, dynamic memory allocation, pointers, and recursion are used, making the code modular, flexible, extensible, and run-time configurable. The model can dynamically sense and correct load imbalances. The paper provides performance, scaling, and load-balancing data collected on the IBM SP2 computers at Argonne National Laboratory and NASA Ames Laboratory. Future work will address the impact of parallel modifications on existing modeling software; an approach using commercially available source translation software is described.
D. P. Diachin and J. A. Herzog, "Analytic Streamline Calculations on Linear Tetrahedra," 13th AIAA Computational Fluid Dynamics Conference, Snowmass, CO, 1997. Also Preprint ANL/MCS-P660-0597. Analytic solutions for streamlines within tetrahedra are used to define operators that accurately and efficiently compute streamlines. The method presented here is based on linear interpolation, and therefore produces exact results for linear velocity fields. In addition, the method requires less computation than the forward Euler numerical method. Results are presented that compare accuracy measurements of the method with forward Euler and fourth-order Runge-Kutta applied to both a linear and a nonlinear velocity field.
I. Foster, G. K. Thiruvathukal, and S. Tuecke, "Technologies for Ubiquitous Supercomputing: A Java Interface to the Nexus Communication System," Concurrency: Practice and Experience 9(11) (6/97), pp. 465-475, special issue, G. C. Fox, ed.  Also Preprint ANL/MCS-P661-0597. We use the term ubiquitous supercomputing to refer to systems that integrate low- and mid-range computing systems, advanced networks, and remote high-end computers with the goal of enhancing the computational power accessible from local environments.  Such systems promise to enable new applications in areas as diverse as smart instruments and collaborative environments.  However, they also demand tools for transporting code between computers and for establishing flexible, dynamic communication structures.  In this article, we propose that these requirements be satisfied by introducing Java classes that implement the global pointer and remote service request mechanisms defined by a communication library called Nexus.   Java supports transportable code; Nexus provides communication support and represents the core communication framework for Globus, a project building infrastructure for ubiquitous supercomputing.  We explain how this Nexus.Java library is implemented and illustrate its use with examples.
S. Fitzgerald, I. Foster, C. Kesselman, G. von Laszewski, W. Smith, and S. Tuecke, ``A Directory Service for Configuring High-Performance Distributed Computations,'' Proc. Sixth IEEE International Symposium on High Performance Distributed Computing, 1997. Also Preprint ANL/MCS-P662-0597, June 1997. High-performance execution in distributed computing environments often requires careful selection and figuration not only of computers, networks, and other resources but also of the protocols and algorithms used by applications. Selection and configuration in turn require access to accurate, up-to-date information on the structure and state of available resources. Unfortunately, no standard mechanism exists for organizing or accessing such information. Consequently, different tools and applications adopt ad hoc mechanisms, or they compromise their portability and performance by using default configurations. We propose a Metacomputing Directory Service that provides efficient and scalable access to diverse, dynamic, and distributed information about resource structure and state. We define an extensible data model to represent required information and present a scalable, high-performance, distributed implementation. The data representation and application programming interface are adopted from the Lightweight Directory Access Protocol; the data model and implementation are new. We use the Globus distributed computing toolkit to illustrate how this directory service enables the development of more flexible and efficient distributed computing services and applications.
J. G. Michalakes, ``RSL: A Parallel Runtime System Library for Regional Atmospheric Models with Nesting,'' Preprint ANL/MCS-P663-0597, July 1997. RSL is a parallel runtime system library developed at Argonne National Laboratory that is tailored to regular-grid atmospheric models with mesh refinement in the form of two-way interacting nested grids. RSL provides high-level stencil and interdomain communication, irregular domain decomposition, automatic local/global index translation, distributed I/O, and dynamic load balancing. RSL was used with Fortran90 to parallelize a well-known and widely used regional weather model, the Penn State/NCAR Mesoscale Model.
C. V. Rao, S. J. Wright, and J. B. Rawlings, ``Application of Interior-Point Methods to Model Predictive Control,'' Preprint ANL/MCS-P664-0597, July 1997. We present a structured interior-point method for the efficient solution of the optimal control problem in model predictive control (MPC). The cost of this approach is linear in the horizon length, compared with cubic growth for a naive approach. We use a discrete Riccati recursion to solve the linear equations efficiently at each iteration of the interior-point method, and shows that we can expect this recursion to be numerically stable although it was motivated originally by structural rather than numerical considerations. We demonstrate the effectiveness of our approach by solving three process control problems.
T. L. Disz, M. Henderson, W. Nickless, R. Olson, M. E. Papka, and R. Stevens, "Argonne's Computing and Communications Infrastructure Futures Laboratory: Designing the Future," Preprint ANL/MCS-P665-0597, July 1997. The Futures Lab was founded within the Mathematics and Computer Science Division of Argonne National Laboratory in the fall of 1994. The goal of the lab is to develop new technology and systems to support collaborative science. In order to meet this goal, the lab is organized around three research areas: advanced networking, multimedia, and virtual environments.
W. Gropp and E. Lusk, ``Why Are PVM and MPI So Different?,'' Preprint   ANL/MCS-P667-0697, July 1997. PVM and MPI are often compared. These comparisons usually start with the unspoken assumption that PVM and MPI represent different solutions to the same problem. In this paper we show that, in fact, the two systems often are solving different problems. In cases where the problems do match but the solution chosen by PVM and MPI are different, we explain the reasons for the differences. Usually such differences can be traced to explicit differences in the goals of the two systems, their origins, or the relationship between their specifications and their implementations.
L. Freitag, M. Jones, and P. Plassmann, ``A Parallel Algorithm for Mesh Smoothing,'' Preprint ANL/MCS-P668-0697, July 1997. Maintaining good mesh quality during the generation and refinement of unstructured meshes in finite-element application is an important aspect in obtaining accurate discretizations and well-conditioned linear systems. In this article, we present a mesh-smoothing algorithm based on nonsmooth optimization techniques and a scalable implementation of this algorithm. We prove that the parallel algorithm has a provably fast runtime bound and executes correctly for a PRAM computational model. We extend the PRAM algorithm to distributed memory computers and report results for two- and three-dimensional simplicial meshes that demonstrate the efficiency and scalability of this approach for a number of different test cases. We also examine the effect of different architectures on the parallel algorithm and present results for the IBM SP supercomputer and an ATM-connected network of SPARC Ultras.
J. N. Lyness and S. Joe, "A Nonabstract Approach to Lattice Rule Canonical Forms," Preprint ANL/MCS-P669-0697, July 1997. The rank and invariants of a general lattice rule conventionally are defined in terms of the group-theoretic properties of the rule. Here we give a nonabstract definition of the rank and invariants by exploiting what we term the Sylow p-decomposition of a lattice rule. This decomposition allows a canonical D-Z form to be calculated for any lattice rule. A new set of necessary and sufficient conditions for recognizing a canonical form is given.
J. M. Herbert and W. C. Ermler, "Symbolic Implementation of Arbitrary-order Perturbation Theory using Computer Algebra: Application to Vibrational-Rotational Analysis of Diatomic Molecules," Computers Chem. 22, nos. 2-3 (1998), pp. 169-184.  Also Preprint ANL/MCS-P671-0597.

The file is not available here.  Contact beumer@mcs.anl.gov for a reprint.

Theoretical details necessary to calculate arbitrary-order correction terms to vibrational-rotational energies and wave functions in Rayleigh-Schroedinger perturbation theory are presented.  Since manual derivation of high-order perturbation formulae is not feasible due to the lengthy algebra involved, the commercial computer algebra software Mathematica® is employed to perform the symbolic manipulations necessary to derive the requisite correction formulae in terms of universal constants, molecular constants, and quantum numbers.  Correction terms through sixth order for ^l\Sigma diatomic molecules are derived and then evaluated for H_2, HD, N_2, CO, and HF.  It is thus possible, with the aid of computer-generated algebra, to apply arbitrarily high-order perturbation theory successfully to the problem of intramolecular nuclear motion.
Michael Batdorf, Lori A. Freitag, and Carl Ollivier-Gooch, "A Computational Study of the Effect of Unstructured Mesh Quality on Solution Efficiency," Preprint ANL/MCS-P672-0697, July 1997. It is well known that mesh quality affects both efficiency and accuracy of CFD solutions. Meshes with distorted elements make solutions both more difficult to compute and less accurate. We review a recently proposed technique for improving mesh quality as measured by element angle (dihedral angle in three dimensions) using a combination of optimization-based smoothing techniques and local reconnection schemes. Typical results that quantify mesh improvement for a number of application meshes are presented. We then examine effects of mesh quality as measured by the maximum angle in the mesh on the convergence rates of two commonly used CFD solution techniques. Numerical experiments are performed that quantify the cost and benefit of using mesh optimization schemes for incompressible flow over a cylinder and weakly compressible flow over a cylinder.
P. Hovland and M. Heath, ``Adaptive SOR: A Case Study in Automatic Differentiation of Algorithm Parameters,'' Preprint ANL/MCS-P673-0797, July 1997. Many algorithms make use of one or more parameters to control the behavior of the algorithm. Examples include the damping factor \alpha in a damped Newton method or the relaxation parameter \omega in a successive over-relaxation (SOR) iterative solver. The optimal value of such parameters is problem dependent and difficult to determine for most problems. We describe the use of automatic differentiation (AD) to adjust algorithm parameters toward optimal behavior. We demonstrate how AD can be used to transform an SOR solver with fixed \omega into an adaptive SOR solver that adjusts \omega toward its optimal value. We provide experimental evidence that for large problems with lax convergence criteria such an adaptive solver may converge faster than a solver using an optimal, but fixed, value for \omega. These are exactly the conditions that apply when SOR is used as a preconditioner, its most common use in modern scientific computing.
A. C. Morlet, ``Further Properties of a Continuum of Model Equations with Globally Defined Flux," Preprint ANL/MCS-P678-0897, August 1997. To develop an understanding of singularity formation in vortex sheets, we consider model equations that exhibit shared characteristics with the vortex sheet equation but are slightly easier to analyze. A model equation is obtained by replacing the flux term in Burgers' equation by alternatives that contain contributions depending globally on the solution. We consider the continuum of partial differential equations u_t =\theta(H(u)u)_x + (1-\theta)H(u)u_x + vu_{xx}, 0 <= \theta <="<em">1, v >= 0, where H(u) is the Hilbert transform of u. We show that when \theta = 1/2, for v > 0, the solution of the equation exists for all time and is unique. We also show with a combination of analytical and numerical means that the solution when \theta = 1/2 and v > 0 is analytic. Using a pseudo-spectral method in space and the Adams-Moulton fourth-order predictor-corrector in time, we compute the numerical solution of the equation with \theta = 1/2 for various viscosities. The results confirm that for and \theta = 1/2, the solution becomes singular in finite time and finite viscosity prevents singularity formation. We also present, for a certain class of initial conditions, solutions of the equation, with 0 <\theta <1/3 and \theta = 1, that become infinite for v >= 0 in finite time.
Y. Zhang, C. H. Bischof, R. C. Easter, and P.-T. Wu, "Sensitivity Analysis of a Mixed-Phase Chemical Mechanism using Automatic Differentiation," Preprint ANL/MCS-P680-0897, January. 1998. A mixed-phase chemistry box model is applied to study heterogeneous chemistry and its effect on tropospheric gas-phase chemistry, particularly on a photochemical production of O_3 and photochemical indicators for O_3-NO_x-hydrocarbon sensitivity, under a variety of atmospheric conditions ranging from remote margin to heavily-polluted atmospheres.  A subsequent sensitivity analysis of the mixed-phase chemical mechanism is conducted using the novel automatic differentiation ADIFOR tool, which calculates the local sensitivities of species concentrations in gas, aqueous and aerosol phases with respect to a variety of model parameters.  The main chemical reaction pathways in all phases, interfacial mass transfer processes, and ambient physical parameters that affect tropospheric O_3 formation, O_3-precursor relations and photochemical indicators under all modeled conditions are identified and analyzed.   The results show that the presence of clouds and aerosols not only reduces many gas-phase species concentrations and the total oxidizing capacity but alters O_3-precursor relations.  Cloud chemistry is the dominant heterogeneous process under the remote and marine atmospheres.  Aerosols are important scavengers for gaseous species in polluted atmospheres when the total aerosol surface area is larger than 1000 \mu m^2 cm^{-3}.  Decreases in concentrations and formation rates of O_3 can be up to 27% and 100%, respectively, depending on the initial atmospheric conditions and preexisting surfaces.  The significant decreases in photochemical O_3 formation are primarily caused by the aqueous-phase reactions of O_{2-} with dissolved HO_2 and O_3 under most cloudy conditions and heterogeneous uptake of O_3, HCHO, HO_2, H_2O_2, and NO_2 in the presence of aerosols.  Heterogeneous chemical processes also affect the photochemical indicators and their sensitivities to many model parameters in a variety of ways.
S. Wright, "On the Convergence of the Newton/Log-Barrier Method," Preprint ANL/MCS-P681-0897, August 1997. In the Newton/log-barrier method, Newton steps are taken for the log barrier function for a fixed value of the barrier parameter until a certain convergence criterion is satisfied. The barrier parameter is then decreased and the Newton process is repeated. A naive analysis indicates that Newton's method does not exhibit superlinear convergence to the minimizer of each instance of the log-barrier function until it reaches a very small neighborhood of the minimizer. By partitioning according to the subspace of active constraint gradients, however, we show that this neighborhood is actually quite large, thus explaining why reasonably fast local convergence can be attained in practice. Finally, we show that the overall convergence rate of the Newton/log-barrier algorithm is superlinear in the number of function/derivative evaluations, provided that the nonlinear program is formulated with a linear objective and the schedule for decreasing the barrier parameter is related in a certain way to the convergence criterion for each Newton process.
C.-J. Lin and J. J. More', "Incomplete Cholesky Factorizations with Limited Memory," SIAM J. on Scientific Computing 21(1) (1999), pp. 24-45.  Also Preprint ANL/MCS-P682-0897, August 1997. We propose an incomplete Cholesky factorization for the solution of large-scale trust region subproblems and positive definite systems of linear equations. This factorization depends on a parameter p that specifies the amount of additional memory (in multiples of n, the dimension of the problem) that is available; there is no need to specify a drop tolerance. Our numerical results show that the number of conjugate gradient iterations and the computing time are reduced dramatically for small values of p. We also show that in contrast with drop tolerance strategies, the new approach is more stable in terms of number of iterations and memory requirements.
H. G. Kaper, B. Wang, and S. Wang, "Determining Nodes for the Ginzburg-Landau Equations of Superconductivity," Discrete and Continuous Dynamical Systems 4, no. 2 (April 1998), pp. 205-224.  Also Preprint  ANL/MCS-P683-0897, September 1997. It is shown that a solution of the time-independent Ginzburg-Landau equations of superconductivity is determined completely and exactly by its values at a finite but sufficiently dense set of determining nodes in the domain. If the applied magnetic field is time dependent and asymptotically stationary, the large-time asymptotic behavior of a solution of the time-dependent Ginzburg-Landau equations of superconductivity is determined similarly by its values at a finite set of determining nodes, whose positions may vary with time.
L. Wos with G. W. Pieper, "The Hot List Strategy," J. Automated Reasoning 22 (1999), pp. 1-44.  Also Preprint  ANL/MCS-P684-0897. Experimentation strongly suggests that, for attacking deep questions and hard problems with the assistance of an automated reasoning program, the more effective paradigms rely on the retention of deduced information. A significant obstacle ordinarily presented by such a paradigm is the deduction and retention of one or more needed conclusions whose complexity sharply delays their consideration. To mitigate the severity of the cited obstacle, I formulated and feature in this article the hot list strategy. The hot list strategy asks the researcher to choose, usually from among the input statements characterizing the problem under study, one or more statements that are conjectured to play a key role for assignment completion. The chosen statements---conjectured to merit revisiting, again and again---are placed in an input list of statements, called the hot list. When an automated reasoning program has decided to retain a new conclusion C---before any other statement is chosen to initiate conclusion drawing---the presence of a nonempty hot list (with an appropriate assignment of the input parameter known as heat) causes each inference rule in use to be applied to C together with the appropriate number of members of the hot list. Members of the hot list are used to complete applications of inference rules and not to initiate applications. The use of the hot list strategy thus enables an automated reasoning program to briefly consider a newly retained conclusion whose complexity would otherwise prevent its use for perhaps many CPU-hours. To give evidence of the value of the strategy, I focus on four contexts: (1) dramatically reducing the CPU time required to reach a desired goal, (2) finding a proof of a theorem that had previously resisted all but the more inventive automated attempts, (3) discovering a proof that is more elegant than previously known, and (4) answering a question that had steadfastly eluded researchers relying on an automated reasoning program. I also discuss a related strategy, the dynamic hot list strategy (formulated by my colleague W. McCune), that enables the program during a run to augment the contents of the hot list. In the Appendix, I give useful input files and interesting proofs. Because of frequent requests to do so, I include challenge problems to consider, commentary on my approach to experimentation and research, and suggestions to guide one in the use of McCune's automated reasoning program OTTER.
Anne C. Morlet, "An Improved Model Equation with Globally Defined Flux for the Vortex Sheet Equation: Analytical Results," Preprint ANL/MCS-P685-0997, September 1997. We present an improved model for the vortex sheet equation, that combines some of the features of the models of Beale and Schaeffer, Dhanak, and Baker et al. We regularize the Beale-Schaeffer equation with a second-order viscous regularizing term, and we add a globally defined flux term in conservative form. We obtain u_t^v + iu_x^v = [H(u^v)u^v]_x + [|u_x^v|^2 u_x^v]_x + vu_{xx}^v, where i^2 = -1, and H(u^v) is the Hilbert transform of u^v. We derive bounds for the solution of the equation and its first-order spatial derivatives in L^2 and in the maximum norm, independent of v. We show that the function u_t^v satisfies an L^2 norm bound that depends linearly on v; all the other derivatives satisfy bounds that depend on negative powers of v. We show that, for v > 0, the solution exists and is unique. We also prove that, for v > 0, in the limit of v -> 0, the sequence of functions (u^v)_{v>0} has a weak limit; the weak limit may not be unique.
P. Hovland, B. Mohammadi, and C. Bischof, "Automatic Differentiation and Navier-Stokes Computations," Preprint ANL/MCS-P687-0997, October 1997. We describe the use of automatic differentiation (AD) to enhance a compressible Navier-Stokes model. Within the solver, AD is used to accelerate convergence by more than an order of magnitude. Outside the solver, AD is used to compute the derivatives needed for optimization. We emphasize the potential for performance gains if the programmer does not treat AD as a black box, but instead utilizes high-level knowledge about the nature of the application.
R. B. Lehoucq, S. K. Gray, D.-H. Zhang, and J. C. Light, "Vibrational Eigenstates of Four-Atom Molecules: A Parallel Strategy Employing the Implicitly Restarted Lanczos Method," Preprint ANL/MCS-P689-0997, September 1997. We present an approach for determining the vibrational eigenstates of four-atom molecules. The primary representation of the (six-dimensional) eigenstates involves a finite basis or quantum number representation, whereas Hamiltonian matrix-vector products are evaluated with the aid of certain grid or discrete variable representations. This approach leads to computational and memory demands that are within acceptable limits. The implicitly restarted Lanczos method, as implemented in the ARPACK suites of codes, is then applied to determine some of the corresponding vibrational eigenstates. A distributed-memory parallel implementation of the method allows very large symmetric matrix eigenvalue problems--on the order of N = 2x10^6--to be tackled. The lowest fifty vibrational states of the HOCO molecule, with zero total angular momentum and even parity, are accurately computed on Argonne National Laboratory's IBM SP computer.
ANL/MCS-P690-0897. Superseded by ANL/MCS-P738-0199.
M. Tobis, I. T. Foster, C. M. Schafer, R. L. Jacob, and J. R. Anderson, "FOAM: Expanding the Horizons of Climate Modeling," Preprint ANL/MCS-P691-0997, October 1997.

The file is not available here. Contact beumer@mcs.anl.gov  for a hard copy of the preprint. 

We report here on a project that expands the applicability of dynamic climate modeling to very long time scales. The Fast Ocean Atmosphere Model (FOAM) is a coupled ocean atmosphere model that incorporates physics of interest in understanding decade to century time scale variability. It addresses the high computational cost of this endeavor with a combination of improved ocean model formulation, low atmosphere resolution, and efficient coupling. It also uses message passing parallel processing techniques, allowing for the use of cost effective distributed memory platforms. The resulting model runs over 6000 times faster than real time with good fidelity, and has yielded significant results.
R. Sheng and F. A. Potra, "Nonsymmetric Search Directions for Semidefinite Programming," Preprint ANL/MCS-P692-0997, September 1997. Two nonsymmetric search directions for semidefinite programming, the XZ and ZX search directions, are proposed. They are derived from a nonsymmetric formulation of the semidefinite programming problem. The XZ direction corresponds to the direct linearization of the central path equation XZ = vI, while the ZX direction corresponds to ZX = vI. The XZ and ZX directions are well defined if both X and Z are positive definite matrices, where X may be nonsymmetric. We present an algorithm using the XZ and ZX directions alternately following the Mehrotra predictor-corrector framework. Numerical results show that the XZ/ZX algorithm is, in most cases, faster than the XZ+ZX method of Alizadeh, Overton, and Haeberly (AHO) while achieving similar accuracy.
R. Overbeek, N. Larsen, N. Maltsev, G. D. Pusch, and E. Selkov, "Metabolic Reconstruction Using the WIT/WIT2 Systems," in Bioinformatics: Databases and Systems, S. I. Letovsky, ed., Kluwer Academic (1999). Also Preprint ANL/MCS-P694-0997. For the past few years, we have been developing metabolic reconstructions for organisms that have been sequenced, and we have made a number of these working models available. By the term metabolic reconstruction we mean the process of inferring the metabolism of an organism from its genetic sequence data supplemented by known biochemical and phenotypic data. Our initial software system to support metabolic reconstruction was called WIT (for "What Is There?") and has been in use since mid-1995 (http://www.cme.msu.edu/WIT/). Recently, a second system, which we have called WIT2, has been made available (http://www/mcs/anl.gov/home/overbeek/WIT2/CGI/ user.cgi). In this chapter we discuss the central design issues in constructing such systems, along with the basic steps that must be supported by any such system.
R. Jacob, J. Anderson, M. Tobis, I. Foster, and C. Scafer, "Decadal and Century Scale Variability in the FOAM Coupled GCM," Preprint ANL/MCS-P695-1097, November 1997.

The file is not available here. Contact beumer@mcs.anl.gov  for a hard copy of the preprint.

FOAM (the Fast Ocean-Atmosphere Model) is a coupled ocean-atmosphere model intended for the explicit dynamic simulation of climatic phenomena one time scales of decades to centuries. The principal computational strategies used are massive parallelism through domain decomposition, efficient algorithmic implementation (particularly of ocean dynamics), and some sacrifice of spatial resolution for temporal span. (Production runs are implemented at R15 resolution.) A closed hydrological cycle including a river runoff model is implemented, conserving fresh water in the model system, preventing salinity drift, and allowing for phenomena linking the ocean circulation through atmospheric precipitation to freshwater influx forcing of the salinity field. The resulting non-flux-corrected closed-hydrology model has been run for a century of model time. Preliminary results from that initial run are presented here.
W. McCune, "Automatic Proofs and Counterexamples for Some Ortholattice Identifies," Information Processing Letters 65 (1998), pp. 285-291.  Also Preprint ANL/MCS-P697-1097, October 1997. This note answers questions on whether three identities known to hold for orthomodular lattices are true also for ortholattices. One identity is shown to fail by MACE, a program that searches for counterexamples, and the other two are proved to hold by EQP, an equational theorem prover. The problems, from work in quantum logic, were given to us by Norman Megill.
C. Bischof, L. Roh, N. Chang, K. Lee, V. Kanevsky, O. S. Nakagawa, and S.-Y. Oh, "Statistical On-Chip Interconnect Modeling: An Application of Automatic Differentiation," Preprint ANL/MCS-P698-1097, May 1998. Automatic differentiation is a technique for computing derivatives accurately and efficiently with minimal human effort.  We employed this technique to generate derivative information of FCAP2 (2-D) and FCAP3 (3-D) programs that simulate the parasitic effects of interconnects and devices.  This derivative information is used in the statistical modeling of worst-case interconnect delays and on-chip cross talks.  The ADIC (Automatic Differentiation in C) tool generated new versions of FCAP2 and FCAP3 programs that compute both the original results and the derivative information.  Given the ANSI C source code for the function, ADIC generates new code that computes derivatives of the model output with respect to the input parameters.  We report on the use of automatic differentiation and divided difference approaches for computing derivatives for FCAP3 programs.  The results show that ADIC-generated code computes derivatives more accurately, more robustly, and faster than the divided difference approach.
R. Thakur, E. Lusk, and W. Gropp, "I/O in Parallel Applications: The Weakest Link," to appear in special issue "I/O in Parallel Applications" of International Journal of Supercomputer Applications and High Performance Computing, 1998. Also Preprint ANL/MCS-P700-1197, November 1997. Parallel computers are increasingly being used to run large-scale applications that also have huge I/O requirements. However, many applications obtain poor I/O performance on modern parallel machines. This special issue of IJSA contains papers that describe the I/O requirements and the techniques used to perform I/O in real parallel applications. We first explain how the I/O application program interface (API) plays a critical role in enabling such applications to achieve high I/O performance. We describe how the commonly used Unix I/O interface is inappropriate for parallel I/O and how an explicitly parallel API with support for collective I/O can help the underlying I/O hardware and software perform I/O efficiently. We then describe MPI-IO, a recently defined, standard, portable API specifically designed for high-performance parallel I/O. We conclude with an overview of the papers in this special issue.
B. F. Smith, "An Interface for Efficient Vector Scatters and Gathers on Parallel Machines," Preprint ANL/MCS-P701-1197, December 1997. Scatter and gather type operations form the heart of most communication kernels in parallel partial differential equation solvers. This paper introduces a simple interface for defining and applying scatter and gather operations on both distributed and shared memory computers. A key feature of the interface is that it allows efficient implementations that can take advantage of underlying structure in the indexing of the scatters and gathers, such as striped indexing or indexing of blocks of data. A discussion of an implementation using MPI in the software package PETSc (the Portable, Extensible Toolkit for Scientific computation) is included. The interface is fully usable from Fortran 77, C, and C++.
[MCS | Research | Resources | People | Collaboration | Software | Publications | Information]
Last updated on January 23, 2004
Disclaimer
Security/Privacy Notice
webmaster@mcs.anl.gov