carns – Page 2

Quarterly Newsletter, April 2022

April 26, 2022 by carns

New tools

Mochi-json-vis
- https://github.com/mochi-hpc/mochi-json-vis
- A command-line tool that can be used to generate a visual representation of a Mochi Bedrock configuration.
- This can be helpful to sanity check or better understand service configuration details such as the mapping of providers to execution streams.

Software updates

Mochi-thallium 0.10.1 (C++ bindings to Mochi)
- Adds support for timer_callback
- Adds logger class and logging functionality
- Adds access to margo’s underlying configuration, pools, and xstreams
Mochi-bedrock 0.4.1 (service configuration framework)
- Ability to initialize the server with a JX9 script instead of a JSON configuration

Publications

Matthieu Dorier, Zhe Wang, Utkarsh Ayachit, Shane Snyder, Robert Ross, Manish Parashar. “Colza: Enabling Elastic In Situ Visualization for High-performance Computing Simulations.” in Proceedings of the 36th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2022) (TO APPEAR)
Bradley Settlemyer, George Amvrosiadis, Philip Carns, and Robert Ross. “It’s time to talk about HPC storage: Perspectives on the past and future.” Computing in Science & Engineering, 23(6):63–68, 2021. https://ieeexplore.ieee.org/document/9658238

Upcoming events

Building Custom Data Services with Mochi (public BoF)

May 12th, 11:00 AM eastern time

We will provide general updates on the Mochi project, highlight key capabilities related to service composition and key/value stores, and share work from guest speakers about the Mochi messaging layer and successful Mochi use cases:

Mercury: platform updates and optimizations for RPC and RDMA communication (Jerome Soumagne, The HDF Group)
Chimbuko: scalable application performance analysis and provenance (Chris Kelly, Brookhaven National Laboratory)
DataSpaces: extreme-scale data management framework (Philip Davis, University of Utah)

To register, follow this link, expand the Mochi BoF description, and click “Register” — this should provide you with a Zoom link to attend the BoF: https://www.exascaleproject.org/event/ecp-community-bof-days-2022/

Mochi BoF at the ECP Community BoF Days, May 12, 2022

April 25, 2022 by carns

We would like to invite everyone to attend a virtual Mochi BoF session as part of the ECP Community BoF Days:

Building Custom Data Services with Mochi
May 12th, 11:00 AM eastern time

Mercury: platform updates and optimizations for RPC and RDMA communication (Jerome Soumagne, The HDF Group)
Chimbuko: scalable application performance analysis and provenance (Chris Kelly, Brookhaven National Laboratory)
DataSpaces: extreme-scale data management framework (Philip Davis, University of Utah)

Thanks!
–Mochi team

Quarterly newsletter, January 2022

January 27, 2022January 27, 2022 by carns

New microservices

Mochi-quintain
- https://github.com/mochi-hpc/mochi-quintain
- Includes a provider that can be embedded in other services, via Mochi-bedrock or other means, to provide synthetic workload testing capability (i.e., “self-test”)
- Includes an MPI benchmark that can be used to issue parameterized RPCs to the quintain provider from a large number of concurrent clients to measure response times from a heavily loaded server
- Some preliminary plotting tools to help understand response time distributions and tail latency

Example distribution of response times for a Quintain provider under load.

Software updates

Mochi-ssg 0.5.2
- Several important API changes since 0.4.x
- https://lists.mcs.anl.gov/pipermail/mochi-devel/2021-December/000147.html
- Error handling, API clarity, and utility funcitons
Mochi-margo 0.9.7
- Bug fixes and a new user-facing timer API
Yokan 0.2
- https://lists.mcs.anl.gov/pipermail/mochi-devel/2021-December/000146.html
- Document storage, user-defined filters, and new execution modes
Mercury 2.1.0
- A variety of enhancements and fixes
- Support for the UCX transport library

Platform support

Slingshot NIC support
- The first systems are coming online with Slingshot NICs and a corresponding libfabric CXI driver.
- https://docs.olcf.ornl.gov/systems/crusher_quick_start_guide.html
- We are actively testing the Mochi stack there

Publications (updated)

Srinivasan Ramesh, Robert B Ross, Matthieu Dorier, Allen D Malony, Philip Carns, and Kevin Huck. SYMBIOMON: A High Performance, Composable Monitoring Service. In 29th IEEE International Conference on High Performance Computing, Data, & Analytics (HiPC). IEEE, 2021.

Upcoming events

ECP annual meeting
- https://www.ecpannualmeeting.com/
- Tentatively scheduled for May
- We will host some BoF and/or tutorial content
  - All material will be made publicly available after the event
- What topics would you like to see covered?

Mochi one of three ANL technologies to capture a 2021 R&D 100 award

November 8, 2021 by carns

See this Argonne press release for details.

Quarterly Newsletter, October 2021

October 28, 2021 by carns

Project News:

R&D World Magazine has announced Mochi as a recipient of a 2021 R&D 100 award in the software/services category!

New Microservices:

We are happy to introduce a new key/value microservice, called Yokan, to the Mochi framework. You can find more details in the Yokan documentation and Yokan GitHub repository. Yokan aims to provide state-of-the-art key/value storage capabilities on top of Margo, following the best practices of the Mochi methodology. It provides many backends, including BerkeleyDB, GDBM, LevelDB, LMDB, RocksDB, TKRZW, Unqlite, and a number of in-memory key/value stores. It was designed to be highly configurable and highly flexible, making it easy to configure databases using JSON, and to provide your own database implementation if the ones we offer don’t satisfy you. Yokan also provides C++ and Python APIs in addition to the usual C API.

Software updates:

Libfabric 1.13.2 has resolved multiple outstanding bugs that impacted Mochi, particularly with the RXM provider which is used on TCP and Verbs networks. Please try it out and report if you have any problems.
Mercury version 2.1.0rc2 is now available. This is very close to the final 2.1.0 release of Mercury and is the default version supported in the mochi-spack-packages repository. It includes a UCX network driver, improvements to the shared memory transport, new threading options, and miscellaneous bug fixes.
Margo version 0.9.6 has also been released; it includes support for the upcoming Mercury 2.1.0 and performance enhancements that take advantage of upcoming features in Argobots 1.2.

Platform support:

Please remember to refer to the Mochi platform configurations repository for suggested configurations for various platforms. We have recently updated several example Spack environment files. Feel free to contribute more!

Contribution policy:

The Mochi Contributor License Agreement (CLA) has been updated to streamline the process of contributing source code to the project. We have also installed GitHub action that will automatically prompt you to digitally agree to the CLA terms when you open your first pull request. Let us know if you have any questions.

New/Upcoming Publications:

Srinivasan Ramesh, Robert Ross, Matthieu Dorier, Allen Malony, Philip Carns and Kevin Huck. SYMBIOMON: A High Performance, Composable Monitoring Service. TO APPEAR in the 28th IEEE International Conference on High Performance Computing, Data, & Analytics (HiPC 2021)

Mochi selected as a 2021 R&D 100 finalist

October 22, 2021September 1, 2021 by carns

UPDATE: Mochi was announced as a R&D 100 winner in the Software/Services category on October 21, 2021! We will post more details soon.

The Mochi project, a collaboration between Argonne National Laboratory, Carnegie Mellon University, Los Alamos National Laboratory, and The HDF Group, has been selected as a finalist for the 2021 R&D 100 Awards. The the R&D 100 Awards have served as the most prestigious innovation awards program for the past 58 years; their mission is to identify and honor the top 100 new technologies of the year. Winners will be announced in November 2021.

Quarterly Newsletter, July 2021

July 29, 2021 by carns

Publication news:

Pierre Matri and Robert Ross. “Neon: Low-Latency Streaming Pipelines for HPC”, to appear in IEEE Cloud 2021, Sept 5-10 2021.
- Introduces a new Mochi service for stream processing
Stay tuned for more Mochi-related publications at SC21 in November. More details will be posted once the SC21 technical program is announced.

Recent development updates:

A proof-of-concept of UCX support in Mercury is available in the master-ucx version of Mercury in the Mochi Spack repository
- Please contact us if you are interested in this capability; it is under active development and should be considered experimental at this time.
The git origin/main branch of Margo includes new safety checks to ensure compatible Argobots runtime parameters if Argobots is initialized outside of Margo. This will be available in an upcoming release after coordinating updates to other Mochi packages.
Both Mochi and Margo have new Contributor License Agreement (CLA) documents available online as of July 2021 with more relaxed language than the previous version. We will soon streamline these even further with online electronic forms that will be activated within the GitHub contribution process.

Debugging tips:

We have encountered several bug reports on Libfabric 1.13.0 in the last few days, especially with the RXM provider. Debugging is in progress, but in the mean time you may want to consider reverting to an earlier release if you encounter communication problems.
Recent libfabric releases also include a new PSM3 provider. PSM3 is not directly supported by Mercury / Mochi, but enabling it in libfabric may interfere with the performance of the traditional PSM2 provider. The libfabric package in the Mochi spack repository disables PSM3 by default for now to avoid this problem.

Quarterly newsletter, April 2021

April 29, 2021 by carns

New presentation materials:

The Mochi team presented a BoF session entitled “Using Mochi to build data services: Overview and Updates” at the 2021 ECP Annual Meeting, April 13, 2021.
- The slides include information about getting started with Mochi, recent project updates, and highlights from Bedrock (a tool to aid with Mochi composition), Mercury (the underlying RPC framework for Mochi), and Mochi profiling tools.

GitHub migration complete:

See https://github.com/mochi-hpc now for most packages, and see our news post for tips on how to update Spack.

New software releases:

Argobots 1.1
- Underlying user-level threading package for Mochi
- includes performance improvements, broader platform support, and new profiling and debugging capabilities (more on that later)
Mercury 2.0.1rc3
- Underlying RPC communication package for Mochi
- improved logging and several performance optimizations
- final 2.0.1 release coming soon
Mochi-sdskv 0.1.12
- Key/Value store microservice
- Bedrock support
- various packaging (cmake, pkgconfig, and dependency) improvements
Bedrock 0.2.1
- Flexible service composition tool
- various packaging (cmake, pkgconfig) improvements
Sonata 0.6.2
- Document store microservice
- various packaging (cmake) improvements

Performance regressions from previous quarterly newsletter resolved:

Power9 CPU mutex locking performance regression is resolved in Argobots 1.1
OmniPath network performance regression is resolved in Mercury 2.0.1rc3

New debugging/profiling/maintenance features:

Margo is now using munit for unit testing
- Available in origin/main (or mochi-margo@main in Spack)
- Coverage is limited for now but will be expanded over time
- We will also be leveraging this frame work in additional components over time
Recent Argobots updates include multiple (optional) stack guard methods
- See Argobots documentation or Spack package variants. Notable optoins:
  - “mprotect”: real time detection of stack overruns (with some performance overhead; just use this for debugging)
  - “canary”: lightweight deferred stack overrun detection (lighter weight, but will not report that a stack overflow occurred until shutdown)
margo_state_dump() function
- Available in origin/main (or mochi-margo@main in Spack)
- function that can be called at any time to dump point-in-time state to a text file or stdout for debugging purposes
- includes Margo json configuration, Argobots configuration, current Argobots ES layout, Argobots performance profile, in flight RPC counts, stack dump for blocked user-level threads, etc. See https://github.com/mochi-hpc/mochi-margo/blob/main/doc/debugging.md for details.

2021 ECP Mochi BoF materials available online

April 29, 2021 by carns

The Mochi team presented a BoF session entitled “Using Mochi to build data services: Overview and Updates” at the 2021 ECP Annual Meeting, April 13, 2021. The slides include information about getting started with Mochi, recent project updates, and highlights from Bedrock (a tool to aid with Mochi composition), Mercury (the underlying RPC framework for Mochi), and Mochi profiling tools.

The Mochi Github migration is complete

March 24, 2021 by carns

All Mochi source code repositories have been migrated to github.com at https://github.com/mochi-hpc/ as of March 22, 2021.

If you are already using spack to install Mochi components, please update your Mochi repository at your earliest convenience:

spack repo rm mochi
git clone https://github.com/mochi-hpc/mochi-spack-packages.git
spack repo add mochi-spack-packages

The package names have not changed; this will just enable you to retrieve new versions as they are released by updating your cloned copy of the mochi-spack-packages repo.

New tools

Software updates

Publications

Upcoming events

Building Custom Data Services with Mochi (public BoF)

Building Custom Data Services with MochiMay 12th, 11:00 AM eastern time

New microservices

Software updates

Platform support

Publications (updated)

Upcoming events

New presentation materials:

GitHub migration complete:

New software releases:

Performance regressions from previous quarterly newsletter resolved:

New debugging/profiling/maintenance features:

Building Custom Data Services with Mochi
May 12th, 11:00 AM eastern time