"Application Performance Evaluation of the HTMT Architecture"
M. Hereld, I. R. Judson, R. Stevens
Technical Memorandum ANL/MCS- TM-258
Preprint Version: [pdf]
In this report we summarize findings from a study of the predicted performance of a suite of application codes taken from the research environment and analyzed against a modeling framework for the HTMT architecture. We find that the inward bandwidth of the data vortex may be a limiting factor for some applications. We also find that available memory in the cryogenic layer is a constraining factor in the partitioning of applications into parcels. The architecture in several examples may be inadequately exploited; in particular, applications typically did not capitalize well on the available computational power or data organizational capability in the PIM layers. The application suite provided significant examples of wide excursions from the accepted (if simplified) program execution model – in particular, by required complex in-SPELL synchronization between parcels. The availability of the HTMT-C emulation environment did not contribute significantly to the ability to analyze applications, because of the large gap between the available hardware descriptions and parameters in the modeling framework and the types of data that could be collected via HTMT-C emulation runs. Detailed analysis of application performance, and indeed further credible development of the HTMT-inspired program execution model and system architecture, requires development of much better tools. Chief among them are cycle-accurate simulation tools for computational, network, and memory components. Additionally, there is a critical need for a whole system simulation tool to allow detailed programming exercises and performance tests to be developed.