Publications

Refereed Journal Papers

  1. Abdelhalim Amer, Milind Chabbi, Huiwei Lu, Yanjie Wei, Jeff Hammond, Satoshi Matsuoka, and Pavan Balaji. Locking Contention Management in Multithreaded MPI. (TOPC), 2018. [paper]

  2. Sangmin Seo, Abdelhalim Amer, Pavan Balaji, Cyril Bordage, George Bosilca, Alex Brooks, Philip Carns, Adrian Castello, Damien Genet, Thomas Herault, Shintaro Iwasaki, Prateek Jindal, Laxmikant V. Kale, Sriram Krishnamoorthy, Jonathan Lifflander, Huiwei Lu, Esteban Meneses, Marc Snir, Yanhua Sun, Kenjiro Taura, and Pete Beckman. “Argobots: A Lightweight Low-Level Threading and Tasking Framework.” IEEE Transactions on Parallel and Distributed Systems (2017). [paper]

Refereed Conference Papers

  1. Shintaro Iwasaki, Abdelhalim Amer, Kenjiro Taura, Pavan Balaji. Lessons Learned from Analyzing Dynamic Promotion for User-Level Threading. To appear in the IEEE/ACM International Conference on High Performance Computing, Networking, Storage and Analysis (SC). Nov. 11–-16, 2018, Dallas, TX, USA.

  2. Kenneth J. Raffenetti, Abdelhalim Amer, Lena Oden, Charles Archer, Wesley Bland, Hajime Fujita, Yanfei Guo, Tomislav Janjusic, Dmitry Durnov, Michael Blocksome, Min Si, Sangmin Seo, Akhil Langer, Gengbin Zheng, Masamichi Takagi, Paul Coffman, Jithin Jose, Sayantan Sur, Alexander Sannikov, Sergey Oblomov, Michael Chuvelev, Masayuki Hatanaka, Xin Zhao, Paul Fischer, Thilina Rathnayake, Matt Otten, Misun Min, and Pavan Balaji. Why is MPI so Slow? Analyzing the Fundamental Limits in Implementing MPI- 3.1. IEEE/ACM International Conference on High Performance Computing, Networking, Storage and Analysis (SC). Nov. 12-17, 2017, Denver, Colorado. [paper]

  3. Milind Chabbi, Abdelhalim Amer, Shasha Wen, Xu Liu. An Efficient Abortable-locking Protocol for Multi-level NUMA Systems. ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming 2017 (PPoPP’17). February 04 - 08, 2017, Austin, TX, USA. [paper]

  4. Hoang-Vu Dang, Sangmin Seo, Abdelhalim Amer, and Pavan Balaji. Advanced Thread Synchronization for Multithreaded MPI Implementations. 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid’17). Madrid, Spain, May 14-17, 2017. [paper]

  5. Abdelhalim Amer, Huiwei Lu, Yanjie Wei, Pavan Balaji and Satoshi Matsuoka. MPI+Threads: Runtime Contention and Remedies. ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming 2015 (PPoPP’15). Feb. 7-11, 2015, San Francisco, California. [paper] [slides]

  6. Abdelhalim Amer, Naoya Maruyama, Miquel Pericàs, Kenjiro Taura, Rio Yokota, and Satoshi Matsuoka. Fork-Join and Data-Driven Execution Models on Multi-core Architectures: Case Study of the FMM. International Supercomputing Conference 2013 (ISC’13), 255-266. [paper] [slides]

  7. Abdelhalim Amer, Ahmed Touflk, Walid-Khaled Hidouci, and Satoshi Matsuoka. Using Bittorrent and SVC for efficient video sharing and streaming. IEEE Symposium on Computers and Communication 2012 (ISCC,12): 537-543. [paper] [slides]

Refereed Workshop Publications

  1. Abdelhalim Amer, Satoshi Matsuoka, Miquel Pericàs, Naoya Maruyama, Kenjiro Taura, Rio Yokota, and Pavan Balaji. Scaling FMM with Data-Driven OpenMP Tasks on Multicore Architectures: To appear at the 12th International Workshop on OpenMP (IWOMP) 2016. [paper] [slides]

  2. Daniel Ellsworth, Tapasya Patki, Swann Perarnau, Sangmin Seo, Kazutomo Yoshii, Abdelhalim Amer, Rinku Gupta, Judicael Zounmevo, Henry Hoffman, Allen Malony, Martin Schulz, and Pete Beckman. Systemwide Power Management with Argo. To appear at the Workshop on High-Performance, Power-Aware Computing (HPPAC) 2016

  3. Abdelhalim Amer, Huiwei Lu, Pavan Balaji, and Satoshi Matsuoka. Characterizing MPI and Hybrid MPI+Threads Applications at Scale: Case Study with BFS. Workshop on Parallel Programming Model for the Masses (PPMM); held in conjunction with IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing (CCGrid). May 4, 2015, Shenzhen, China. [paper] [slides]

  4. Miquel Pericàs, Abdelhalim Amer, Kenjiro Taura and Satoshi Matsuoka: Analysis of Data Reuse in Task-Parallel Runtimes. 4th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS’13), Denver, November 2013. [paper] [slides]

  5. Miquel Pericàs, Abdelhalim Amer, Kenjiro Taura and Satoshi Matsuoka: Analysis of Data Reuse in Task-Parallel Runtimes [extended version]. Lecture Notes in Computer Science, Springer, High Performance Computing Systems. Performance Modeling, Benchmarking and Simulation, pp 73-87, 2014 [paper]