To accomplish the project goals, multi-institution teams are conducting the following research activities:
- Performance portability: We are developing performance measurement, modeling and autotuning technology for petascale and heterogeneous systems, thus permitting scientists to exploit a wide range of high-end systems from a common code base. Team leader: Mary Hall, Utah.
- Energy efficiency: Relatively minor code changes can enable significant energy savings in some applications with little or no impact on performance. We are investigating application-level energy efficiency techniques to help reduce DOE’s energy costs. Team leader: Laura Carrington, UC San Diego.
- Resilience: Petascale calculations are pressing the limits of reliability in both hardware and system software. We are exploring strategies to enable petascale applications to be resilient in the face of faults. Team leader: Bronis R. de Supinski, LLNL.
- Optimization: We are building on tools from the mathematical optimization community to develop strategies that collectively optimize performance, energy efficiency and resilience. This work involves optimization models for different components and the use of multi-objective optimization techniques. Team leader: Paul Hovland, ANL.
SUPER researchers actively seek to engage in collaborations with other SciDAC-3 institutes as well as the broader DOE computational sciences community. This both focuses our research on the real challenges facing petascale scientific computing as well as ensures the broad and immediate impact of the results of our research. We work with DOE centers and HPC vendors to integrate, deploy, test, and document tools that achieve our objectives, and then actively collaborate with DOE center staff and scientific application teams to address high-priority codes and codes with special needs. Multi-institutional teams are conducting the following engagement activities.
- Application engagement: We provide measurement and analysis tools and assist application developers with modeling and optimizing their codes and deciding whether closer collaboration with SUPER would be useful. We work with the computing centers to enhance user-level measurement and analysis capabilities and to identify applications for which further performance, energy or resilience enhancement is needed. We work with other SciDAC-3 institutes to analyze and optimize their software and use of their software in application codes. Team leader: Patrick H. Worley, ORNL.
- Tool integration: We are developing an integrated tool suite to enable end-to-end application optimization, including automated experiments for performance analysis and autotuning, power-aware instrumentation for energy evaluation, fault detection and vulnerability analysis, and scalable data management and presentation. Team leader: Allen D. Malony, Oregon.
SUPER is also involved in inter-institute activities with other SciDAC Institutes.
- The Roofline Model project is a collaboration with the FastMATH Institute. Lawrence Berkeley Laboratory, Argonne National Laboratory, and the University of Oregon recently began construction of a Roofline-based performance modeling and analysis toolkit that will enabled automated prediction and analysis of the computational kernels within various scientific applications. Moreover, general characterization of emerging architectures will allow applied math researchers in FastMATH and various application partnerships to tailor algorithmic parameters according to the true machine abilities rather than marketing hype.