In upcoming HPC systems, both power and energy consumption will face strict limits, imposed by technological, political and facility constraints. In order to still enable efficient execution of applications, we will have to actively manage these limited resources in a way that matches both system and application needs.
In particular, we need to identify where either power or energy is needed to make progress and then direct power/energy to these sections of the execution. This requires the ability to monitor and observe power and energy utilization and the ability to correlate any monitoring data with application context, which creates its own data analytics problem. Further, we will need the ability to adapt applications and systems at runtime to implement optimizations stemming from the observed data. In this talk I will present the current state of the art at the Leibniz Supercomputing Centre in Munich, which is one of the leading institutions in terms of power and energy management, as well as several activities to integrate power and energy usage and optimization into runtime environments.
Combined, they will form one of the pillars in breaking the power and energy wall we are facing and will help us continue to scale the capabilities of our HPC systems.
More information can be found at: http://hpbdis.csp.escience.cn/dct/page/70040
Prof. Dr. rer. nat. Martin Schulz