Current HPC environments and applications are rather rigid and inflexible, and MPI’s inability to efficiently support malleability, i.e., the ability to grow and shrink the computational resources associated with a job at runtime, is a significant part of the problem. Future generations of HPC systems, however, require a more flexible approach, e.g., to support a greater level of fault tolerance, to adjust to changing levels of available resources, or to match more complex workflows. This will also require MPI to change. In this talk I will discuss the challenges facing MPI in these scenarios as well as several approaches that are first steps towards supporting malleability in MPI. They will open the door for MPI to both support a new generation of applications as well as to provide more flexible runtime support for higher level programming models.
Chair of Computer Architecture and Parallel Systems