Parallel Programming Systems (IN2365)

Lecturer (assistant)
  • Michael Klemm [L]
Number0000002315
DatesSee TUMonline

Lecturer

Dr.-Ing. Michael Klemm is a Principal Member of Technical Staff in the team working on Compilers, Languages, Runtimes & Tools at the Machine Learning & Software Engineering group at AMD. He is part of the OpenMP compiler team, focusing on application and kernel performance for AMD Instinct accelerators for High Performance and Throughput Computing. Michael is also the Chief Executive Officer of the OpenMP Architecture Review Board (http://www.openmp.org) that oversees the OpenMP API specification. Michael holds Doctor of Engineering degree (Dr.-Ing.) in Computer Science from the Friedrich Alexander University, Erlangen, Germany. His research focus was on compilers and runtime systems for distributed systems. His areas of interest include compiler construction, design of programming languages, parallel programming as well as performance analysis and tuning.

Content

This lecture focuses on the implementation aspects of parallel programming systems.  Parallel programming models need compiler and runtime support to map the rich feature set of a parallel programming model to actual parallel hardware. To obtain high performance and high efficiency, this mapping needs to take into account the specific architectural aspects of the underlying computer architecture. This lecture briefly reviews key concepts that have been presented in the lecture "Parallel Programming" (IN2147), and "Microprocessors" (IN2075) or "Advanced Computer Architecture" (IN2076). It then turns towards the fundamental algorithms used to implement the concepts of parallel programming models and how they interact with modern processors. While the lecture will focus on the general mechanisms, we will use x86 architectures to exemplify the discussed implementation concepts.

Recommended Literature

  • John Hennessy and David Patterson: Computer Architecture: A Quantitative Approach (Morgan Kaufmann Series in Computer Architecture and Design). Morgan Kaufmann, 6th edition, ISBN-13 978-0128119051.
  • William Stallings: Computer Organization and Architecture: Designing for Performance. Prentice Hall, 7th edition, ISBN 978-0131856448.
  • Yan Solihin: Fundamentals of Parallel Multicore Architecture. Apple Academic Press, ISBN 978-1482211184.
  • Sources of the LLVM OpenMP Runtime Implementation. openmp.llvm.org.
  • Sources of the Threading Building Blocks. www.threadingbuildingblocks.org.
  • Select research papers regarding barrier implementation, lock implementation, scheduling task graphs, etc.
  • Intel Corporation: Intel® 64 and IA-32 Architectures Optimization Reference Manual, document ID 248966-040.
  • Barbara Chapman, Gabriele Jost, and Ruud van der Pas: Using OpenMP - Portable Shared Memory Parallel Programming. MIT Press, ISBN-13 978-0262533027.
  • Ruud van der Pas, Eric Stotzer, and Christian Terboven: Using OpenMP - The Next Step: Affinity, Accelerators, Tasking, and SIMD. MIT Press, ISBN-13 978-0262534789.
  • James Reinders: Threading Building Blocks. O'Reilly, ISBN 978-0596514808.
  • Alexander Supalov: Inside the Message Passing Interface - Creating Fast Communication Libraries, De|G Press, ISBN 978-1501515545.