PD Dr. rer. nat. Josef Weidendorfer

Technische Universität München
Informatik 10 - Lehrstuhl für Rechnertechnik und parallele Systeme (Prof. Schulz)
Boltzmannstr. 3, 85748 Garching b. München
josef.weidendorfer (at) tum.de

Leibniz Rechenzentrum der Bayerischen Akademie der Wissenschaften
Boltzmannstr. 1, 85748 Garching b. München
Tel.: +49 (89) 35831-8766, josef.weidendorfer (at) lrz.de

 

Short CV

Josef Weidendorfer is a qualified private lecturer at Technische Universität München (TUM) for Computer Science. He works at the Leibniz Computing Centre (LRZ) as lead of the Future Computing Group, which is developing smooth migration strategies for future HPC systems and evaluating novel technologies. This includes improvement of system level and workload analysis tools as well as parallel programming models towards new usage models for LRZ hardware. He maintains a tight cooperation to the chair of computer architecture and parallel systems (CAPS, Prof. Dr. M. Schulz) at TUM. Before, he was senior researcher and teaching assistant at TUM. Research involved best use of accelerators, heterogeneous computing, and tuning strategies for parallel code including dynamic code generation techniques. Josef did his habilitation at TUM in 2016 on simulation-driven performance analysis for parallel code, especially looking at capturing bottlenecks in the memory hierarchy of modern architectures and presenting them in a way to hint at adequate performance optimizations. He received his Ph.D. from TUM in 2003 for studying load balancing issues in car crash simulation on industrial code at BMW AG.

Research Interests

Parallel Computer architectures, High Performance Computing, Multi-/Manycore architectures, GPGPU, Performance analysis and optimization, Cache Simulation, Virtual Machines, dynamic code generation. Josef is interested in all kind of strategies towards improving efficiency of computations on various hardware structures, both on general purpose and specialized accelerator hardware (mostly towards HPC codes), including required tools (e.g. for performance analysis) and techniques on the SW/HW boundary (code generation, cache exploitation, ...). To this end, he regularly organizes the UCHPC workshop (since 2010 with Euro-Par) about usage of "unconventional" hardware ideas for HPC computing. Furthermore, being interested in performance analysis tools, he is co-organizer of the PSTI workshop series as well as the contact person at LRZ as member of the VI-HPS interest group.

He is maintainer of the open-source tools Callgrind/KCachegrind for cache simulation.

Memberships: ACM, GI, Zuse-Gesellschaft, VI-HPS

Teaching

For student works (bachelor/master thesis, IDP, guided research), see here (look for LRZ-related topics). Or even better, please ask me by mail for a meeting to discuss current open topics.

Winter Term 24/25
Summer Term 24
Winter Term 23/24
Summer Term 23
Winter Term 22/23
Summer Term 22
Winter Term 21/22
Summer Term 21
Winter Term 20/21
Winter Term 19/20
Winter Term 18/19
  • Lecture Virtualization Techniques
  • Master-Seminar: Programming models and code generation
Winter Term 17/18
  • Lecture "Einführung in die Rechnerarchitektur" (introduction to computer architecture)
  • Master-Seminar: Programming models and code generation
  • Master-Seminar: "Hochleistungsrechner: Aktuelle Trends und Entwicklungen"
Summer Term 17
  • Master-Seminar Virtualization Techniques
  • Proseminar Multicore Architectures
  • Lab Course Efficient Programming of Multicore-Systems and Supercomputers
  • Seminar Trends in Computing
Winter Term 16/17
  • Lecture Virtualization Techniques
  • Master-Seminar: Programming models and code generation
  • Master-Seminar: Hochleistungsrechner: Aktuelle Trends und Entwicklungen
  • Introduction to computer architecture: central exercise (microcode programming)
Summer Term 16
  • Master-Seminar Virtualization Techniques
  • Proseminar Multicore Architectures
  • Lab Course Efficient Programming of Multicore-Systems and Supercomputers
  • Seminar Trends in Computing
Winter Term 15/16
  • Lecture Virtualization Techniques
  • Master-Seminar: Programming models and code generation
  • Master-Seminar: Hochleistungsrechner: Aktuelle Trends und Entwicklungen
  • Introduction to computer architecture: central exercise (microcode programming)
Summer Term 15
  • Seminar Virtualization Techniques
  • Seminar Resource-Aware Computing
  • Proseminar Multicore Architectures
  • Lab Course Efficient Programming of Multicore-Systems and Supercomputers
Winter Term 14/15
  • Lecture Virtualization Techniques
  • Seminar Programming models and code generation
  • Seminar Akzeleratorarchitekturen
  • Introduction to computer architecture: central exercise (microcode programming)

Supervised Student Work

  • Alisa Parashchenko: Design and Implementation of an RDMA backend supporting malleability for LAIK. Bachelor Thesis 2024 (ongoing)
  • Leon Spoerl: Streamlined Software Stack Deployment and Verification for LRZ Benchmarking Platform. Bachelor Thesis 2024 (ongoing)
  • Zafer Yilmazer: Dynamic Resources for the MPI Backend in LAIK. Guided Research 2024 (ongoing)
  • Ahmad Belbeisi: Modeling and Simulation of Power Consumption of MPI Applications. Master thesis 2024 (ongoing)
  • Leonhardt Tizian: Low Overhead Sampling Techniques for HPC Runtimes. Master Thesis 2024 (ongoing)
  • Peter Goldammer: Evaluation of Energy and Power Management Frameworks for HPC Workloads. Bachelor Thesis, April 2024
  • Vyas Giridharan: Implementation and Evaluation of Matrix Profile Algorithms on the Cerebras Wafer-Scale Engine. Master Thesis, March 2024
  • Ibrahim Erdurucan: Malleability for the HPCG Benchmark Using LAIK. Bachelor Thesis, December 2023
  • Lukas Heine: Topology-Aware Communication Optimization in LAIK. Master Thesis, December 2023
  • Jakob Schaeffeler: Exploring Performance Differences of GPU Offloading Programming Models. Guided Research, October 2023
  • Lukas Neef: Development and Comparison of Shared Memory Usage Model Schemes in a LAIK Backend. Bachelor Thesis, September 2023
  • Moritz Unseld: TRAIL: TRansfer learning And Intelligent data pruning for Language models. Bachelor Thesis, July 2023
  • Tobias Bauer: Optimization of Data Pipelines and Performance of AI-based Fire Propagation Models. Master Thesis, April 2023
  • Julian Scheipl: Phase-aware Statistical Sampling for Always-on Performance Monitoring of HPC Systems. Bachelor Thesis, February 2023
  • Robert Hubinger: Design und Implementierung eines Shared Memory Backends für LAIK. Bachelor Thesis, September 2022
  • Orgil Dorj: Extension of BEAST Benchmarking Infrastructure with Automation, Postprocessing, and Modeling Features. Bachelor Thesis, August 2022
  • Aleksandr Balakirev: Exploring the Benefits of Non-volatile Memory in HPC Applications. Bachelor Thesis, February 2022
  • Sergej Breiter: Evaluating Sector Caches in High-Performance Computing. Master Thesis (LMU, advisor), January 2022
  • Maximilian Mayr: Design and Implementation of an Infrastructure for the automatic Evaluation of Benchmarks in the LRZ Testbed. Bachelor Thesis, September 2021
  • Vincent Bode: Application-Integrated Fault Tolerance in HPC. Master Thesis, November 2019
  • Alexander Kurtz: Design and Implementation of a Lightweight Communication Backend for HPC/Distributed Applications, Master Thesis, Mai 2018
  • K. Pröll: Adaptive data layout optimizations for stencil-code using binary rewriting, Master Thesis, March 2018
  • T. Asheim: Evaluation of Binary Rewriting Techniques for MPI, Guided Research, April 2017
  • J. Rodrigues: Mutual Influence of Memory- and Compute-Intensive Parallel Applications - Characterization for Prediction, Master Thesis, Feb 2017
  • M. Eiler: Analysing and Using OpenCL for Processing Laser Scanning Data, Master Thesis, Aug 2016
  • M. Kruk: Evaluation of MPI vs. PGAS for Cache-optimized Benchmarks, Bachelor Thesis, Sep 2016
  • A. Engelke: Using LLVM to Optimize Binary Re-Writing at Runtime, Guided Research, Oct 2016
  • J. Rodrigues: Mutual Influence of Applications for Co-Scheduling, Guided Research, Oct 2016
  • D.A. Suarez Trujillo: Design and Implementation of a Feature Detection Algorithm for Space Debris Detection on the High Performance Data Processor (HPDP), Master Thesis, Oct 2015
  • D.A. Ortiz-Yepes: Page Migration Strategies on NUMA Systems Based on Sampling, Master Thesis, Nov 2015
  • T. Geissler: A tool for efficient analysis of Memory Access Behaviour of HPC Applications, Bachelor Thesis, Mar 2015
  • S. Bartels: Investigation of the Portability of an Image Processing Algorithm on a Reconfigurable Space-borne Parallel Processor, Master Thesis, Aug 2014
  • L. Kowalczyk: Design and Implementation of an Automatic Tuning Solution for GPU Programs, Master Thesis, Mai 2014
  • I. Vadasz: Hardware Transactional Memory for Cache Simulation, Master Thesis, April 2014
  • G. Kukreja: Host compiled simulation to estimate time and power consumption of embedded systems, Master Thesis, Nov 2014
  • S. Hertle: Adaptive Usage of Hardware Transactional Memory on Haswell Processors, Bachelor Thesis, Oct. 2014
  • J. Kranz: Generating Fast Code Generators, Interdisciplinary Project, 2013
  • M. Plichta: Faster Sparse Matrix Operations by Code Generation Embedding Prefetching, Bachelor Thesis, Aug. 2013

Research

Projects

  • upcoming: EU Project SEANERGYS (January 2025 - December 2028, run by Future Computing Group at LRZ, PI)
  • BMBF Project ScalNext (October 2022 - September 2025, run by Future Computing Group at LRZ, PI)
  • BMBF Project CoMPS (November 2022 - October 2025, run by Future Computing Group at LRZ)
  • EU Project REGALE (April 2021 - March 2024, run by Future Computing Group at LRZ)
  • EU Project DEEP-SEA (April 2021 - March 2024, run by Future Computing Group at LRZ)
  • BMBF Project Envelope (Januar 2017 - December 2019, PI at LRR/CAPS TUM)
  • BMBF Projekt FAST (Januar 2014 - December 2016, PI at LRR TUM)
  • DFG SFB Transregio 89: Invasive Computing (2010 - 2022, advised PhD candidates at LRR TUM)
  • Old: KAUST Simulating CO2 SequestrationVirtual Arabia, IGSSE 1.08 Hardware-aware simulation, MAC/IGSSE MAPCO

Open Source Software Projects

(only most important projects related to current research)

  • LAIK, a library for elastic parallel computing (started as part of work on BMBF project Envelope)
  • DBrew, a library for dynamic binary rewriting
  • KCachegrind, visualization GUI for profile performance data
  • Callgrind, part of Valgrind, cache simulation via dynamic binary instrumentation

Talks

  • Cache Performance Analysis with Callgrind and KCachegrind, 45th VI-HPS Tuning workshop, LRZ, Garching. June 2024.
  • Democratizing AI Accelerators for HPC Applications: Challenges, Success, and Support. BoF at ISC24, Hamburg, May 2024.
  • Stand Project ScalNext. Statustagung der Gauss-Allianz 2024, Dresden, April 2024.
  • Phase 2 of the LRZ Flagship System SuperMUC-NG. IXPUG Workshop at SC23, Denver, November 2023.
  • BEAST Lab: A Practical Course on Experimental Evaluation of Diverse Modern HPC Architectures and Accelerators. Workshop Best Practices for HPC training and Education at SC23, Denver. November 2023.
  • Recent Energy Efficiency at LRZ: REGALE - the European Realization of the PowerStack-Approach. EnviroInfo 2023. LRZ, Garching, October 2023
  • A Unified Affinity-Aware Parallel Programming Model. ScalPerf23. Bertinoro, Italy. September 2023
  • Ansätze zur Nutzung von heterogener Hardware und smarten Netzwerken. Statustagung der Gauss-Allianz 2023, Karlsruhe, Juni 2023.
  • REGALE - die europäische Verwirklichung des PowerStack-Ansatzes. Statustagung der Gauss-Allianz 2023, Karlsruhe, Juni 2023.
  • Phase-aware System-Side Sampling for HPC. Computing Frontiers 2023. Bologna, Italy. May 2023.
  • Benchmarking across HPC Architectures - The LRZ Perspective. BoF on Benchmarking at SC22. Dallas, US, November 2022.
  • Giving away Control for Scalability - Incremental Improvements for Legacy HPC Programming Models. ScalPerf22. Bertinoro, Italy. September 2022
  • Cache Analysis with Callgrind. Code Optimization Workshop. LRZ, Garching. June 2022.
  • Malleability for HPC. ZIH-Colloquium. TUD, Dresden. June 2022.
  • Activities Towards the Upcoming Extension of the LRZ Flagship System. IXPUG Workshop at ISC22. Hamburg. June 2022.
  • Transparent Application-integrated Fault Tolerance in Parallel Programming Models. ScalPerf21. Bertinoro, Italy. September 2021.
  • Cache Performance Analysis with Callgrind and KCachegrind. 40th VI-HPS Tuning Workshop. LRZ, Garching. June 2021.
  • Future Computing am LRZ. Statustagung der Gauss-Allianz 2020, Oktober 2020.
  • Fujitsu A64FX - Ein erster Eindruck. GCS Meeting, September 2020.
  • More Science by Better Energy efficiency and Power Management, ScalPerf'19, Bertinoro, IT, September 2019.
  • System-wide Low-frequency Sampling for large HPC system, 13th Parallel Tools Workshop, Dresden, DE, September 2019.
  • LAIK - ein fehlertolerantes Programmiermodell für HPC, Diskussionskreis Fehlertoleranz, Dresden, DE, December 2018.
  • Are existing Parallel Programming Models Ready for Future HPC Systems? ScalPerf'18, Bertinoro, IT, September 2018.
  • Cache Performance Analysis with Callgrind and KCachegrind. 27th VI-HPS Tuning Workshop, Garching, DE, April 2018.
  • Malleability for HPC Applications: Easing the Path for Legacy Codes with LAIK. Invited Talk. MULTIPROG 2018. Manchester, UK, January 2018.
  • LAIK: A Library for Application Integrated Fault Tolerance. ScalPerf'17, Bertinoro, IT, September 2017.
  • Using LLVM for Optimized Lightweight Binary Re-Writing at Runtime. 22nd int. Workshop on high-level parallel programming models and supportive environments (HIPS 2017). Orlando, US, May 2017.
  • On the Applicability of Virtualization in an Industrial HPC Environment. 2nd Workshop on Co-Scheduling of HPC Applications, HiPEAC 2017, Stockholm, Sweden, January 2017.
  • Dynamic binary rewriting for optimization of parametric data processing. ScalPerf'16, Bertinoro, IT, September 2016.
  • DBrew - A library for dynamic binary rewriting. 4th Workshop on Runtime and Operating Systems for the Many-core Era (ROME 2016). Invited talk. Grenoble, FR, August 2016.
  • Inclusive Cost Attribution for Cache Use Profiling. Workshop for Tools for Program Development and Analysis in Computational Science at ICCS 2016 (TOOLs 2016), San Diego, US, June 2016.
  • Cache Performance Analysis with Callgrind and KCachegrind. 21th VI-HPS Tuning Workshop, Garching, April 2016.
  • Co-Scheduling: Increasing Efficiency for HPC. Minisymposium "Middleware in the Era of Extreme Scale Computing", 17th SIAM PP, Paris, April 2016.
  • Detailed Characterization of HPC Applications for Co-Scheduling. 1st Workshop on Co-Scheduling of HPC Applications, HiPEAC 2016, Prague, Czech Republic, January 2016.
  • Dynamic code generation for HPC. ScalPerf'15, Bertinoro, Italy, September 2015.
  • Analysis and Optimization of the Memory Access Behavior of Applications. Summer School "École Optimisation 2014", Université de Strasbourg, July 2014
  • Valgrind, Dynamic Binary Instrumentation and what you can do with it. Invited talk, EDF, Paris, March 2014
  • Effiziente System-Virtualisierung auf ARM-Archikturen. Eingeladener Vortrag im Rahmen der Vorlesung "Virtualisierte Systeme" (Vitalian Danciu), LMU, January 2014.
  • Architecture Simulation for HPC Programmers. Invited talk, "RBP-Vortragsreihe", LRZ, Munich. November 2013.
  • Data Transfer Requirement Analysis with Bandwidth Curves. At 6th Workshop on Productivity and Performance, Euro-Par 2013. Aachen, Germany, August 2013.
  • Implicit and Task-Based Approaches to Heterogeneous Programming. Invited lecture (with hands-on sessions) at CEA-EDF-Inria Computer Science Summer School 2013 (Programming Heterogeneous Parallel Architectures). Cadarache, France, June/July 2013.
  • Message-passing and threads. Invited talk at ComplexHPC Spring School 2013 (EU COST Action IC0805). Uppsala, Sweden. June 2013.
  • Architecture Simulation for Performance Optimization. Invited talk, LMU, Munich. January 2013.

(Co-)Organized Scientific Events

  • November 2023, Denver, US: Workshop on Programming and Performance Visualization Tools (ProTools 2023) at SC23
  • January 2019, Valencia, ES: Workshop on Novel Challenges for Scheduling of HPC Applications (COSH 2019, with HiPEAC 2019)
  • November 2019, Denver, US: Workshop on Programming and Performance Visualization Tools (ProTools 19), with SC19
  • January 2018, Manchester, UK: 3rd Workshop on Co-Scheduling of HPC Applications 2017 (COSH 2018, with HiPEAC 2018)
  • August 2017, Santiago de Compostella, ES: 10th Workshop on UnConventional High Performance Computing 2017 (UCHPC 2017, with Euro-Par 2017) 
  • August 2017, Bristol, UK: 7th International Workshop on Parallel Software Tools and Tool Infrastructures (PSTI 2017, with ICPP 2017) 
  • January 2017, Stockholm, SE: 2nd Workshop on Co-Scheduling of HPC Applications 2017 (COSH 2017, with HiPEAC 2017) 
  • December 2016, Granada, ES: First International Workshop on Data Locality in Modern Computing Systems (DLMCS 2016, with ICA3PP 2016) 
  • August 2016, Grenoble, FR: 9th Workshop on UnConventional High Performance Computing 2016 (UCHPC 2016, with Euro-Par 2016) 
  • August 2016, Philadephia, US: Sixth International Workshop on Parallel Software Tools and Tool Infrastructures (PSTI 2016
  • January 2016, Prague, CZ: 1st Workshop on Co-Scheduling of HPC Applications 2016 (COSH 2016, with HiPEAC 2016) 
  • August 2015, Vienna, AT: UCHPC 2015, Workshop at Euro-Par 2015 
  • September 2014, Minneapolis, US: Fifth International Workshop on Parallel Software Tools and Tool Infrastructures (PSTI 2014)
  • August 2014, Porto, Portugal: 7th Workshop on UnConventional High Performance Computing 2014 (UCHPC 2014)
  • May 2014, Barcelona, ES: Thematic Session on Dynamic co-optimization of applications and resource management, HiPEAC Computer Systems Week. 
  • October 2013, Lyon, France: Fourth International Workshop on Parallel Software Tools and Tool Infrastructures (PSTI 2013)
  • September 2013, Munich, Germany. PhD Forum at International Conference on Parallel Computing (ParCo2013)
  • August 2013, Aachen, Germany: 6th Workshop on UnConventional High Performance Computing 2013 (UCHPC'13)
  • May 2013, Ischia, Italy: Special Session on "Emerging Trends in Dataflow Computing" at the ACM International Conference on Computing Frontiers 2013 (see here).

Program Committee Memberships

  • IEEE Cluster 2024
  • IEEE International Conference on Computer Design 2024 (ICCD 2024)
  • ISC High Performance 2024, Poster PC
  • Computing Frontiers 2024 (CF24)
  • International Conference on Parallel Processing and Applied Mathematics (PPAM24)
  • International Symposium on Parallel and Distributed Computing 2024 (ISPCD24)
  • International Workshop on SYCL and OpenCL 2024 (IWOCL 2024)
  • Workshop on Programming and Performance Visualization Tools (ProTools 2024) at SC24
  • Sustainability in HPC: Vision and Opportunities 2024, Workshop at SC24
  • IEEE International Conference on Computer Design 2023 (ICCD 2023)
  • International Conference for High Performance Computing, Networking, Storage, and Analysis (SC23), Workshop Selection Comittee
  • Computing Frontiers 2023 (CF23)
  • GI/ITG International Conference on Architecture of Computing Systems 2023 (ARCS23)
  • International Symposium on Parallel and Distributed Computing 2023 (ISPCD23)
  • Sustainability in HPC: Vision and Opportunities 2023, Workshop at SC23
  • Malleability Techniques Applications in High-Performance Computing (HPCMALL23), Workshop at SC23
  • International Workshop on SYCL and OpenCL 2023 (IWOCL 2023)
  • Workshop on Tools for Data Locality, Power and Performance (TDLPP 2023) at EuroPar23
  • Computing Frontiers 2022 (CF22), PC Main Track + Posters
  • HPC ASIA 2022 - SuperComputing Asia
  • International Conference for High Performance Computing, Networking, Storage, and Analysis (SC22), Workshop Selection Comittee
  • International Symposium on Parallel and Distributed Computing 2022 (ISPCD22)
  • 27nd International Conference on Parallel and Distributed Computing (Euro-Par 2021)
  • ACM International Conference on Computing Frontiers 2021 (CF 2021)
  • 20th IEEE International Symposium on Parallel and Distributed Computing (ISPDC-2021)
  • 9th International Workshop on OpenCL and SYCL (IWOCL/SYCLcon 2021)
  • Int. Conference on High Performance Computing in Asia-Pacific Region (HPC Asia 2022)
  • Workshop on Performance Monitoring and Analysis of Cluster Systems (PMACS 2021), with EuroPar 2021
  • ACM International Conference on Computing Frontiers 2020 (CF 2020)
  • 20th IEEE International Symposium on Parallel and Distributed Computing (ISPDC-2020)
  • 8th International Workshop on OpenCL and SYCL (IWOCL/SYCLcon 2020)
  • ISC High Performance 2019 (ISC 2019)
  • ACM International Conference on Computing Frontiers 2019 (CF 2019)
  • 18th IEEE International Symposium on Parallel and Distributed Computing (ISPDC-2019)
  • 7th Workshop on Runtime and Operating Systems for the Many-core Era (ROME 2019)
  • 4rd Workshop on Performance and Scalability of Storage Systems 2018 (WOPSSS 2019)
  • Workshop on Tools for Program Development and Analysis in Computational Science (TOOLS 2019), with ICCS 2019
  • 7th International Workshop on OpenCL (IWOCL 2019)
  • IEEE International Parallel and Distributed Processing Symposium 2018 (IPDPS 2018)
  • ISC High Performance 2018 (ISC 2018), Performance Track
  • International Symposium on Parallel and Distributed Computing (ISPDC-2018)
  • 6th Workshop on Runtime and Operating Systems for the Many-core Era (ROME 2018)
  • 3rd Workshop on Performance and Scalability of Storage Systems 2018 (WOPSSS 2018)
  • Parallel Computing Conference 2017 (ParCo 2017
  • ACM International Conference on Computing Frontiers 2017 (CF 2017
  • 23nd International Conference on Parallel and Distributed Computing (Euro-Par 2017) 
  • 16th International Symposium on Parallel and Distributed Computing (ISPDC 2017) 
  • International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD 2017) 
  • 2nd Workshop on Co-Scheduling of HPC Applications 2017 (COSH 2017) 
  • Tools for Program Development and Analysis in Computational Science (TOOLS 2017) at ICCS 2017 
  • 2nd SYCL Programming Workshop (SYCL17) 
  • 1st Workshop on Computer Architectures in Space (CompSpace17) at ARCS 2017 
  • 22nd Workshop on High-Level Parallel Programming Models and Supportive Environments (HIPS 2017) at IPDPS 17 
  • 13th Workshop on Dependability and Fault Tolerance (VERFE17, with ARCS 2017) 
  • 5th Workshop on Runtime and Operating Systems for the Many-core Era (ROME 2017) at Euro-Par 2017 
  • 10th Workshop on UnConventional High Performance Computing 2017 (UCHPC 2017 at Euro-Par 2017) 
  • ACM International Conference on Computing Frontiers 2016 (CF 2016
  • IEEE International Parallel and Distributed Processing Symposium 2016 (IPDPS 2016
  • International Conference on High Performance Computing and Simulation (HPCS 2016
  • 2nd IEEE International Conference on Green High Performance Computing (ICGHPC16
  • 1st SYCL Programming Workshop (SYCL16, with PPoPP16) 
  • 12th Workshop on Dependability and Fault Tolerance (VERFE16, with ARCS 2016) 
  • 1st Workshop on Co-Scheduling of HPC Applications 2016 (COSH 2016, with HiPEAC 2016) 
  • 1st Workshop on Performance and Scalability of Storage Systems 2016 (WOPSSS, with ISC 2016) 
  • International Workshop on High Performance Platform Management 2016 (HPPM 2016
  • Nineth Workshop on UnConventional High Performance Computing 2016 (UCHPC 2016, with Euro-Par 2016) 
  • Sixth International Workshop on Parallel Software Tools and Tool Infrastructures (PSTI 2016
  • International Workshop on Performance, Power and Energy-Efficiency Optimization in Heterogeneous Systems (PPEO 2016, with VECPAR16) 
  • 4th Workshop on Runtime and Operating Systems for the Many-core (ROME16, with Euro-Par 2016) 
  • CF15, IPDPS15, ISPDC15, HIPC15, E-MuCoCoS15, UCHPC15, ROME15
  • ISPDC14, PSTI 2014, UCHPC14, ROME 2014, VERFE14
  • CF13, PARCO13, UCHPC13
  • SC11, Facing the Multicore-Challange 2011, PROPER11, HipHac11

Other Organizational Roles

  • Reproducability Chair SC25
  • Steering Committee Member for Computing Frontiers Conference
  • General Chair, CF 2024
  • Publication Chair, CF 2019 - 2023
  • Program Chair, CF 2018
  • Local Arrangements Chair, ICS 2014 
  • Web Chair, Computing Frontiers 2014, SERESSA 2017 
  • Publicity Chair, Computing Frontiers 2013 
  • Special Session Char, Computing Frontiers 2012 
  • Publicity Chair, Computing Frontiers 2010 
  • Workshop Chair, Computing Frontiers 2009

Publications

2024

  • Amir Raoofy, Bengisu Elis, Vincent Bode, Minh Thanh Chung, Sergej Breiter, Maron Schlemon, Dennis-Florian Herr, Karl Fuerlinger, Martin Schulz, and Josef Weidendorfer. The BEAST LAB: A Practical Course on Experimental Evaluation of Diverse Modern HPC Architectures and Accelerators. Journal of Computational Science Education, volume 15. Shodor Education Foundation, 2024 (DOI:10.22369/issn.2153-4136/15/1/5).
  • Jakob Schäffeler, Bengisu Elis, Amir Raoofy, Josef Weidendorfer, and Martin Schulz. A Portable Tool to Compare Performance Profiles from GPU Offloading Programming Models. In Proceedings of the 21st ACM International Conference on Computing Frontiers. Association for Computing Machinery. New York, NY, USA, 2024 (DOI:10.1145/3649153.3652997).

2023

  • Minh Thanh Chung, Josef Weidendorfer, Karl Fürlinger, and Dieter Kranzlmüller. From reactive to proactive load balancing for task-based parallel applications in distributed memory machines. Concurrency and Computation: Practice and Experience. Wiley, 2023 (DOI:10.1002/cpe.7828).
  • Julian Scheipl, Amir Raoofy, Michael Ott, and Josef Weidendorfer. Phase-aware System-Side Sampling for HPC. In Proceedings of the 20th ACM International Conference on Computing Frontiers, pages 220-221. ACM, 2023 (DOI:10.1145/3587135.3592181).
  • Minh Thanh Chung, Josef Weidendorfer, Karl Fürlinger, and Dieter Kranzlmüller. Proactive Task Offloading for Load Balancing in Iterative Applications. In Roman Wyrzykowski, Jack Dongarra, Ewa Deelman, and Konrad Karczewski, editors, Parallel Processing and Applied Mathematics. Springer International Publishing. Cham, 2023.
  • Sergej Breiter, Josef Weidendorfer, Minh Thanh Chung, and Karl Fürlinger. A Profiling-Based Approach to Cache Partitioning of Program Data. In Hiroyuki Takizawa, Hong Shen, Toshihiro Hanawa, Jong Hyuk Park, Hui Tian, and Ryusuke Egawa, editors, Parallel and Distributed Computing, Applications and Technologies (PDCAT), pages 453-463. Springer Nature Switzerland. Cham, 2023.

2022

  • Amir Raoofy, Josef Weidendorfer, and Michael Ott. Always-on Instrumentation for Application Introspection in HPC. In Proceedings of the 19th ACM International Conference on Computing Frontiers. Association for Computing Machinery. New York, NY, USA, 2022 (DOI:10.1145/3528416.3530863).
  • Eishi Arima, Minjoon Kang, Issa Saba, Josef Weidendorfer, Carsten Trinitis, and Martin Schulz. Optimizing Hardware Resource Partitioning and Job Allocations on Modern GPUs under Power Caps. In Workshop Proceedings of the 51st International Conference on Parallel Processing. Association for Computing Machinery. New York, NY, USA, 2022 (DOI:10.1145/3547276.3548630).

2021

2020

  • Minh Thanh Chung, Josef Weidendorfer, Philipp Samfass, Karl Fuerlinger, and Dieter Kranzmüller. Scheduling across Multiple Applications using Task-Based Programming Models. In Proceedings of the Fourth Annual Workshop on Emerging Parallel and Distributed Runtime Systems and Middleware (IPDRM 2020). IEEE. Atlanta, US, November 2020.

2019

  • Amir Raoofy, Dai Yang, Josef Weidendorfer, Carsten Trinitis, and Martin Schulz. Enabling Malleability for Livermore Unstructured Lagrangian Explicit Shock Hydrodynamics using LAIK. In Proceedings of the 29th PARS Workshop (PARS 2019). Berlin, DE, 2019.
  • David Boehme, Kevin Huck, Jonathan Madsen, and Josef Weidendorfer. The Case for a Common Instrumentation Interface for HPC Codes. In Proceedings of the 2019 IEEE/ACM International Workshop on Programming and Performance Visualization Tools (ProTools 2019). Denver, US, 2019.
  • Abhinav Bhatele, David Boehme, Tom Vierjahn, and Josef Weidendorfer, editors. Proceedings of the IEEE/ACM International Workshop on Programming and Performance Visualization Tools (ProTools 2019). IEEE. Denver, US, November 2019.
  • Josef Weidendorfer, Carla Guillen, and Michael Ott. Proceedings of the 13th Parallel Tools Workshop, chapter System-wide Low-frequency Sampling for Large HPC Systems. Dresden, DE, 2019.

2018

  • Carsten Trinitis and Josef Weidendorfer, editors. Proceedings of the 3rd Workshop on Co-Scheduling of HPC Applications (COSH 2018). TUM Library. Manchester, UK, January 2018 (DOI:10.14459/2018md1428535).
  • Minh Lê and Josef Weidendorfer. A Message-Passing based Algorithm for k-Terminal Reliability. In 14th European Dependable Computing Conference (EDCC 2018). Iasi, RO, 2018.

2017

  • Carsten Clauss, Stefan Lankes, Carsten Trinitis, and Josef Weidendorfer, editors. Proceedings of the Joined Workshops COSH 2017 and VisorHPC 2017. TUM Library, January 2017 (DOI:10.14459/2017md1344297).
  • Tilman Küstner, Carsten Trinitis, Josef Weidendorfer, Andreas Blaszczyk, Patrik Kaufmann, and Marcus Johansson. On the Applicability of Virtualization in an Industrial HPC Environment. In 1st Workshop on Virtualization Solutions for High-Performance Computing (VisorHPC'17). Stockholm, SE, 2017.
  • Carsten Trinitis and Josef Weidendorfer, editors. Co-Scheduling of HPC Applications. Volume 28 of Advances in Parallel Computing. IOS Press. Amsterdam, The Netherlands, January 2017.
  • Carsten Trinitis, Josef Weidendorfer, and André Brinkmann. Co-Scheduling of HPC Applications, chapter Co-Scheduling: Prospects and Challenges, pages 1-11. IOS Press. Amsterdam, The Netherlands, 2017.
  • Jens Breitbart and Josef Weidendorfer. Co-Scheduling of HPC Applications, chapter Detailed Application Characterization and Its Use for Effective Co-Scheduling, pages 69-94. IOS Press. Amsterdam, The Netherlands, 2017.
  • Alexis Engelke and Josef Weidendorfer. Using LLVM for Optimized Light-Weight Binary Re-Writing at Runtime. In Proceedings of the 22st int. Workshop on High-Level Parallel Programming Models and Supportive Environments (HIPS 2017). Orlando, US, 2017 (Slides).
  • Josef Weidendorfer, Dai Yang, and Carsten Trinitis. LAIK: A Library for Fault Tolerant Distribution of Global Data for Parallel Applications. In Proceedings of the 27th PARS Workshop (PARS 2017). Hagen, DE, 2017.
  • Carsten Trinitis and Josef Weidendorfer. Cachepartionierung im Kontext von Co-Scheduling. In Proceedings of the 27th PARS Workshop (PARS 2017). Hagen, DE, 2017.
  • Frédéric Desprez, Pierre-François Dutot, Christos Kaklamanis, Loris Marchal, Korbinian Molitorisz, Laura Ricci, Vittorio Scarano, Miguel A. Vega-Rodriguez, Ana Lucia Varbanescu, Sascha Hunold, Stephen L. Scott, Stefan Lankes, and Josef Weidendorfer, editors. Euro-Par 2016: Parallel Processing Workshops, volume 10104 of Lecture Notes in Computer Science. Springer, 2017.
  • Jens Breitbart, Simon Pickartz, Josef Weidendorfer, Stefan Lankes, and Antonello Monti. Dynamic Co-scheduling Driven by Main Memory Bandwidth Utilization. In 2016 IEEE International Conference on Cluster Computing (CLUSTER 2017), September 2017.
  • Dai Yang, Josef Weidendorfer, Carsten Trinitis, Tilman Küstner, and Sibylle Ziegler. Enabling Application-Integrated Proactive Fault Tolerance. In ParCo 2017: Proceedings of Parallel Computing 2017. IOS Press. Bolognia, Italy, September 2017.

2016

  • Carsten Trinitis and Josef Weidendorfer, editors. Proceedings of the 1st COSH Workshop on Co-Scheduling of HPC Applications. TUM Library, January 2016 (DOI:10.14459/2016md1285489).
  • Josef Weidendorfer and Jens Breitbart. Detailed Characterization of HPC Applications for Co-Scheduling. In Proceedings of the first Workshop on Co-Scheduling of HPC Applications (COSH 2016). Prague, Czech Republic, 2016.
  • Josef Weidendorfer and Jens Breitbart. The Case for Binary Rewriting at Runtime for Efficient Implementation of High-Level Programming Models in HPC. In Proceedings of the 21st int. Workshop on High-Level Parallel Programming Models and Supportive Environments (HIPS 2016). Chicago, US, 2016 (Slides).
  • Josef Weidendorfer and Jens Breitbart. Inclusive Cost Attribution for Cache Use Profiling. In Proceedings of the Workshop for Tools for Program Development and Analysis in Computational Science at ICCS 2016 (TOOLs 2016). San Diego, US, 2016.
  • Jens Breitbart, Josef Weidendorfer, and Carsten Trinitis. Automatic Co-scheduling based on Main Memory Bandwidth Usage. In Proceedings of the 20th Workshop on Job Scheduling Strategies for Parallel Processing (JSSPP 2016). Chicago, US, 2016.
  • Josef Weidendorfer. Simulation Driven Performance Analysis for Software Optimization. Technische Universität München, 2016. Habilitation Thesis (DOI:10.14459/2015MD1303711).
  • Jens Breitbart, Simon Pickartz, Josef Weidendorfer, and Antonello Monti. Viability of Virtual Machines in HPC. In 4th Workshop on Runtime and Operating Systems for the Many-core Era (ROME 2016). Grenoble, FR, 2016.
  • Jesus Carretero, Javier Garcia-Blas, Victor Gergel, Vladimir Voevodin, Iosif Meyerov, Juan A. Rico-Gallego, Juan C. Díaz-Martín, Pedro Alonso, Juan Durillo, José Daniel Garcia Sánchez, Alexey L. Lastovetsky, Fabrizio Marozzo, Qin Liu, Zakirul Alam Bhuiyan, Karl Fürlinger, Josef Weidendorfer, and José Gracia, editors. ICA3PP 2016 Collocated Workshops: SCDT, TAPEMS, BigTrust, UCER, DLMCS. Algorithms and Architectures for Parallel Processing, volume 10049 of Lecture Notes in Computer Science. Springer, 2016.

2015

  • Jens Breitbart, Josef Weidendorfer, and Carsten Trinitis. Case Study on Co-Scheduling for HPC Applications. In International Workshop on Scheduling and Resource Management for Parallel and Distributed Systems (SRMPDS 2015). Beijing, China, 2015.
  • S. Hunold, A. Costan, D. Giménez, A. Iosup, L. Ricci, M.E. Gómez Requena, V. Scarano, A.L. Varbanescu, S.L. Scott, S. Lankes, J. Weidendorfer, and M. Alexander, editors. Proceedings of UCHPC15. In Euro-Par 2015: Parallel Processing Workshops, volume 9523 of Lecture Notes in Computer Science. Springer, 2015.

2014

  • Minh Le, Max Walter, and Josef Weidendorfer. Improving the Kuo-Lu-Yeh algorithm for assessing Two-Terminal Reliability. 10th European Dependable Computing Conference (EDCC 2014). Newcastle, UK, 2014.
  • Minh Le, Josef Weidendorfer, and Max Walter. A Novel Variable Ordering Heuristic for BDD-Based k-Terminal Reliability. 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN 2014). Atlanta, US, 2014.
  • Dieter an Mey, Michael Alexander, Paolo Bientinesi, Mario Cannataro, Carsten Clauss, Alexandru Costan, Gabor Kecskemeti, Christine Morin, Laura Ricci, Julio Sahuquillo, Martin Schulz, Vittorio Scarano, Stephen L. Scott, and Josef Weidendorfer, editors. Proceedings of UCHPC13. Euro-Par 2013: Parallel Processing Workshops. BigDataCloud, DIHC, FedICI, HeteroPar, HiBB, LSDVE, MHPC, OMHI, PADABS, PROPER, Resilience, ROME, and UCHPC 2013. Aachen, Germany, August 26-27, 2013. Revised Selected Papers, volume 8374 of Lecture Notes in Computer Science. Springer, 2014.

2013

  • David Büttner, Jean-Thomas Aquaviva, and Josef Weidendorfer. Real Asynchronous MPI Communication in Hybrid Codes through OpenMP Communication Tasks. In 19th IEEE International Conference on Parallel and Distributed Systems. Seoul, Korea, 2013.
  • Josef Weidendorfer. Data Transfer Requirement Analysis with Bandwidth Curves. In 6th Workshop on Productivity and Performance. EuroPar 2013 Workshops. Aachen, Germany, 2013.
  • Ioannis Caragiannis, Michael Alexander, Rosa Maria Badia, Mario Cannataro, Alexandru Costan, Marco Danelutto, Frédéric Desprez, Bettina Krammer, Julio Sahuquillo, Stephen L. Scott, and Josef Weidendorfer, editors. Euro-Par 2012: Parallel Processing Workshops. BDMC, CGWS, HeteroPar, HiBB, OMHI, Paraphrase, PROPER, Resilience, UCHPC, VHPC, Rhodes Islands, Greece, August 27-31, 2012. Revised Selected Papers. Euro-Par Workshops, volume 7640 of Lecture Notes in Computer Science. Springer, 2013.
  • Thomas Müller, Josef Weidendorfer, and Andreas Blaszczyk. Expression Tree Evaluation by Dynamic Code Generation - Are Accelerators up for the Task?. In Proceedings of the 2013 International Conference on Parallel Processing (ICPP-2013). Lyon, France, 2013.

2012

  • Alin F. Murarasu, Gerrit Buse, Dirk Pflüger, Josef Weidendorfer, and Arndt Bode. fastsg: A Fast Routines Library for Sparse Grids. In Proceedings of the International Conference on Computational Science (ICCS 2012), volume 9 of Procedia CS, pages 354-363, 2012.
  • A. F. Murarasu and J. Weidendorfer. Building Input Adaptive Parallel Applications: A Case Study of Sparse Grid Interpolation. In Proceedings of the 15th IEEE International Conference on Computational Science and Engineering (ICCSE), 2012.
  • Michael Gerndt, Frank Hannig, Andreas Herkersdorf, Andreas Hollmann, Marcel Meyer, Sascha Roloff, Josef Weidendorfer, Thomas Wild, and Aurang Zaib. An integrated Simulation Framework for Invasive Computing. In Forum on specification and Design Languages (FDL 2012), pages 209-216, 2012.
  • Michael Gerndt, Andreas Hollmann, Marcel Meyer, Martin Schreiber, and Josef Weidendorfer. Invasive computing with iOMP. In Forum on specification and Design Languages (FDL 2012), pages 225-231, 2012.
  • Michael Alexander, Pasqua D'Ambra, Adam Belloum, George Bosilca, Mario Cannataro, Marco Danelutto, Beniamino Di Martino, Michael Gerndt, Emmanuel Jeannot, Raymond Namyst, Jean Roman, Stephen L. Scott, Jesper Larsson Träff, Geoffroy Vallée, and Josef Weidendorfer, editors. Euro-Par 2011: Parallel Processing Workshops - CCPI, CGWS, HeteroPar, HiBB, HPCVirt, HPPC, HPSS, MDGS, ProPer, Resilience, UCHPC, VHPC, Bordeaux, France, August 29 - September 2, 2011, Revised Selected Papers. Euro-Par Workshops, volume 7155 of Lecture Notes in Computer Science. Springer, 2012.
  • Minh Lê, Max Walter, and Josef Weidendorfer. A Memory-efficient Bounding Algorithm for the Two-terminal Reliability Problem. In Second Workshop on Quantitative Models for Performance and Dependability (QMPD 2012), volume 291 of Electronic Notes in Theoretical Computer Science, pages 15-25. Elsevier, 2012.

2011

  • Alin Murarasu, Josef Weidendorfer, and Arndt Bode. Workload Balancing on Heterogeneous Systems: A Case Study of Sparse Grid Interpolation. In The 4th Workshop on UnConventional High Performance Computing 2011 (UCHPC 2011). Springer, 2011.
  • Michael Bader, Hans-Joachim Bungartz, Michael Gerndt, Andreas Hollmann, and Josef Weidendorfer. Invasive Programming as a Concept for HPC. In Proceedings of the 10h IASTED International Conference on Parallel and Distributed Computing and Networks 2011 (PDCN 2011). Innsbruck, Austria, 2011.
  • Alin Murarasu, Josef Weidendorfer, Gerrit Buse, Daniel Butnaru, and Dirk Pflüeger. Compact Data Structure and Scalable Algorithms for the Sparse Grid Technique. In Proceedings of the 16th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP'11), February 2011.
  • Josef Weidendorfer, Tilman Küstner, and Sally A. McKee. Performance Optimization by Dynamic Code Transformation. In Proceedings of Computing Frontiers 2011. ACM Press, 2011. Poster Abstract.
  • Josef Weidendorfer. Intel Core Microarchitecture, x86 Processor Family. In David Padua, editor, Encyclopedia of Parallel Computing. Springer, 2011.

2010

  • Alexander Heinecke, Carsten Trinitis, and Josef Weidendorfer. Porting existing cache-oblivious Linear Algebra HPC Modules to Larrabee Architecture. In Proceedings of Computing Frontiers 2010. ACM Press, May 2010.
  • Hans Hacker, Carsten Trinitis, Josef Weidendorfer, and Matthias Brehm. Considering GPGPU for HPC Centers: Is it Worth the Effort?. In Proceedings of Conference for Young Scientists: Facing the Multicore-Challenge. Heidelberg, 2010.
  • Carsten Trinitis, Tilman Küstner, Josef Weidendorfer, and J. Smajic. Sparse Matrix Operations on Multi-Core Architectures. Journal of Supercomputing (selected papers of PaCT-2009). Springer, 2010 (DOI:10.1007/s11227-010-0428-9).
  • Tilman Küstner, Peter Pedron, Jasmine Schirmer, Melanie Hohberg, Josef Weidendorfer, and Sibylle I. Ziegler. Fast System Matrix Generation Using the Detector Response Function Model on Nvidia Fermi GPUs. In Proceedings of IEEE Medical Imaging Conference 2010. IEEE, 2010. Extended Abstract.

2009

  • Tilman Küstner, Josef Weidendorfer, and Tobias Weinzierl. Argument Controlled Profiling. In Proceedings of 2nd Workshop on Productivity and Performance (PROPER 2009). Springer, 2009.
  • Tilman Küstner, Josef Weidendorfer, Jasmin Schirmer, T. Klug, C. Trinitis, and S. Ziegler. Parallel MLEM on Multicore Architectures. In Proceedings of 9th International Conference on Computational Science (ICCS 2009), number 5544 of LNCS, pages 491-500. Springer, 2009.
  • Michael Bader and Josef Weidendorfer. Exploiting Memory Hierarchies in Scientific Computing. Extended Abstract. In Proceedings of the 2009 International Conference on High Performance Computing and Simulation (HPCS2009). IEEE. Leipzig, Germany, 2009.
  • Carsten Trinitis, Tilman Küstner, Josef Weidendorfer, and J. Smajic. Sparse Matrix Operations on Multi-Core Architectures. In Proceedings of 11th International Conference on Parallel Computing Technologies (PaCT 2009). Springer, 2009.
  • Stephan M. Günther and Josef Weidendorfer. Assessing cache false sharing effects by dynamic binary instrumentation. In WBIA '09: Proceedings of the Workshop on Binary Instrumentation and Applications, pages 26-33. ACM Press. New York, 2009.

2008

  • Tobias Klug, Michael Ott, Josef Weidendorfer, and Carsten Trinitis. Autopin --- Automated Optimization of Thread-to-Core Pinning on Multicore Systems. Transactions on High-Performance Embedded Architectures and Compilers (Transactions on HiPEAC), volume 3 (4), pages 219-235, 2008.
  • Josef Weidendorfer and Carsten Trinitis. Off-loading application controlled data prefetching in numerical codes for multicore processors. International Journal of Computational Science and Engineering (IJCSE), volume 4 (1), pages 22-28, 2008.
  • Josef Weidendorfer. Tools for high performance computing: Proceedings of the 2nd International Workshop on Parallel Tools for High Performance Computing, chapter Sequential Performance Analysis with Callgrind and KCachegrind. Springer. Stuttgart, Juli 2008.
  • Michael Ott, Tobias Klug, Josef Weidendorfer, and Carsten Trinitis. autopin - Automated Optimization of Thread-to-Core Pinning on Multicore Systems. Proceedings of 1st Workshop on Programmability Issues for Multi-Core Computers (MULTIPROG), 2008.
  • Julian Seward, Nick Nethercode, and Josef Weidendorfer. Valgrind 3.3. Advanced Debugging and Profiling for GNU/Linux applications. Network Theory Limited. UK, March 2008.

2007

  • Josef Weidendorfer, Michael Ott, Tobias Klug, and Carsten Trinitis. Latencies of Conflicting Writes on Contemporary Multi-core Architectures. In Ninth International Conference on Parallel Computing Technologies (PaCT-2007), number 4671 of LNCS, pages 318-327. Springer. Pereslavl-Zalessky, Russia, 2007.
  • Josef Minde, Josef Weidendorfer, Tobias Klug, and Carsten Trinits. PET-Bildrekonstruktion auf der Cell-BE. In Tagungsband Kommunikation in Clusterrrechnern und Clusterverbundsystemen, 3. Tagung. RWTH Aachen. Aachen, Germany, December 2007.
  • Josef Weidendorfer, Michael Ott, Tobias Klug, and Carsten Trinitis. False Sharing auf aktuellen Mikroprozessoren. In Tagungsband 2. Workshop "Kommunikation in Clusterrechnern und Clusterverbundsystemen" (KiCC), volume ISSN 0947-5125. Chemnitz, Deutschland, February 2007.
  • Josef Weidendorfer. Understanding Memory Access Bottlenecks on Multi-core. In Book of Abstracts of the International Conference ParCo 2007, volume 37 of NIC Series. Forschungszentrum Jülich, 2007.

2006

  • Josef Weidendorfer and Carsten Trinitis. Block Prefetching for Numerical Codes. In Proceedings of 19th Symposium on Simulation Techniques (ASIM 2006). Hannover, Germany, September 2006.
  • Carsten Trinits, Tobias Klug, Max Walter, and Josef Weidendorfer. Automatic High Voltage Apparatus Optimization. Dresden, Germany, June 2006.
  • Josef Weidendorfer and Carsten Trinitis. Cache Optimizations for Iterative Numerical Codes Aware of Hardware Prefetching. In Applied Parallel Computing, State of the Art in Scientific Computing, 7th International Workshop, PARA 2004, Lyngby, Denmark, June 20-23, 2004, Revised Selected Papers, volume 3732 of LNCS, pages 921-927. Springer, 2006 (Slides).

2005

  • Josef Weidendorfer and Carsten Trinitis. Collecting and Exploiting Cache-Reuse Metrics. In ICCS 2005: 5th International Conference on Computational Science, volume 3515 of LNCS, pages 191-198. Springer, May 2005 (Slides).
  • Jie Tao, Jürgen Jeitner, Carsten Trinitis, Wolfgang Karl, and Josef Weidendorfer. Comprehensive Cache Inspection with Hardware Monitors. In Proceedings of 9th International Conference on Parallel Computing Technologies (PaCT 2005), number 3606 of LNCS, pages 331-345. Springer, 2005.

2004

2003

  • Josef Weidendorfer. Konzepte zur Optimierung der Skalierbarkeit von parallelen Fahrzeugkollisionsberechnungen und ihre industrielle Realisierbarkeit. Dissertation. Volume 29 of Research Report Series, Lehrstuhl für Rechnertechnik und Rechnerorganisation, Technische Universität München. Shaker Publishing, 2003.

2001

  • Josef Weidendorfer and Peter Luksch. A Framework for Transparent Load Balancing in Parallel Numerical Simulation. In Proceedings of the 34th annual symposium on Simulation, pages 125-132. IEEE Computer Society. Washington, DC, USA, 2001.

1999

  • Hermann Hellwagner and Josef Weidendorfer. SCI Sockets Library. In SCI: Scalable Coherent Interface, Architecture and Software for High-Performance Compute Clusters, volume 1734 of LNCS, pages 209-229. Springer. London, UK, 1999.
  • Josef Weidendorfer. Load balancing Contact 36 in PAMCRASH MPP. In Proceedings of EURO-PAM'99. Darmstadt, Germany, 1999.

1998

  • Michael Eberl, Hermann Hellwagner, Wolfgang Karl, Markus Leberecht, and Josef Weidendorfer. Fast Communication Libraries on a SCI Cluster. In Hermann Hellwagner and Alexander Reinefeld, editors, Scalable Coherent Interface: Technology and Applications. Proceedings of SCI Europe'98, pages 165-175. Cheshire Henbury Tamwoth House, P.O. Box 103 Macclesfield SK11 8UW, UK, 1998.

1997

  • Josef Weidendorfer. Entwurf und Implementierung einer Socket-Bibliothek für ein SCI-Netzwerk. Diploma thesis. Institut für Informatik, Technische Universität München. February 1997.