Seminar Integrierte Systeme

Scientific Seminar on Integrated Systems

Lecturer (assistant)	Shichen Huang [L] Shichen Huang [L] Andreas Herkersdorf Thomas Wild
Type	seminar
Duration	3 SWS
Term	Wintersemester 2025/26
Language of instruction	German

Dates

Admission information

Note: Note: Limited number of participants! Registration via TUMonline from Sep. 22nd to Oct. 19th 2025 is required. Students have to choose a seminar topic before Oct. 31st 2025. Therefore you need to contact the supervisor of the topic you are interested in. Topics are selected on a first-come first-served basis. Topics will be published on Oct 6th 2025 at https://www.ce.cit.tum.de/en/lis/teaching/seminars/seminar-integrierte-systeme/ Attendance at the introductory lecture on October 23, 2025 (Thursday) at 10:30 in room N2128 is mandatory to secure your topic. Three conditions must be met to successfully enroll in this course: Registration via TUMonline. Attendance at the introductory lecture. Confirmation of a topic from a supervisor.

Objectives

At the end of the seminar, the student is able to present a state-of-the-art literature review in the area of integrated systems in an understandable and convincing manner.
The following competencies will be acquired:
* The student is able to independently analyze state-of-the-art concepts in the field of integrated systems.
* The student is able to present a topic in a structured way according to problem formulation, state of the art, goals, methods, and results.
* The student can present a topic according to the structure given above orally with a set of slides and with a written report.

Description

Specific topics in the area of integrated circuits and systems will be offered.

The participants independently work on a current scientific topic, write a paper, design a poster and present their topic in a talk. In the subsequent discussion, the topic will be treated in-depth.

Prerequisites

Basic knowledge of integrated circuits and systems and their applications.

Teaching and learning methods

Participants elaborate a given scientific topic by themselves in coordination with the respective research assistant.
If all three conditions described in the "Course Criteria & Registration" are met, you may voluntarily attend the writing and presentation workshop offered by the EDA chair (at 15:00 - 16:30, in room 2999, in English):
03.11.2025: Scientific Writing Lecture
17.11.2025: Presentation Training Lecture
(limited places due to room capacity, first-come, first-served).

Examination

Examination with the following elements:
- 50 % paper of 4 pages in IEEE format
- 50 % presentation of 15-20 minutes and subsequent questions

Recommended literature

Themen-spezifische Literatur wird vom jeweiligen Betreuer empfohlen und soll durch eigene Recherchen ergänzt werden.

Offered Topics

Seminars

Download thesis as PDF

Low-power Asynchronous Neural Networks

Description

Description
Neural networks (NNs) have seen great improvements over the last decades and have consequently been adopted for a multitude of applications. While much more capable in certain areas than prior solutions, NNs have one big drawback.

A neural network requires much more power than traditional computational models, making them generally unsuited for embedded devices. The rapid adoption also poses challenges for high performance models, as the amount of processing power required for widespread use strains the existing power grid - with construction of AI data-centers significantly outpacing construction of new power plants. Clearly this growth is unsustainable unless these challenges are addressed.

In part to address these issues, research has been ongoing into techniques which may avoid the high computational cost and power dissipation of standard neural networks, such as Convolutional Neural Networks (CNNs). Particularly for event driven computation, models such as Spiking Neural Networks (SNNs) and/or asynchronous neural networks offer potentially significant benefits; as event driven applications only require that computation is performed once a new event occurs, power can be saved by only being active when a computation is required. Asynchronous circuits take this idea to the extreme by completely avoiding all dynamic power dissipation except when subcircuits have valid inputs available.

Task
For this seminar topic, the student is expected to look into the state-of-the-art for asynchronous neural networks and provide a summary of relevant research. Papers that could serve as potential starting points can be seen below, but the student is free to pursue the topic as they want, within the confines of the scope given in this description.

Starting points

A 28nm Configurable Asynchronous SNN Accelerator with Energy-Efficient Learning
DYNAP-SE2: a scalable multi-core dynamic neuromorphic
asynchronous spiking neural network processor
Design and Tool Flow of a Reconfigurable Asynchronous
Neural Network Accelerator
A 2048-Neuron Spiking Neural Network
Accelerator with Neuro-Inspired Pruning and
Asynchronous Network on Chip in 40nm CMOS

Supervisor:

William Wulff

Download thesis as PDF

Categorization of Ethernet-Detected Anomalies Induced by Processing Unit Deviations

Description

Sporadic anomalies in automotive systems can degrade performance over time and may originate from various system components. In automotive applications, anomalies are often observed at the sensor and ECU levels, with potential propagation through the in-vehicle network via Ethernet. Such anomalies may be the result of deviations in electronic control units, highlighting the importance of monitoring these signals over Ethernet.

Not all processing anomalies are equally detectable over Ethernet due to inherent limitations in the monitoring techniques and the nature of the anomalies. This seminar will explore various anomaly categories, investigate their potential causes, and assess the likelihood of their propagation through the network.

The goal of this seminar is to provide a comprehensive analysis of these anomaly categories, evaluate the underlying causes, and discuss the potential for their detection and mitigation when monitored over Ethernet.

Contact

Zafer Attal

zafer.attal@tum.de

Supervisor:

Zafer Attal

Assigned Topics

Seminars

Download thesis as PDF

The PULP Platform and Its Efforts Around the CVA6 RISC-V Core

Description

The Parallel Ultra-Low Power (PULP) platform is an open-source hardware and software ecosystem developed by ETH Zürich and the University of Bologna to explore energy-efficient computing. It provides scalable multi-core architectures, SoC components, and toolchains designed for applications where power consumption is critical, such as edge AI, IoT, and embedded sensing.

At its core, PULP focuses on parallelism and low-power techniques, combining lightweight RISC-V processors, tightly coupled memory hierarchies, and domain-specific accelerators. The modular and flexible platform enables researchers and developers to prototype custom system-on-chip designs while leveraging a growing suite of open-source IP blocks, including the CVA6 RISC-V core.

This CPU core, formerly Ariane, is a 64-bit, in-order, six-stage-pipelined application-class processor compatible with the RISC-V RV64GC instruction set. Though optimized for energy efficiency, it is powerful enough to boot operating systems such as Linux or FreeRTOS. With standard interfaces like AXI for memory and peripheral access, a rich toolchain, and a release under the permissive Solderpad license, the CVA6 core is very useful for system-on-chip integration in research.

In this seminar topic, the student should further investigate the PULP platform, its contributions, including the CVA6 core, and especially its recent projects on novel system architectures. Examples for case studies are the lightweight Cheshire and coherence-focused Culsans platforms, or the Occamy chiplet system comprising Snitch clusters. Besides the conceptual aspects, their performance, resource utilization, and tapeout characteristics should be analyzed. Another focus of this seminar should be the toolchains provided by the PULP platform and the workflow of integrating, adapting, and verifying their designs in other projects.

Possible starting points for literature research are listed below.

https://pulp-platform.org/index.html
https://pulp-platform.org/docs/Ariane_detailed.pdf
https://ieeexplore.ieee.org/abstract/document/8777130
https://ieeexplore.ieee.org/abstract/document/10163410
https://arxiv.org/abs/2407.19895
https://ieeexplore.ieee.org/abstract/document/10631529

Contact

Michael Meidinger

michael.meidinger@tum.de

Supervisor:

Michael Meidinger

Download thesis as PDF

One Flow, Many Cores: Challenges and Solutions in Multicore Network Processing

Description

?? CPU ? DPU ???????????????????????????????????????????????????????????????????????????????

????????????????????????????????????????“???→????”??????????????????????????????????????????RDMA/RoCE ? NFV ???????

Prerequisites

??????????????????
??????/?????????

Contact

Shichen Huang

shichen.huang@tum.de

Supervisor:

Shichen Huang

Download thesis as PDF

Prefetching Techniques for GPGPU

Description

Prefetching is a widely used technique in modern processors to mitigate the memory wall problem by fetching data into faster memory levels before it is actually needed. While prefetching has been extensively studied and deployed in CPUs, where hierarchical cache designs (L1, L2, and off-chip DRAM) dominate, GPUs present a very different challenge.

GPUs are now central to diverse applications ranging from artificial intelligence to scientific computing. Their massively parallel architecture and SIMT execution model create distinct memory access behaviors compared to CPUs. Consequently, conventional CPU prefetching mechanisms are often ineffective or even harmful when applied to GPUs. This has led to the development of GPU-specific prefetching strategies that account for the unique architectural features and execution patterns of GPUs.

The goal of this seminar is to study and compare different GPU prefetching mechanisms. By reviewing recent research papers, participants will gain an understanding of how these mechanisms work, their advantages and limitations, and under what conditions they can improve GPU performance.

Prerequisites

Basic Knowledge of Computer Architecure
Good English Skill

Contact

Yuanji Ye

yuanji.ye@tum.de

Supervisor:

Yuanji Ye

Download thesis as PDF

Time-Division-Multiplexed Network-on-Chips

Description

Overview:

Modern MPSoCs are heavily reliant on efficient and scalable interconnects. However fast or numerous the processors may be, the system will not be able to take advantage of these compute resources unless data and messages can be shared effectively. For this reason, network-on-chips (NoCs) are a virtual part of the design of modern SoCs. NoCs are highly scalable, while still being able to achieve low latency and high bandwidth utilisation.

However, current NoCs are not always suited for time-sensitive applications. Standard NoC designs use a "best effort" approach; this offers good average performance and can be used with the vast majority of workloads without requiring any modification of NoC components. But best effort NoCs offer no guarantee that a given transaction is completed in a given timeframe, which makes them wholly unsuited for real-time systems with hard deadlines.

A well-known alternative approach to best effort is time-division-multiplexing (TDM). In TDM NoCs a global schedule is made and each node is allocated a certain time-slot in which it may transmit information. This approach therefore allows for transmission times for a given program to be determined exactly at compile time.

Task:

For this seminar, the student will investigate TDM and mixed best-effort/TDM NoCs, with the goal of exploring and summarising state-of-the-art TDM NoC techniques, as well as the performance trade-offs of TDM NoCs compared to standard best-effort NoCs.

Relevant literature:

R. A. Stefan, A. Molnos and K. Goossens, "dAElite: A TDM NoC Supporting QoS, Multicast, and Fast Connection Set-Up," in IEEE Transactions on Computers, vol. 63, no. 3, pp. 583-594, March 2014

M. Schoeberl, F. Brandner, J. Sparsø and E. Kasapaki, "A Statically Scheduled Time-Division-Multiplexed Network-on-Chip for Real-Time Systems," 2012 IEEE/ACM Sixth International Symposium on Networks-on-Chip, Lyngby, Denmark, 2012

S. Hesham, D. Goehringer and M. A. Abd El Ghany, "HPPT-NoC: A Dark-Silicon Inspired Hierarchical TDM NoC with Efficient Power-Performance Trading," in IEEE Transactions on Parallel and Distributed Systems, vol. 31, no. 3, pp. 675-694, 1 March 2020

Contact

William Wulff
william.wulff@tum.de

Supervisor:

William Wulff

Download thesis as PDF

Comparative Analysis of Local vs. Cloud Processing Approaches

Description

In today’s data-driven world, processing approaches are typically divided between cloud-based solutions—with virtually unlimited resources—and localized processing, which is constrained by hardware limitations. While the cloud offers extensive computational power, localized processing is often required for real-time applications where latency and data security are critical concerns.

To bridge this gap, various algorithms have been developed to pre-process data or extract essential information before it is sent to the cloud.

The goal of this seminar is to explore and compare these algorithms, evaluating their computational load on local hardware and their overall impact on system performance.

Contact

Zafer Attal

zafer.attal@tum.de

Supervisor:

Zafer Attal

To top

Chair of Integrated Systems

Prof. Dr. sc.techn. Andreas Herkersdorf

Technical University of Munich
Arcisstraße 21
80333 München, Germany

Tel.: +49.89.289.22515

Fax: +49.89.289.28323

E-mail: lis(at)ei.tum.de

Scientific Seminar on Integrated Systems

Dates

Admission information

Objectives

Description

Prerequisites

Teaching and learning methods

Examination

Recommended literature

Links

Offered Topics

Seminars

Low-power Asynchronous Neural Networks

Low-power Asynchronous Neural Networks

Description

Supervisor:

Categorization of Ethernet-Detected Anomalies Induced by Processing Unit Deviations

Categorization of Ethernet-Detected Anomalies Induced by Processing Unit Deviations

Description

Contact

Supervisor:

Assigned Topics

Seminars

The PULP Platform and Its Efforts Around the CVA6 RISC-V Core

The PULP Platform and Its Efforts Around the CVA6 RISC-V Core

Description

Contact

Supervisor:

One Flow, Many Cores: Challenges and Solutions in Multicore Network Processing

One Flow, Many Cores: Challenges and Solutions in Multicore Network Processing

Description

Prerequisites

Contact

Supervisor:

Prefetching Techniques for GPGPU

Prefetching Techniques for GPGPU

Description

Prerequisites

Contact

Supervisor:

Time-Division-Multiplexed Network-on-Chips

Time-Division-Multiplexed Network-on-Chips

Description

Contact

Supervisor:

Comparative Analysis of Local vs. Cloud Processing Approaches

Comparative Analysis of Local vs. Cloud Processing Approaches

Description

Contact

Supervisor: