Wissenschaftliches Seminar Integrierte Systeme
| Vortragende/r (Mitwirkende/r) | |
|---|---|
| Art | Seminar |
| Umfang | 3 SWS |
| Semester | Wintersemester 2025/26 |
| Unterrichtssprache | Deutsch |
Termine
Teilnahmekriterien
Anmerkung: Anmerkung: Begrenzte Teilnehmerzahl! Anmeldung in TUMonline vom 22.09.2025 - 19.10.2025. Studierende müssen bis zum 31. Oktober 2025 ein Seminarthema wählen. Bitte kontaktieren Sie dafür die Betreuungsperson des Themas, für das Sie sich interessieren. Die Vergabe erfolgt nach dem Prinzip „Wer zuerst kommt, mahlt zuerst.“ Die einzelnen Themen werden ab 06.10.2025 unter <a href="https://www.ce.cit.tum.de/lis/lehre/seminare/seminar-integrierte-systeme/">https://www.ce.cit.tum.de/lis/lehre/seminare/seminar-integrierte-systeme/</a> bekannt gegeben. Die Teilnahme an der Einführungsveranstaltung am 23. Oktober 2025 (Donnerstag) um 10:30 Uhr in Raum N2128 ist verpflichtend, um Ihr Thema zu sichern. Drei Voraussetzungen müssen erfüllt sein, um sich erfolgreich für diesen Kurs einzuschreiben: (1) Anmeldung über TUMonline (2) Teilnahme an der Einführungsveranstaltung (3) Bestätigung eines Themas durch eine Betreuerin oder einen Betreuer
Lernziele
Beschreibung
Die Modulteilnehmer erarbeiten selbstständig aktuelle wissenschaftliche Beiträge, fertigen eine zu bewertende schriftliche Ausarbeitung an, präsentieren ihren Beitrag im Rahmen eines Kolloquiums und tragen mit Diskussionsbeiträgen zum Kolloquium bei.
Inhaltliche Voraussetzungen
sowie deren Anwendungen.
Lehr- und Lernmethoden
Der Teilnehmer bekommt - abhängig von seinem individuellen Thema - einen eigenen Betreuer zugeordnet. Der Betreuer hilft dem Studierenden insbesondere zu Beginn der Arbeit, indem er in das Fachthema einführt, geeignete Literatur zur Verfügung stellt und hilfreiche Tipps sowohl bei der fachlichen Arbeit als auch bei der Erstellung der schriftlichen Ausarbeitung und des Vortrags gibt.
Wenn alle drei im Abschnitt „Teilnahmekriterien & Anmeldung“ beschriebenen Bedingungen erfüllt sind, können Sie freiwillig am Schreib- und Präsentationsworkshop des EDA-Lehrstuhls teilnehmen (15:00–16:30 Uhr, Raum 2999, auf Englisch):
03.11.2025: Wissenschaftliches Schreiben
17.11.2025: Präsentationstraining
(Aufgrund begrenzter Raumkapazität stehen nur wenige Plätze zur Verfügung; Vergabe nach dem Prinzip „first come, first served“).
Studien-, Prüfungsleistung
- 50 % schriftliche Ausarbeitung (typisch 4 Seiten)
- 50 % Vortrag 15 Minuten plus Diskussion 5 Minuten
Empfohlene Literatur
Links
Angebotene Themen
Vergebene Themen
Low-power Asynchronous Neural Networks
Beschreibung
Description
Neural networks (NNs) have seen great improvements over the last decades and have consequently been adopted for a multitude of applications. While much more capable in certain areas than prior solutions, NNs have one big drawback.
A neural network requires much more power than traditional computational models, making them generally unsuited for embedded devices. The rapid adoption also poses challenges for high performance models, as the amount of processing power required for widespread use strains the existing power grid - with construction of AI data-centers significantly outpacing construction of new power plants. Clearly this growth is unsustainable unless these challenges are addressed.
In part to address these issues, research has been ongoing into techniques which may avoid the high computational cost and power dissipation of standard neural networks, such as Convolutional Neural Networks (CNNs). Particularly for event driven computation, models such as Spiking Neural Networks (SNNs) and/or asynchronous neural networks offer potentially significant benefits; as event driven applications only require that computation is performed once a new event occurs, power can be saved by only being active when a computation is required. Asynchronous circuits take this idea to the extreme by completely avoiding all dynamic power dissipation except when subcircuits have valid inputs available.
Task
For this seminar topic, the student is expected to look into the state-of-the-art for asynchronous neural networks and provide a summary of relevant research. Papers that could serve as potential starting points can be seen below, but the student is free to pursue the topic as they want, within the confines of the scope given in this description.
Starting points
- A 28nm Configurable Asynchronous SNN Accelerator with Energy-Efficient Learning
- DYNAP-SE2: a scalable multi-core dynamic neuromorphic
asynchronous spiking neural network processor - Design and Tool Flow of a Reconfigurable Asynchronous
Neural Network Accelerator - A 2048-Neuron Spiking Neural Network
Accelerator with Neuro-Inspired Pruning and
Asynchronous Network on Chip in 40nm CMOS
Betreuer:
The PULP Platform and Its Efforts Around the CVA6 RISC-V Core
Beschreibung
The Parallel Ultra-Low Power (PULP) platform is an open-source hardware and software ecosystem developed by ETH Zürich and the University of Bologna to explore energy-efficient computing. It provides scalable multi-core architectures, SoC components, and toolchains designed for applications where power consumption is critical, such as edge AI, IoT, and embedded sensing.
At its core, PULP focuses on parallelism and low-power techniques, combining lightweight RISC-V processors, tightly coupled memory hierarchies, and domain-specific accelerators. The modular and flexible platform enables researchers and developers to prototype custom system-on-chip designs while leveraging a growing suite of open-source IP blocks, including the CVA6 RISC-V core.
This CPU core, formerly Ariane, is a 64-bit, in-order, six-stage-pipelined application-class processor compatible with the RISC-V RV64GC instruction set. Though optimized for energy efficiency, it is powerful enough to boot operating systems such as Linux or FreeRTOS. With standard interfaces like AXI for memory and peripheral access, a rich toolchain, and a release under the permissive Solderpad license, the CVA6 core is very useful for system-on-chip integration in research.
In this seminar topic, the student should further investigate the PULP platform, its contributions, including the CVA6 core, and especially its recent projects on novel system architectures. Examples for case studies are the lightweight Cheshire and coherence-focused Culsans platforms, or the Occamy chiplet system comprising Snitch clusters. Besides the conceptual aspects, their performance, resource utilization, and tapeout characteristics should be analyzed. Another focus of this seminar should be the toolchains provided by the PULP platform and the workflow of integrating, adapting, and verifying their designs in other projects.
Possible starting points for literature research are listed below.
https://pulp-platform.org/index.html
https://pulp-platform.org/docs/Ariane_detailed.pdf
https://ieeexplore.ieee.org/abstract/document/8777130
https://ieeexplore.ieee.org/abstract/document/10163410
https://arxiv.org/abs/2407.19895
https://ieeexplore.ieee.org/abstract/document/10631529
Kontakt
Michael Meidinger
michael.meidinger@tum.de
Betreuer:
One Flow, Many Cores: Challenges and Solutions in Multicore Network Processing
Beschreibung
?? CPU ? DPU ???????????????????????????????????????????????????????????????????????????????
????????????????????????????????????????“???→????”??????????????????????????????????????????RDMA/RoCE ? NFV ???????
Voraussetzungen
- ??????????????????
- ??????/?????????
Kontakt
Shichen Huang
shichen.huang@tum.de
Betreuer:
Prefetching Techniques for GPGPU
Beschreibung
Prefetching is a widely used technique in modern processors to mitigate the memory wall problem by fetching data into faster memory levels before it is actually needed. While prefetching has been extensively studied and deployed in CPUs, where hierarchical cache designs (L1, L2, and off-chip DRAM) dominate, GPUs present a very different challenge.
GPUs are now central to diverse applications ranging from artificial intelligence to scientific computing. Their massively parallel architecture and SIMT execution model create distinct memory access behaviors compared to CPUs. Consequently, conventional CPU prefetching mechanisms are often ineffective or even harmful when applied to GPUs. This has led to the development of GPU-specific prefetching strategies that account for the unique architectural features and execution patterns of GPUs.
The goal of this seminar is to study and compare different GPU prefetching mechanisms. By reviewing recent research papers, participants will gain an understanding of how these mechanisms work, their advantages and limitations, and under what conditions they can improve GPU performance.
Voraussetzungen
- Basic Knowledge of Computer Architecure
- Good English Skill
Kontakt
Yuanji Ye
yuanji.ye@tum.de
Betreuer:
Categorization of Ethernet-Detected Anomalies Induced by Processing Unit Deviations
Beschreibung
Sporadic anomalies in automotive systems can degrade performance over time and may originate from various system components. In automotive applications, anomalies are often observed at the sensor and ECU levels, with potential propagation through the in-vehicle network via Ethernet. Such anomalies may be the result of deviations in electronic control units, highlighting the importance of monitoring these signals over Ethernet.
Not all processing anomalies are equally detectable over Ethernet due to inherent limitations in the monitoring techniques and the nature of the anomalies. This seminar will explore various anomaly categories, investigate their potential causes, and assess the likelihood of their propagation through the network.
The goal of this seminar is to provide a comprehensive analysis of these anomaly categories, evaluate the underlying causes, and discuss the potential for their detection and mitigation when monitored over Ethernet.
Kontakt
Zafer Attal
zafer.attal@tum.de
Betreuer:
Comparative Analysis of Local vs. Cloud Processing Approaches
Beschreibung
In today’s data-driven world, processing approaches are typically divided between cloud-based solutions—with virtually unlimited resources—and localized processing, which is constrained by hardware limitations. While the cloud offers extensive computational power, localized processing is often required for real-time applications where latency and data security are critical concerns.
To bridge this gap, various algorithms have been developed to pre-process data or extract essential information before it is sent to the cloud.
The goal of this seminar is to explore and compare these algorithms, evaluating their computational load on local hardware and their overall impact on system performance.
Kontakt
Zafer Attal
zafer.attal@tum.de