
CeCaS
Mannheim CeCaS is a supra-regional research project funded by the BMBF to develop a "Central Car Server" for future automated, connected and electrified vehicles. The project network consists of numerous industrial partners, accompanied by several academic research groups.
Overarching Objective: Automotive Supercomputing Platform - powerful Central Car Server concept based on new automotive qualified high performance processors, in FinFET supported by application specific accelerators and adaptive automotive SW stack for highly automated connected vehicles.
At the Technical University of Munich, three chairs (TUM-AIR, TUM-LIS, TUM-SEC) are involved in the CeCaS project network, contributing in the areas of model-based development, requirements management, software architecture, memory technology, and security.
Contribution of LIS
TUM-LIS is developing approaches for intelligent pre-fetching and write-back of data by the memory controller to increase the performance of the automotive processor. In addition, a prediction model for future addresses and data accesses is being investigated using machine learning methods such as reinforcement learning.
The current approach provides a wrapper layer around the DDR controller that realizes this functionality. It reduces the access latencies to external volatile and non-volatile main memories via adaptive prefetching of data and instructions in fast on-chip SRAM memories and by intelligent write-back of modified data located in the SRAM memory to the external main memory.
In the work on the wrapper layer we cooperate with TUM-SEC who investigate suitable lightweight techniques for transparent on-the-fly en-/de-cryption of data stored on external memory to prevent unauthorized access as well as error correction codes.

Workflow
In the CeCaS project we take a two-sided approach. On the one hand, we examine various implementation concepts and approaches with a SystemC based simulation model together with our partners. On the other hand, we are also working on an FPGA implementation, which offers a deeper level of abstraction for even more precise analyses. In both areas there are often topics for student work.
Involved Researchers
Open Student Work
Current Student Work
Evaluations-Framework für eine SystemC MPSoC Prototyp Architektur
Description
Gegenstand dieser Bachelorarbeit ist die Entwicklung eines Compile-Flows, mit dem verschiedene Benchmarks, z.B: von EEMBC, kompiliert und auf einer SystemC basierten Prototyp Architektur abgespielt werden können. Dabei sollen verschiedene Benchmarks, ggf. mit unterschiedlichen Parametern so in das System eingebunden werden, dass jedes Teammitglied diese auf einfache Weise kompilieren und abspielen kann.
Das SystemC Modell verwendet ein taktgenaues Modell eines Prozessors der Synopsys ARC Familie, um Speicherzugriffe auszuführen und so die Speicherhierarchie unter realistischen Bedingungen zu testen und zu evaluieren.
Je nach zeitlichem Fortgang der Arbeit kann man die Ergebnisse der Benchmarks dann auswerten
Prerequisites
- Gutes Fachwissen über MPSoC Systeme
- Kenntnisse über Python-Programmierung
- Hohe Motivation
- Selbstverantwortliche Arbeitsweise
Supervisor:
Design and Integration of a Hardware Performance Counter Unit for Memory Access Statistics
Description
Their main advantages are an easy design with only 1 Transistor per Bit and a high memory density make DRAM omnipresend in most computer architectures. However, DRAM accesses are rather slow and require a dedicated DRAM controller
that coordinates the read and write accesses to the DRAM as well as the refresh cycles. In order to reduce the DRAM access latency, memory prefetching is a common technique to access data prior to their actual usage. However, this requires sophisticated prediction algorithms in order to prefetch the right data at the right time.
The Goal of this thesis is to extendan existing DRAM preloading mechanism on an FPGA based prototype platform by a hardware performance counter and statistics module. This requires a profund understanding of AHB communication protocolls as well as the functionalities of the cache and memory hierarchie of an MPSoC system. The new component should be integrated in the whole architecture design and tested and evaluated under different scenarios.
Towards this goal, you'll complete the following tasks:
1. Understanding the existing Memory Access and Preloading mechanism
2. VHDL implementation of the refined preloading functionalities
3. Write and execute small baremetal test programs
4. Analyse and discuss the performance results
Prerequisites
- Gutes Fachwissen über MPSoC Systeme
- Kenntnisse über Python-Programmierung
- Hohe Motivation
- Selbstverantwortliche Arbeitsweise
Contact
Oliver Lenke
o.lenke@tum.de
Supervisor:
Integration of a Hardware Preload Unit into an AXI-based CVA6 Architecture
Description
Their main advantages are an easy design with only 1 Transistor per Bit and a high memory density make DRAM omnipresend in most computer architectures. However, DRAM accesses are rather slow and require a dedicated DRAM controller
that coordinates the read and write accesses to the DRAM as well as the refresh cycles. In order to reduce the DRAM access latency, memory prefetching is a common technique to access data prior to their actual usage. However, this requires sophisticated prediction algorithms in order to prefetch the right data at the right time.
The Goal of this thesis is to transfer an existing DRAM preloading mechanism to an FPGA based prototype platform of the RISC-V CVA6 architecture. This requires a profund understanding of AHB and AXI communication protocolls as well as the functionalities of the cache and memory hierarchie of an MPSoC system.
Towards this goal, you'll complete the following tasks:
1. Understanding the existing Memory Access and Preloading mechanism
2. VHDL implementation of the refined preloading functionalities
3. Write and execute small baremetal test programs
4. Analyse and discuss the performance results
Prerequisites
- Gutes Fachwissen über MPSoC Systeme
- Kenntnisse über Python-Programmierung
- Hohe Motivation
- Selbstverantwortliche Arbeitsweise
Contact
Oliver Lenke
o.lenke@tum.de
Supervisor:
Profiling-based Prefetcher Design
Description
In memory hierarchy, multi-levels caches are used to cache datas in order to avoid the long access latency when accessing to the DRAM. However, when cache misses happen, the long memory access latency will still stall the program execution. To further improve the performance, prefetching techniques are widely used in our modern processors. A prefetcher predict and fetch the data to cache/buffer before it is actually accessed, thereby hiding memory access latency.
Our prefetcher reacts to cache load misses by prefetching large memory regions. While simple, this can severly burden the DRAM bandwidth and flood the buffer, especially when many of those prefetched regions are not actually needed.
Applications exhibit varied memory access patterns. Some memory regions show some characteristics that they are better candidates for prefetching. By profiling an program in advance, it is possible to determine which memory region should be prefetched and which memory region should be evicted earlier.
In this internship, the student will help to implement the prefetching priority and eviction policy in our existing SystemC model. And by using the profiling result in the policy, we expect to get a performance improvement compared to the original model.
Prerequisites
- Basic computer architecture knowledge
- Experience with Python programming
- Better if have SystemC knowledge.
Supervisor:
Analysis and Visualization of Cache Access Behavior in CPU Clusters
Description
Gegenstand dieser Bachelorarbeit ist die Entwicklung eines Python-Tools, welches verschiedene Statistiken über die Speicherzugriffe einer MPSoC-Architektur erstellt. Dazu werden simulations-basierte Traces verwendet, in denen alle Speicherzugriffe aufgezeichnet werden. In diesen Traces sind alle Zugriffe dokumientiert: Zeitpunkt? Cache Hit/Miss? Welcher Core?
Aus diesen Traces sollen verschiedene Statistiken erstellt werden, dazu muss ein Python Programm geschrieben werden, welches die Traces auswertet und Plottet.
Mögliche Statistiken sind beispielsweise
- Auf welche Page wurde wie oft zugegriffen?
- Wie viele Zugriffe hintereinander fallen im Schnitt in die selbe Page
- Wie ist die zeitliche Verteilung der unterschiedlichen Pages?
- Zeitlicher Abstand zwischen Zugriffen auf dieselbe Page?
Diese Daten sollen bei der Analyse von Speicherzugriffsmustern von verschiedenen Anwendungen helfen, um so einen effizienten Mechanismus zum Vorladen ausgewählter Speicherinhalte zu entwickeln.
Prerequisites
- Gutes Fachwissen über MPSoC Systeme
- Kenntnisse über Python-Programmierung
- Hohe Motivation
- Selbstverantwortliche Arbeitsweise
Contact
Oliver Lenke
o.lenke@tum.de
Supervisor:
Completed Student Work
Contact
Oliver Lenke
o.lenke@tum.de
Supervisor:
Contact
Oliver Lenke
o.lenke@tum.de
Supervisor:
Supervisor:
Contact
Oliver Lenke
o.lenke@tum.de
Supervisor:
Contact
Oliver Lenke
o.lenke@tum.de
Supervisor:
Supervisor:
Contact
Oliver Lenke
o.lenke@tum.de