Einzelne angebotene Forschungspraxen oder MSCE Internships können auch als Aufgabe im Rahmen des Projektpraktikums Integrated Systems durchgeführt werden. Für die betreffenden Ausschreibungen ist dies im Ausschreibungstext explizit angegeben.
Offene Arbeiten
Interesse an einer Studien- oder Abschlussarbeit? In unseren Arbeitsgruppen sind oftmals Arbeiten in Vorbereitung, die hier noch nicht aufgelistet sind. Teilweise besteht auch die Möglichkeit, ein Thema entsprechend Ihrer speziellen Interessenslage zu definieren. Kontaktieren Sie hierzu einfach einen Mitarbeiter aus dem entsprechenden Arbeitsgebiet. Falls Sie darüber hinaus allgemeine Fragen zur Durchführung einer Arbeit am LIS haben, wenden Sie sich bitte an Dr. Thomas Wild.
High-Performance Hardware Tracing of SmartNIC Packet Processing Pipelines
Beschreibung
With the advent of research on the next generation of mobile communications 6G, we are engaged in exploring architecture extensions for Smart Network Interface Cards (SmartNICs). To enable adaptive, energy-efficient and low-latency network interfaces, we are prototyping a custom packet processing pipeline on FPGA-based NICs, partially based on the open-nic project (https://github.com/Xilinx/open-nic).
Modern server architectures face constant challenges in performance and energy efficiency. SmartNICs offer a promising solution by offloading packet preprocessing and collecting real-time traffic analytics. These capabilities allow servers to dynamically adapt to changing network conditions and processing demands. However, operating at speeds of 100 Gbps generates massive data volumes that require sophisticated monitoring and debugging capabilities.
This thesis focuses on designing and implementing advanced hardware extensions for debugging and tracing SmartNIC packet processing pipelines using Hardware Description Language (HDL). The developed system will provide critical visibility into high-speed packet processing operations and monitoring logic.
Developing trace collection mechanisms compatible with 100 Gbps line rates
Engineering efficient solutions for capturing, moving, and storing large volumes of trace data
Implementing strategies to avoid performance degradation during trace collection
Applying suitable postprocessing and generating visualizations of key information
Voraussetzungen
Programming skills in VHDL/Verilog, C, Python and preferably Rust
Practical experience with FPGA Design and Implementation
Good Knowledge of computer architecture, low-level software and OSI network model
In the BCDC project, a working group at TUM collaborates on designing a RISC-V-based chiplet demonstration chip, of which at least two will be connected via an interposer to simulate a system of interconnected chiplets. At LIS, we work on a high-performance, low-latency chiplet interconnect with additional application-specific features managed by a smart protocol controller. It closes the gap between the underlying physical layer that takes care of data transmission across the interposer, and the system bus that attaches the inter-chiplet interface to the other components of the demonstration chip.
A high-level simulation of our system should be set up during this research internship to investigate the viability and performance of different architecture configurations. The simplest topology involves two connected identical chiplets; more complex arrangements could consist of more chiplets or other architectural elements, like an FPGA. The chiplets should be abstracted to mainly generate and process data in manners like the RISC-V CPU cores and further processing units attached to the AXI bus. A bus functional model should represent the interconnect and simulate it regarding transmission width, throughput, and latency. The modeled interconnect standard, for example, UCIe, PCIe, or modified versions of MII or SPI, and the level of modeling detail are to be explored.
As a first step, approaches to simulate specifically chiplet architectures should be researched theoretically. After choosing a suitable framework, e.g., SystemC or Matlab/Simulink, the system model should be created, and different configurations should be investigated. Ultimately, the simulation should help identify the benefits and drawbacks of these configurations and support a future HDL implementation.
Voraussetzungen
Basic understanding of chiplet architectures
Experience with high-level simulation
Structured and independent way of working and strong problem-solving skills
Localizing Automotive Diagnostic Solutions: Software Migration and PS/PL Interface Implementation on ZCU102
Beschreibung
About the Project: Future cars rely on a wide variety of sensors—including cameras, LiDARs, and RADARs—that generate enormous amounts of data. This data flows through the intra-vehicular network (IVN) to processing nodes, ultimately triggering actuators. With strict timing constraints essential for vehicle safety, time-sensitive networking (TSN) is now a critical component in modern automotive systems. Within the context of the EMDRIVE project, our team is developing new monitoring and diagnostic approaches to detect errors early and maintain functional safety in highly automated driving environments.
Project Description: The primary goal of this project is to migrate existing software packages—used to record ECU traces and analyze processing anomalies—onto the ZCU102 board. This migration will enable local processing of anomalies and establish a robust PS/PL interface between the anomaly detection hardware (implemented on the FPGA) and the processing system running the software.
The key tasks include:
TAS Tool Configuration: Bring up the TAS tool and configure it to work with the Multi Core Debug Solution (MCDS) for trace recording.
Trace Analyzer Deployment: Bring up and configure the Trace Analyzer to parse recorded traces and detect deviations in processing.
Software Migration: Migrate the existing software packages to run on the Processing System (PS) of the ZCU102 board.
Interface Integration: Develop and integrate a stable interface between the Programmable Logic (PL) and the PS, ensuring efficient sharing of data, status, and configuration information.
Key Responsibilities:
Analyze existing software packages and understand the hardware integration requirements.
Configure and validate both the TAS tool and the Trace Analyzer.
Adapt and optimize software for deployment on the ZCU102 board.
Develop and implement a robust PS/PL interface for seamless communication between hardware and software.
Collaborate with interdisciplinary teams to integrate and test the complete system.
Voraussetzungen
Required Skills:
Proficiency in C programming.
Strong understanding of System-on-Chip (SoC) architectures and microcontroller modules.
Background in automotive applications and systems.
Experience with hardware description languages (e.g., VHDL) and embedded systems (preferred).
Familiarity with Linux-based systems and FPGA integration is a plus.
Benefits:
Hands-on experience with cutting-edge automotive diagnostic technology.
Exposure to advanced hardware-software integration and embedded systems.
Opportunity to contribute to projects that enhance the safety and reliability of future vehicles.
Collaborative work environment with industry-leading partners.
In memory hierarchy, multi-levels caches are used to cache datas in order to avoid the long access latency when accessing to the DRAM. However, when cache misses happen, the long memory access latency will still stall the program execution. To further improve the performance, prefetching techniques are widely used in our modern processors. A prefetcher predict and fetch the data to cache/buffer before it is actually accessed, thereby hiding memory access latency.
Our prefetcher reacts to cache load misses by prefetching large memory regions. While simple, this can severly burden the DRAM bandwidth and flood the buffer, especially when many of those prefetched regions are not actually needed.
Applications exhibit varied memory access patterns. Some memory regions show some characteristics that they are better candidates for prefetching. By profiling an program in advance, it is possible to determine which memory region should be prefetched and which memory region should be evicted earlier.
In this internship, the student will help to implement the prefetching priority and eviction policy in our existing SystemC model. And by using the profiling result in the policy, we expect to get a performance improvement compared to the original model.