Forschungspraxen / MSCE Research Internships

Einzelne angebotene Forschungspraxen oder MSCE Internships können auch als Aufgabe im Rahmen des Projektpraktikums Integrated Systems durchgeführt werden. Für die betreffenden Ausschreibungen ist dies im Ausschreibungstext explizit angegeben.

Offene Arbeiten

Interesse an einer Studien- oder Abschlussarbeit?
In unseren Arbeitsgruppen sind oftmals Arbeiten in Vorbereitung, die hier noch nicht aufgelistet sind. Teilweise besteht auch die Möglichkeit, ein Thema entsprechend Ihrer speziellen Interessenslage zu definieren. Kontaktieren Sie hierzu einfach einen Mitarbeiter aus dem entsprechenden Arbeitsgebiet. Falls Sie darüber hinaus allgemeine Fragen zur Durchführung einer Arbeit am LIS haben, wenden Sie sich bitte an Dr. Thomas Wild.

Laufende Arbeiten

Exploration of Deadlock-Avoidance Algorithms for FPGA-Based Network-on-Chips

Beschreibung

The fast pace at which new online services emerge leads to a rapid surge in the volume of network traffic and the associated computing demands. A recent approach that the research community has proposed to tackle this issue is in-network computing, which means that network devices (e.g., smart network interface cards (SmartNICs) and switches) perform more types of computations than before. As a result, processing demands become more varied, requiring flexible packet-processing architectures. Since FPGA-based network-on-chips (NoCs) provide high flexibility and scalability, they can be used to provide high-speed communication in SmartNICs.

This project aims to explore deadlock-avoidance algorithms for FPGA-based network-on-chips (NoCs). A literature research must be performed to discover state-of-the-art approaches and estimate their complexity and impact on performance. One or two promising methods must be implemented in SystemVerilog and integrated into an existing NoC-based SmartNIC architecture. The achievable throughput and latency must be evaluated and compared with the baseline via cycle-accurate register-transfer level simulations in Vivado. Furthermore, the resource usage on an Alveo U55C must be determined by running Synthesis and Implementation in Vivado.

Betreuer:

Klajd Zyla

Implementation of a FPGA-based Intersatellite Network Switch for High-Speed Traffic

Beschreibung

The demand for high-speed communication links has significantly increased in recent years. Additionally, satellite telecom constellations can extend connectivity to the most remote areas and assist in handling higher payloads associated with emerging technologies like 6G. Each satellite node within these constellations requires routing capabilities to manage such data traffic. In this context, MPLS (Multi-Protocol Label Switching) and high-speed switching hardware are crucial for supporting this growth. MPLS enhances network performance by directing data based on short path labels instead of long network addresses, and advanced hardware enables the efficient handling of this data traffic.

The goal of this work is to validate a High-Speed (~600 Gb/s) Switch that utilizes MPLS technology. The system under evaluation is composed by a Multi-Rate MAC, a traffic generator, a MPLS Switch IP, and a PetaLinux build to manage MPLS traffic. The Multi-Rate MAC, traffic generator, and MPLS Switch are implemented on the FPGA embedded in the Versal ACAP, while PetaLinux operates on one of its ARM cores. The project employs Vivado, Vitis, and XSDB (Xilinx System Debugger) as the primary software tools, and the Versal ACAP and a e DVB-S2X/RCS2 Native Modem as the hardware to integrate and implement different parts and functionalities of the project.

Specific Task

  • Log project telemetry data to Petalinux Filesystem
  • Implement a Script to run on the Versal for tracing of the system
  • Show metrics on a GUI using HTML or a Phython lib (e.g. TKinter)
  • Define a meaningful test case for the system
  • Run the defined tests, adapt the system if necessary and debug

 

Voraussetzungen

  • Proficient in VHDL/Verilog, C, and Python programming languages
  • Strong understanding of computer networks, including the OSI model and various protocols (with
  • the focus on MPLS and IP)
  • Comfortable with using the Linux command line and writing bash scripts
  • Practical experience in FPGA and ACAP design and implementation

Betreuer:

Lars Nolte - Michael Hanh (Airbus)

Design and Implementation of a Stride Prefetching Mechanism in VHDL

Beschreibung

Since DRAM typically come with much higher access latencies than SRAM, many approaches to reduce DRAM latencies have already been explored, such as Caching, Access predictors, Row-buffers etc.

In the CeCaS research project, we plan to employ an additional mechanism, in detail a preloading mechanism of a certain fraction of the DRAM content to a small on-chip SRAM buffer. Thus, it is required to predict potentially next-accessed Cachelines, preload them to the SRAM and answer subsequent memory requests of this data from the SRAM instead forwarding them to the DRAM itself.

This functionality should be implemented as a cycle accurate VHDL model. A baseline system will bw provided, the goal is to implement this functionality in its simplest form as a baseline. Depending on the progress, this can be extended or refined in subsequent steps.

A close supervision, especially during the inital phase, will be guaranteed. Nevertheless, some experience with VHDL++ programming is required.

 

Voraussetzungen

  •    Experience with VHDL Coding
  •    Basic knowledge on MPSoC, cache hierarchies etc.
  •    B.Sc. in Electrical Engineering or similar

 

Betreuer:

Oliver Lenke

Evaluation of a Modern RISC-V Vector Processor

Beschreibung

Vector processors provide a means of incorporating tightly coupled parallel processing in general-purpose processors. The move from pure scalar - or limited SIMD - architectures yield increased efficiency for certain computational tasks. At the same time, vector processors require less hardware and dissipate less power than comparable solutions such as GPUs, and are easier to program for. This makes vector processors of interest in both low-end as well as high-end applications; embedded systems benefit from increased efficiency and lowered latency, and HPCs and supercomputers can take advantage of SIMD for computations with complex control flows.

The same advantages leads us to consider using vector processors as embedded processors on smartNICs for the HyperNIC project. Current smartNICs may incorporate CPUs for packet processing, however, these are standard Arm cores and are typically inefficient for parallel workloads. GPUs are also used for this purpose, but the latency can be high and they may struggle with control heavy applications. Vector processors lie somewhere in the middle, having the flexibility of standard CPUs while still being SIMD capable.

In this project, the performance and suitability of RISC-V V processors for packet processing on smartNICs will be investigated. A simple vector program should be written and its performance compared to a scalar baseline for a suitable, open source RISC-V processor. Ideally, the processor should also be implemented on an FPGA and compared to other processors in terms of clock frequency, area, and power consumption.

 

Kontakt

William Wulff
william.wulff@tum.de

Betreuer:

William Wulff

Software Implementation on ZCU102 Zynq Board PS in Correlation to TAS Server

Beschreibung

Context:

Future cars have a wide variety of sensors, such as cameras, LiDARs, and RADARs that generate a large amount of data. This data has to be sent via an intra-vehicular network (IVN) to further processing nodes, and, ultimately, actuators have to react to the sensor input. In between the processing steps, the intra-vehicular network has to ensure that all of the data and control signals reach their destination in time. Hence, next to a large amount of data, there are also strict timing constraints that the intra-vehicular network has to cope with. Therefore, the so-called time-sensitive networking (TSN) has been introduced. The functional safety of such networks plays an important role against the background of highly automated driving. Emerging errors have to be detected early and potential countermeasures have to be taken to keep the vehicle in a safe state. Therefore, highly sophisticated monitoring and diagnosis algorithms are a key requirement for future cars. When an anomaly is detected, the TAS server (Tool developed by Infineon) is used to request trace information from MultiCore Debug Solution (MCDS), which is a hardware feature available for Aurix boards that are used for debugging and tracing core and bus activities (See Project EMDRIVE).

The Zynq board consists of two parts: Programmable Logic (PL) and Processing System (PS). In this part of the work, the PL will implement the Companion Box, which will continuously monitor the traffic over the Ethernet. The PS part handles the tasks related to the TAS server and MCDS configuration, which will require Linux installation on the Zynq board to implement the TAS server. When an anomaly is detected from the Companion Box, a flag is set so that the software can detect and work accordingly. Then, a set of configurations will be automatically defined by the TAS server and sent to the MCDS of the targeted Aurix board that generated the anomaly. Upon the new configuration, the TAS server will retrieve the traces from the MCDS. 

FORSCHUNGSPRAXIS:

The substance of this work is to implement the following tasks:

  1. Install Linux on the ZCU102 PS.
  2. Test functionality of Linux.
  3. Install TAS server on the Linux OS of ZCU102 board.
  4. Configure MCDS of the Aurix boards using the TAS server and retrieve traces.
  5. Establish a connection between SW and HW of ZCU102 board (Flag assertion, Memory access).
  6. Automate the process of MCDS configuration and Trace retrieval. 

If you are interested, feel free to contact me! Please send your CV as well as a recent transcript.

Voraussetzungen

The primary skills that will be developed and needed during this project are the following:

  • Proficiency in Verilog/SystemVerilog for FPGA design.
  • A solid understanding of Linux OS.
  • An understanding of HW/SW co-design.
  • A strong background in System-on-Chip design.
  • A good knowledge of Python and Shell scripting.

Kontakt

zafer.attal@tum.de

Betreuer:

Zafer Attal

Automotive Ethernet Anomaly Detection for Burst of Packets - ZCU102 Implementation

Beschreibung

Context:

Future cars have a wide variety of sensors, such as cameras, LiDARs, and RADARs that generate a large amount of data. This data has to be sent via an intra-vehicular network (IVN) to further processing nodes, and, ultimately, actuators have to react to the sensor input. In between the processing steps, the intra-vehicular network has to ensure that all of the data and control signals reach their destination in time. Hence, next to a large amount of data, there are also strict timing constraints that the intra-vehicular network has to cope with. Therefore, the so-called time-sensitive networking (TSN) has been introduced. The functional safety of such networks plays an important role against the background of highly automated driving. Emerging errors have to be detected early and potential countermeasures have to be taken to keep the vehicle in a safe state. Therefore, highly sophisticated monitoring and diagnosis algorithms are a key requirement for future cars. (See Project EMDRIVE)  

Our approach for such diagnosis builds on non-intrusively monitoring the intra-vehicular network by snooping on data traffic at an interconnect in the car. An analysis of the traffic shall give information about anomalies that occur inside the network as symptoms of an error inside the electrical architecture.   FORSCHUNGSPRAXIS:   The substance of this work is to first work into an existing design of an anomaly detection module that monitors individual packets in a flow. Based on the already existing work, several extensions have to be implemented (Verilog/SystemVerilog) in the hardware design to support anomaly detection in a burst of packet transfer. Type of the faults and anomalies:

  1. Arrival time of the Burst 
  2. Timing in-between packets in a single Burst
  3. Number of packets in a single Burst  

The system should be capable of detecting these fault classes and sending an alert/raising a flag to the software about the detected anomaly. It can then later on inject these types of fault classes during demonstration upon request.  The design should be simulated and implemented on an FPGA (ZCU102 Zync Board).  

If you are interested, feel free to contact me! Please send your CV as well as a recent transcript.

Voraussetzungen

The primary skills that will be developed and needed during this project are the following:

  • Proficiency in Verilog/SystemVerilog for FPGA design.
  • Ability to design and implement hardware modules.
  • Experience with FPGA simulation tools (e.g., ModelSim).
  • A strong background in System-on-Chip design.
  • A good understanding of network protocols and their implementation on FPGA platforms

Kontakt

zafer.attal@tum.de

Betreuer:

Zafer Attal

Functional Chain on Aurix TC3x Boards Implementation - Optical Flow Detection

Beschreibung

The Aurix TC3x boards are used as ECUs emulators in a Car for in-vehicle network communication. These boards are used to represent this communication behavior, which will work as a benchmark for other network traffic monitors and fault detection modules.

To showcase the Aurix board's functional chain, an Optical Flow Detection algorithm is proposed, where the input is real-time video (Camera). At the same time, the output will be the processed video displayed on a screen or Aurix LCD.

The functional chain should be divided into 3 sub-functions (F1-F2-F3) that will represent the algorithm in which each Aurix board should implement a single function. The data transfer from one board to another uses an Ethernet switch, where the standard Ethernet protocols should be used for communication.  

This encompasses the following sub-tasks:

  • Bring up the Aurix boards, including the Aurix development environment.
  • Implement a functional chain consisting of (F1-F2-F3) that represents an Optical Flow Detection algorithm. 
  • Display the results on a screen or on an Aurix board LCD.
  • Establish Ethernet-based data exchange.

Voraussetzungen

  • Good knowledge of C programming
  • A solid understanding of System-on-Chip and the modules of general microcontroller

Betreuer:

Zafer Attal

Fine granular Page Preloading Mechanism on an FPGA Prototype

Stichworte:
VHDL, C Programming, Distributed Memory, Data Migration, Task Migration, Hardware Accelerator

Beschreibung

Their main advantages are an easy design with only 1 Transistor per Bit and a high memory density make DRAM omnipresend in most computer architectures. However, DRAM accesses are rather slow and require a dedicated DRAM controller
that coordinates the read and write accesses to the DRAM as well as the refresh cycles. In order to reduce the DRAM access latency, memory prefetching is a common technique to access data prior to their actual usage. However, this requires sophisticated prediction algorithms in order to prefetch the right data at the right time.


The Goal of this thesis is to refine an existing DRAM preloading mechanism on an  FPGA based prototype platform. Instead of preloading a whole memory page in a single atomic operation, the refinement should lead to a fine-granular page preloading, i.e. loading multiple small fractions of a page step by step while allowing regular memory accesses to be prioritized intermediately.


Towards this goal, you'll complete the following tasks:
1. Understanding the existing Memory Access and Preloading mechanism
2. VHDL implementation of the refined preloading functionalities
3. Write and execute small baremetal test programs
4. Analyse and discuss the performance results

Voraussetzungen

  • Good Knowledge about MPSoCs
  • Good VHDL skills
  • Good C programming skills
  • High motivation
  • Self-responsible workstyle

Kontakt

Oliver Lenke

o.lenke@tum.de

Betreuer:

Oliver Lenke

FPGA-based Network Tester for 100 Gbps

Beschreibung

With the advent of research on the next generation of
mobile communications 6G, we are engaged in exploring
architecture extensions for Smart Network Interface Cards
(SmartNICs). To enable adaptive, energy-efficient and
low-latency network interfaces, we are prototyping a
custom packet processing pipeline on FPGA-based NICs,
partially based on the open-nic project
(https://github.com/Xilinx/open-nic).

To test the performance of a SmartNIC-assisted server
under peak loads and achieve precise measurements of
key performance indicators (KPIs) such as throughput and latency, an FPGA-based Network Tester for 100 Gbps links shall be implemented and tested. For this, the Alveo U55C FPGA-based SmartNICs shall be used. With packet generation and throughput and latency measurements in hardware, maximum performance and precision should be reached.

The goal of this work is to implement the required logic modules in HDL (Verilog), integrate these modules into the OpenNIC Shell platform and test the design on the Alveo U55C FPGAs. Additionally, a software-interface to control the network tester can be developed, building up on a previous 10 Gbps Network Tester design. The design should also be evaluated regarding the performance of the packet generation as well as the precision in throughput and latency measurement.

Voraussetzungen

  • Programming skills VHDL/Verilog and C (and Python)
  • Good Knowledge of computer networks, OSI layer model and protocols
  • Comfortable with the Linux command line and bash
  • Preferably practical experience in FPGA design and implementation

Betreuer:

Marco Liess