Seminar on Integrated Systems
Lecturer (assistant) | |
---|---|
Type | seminar |
Duration | 3 SWS |
Term | Sommersemester 2024 |
Language of instruction | English |
Dates
- 22.04.2024 13:15-14:45 N2128, Seminarraum
Admission information
Note: Note: Limited number of participants! Registration via TUMonline from March 27th to April 21st 2024 is required. Students have to choose a seminar topic before the introduction lesson. Therefore you need to contact the supervisor of the topic you are interested in. Topics are selected on a first-come first-served basis. Topics will be published on April 8th 2024 at <a href="https://www.ce.cit.tum.de/en/lis/teaching/seminars/seminar-integrierte-systeme/">https://www.ce.cit.tum.de/en/lis/teaching/seminars/seminar-integrierte-systeme/</a>.
Objectives
The following competencies will be acquired:
* The student is able to independently analyze state-of-the-art concepts in the field of integrated systems.
* The student is able to present a topic in a structured way according to problem formulation, state of the art, goals, methods, and results.
* The student can present a topic according to the structure given above orally with a set of slides and with a written report.
Description
The participants independently work on a current scientific topic, write a paper, design a poster and present their topic in a talk. In the subsequent discussion, the topic will be treated in-depth.
Prerequisites
Teaching and learning methods
Examination
- 50 % paper of 4 pages in IEEE format
- 50 % presentation of 15-20 minutes and subsequent questions
Recommended literature
Links
Assigned Topics
Seminars
Prefetching Techniques Based on Machine Learning
Description
Prefetching techniques are widely used in digital systems to enhance performance. A prefetcher predicts and fetches data before it is actually accessed, thereby hiding memory access latency.
Traditional prefetchers typically consider only one program context and work well with regular memory access patterns. Recently, machine learning techniques such as neural networks and reinforcement learning have been employed in prefetcher design. These machine learning based prefetchers take into account more program and system-level information, allowing them to make smarter decisions. As a result, they often achieve higher accuracy, coverage, and timeliness, leading to improved system performance.
The goal of this seminar is to study and compare prefetching mechanisms based on different machine learning methodologies. After reading some papers, you should know the advantages of using machine learning in prefetching, as well as the challenges associated with its implementation. A starting point literature will be provided.
Prerequisites
For MSCE/MSEI student
Contact
Yuanji Ye
yuanji.ye@tum.de
Supervisor:
Algorithms for Memory Prefetching
Description
DRAM modules are indispensable for modern computer architectures. Their main advantages are an easy design with only 1 Transistor per Bit and a high memory density.
However, DRAM accesses are rather slow and require a dedicated DRAM controller that coordinates the read and write accesses to the DRAM as well as the refresh cycles.
In order to reduce the DRAM access latency,memory prefetching is a common technique to access data prior to their actual usage. However, this requires sophisticated prediction algorithms in order to prefetch the right data at the right time.
The goal of this Seminar is to study and compare several memory prefetching algorithms and present their benefits and usecases. A starting point of literature will be provided.
Contact
Oliver Lenke
o.lenke@tum.de
Supervisor:
Developments in Asynchronous SRAM Design and Verification
Description
One of the barriers to implementing asynchronous logic in commercial ASICs, is the lack of asynchronous SRAM primitives and design tools. Current VLSI design tools may be coaxed into supporting implementation and layout of asynchronous logic, but often this has to be combined with traditional synchronous SRAM. Furthermore, while custom cells for asynchronous SRAM are a theoretical possibility, the lack of EDA support for design and layout of asynchronous memory arrays significantly increases the effort required. Lastly, efficient design of asynchronous SRAM cells and controllers, and the verification hereof, is not nearly as comprehensively explored as their synchronous counterparts.
For this topic, the student would look into current design strategies for 6-transistor asynchronous SRAM cells and the required controllers. Additionally, the state-of-the-art of asynchronous memory compilers and verification tools and strategies should be explored.
Supervisor:
A Comparison of Recent Memory Prefetching Techniques
Description
DRAM modules are indispensable for modern computer architectures. Their main advantages are an easy design with only 1 Transistor per Bit and a high memory density.
However, DRAM accesses are rather slow and require a dedicated DRAM controller that coordinates the read and write accesses to the DRAM as well as the refresh cycles.
In order to reduce the DRAM access latency, the cache hierarchy can be extended by dedictated hardware access predictors in order to preload certain data to the caches before it is actually accessed.
The goal of this Seminar is to study and compare prefetching mechanisms and access predictors on cache level with several optimizations and present their benefits and usecases. A starting point of literature will be provided.
Prerequisites
B.Sc. in Electrical engineering or similar degree
Contact
Oliver Lenke
o.lenke@tum.de
Supervisor:
A Survey of Recent Prefetching Techniques for Processor Caches
Description
Cache Design have by design some compulsory cache misses, i.e. the first access of a certain cacheline will typically result in a cache miss, since the data is not present in the cache hierarchy yet.
In order to reduce this, caches can be extended by prefetching mechanisms that speculatively prefetch some cachelines before they first get accessed.
The goal of this Seminar is to study and compare different cache prefetcher designs and present their benefits and usecases. A starting point of literature will be provided.
Prerequisites
B.Sc. in Electrical engineering or similar degree
Contact
Oliver Lenke
o.lenke@tum.de
Supervisor:
Asynchronous Design Using Standard EDA Tools
Description
Asynchronous logic have several advantages over conventional, clocked circuits which makes it of interest for certain areas of applications, such as network-on-chips, mixed-mode electronics, and arithmetic processors. Furthermore, a properly designed asynchronous circuit may offer both better performance and significantly lower power consumption than a synchronous equivalent.
Modern EDA tools, however, are not optimised for asynchronous design. This unfortunately complicates everything from architectural descriptions to synthesis and implementation, to verification and testing. A major concern lies in the fact that most tools are reliant upon global clocks for optimisation, as well as timing checks. For asynchronous circuits, where all functional blocks are self timed, this means that EDA tools will not be able to properly use clock constraints to optimise the critical path, thereby nullifying any speed advantages. And critically, EDA tools are not even guaranteed to produce functioning netlists. As such, in order to produce and test asynchronous circuits that are of non-trivial complexity, the standard design flow must be modified to take the characteristics of asynchronous logic into account.
For this seminar, the student should research the state-of-the-art for asynchronous logic design and testing with current industry standard EDA tools and what design flow modifications are required for producing robust and efficient asynchronous circuits.
Supervisor:
Simulation of Chiplet-based Systems
Description
With technology nodes approaching their physical limit, Moore’s law becomes continually more difficult to keep up with. As a strategy to allow further scaling, chiplet-based architectures will likely become more prevalent as they offer benefits regarding development effort and manufacturing yield.
Even while reusing IP, creating an entire multi-chiplet system is still a complicated task. Following a top-down approach, a high-level simulation can help design the system architecture before going to the register transfer level. As most available simulators cater to classical SoCs, setting up a simulation for chiplet-based systems might require special attention in selecting a framework and effort in its adaptation.
This seminar work should investigate what needs to be considered when simulating chiplet-based systems compared to SoCs, what simulation frameworks are viable, and what challenges simulation for chiplets and especially their interconnect brings.
A starting point for literature could be the following paper:
https://dl.acm.org/doi/abs/10.1145/3477206.3477459
Contact
michael.meidinger@tum.de
Supervisor:
Exploration of Deadlock-Avoidance Algorithms for FPGA-Based Network-on-Chips
Description
Network-on-chip (NoC) is a communication architecture used in multi-core and many-core systems to interconnect processing elements (PEs), such as CPUs, GPUs, accelerators, and memory controllers, using packet-switched networks similar to those found in computer networks. It replaces traditional bus-based interconnects with a scalable and modular network infrastructure, offering higher performance, lower latency, and improved scalability. In a NoC, PEs are connected through a network of routers and links, forming a mesh, torus, or other topologies. Each router is responsible for forwarding packets between neighboring PEs using routing algorithms. NoC architectures can vary greatly in terms of topology, routing algorithms, flow control mechanisms, and other parameters, depending on the specific application requirements and design constraints.
Field-Programmable Gate Arrays (FPGAs) are integrated circuits that contain an array of configurable logic blocks interconnected through programmable routing resources. They provide a versatile and powerful platform for implementing digital circuits and systems, offering flexibility, reconfigurability, parallelism, and hardware acceleration capabilities. Hence, they are well-suited for a wide range of applications across various domains, including telecommunications, networking, automotive, aerospace, consumer electronics, and industrial automation.
FPGA-optimized NoCs are tailored to exploit the unique features and capabilities of FPGAs while addressing the challenges of communication and interconnection in FPGA-based systems. They play a crucial role in enabling efficient and scalable communication infrastructure for FPGA-based applications across a wide range of domains. The goal of this seminar work is to investigate state-of-the-art deadlock-avoidance algorithms for FPGA-based NoCs.
Relevant literature
[1] Monemi, Alireza, et al. "ProNoC: A low latency network-on-chip based many-core system-on-chip prototyping platform." Microprocessors and Microsystems 54 (2017): 60-74.
[2] Becker, Daniel U. Efficient microarchitecture for network-on-chip routers. Stanford University, 2012.
[3] Xu, Yi, et al. "Simple virtual channel allocation for high throughput and high frequency on-chip routers." HPCA-16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture. IEEE, 2010.
Supervisor:
An Overview of Service Migration in Modern Edge Computer Networks
Description
In modern Edge computer networks, applications and services should adhere to service-level agreements (SLA) like low latency or minimal throughput. Depending on demand and resource availability, these services have to be migrated between compute nodes to ensure these SLAs.
Service migration is a critical aspect of Edge computing, enabling the movement of services closer to the data source or end-users for improved performance and reduced latency. However, it comes with its own set of challenges, such as maintaining service continuity and managing resource constraints. This involves checkpointing and restarting of the applications (potentially in containers), as well as moving the data from one compute node to the other. This data movement could be further improved with RDMA technology.
This seminar should provide a background overview of the required technologies for service migration and explore recent improvements for low-latency service migration in both hardware and software.
Contact
marco.liess@tum.de