Seminar Integrierte Systeme
Vortragende/r (Mitwirkende/r) | |
Art | Seminar |
Umfang | 3 SWS |
Semester | Wintersemester 2024/25 |
Unterrichtssprache | Deutsch |
- 25.10.2024 10:00-11:30 N2128, Seminarraum
Anmerkung: Begrenzte Teilnehmerzahl! Anmeldung in TUMonline von 23.09.2024 bis 24.10.2024 Jeder Student muss ein Seminarthema vor der Einführungsveranstaltung wählen. Dazu muss er Kontakt mit dem entsprechenden Themenbetreuer aufnehmen. Die Themen werden in der Reihenfolge der Anfragen vergeben. Die einzelnen Themen werden unter <a href=""></a> ab 07.10.2024 bekannt gegeben.
Die Modulteilnehmer erarbeiten selbstständig aktuelle wissenschaftliche Beiträge, fertigen eine zu bewertende schriftliche Ausarbeitung an, präsentieren ihren Beitrag im Rahmen eines Kolloquiums und tragen mit Diskussionsbeiträgen zum Kolloquium bei.
Inhaltliche Voraussetzungen
Lehr- und Lernmethoden
Der Teilnehmer bekommt - abhängig von seinem individuellen Thema - einen eigenen Betreuer zugeordnet. Der Betreuer hilft dem Studierenden insbesondere zu Beginn der Arbeit, indem er in das Fachthema einführt, geeignete Literatur zur Verfügung stellt und hilfreiche Tipps sowohl bei der fachlichen Arbeit als auch bei der Erstellung der schriftlichen Ausarbeitung und des Vortrags gibt.
Studien-, Prüfungsleistung
- 50 % schriftliche Ausarbeitung (typisch 4 Seiten)
- 50 % Vortrag 20 Minuten plus Diskussion 5 Minuten
Empfohlene Literatur
Angebotene Themen
Vergebene Themen
Prefetching Techniques Based on Machine Learning
Prefetching techniques are widely used in digital systems to enhance performance. A prefetcher predicts and fetches data before it is actually accessed, thereby hiding memory access latency.
Traditional prefetchers typically consider only one program context and work well with regular memory access patterns. Recently, machine learning techniques such as neural networks and reinforcement learning have been employed in prefetcher design. These machine learning based prefetchers take into account more program and system-level information, allowing them to make smarter decisions. As a result, they often achieve higher accuracy, coverage, and timeliness, leading to improved system performance.
The goal of this seminar is to study and compare prefetching mechanisms based on different machine learning methodologies. After reading some papers, you should know the advantages of using machine learning in prefetching, as well as the challenges associated with its implementation. A starting point literature will be provided.
For MSCE/MSEI student
Yuanji Ye
Algorithms for Memory Prefetching
DRAM modules are indispensable for modern computer architectures. Their main advantages are an easy design with only 1 Transistor per Bit and a high memory density.
However, DRAM accesses are rather slow and require a dedicated DRAM controller that coordinates the read and write accesses to the DRAM as well as the refresh cycles.
In order to reduce the DRAM access latency,memory prefetching is a common technique to access data prior to their actual usage. However, this requires sophisticated prediction algorithms in order to prefetch the right data at the right time.
The goal of this Seminar is to study and compare several memory prefetching algorithms and present their benefits and usecases. A starting point of literature will be provided.
Oliver Lenke
Developments in Asynchronous SRAM Design and Verification
One of the barriers to implementing asynchronous logic in commercial ASICs, is the lack of asynchronous SRAM primitives and design tools. Current VLSI design tools may be coaxed into supporting implementation and layout of asynchronous logic, but often this has to be combined with traditional synchronous SRAM. Furthermore, while custom cells for asynchronous SRAM are a theoretical possibility, the lack of EDA support for design and layout of asynchronous memory arrays significantly increases the effort required. Lastly, efficient design of asynchronous SRAM cells and controllers, and the verification hereof, is not nearly as comprehensively explored as their synchronous counterparts.
For this topic, the student would look into current design strategies for 6-transistor asynchronous SRAM cells and the required controllers. Additionally, the state-of-the-art of asynchronous memory compilers and verification tools and strategies should be explored.
A Comparison of Recent Memory Prefetching Techniques
DRAM modules are indispensable for modern computer architectures. Their main advantages are an easy design with only 1 Transistor per Bit and a high memory density.
However, DRAM accesses are rather slow and require a dedicated DRAM controller that coordinates the read and write accesses to the DRAM as well as the refresh cycles.
In order to reduce the DRAM access latency, the cache hierarchy can be extended by dedictated hardware access predictors in order to preload certain data to the caches before it is actually accessed.
The goal of this Seminar is to study and compare prefetching mechanisms and access predictors on cache level with several optimizations and present their benefits and usecases. A starting point of literature will be provided.
B.Sc. in Electrical engineering or similar degree
Oliver Lenke
A Survey of Recent Prefetching Techniques for Processor Caches
Cache Design have by design some compulsory cache misses, i.e. the first access of a certain cacheline will typically result in a cache miss, since the data is not present in the cache hierarchy yet.
In order to reduce this, caches can be extended by prefetching mechanisms that speculatively prefetch some cachelines before they first get accessed.
The goal of this Seminar is to study and compare different cache prefetcher designs and present their benefits and usecases. A starting point of literature will be provided.
B.Sc. in Electrical engineering or similar degree
Oliver Lenke
Asynchronous Design Using Standard EDA Tools
Asynchronous logic have several advantages over conventional, clocked circuits which makes it of interest for certain areas of applications, such as network-on-chips, mixed-mode electronics, and arithmetic processors. Furthermore, a properly designed asynchronous circuit may offer both better performance and significantly lower power consumption than a synchronous equivalent.
Modern EDA tools, however, are not optimised for asynchronous design. This unfortunately complicates everything from architectural descriptions to synthesis and implementation, to verification and testing. A major concern lies in the fact that most tools are reliant upon global clocks for optimisation, as well as timing checks. For asynchronous circuits, where all functional blocks are self timed, this means that EDA tools will not be able to properly use clock constraints to optimise the critical path, thereby nullifying any speed advantages. And critically, EDA tools are not even guaranteed to produce functioning netlists. As such, in order to produce and test asynchronous circuits that are of non-trivial complexity, the standard design flow must be modified to take the characteristics of asynchronous logic into account.
For this seminar, the student should research the state-of-the-art for asynchronous logic design and testing with current industry standard EDA tools and what design flow modifications are required for producing robust and efficient asynchronous circuits.
Exploration of Deadlock-Avoidance Algorithms for FPGA-Based Network-on-Chips
Network-on-chip (NoC) is a communication architecture used in multi-core and many-core systems to interconnect processing elements (PEs), such as CPUs, GPUs, accelerators, and memory controllers, using packet-switched networks similar to those found in computer networks. It replaces traditional bus-based interconnects with a scalable and modular network infrastructure, offering higher performance, lower latency, and improved scalability. In a NoC, PEs are connected through a network of routers and links, forming a mesh, torus, or other topologies. Each router is responsible for forwarding packets between neighboring PEs using routing algorithms. NoC architectures can vary greatly in terms of topology, routing algorithms, flow control mechanisms, and other parameters, depending on the specific application requirements and design constraints.
Field-Programmable Gate Arrays (FPGAs) are integrated circuits that contain an array of configurable logic blocks interconnected through programmable routing resources. They provide a versatile and powerful platform for implementing digital circuits and systems, offering flexibility, reconfigurability, parallelism, and hardware acceleration capabilities. Hence, they are well-suited for a wide range of applications across various domains, including telecommunications, networking, automotive, aerospace, consumer electronics, and industrial automation.
FPGA-optimized NoCs are tailored to exploit the unique features and capabilities of FPGAs while addressing the challenges of communication and interconnection in FPGA-based systems. They play a crucial role in enabling efficient and scalable communication infrastructure for FPGA-based applications across a wide range of domains. The goal of this seminar work is to investigate state-of-the-art deadlock-avoidance algorithms for FPGA-based NoCs.
Relevant literature
[1] Monemi, Alireza, et al. "ProNoC: A low latency network-on-chip based many-core system-on-chip prototyping platform." Microprocessors and Microsystems 54 (2017): 60-74.
[2] Becker, Daniel U. Efficient microarchitecture for network-on-chip routers. Stanford University, 2012.
[3] Xu, Yi, et al. "Simple virtual channel allocation for high throughput and high frequency on-chip routers." HPCA-16 2010 The Sixteenth International Symposium on High-Performance Computer Architecture. IEEE, 2010.
An Overview of Service Migration in Modern Edge Computer Networks
In modern Edge computer networks, applications and services should adhere to service-level agreements (SLA) like low latency or minimal throughput. Depending on demand and resource availability, these services have to be migrated between compute nodes to ensure these SLAs.
Service migration is a critical aspect of Edge computing, enabling the movement of services closer to the data source or end-users for improved performance and reduced latency. However, it comes with its own set of challenges, such as maintaining service continuity and managing resource constraints. This involves checkpointing and restarting of the applications (potentially in containers), as well as moving the data from one compute node to the other. This data movement could be further improved with RDMA technology.
This seminar should provide a background overview of the required technologies for service migration and explore recent improvements for low-latency service migration in both hardware and software.