Seminar on Development and Integration of Hardware Accelerators (IN2107)

Dirk Stober and Xiaorang Guo

Dates:

Kick-off Meeting mid October

Kick-off meeting:

t.b.d

Regular meetings t.b.d
Presentations: t.b.d
ECTS: 5
Language: English
Type: Seminar, 2SWS
Registration:

Matching System

Questions:  Contact dirk.stober(at)tum.de
Requirements: 

Good unterstanding of Computer Architecture and Digital Design required!

 

This course is part of the BB-KI (Brandenburg / Bayern Aktion für KI-Hardware) chips project, aimed at offering practical courses in the area of dedicated AI Hardware.


Topics

The goal of the seminar is to explore approaches in integrating and developing new domain specific architectures and accelerators, as well as studying existing accelerators. Possible topics might include but are not limited to:

Hardware Development:
  • Open Source HLS (e.g. PandA-Bambu, XLS) and FPGA toolchain (Yosys, etc.)
  • Domain-specific languages: Sycl, OpenACC, HPX, DaCe, ...
  • More Abstract Hardware Description Languages and Frameworks (CHISEL, FIRRTL, Magma and more
  • Machine Learning on FPGAs (hls4ml, FINN, ...)
HW/SW Interface and Integration:
  • Hardware & Software Co-Design (DevelopmentDebugging and Simulation)
  • ASIC accelerators: Bus Interfaces and Programming Languages (TPU, NVDLA, Cerebras,...)
  • Memory Hierachy on Heterogeneous platforms: Shared Memory, Caches, Scratchpads, Coherence and Virtual Memory Support
  • Integration of FPGAs into Cloud Environment
Novel AI accelerators/architectures
  • Vision Transformer (ViT) Accelerator (Supervisor: Xiaorang Guo) (paper)
  • Graph convolutional neural networks (Supervisor: Xiaorang Guo) (paper)
  • AI for Quantum Application: Efficient ML architecture for qubit readout (Supervisor: Xiaorang Guo) (paper)
  • Processing In Memory: Microarchitecture, Workloads, Synchronization and Software Interface (Samsung Function-In-Memory, Academic Approaches) (paper)
  • ASIC accelerators for ML: Microarchitecture (TPU, Cerebas, Eyeriss,...)
  • Course Grained Reconfigurable Arrays (CGRA): Application Cases, Architectures, Programming Models

 

Your own topic in the area?

 


Organization

Report:

  • Literary survey ( Min 3-4 scientific papers)
  • Concepts and Trade-offs
  • If applicable experience from implementation
  • Reviewed by other student
  • Deadline: t.b.d

Presentation:

  • ~20 min presentation + 10 min Questions
  • Short 2 min pre-recorded Summary of your work/presentation in video form to be presented to other students.

Review:

  • Review of other students paper

Grading:

The work will be performed on an individual basis and the final grade will be based on the sum of the three grades, with all three tasks being mandatory to pass.


Prerequisites

  • Interest in Computer Architecture and/or low level programming
  • Ability to work independently