Experimental Evaluation of xApp-related Vulnerabilities in FlexRIC
Description
Description
In previous mobile network generations, Radio Access Networks (RAN) have been treated as a proprietary, closed network segment that is specific to every operator. To accelerate development and innovation, new initiatives such as the O-RAN ALLIANCE were born, aiming to split the RAN into different components and standardize the open interfaces that connect them.
Fundamentally, O-RAN leverages the concept of Software Defined RAN (SD-RAN) by decoupling the RAN data plane from the control plane and introducing several new RAN-controlling components. One of the central components is the near real-time RAN Intelligent Controller (nearRT-RIC), which manages the RAN (network slices, handovers, etc). The nearRT-RIC is designed to allow both the use of traditional, rule-based policies and Machine Learning or data-driven ones to optimize the RAN operation. The logic of these policies is encapsulated in applications called xApps that run on the nearRT-RIC platform and can read and modify different parameters of the RAN.
Among open-source implementations of the nearRT-RIC, OpenAirInterface FlexRIC is one of the most prominent, providing a flexible platform to experiment with custom RIC functionality [1].
While providing opportunities for efficient resource management, the nearRT-RIC is also a prospective target for attackers, because of its control power over the RAN. Specifically, an attack vector is a malicious xApp that can interfere with other legitimate xApps running on the nearRT-RIC.
Near-RT RIC implementations are still in their infancy and exhibit various bugs and security vulnerabilities, particularly at the E2 interface [2].
To investigate the broader impact of such issues, we intend to examine whether known vulnerabilities - originally identified in other nearRT-RIC platforms[3, 4] - can be reproduced in the OpenAirInterface FlexRIC [1]. Specifically, we analyze the extent to which crafted messages from malicious xApps can disrupt FlexRIC’s operation through the E2 interface. Understanding the susceptibility of FlexRIC to these types of attacks is essential for evaluating its robustness and for hardening open-source RIC implementations more generally. This will help discover if specific implementations cause the vulnerabilities or if they are common problems in the design of nearRT-RIC systems.
Objectives
The goal of this Student Work is to reproduce the attacks of the OSC RIC discussed in [2] and [4] for the OpenAirInterface FlexRIC. Additionally, after reproducing the existing attacks and understanding the FlexRIC Platform, the student is expected to explore new attack attempts with the same goal of disrupting nearRT-RIC. Special focus will be put on the critical interfaces of the system, such as the E2 Interface, and the E42 Interface.
[1] R. Schmidt, M. Irazabal, and N. Nikaein, “Flexric: an sdk for next-generation sd-rans,” in Proceedings of the 17th International Conference on Emerging Networking EXperiments and Technologies, ser. CoNEXT ’21. New York, NY, USA: Association for Computing Machinery, 2021, p. 411–425. [Online]. Available: https://doi.org/10.1145/3485983.3494870
[2] C.-F. Hung, Y.-R. Chen, C.-H. Tseng, and S.-M. Cheng, “Security Threats to xApps Access Control and E2 Interface in O-RAN,” IEEE Open Journal of the Communications Society, vol. 5, pp. 1197–1203, 2024.
[3] “O-RAN SC Projects,” https://docs.o-ran-sc.org/en/latest/projects.html#near-realtime-ran-intelligent-controller-ric, accessed: 2025-03-01.
[4] “Opening Critical Infrastructure: The Current State of Open RAN Security,” https://www.trendmicro.com/en_us/research/23/l/the-current-state-of-open-ran-security.html, accessed: 2025-03-01.
Prerequisites
- Interest in network security and hands-on approach
- Experience with C/C++ or Python; familiarity with Linux-based development
- Basic knowledge of mobile networks and software-defined networking (SDN)
Contact
- R?zvan-Mihai Ursu (razvan.ursu@tum.de)
Supervisor:
Evaluating the Energy Consumption of Indoor Networks
Description
Join us in exploring the future of energy-efficient in-building networks! As demand for high-performance wireless connectivity grows, optimizing Access Point (AP) placement is critical to reducing energy consumption and operational costs.
In this work, you will:
- Measure power consumption of WiFi and LiFi networks under different AP placement strategies.
- Use planning tools to simulate and evaluate network layouts.
- Compare energy efficiency and cost per bit for various deployment scenarios.
- Develop recommendations for optimizing AP placement to improve sustainability and reduce costs.
Prerequisites
- Background in wireless networking and communication systems
- Experience with Python
- Strong problem-solving skills
- Availability to work in-presence on the testbed
Supervisor:
Identifying Challenges in Reliability and Security of Multi-Domain QKD Networks
Description
This research internship focuses on investigating the state-of-the-art reliability and security mechanisms in multi-domain Quantum Key Distribution (QKD) networks. The first objective is to gain a deep understanding of multi-domain networks and how QKD is applied in this concept. Then, the goal is to explore existing architectures, key management protocols, and inter-domain coordination strategies to understand how QKD networks ensure secure key exchange across multiple administrative domains. A key objective is to identify challenges and open research questions related to key relay trust models, authentication between domains, error correction, and resilience against failures or attacks. The study will also assess the impact of network heterogeneity, policy conflicts, and potential attack vectors, such as man-in-the-middle threats in QKD link handovers. The findings will contribute to a structured gap analysis, providing insights into future research directions to enhance the scalability, robustness, and trustworthiness of multi-domain QKD networks.
Prerequisites
- Strong networking background (knowledge on QKD is a plus)
- Research motivation and critical thinking
Supervisor:
Analysis of Dependability and Resilience of BGP and End-To-End Routing in Multi-Domain Networks
Description
The Border Gateway Protocol (BGP) is the fundamental mechanism enabling coordination and routing across multiple autonomous systems in the global Internet. Despite its widespread adoption, BGP faces significant challenges related to dependability and resilience, including route convergence delays, misconfigurations, route hijacking, and limited support for fast recovery after failures. Ensuring dependable inter-domain coordination is critical for the stability of the Internet and large-scale communication networks. The first objective of the research internship is to get a deep understanding of how BGP is implemented in multi-domain networks, and what are the challenges and missing points regarding its dependability and resilience aspects. Once BGP is well defined, another issue is how it is used for end-to-end demand routing in multi-domain networks. The second part of the research internship should focus on studying state of the art regarding routing schemes and different routing parameters considered in multi-domain networks, according to different types of networks, applications, and use cases. The goal is to identify parameters that are still missing when optimizing demand routing in multi-domain networks and how their absence affects reliability and resilience.
Prerequisites
- Strong networking background
- Research motivation and critical thinking
Contact
maria.samonaki@tum.de
Supervisor:
Student Assistant for Programmable Communication Networks Lab Summer Semester 2025
Description
PCN lab offers the opportunity to familiarize with Openflow and P4 for computer networks. For the next semester, a position is available to assist the participants during labs and the project phase. The lab is planned to be held on-site every Wednesday from 13:00 to 17:00.
Prerequisites
- Knowledge of communication networks.
- Solid programming skills: Python.
- Linux knowledge.
Contact
kaan.aykurt@tum.de
nicolai.kroeger@tum.de
Supervisor:
Mobile Communication Message Security Analysis with Machine Learning
Description
Background
In this topic an analysis of mobile communication messages in 5G should be done. There are several different kinds of these messages with different functions and level of information content. The analysis should be done using state of the art machine learning algorithms like random forest or neural networks. A focus should be on how to identify differences between network vendors.
Objectives
This thesis provides an opportunity for students to gain hands-on experience with 5G technology through the following activities (varying complexity, depending on the type of thesis):
• Security and privacy analysis of Mobile Communication Messages
• Focus on machine learning algorithms
• Documentation and Reporting: Document the research process, experimental setups, findings, and challenges encountered during the research.
Application Process
Note that this thesis is offered by ZITiS and academically supervised by LKN. Therefore, applicants should follow the application process as described below. In case of issues with that etc. please contact Nicolai Kröger.
All applications must be submitted through the application website INTERAMT.
• Master Thesis: https://interamt.de/koop/app/stelle?0&id=1242375
• Bachelor Thesis: https://www.interamt.de/koop/app/stelle?id=1242370
Carefully note the information provided on the site to avoid any issues with your application.
Your application should include
• a short CV
• a current transcript of records
• the keyword “T3-MK-ANALYSIS” as a comment
For any questions or further details regarding this thesis and the application process, please feel free to contact
• TUM contact: nicolai.kroeger@tum.de
• Forschungreferat T3 (ZITiS): t3@zitis.bund.de
Prerequisites
Candidates should possess basic programming skills (Python) and machine learning skills. Interest in networking and wireless communication technologies is required. Although prior knowledge of 5G technology is beneficial, it is not mandatory. Familiarity with wireless communication protocols and network security principles is advantageous.
Supervisor:
Mobile Communication Broadcast Message Security
Description
Background
In this topic an analysis of Broadcast messages in 4G and 5G should be done. There exist several different kind of these messages with different functions and level of information content. The analysis should consider privacy and security aspects. After the theoretical review and analysis, the
practical part should focus on one aspect of the findings. An implementation of one security and privacy aspect should be done as a proof-of-concept with Open Source hard- and software.
Objective
This thesis provides an opportunity for students to gain hands-on experience with 4G, 5G technology through the following activities (varying complexity, depending on the type of thesis):
• Security and privacy analysis of Broadcast Messages
• Focus on cell priorities and implementation as a proof-of-concept
• Practical evaluation with testing
• Documentation and Reporting: Document the research process, experimental setups, findings, and challenges encountered during the research.
Application Process
The offered thesis is external (from ZITiS: https://www.zitis.bund.de/) and academically supervised by TUM LKN. Thus, all applications must be submitted through the application website INTERAMT.
• Master Thesis: https://interamt.de/koop/app/stelle?0&id=1242375
• Bachelor Thesis: https://www.interamt.de/koop/app/stelle?id=1242370
Carefully note the information provided on the site to avoid any issues with your application.
Your application should include
• a short CV
• a current transcript of records
• the keyword “T3-MK-BROADCAST” as a comment
For any questions or further details regarding this thesis and the application process, please feel
free to contact
• TUM contact: nicolai.kroeger@tum.de
• Forschungreferat T3 (ZITiS): t3@zitis.bund.de
Prerequisites
Candidates should possess basic programming skills (C/C++ and Python) and have an interest in networking and wireless communication technologies. Although prior knowledge of 4G, 5G technology is beneficial, it is not mandatory. Familiarity with wireless communication protocols, network security principles, and basic hardware interfacing is advantageous.
Supervisor:
Resource Optimization for Multi-Link Operation towards Wi-Fi 8
Description
Wi-Fi 7 introduces Multi-Link Operation (MLO) – a technology that enables devices to transmit and receive data across multiple frequency bands simultaneously. This innovation optimizes network performance and reliability especially in high-density environments.
In this thesis, you'll develop and evaluate cutting-edge resource allocation strategies for MLO using the ns-3 network simulator. Join this exciting research and help shape the next generation of wireless communication!
Related Reading:
- M. Carrascosa-Zamacois, G. Geraci, E. Knightly and B. Bellalta, "Wi-Fi Multi-Link Operation: An Experimental Study of Latency and Throughput," in IEEE/ACM Transactions on Networking, vol. 32, no. 1, pp. 308-322, Feb. 2024, doi: 10.1109/TNET.2023.3283154.
- L. Zhang, H. Yin, S. Roy, L. Cao, X. Gao and V. Sathya, "IEEE 802.11be Network Throughput Optimization With Multi-Link Operation and AP Controller," in IEEE Internet of Things Journal, doi: 10.1109/JIOT.2024.3386653.
If you are interested in this work, please send me an email with a short introduction of yourself along with your CV and grade transcript.
Prerequisites
- Experience with programming in C/C++
- Strong foundation in wireless networking concepts
- Motivation to learn multi-link concepts
Supervisor:
Most energy efficient Core on a private Telco Cloud: Energy optimized redundancy model for telco applications
Kubernetes, Energy Efficiency, 5G Core Network
Description
Motivation:
Deutsche Telekom is operating and constantly developing and improving its own cloud to operate internet and telephony services. The Kubernetes Cloud and the Telco applications are combined to form a TaaP – Telco as a Platform. The TaaP are thousands of servers and hundreds of applications. The energy efficiency of the TaaP is a key success criterion in order to optimize costs, energy consumption, and carbon emissions. Hence the concept of Full Stack Energy Management is established. The focus is to optimize hardware, software and services towards energy efficiency without affecting service availability and robustness.
Problem & Challenge:
In the Telco industry, so far, HW redundancy has been the baseline for service robustness and resilience. The introduction of virtualization and containerization concepts resulted in an additional redundancy level above the hardware. Classical redundancy models don’t apply to this multi-layer redundancy any longer. Moreover, there is no mathematical model that calculates the service availability for such a case.
Specific Problem Formulation:
On a TaaP there are multiple layers of redundancy in Hardware and Software. On the one hand, there are multiple site deployments, where each site has multiple hundreds of servers. On the other hand, on each site, each server has multiple redundant hardware parts like power supply. Moreover, a Kubernetes Cluster, which is homed on one site, hosts multiple microservices, each with a different redundancy concept like active/passive, n+1, n+m, etc. This setup of mixed HW and SW redundancy causes inefficiency and is not easy to calculate or simulate in terms of overall service, network, site, redundancy, and energy consumption.
Solution Approach:
There are multiple different parameters in HW and SW that impact the service availability and energy consumption. Firstly, a comprehensive list of these parameters is required, including modeling of dependencies. Secondly, a model needs to be set up to consider all of these parameters into “one equation”.
Expected Outcome:
A simulation and mathematical model should be developed that considers software and hardware redundancy across multiple sites and SW layers in order to calculate the network-wide service availability. Moreover, the model should allow the optimization of the following parameters: least required HW based on predefined service availability, least energy consumption, and best redundancy.
Prerequisites
- Familiarity with tools such as GitLab and Wiki platforms.
- Proficiency in English. The project language is English and the team spans across four EU countries.
- Basic Kubenetes Knowhow.
- High level of self-engagement and motivation.
Contact
- Manuel Keipert (manuel.keipert@telekom.de)
- Valentin Haider (valentin.haider@tum.de)
- Razvan-Mihai Ursu (razvan.ursu@tum.de)
Supervisor:
Early Warning Model (EWM) for Anomalies in Deutsche Telekom Streaming Data
Description
Through its nationwide communication infrastructure, Deutsche Telekom operates a large variety of services targeted at the needs of customers and their devices. With technological advances reaching many industries, the set of such networked daily-use devices includes not only phones but TV attachments and many more. Naturally, this combination of a high number of users plus the variety of services and devices produces a large amount of heterogeneous data. Unexpected events and anomalous behavior can easily cause service disruptions and even downtime for the system.Therefore, it is important to identify points within the streaming data that indicate deviations from normal system operation. In this context, the thesis aims to evaluate the ability to flag such anomalies early on or even predict them in advance, essentially creating an early warning model (EWM).
Prerequisites
- Knowledge in python programming.
- Familiarity with supervised learning, sensitivity analysis and timeseries.
- Skills in working with data (especially elastic and pandas)
- willingness to self-teach and strong problem-solving skills :)
Supervisor:
Evaluating the Necessity of an Orchestration Tool in Kubernetes-Based CNF Deployments: A Design Science Approach
Kubernetes, Cloud Orchestration, 5G Core Network, Cloud-Native Network Functions
Description
In the ongoing digital transformation, telecommunications companies are shifting from Virtual Network Functions (VNFs) to Cloud-Native Network Functions (CNFs) to meet the demand for agile, scalable, and resilient services. Deutsche Telekom is at the forefront of this transition, moving its network services onto a self-hosted bare-metal cloud infrastructure using Kubernetes as the core platform for container orchestration.
Kubernetes, widely recognized for its robust orchestration capabilities, is the foundation of Deutsche Telekom's cloud-native strategy. However, as network services are usually complex software solutions, deploying and provisioning CNFs pose several orchestration challenges that may require additional tooling. Various tools on the market are designed to manage these orchestration complexities, but the necessity and efficiency of such tools in a Kubernetes-based environment remain an open question.
This thesis seeks to answer the following question: "Is an additional orchestration tool necessary for managing CNF deployments in Kubernetes, or can a custom Kubernetes operator effectively address these orchestration needs?". The purpose of this master's thesis is to evaluate whether a dedicated orchestration tool is needed when deploying and managing CNFs in a Kubernetes setup, where Kubernetes already acts as an orchestrator. This thesis will also explore the design and development of a Kubernetes operator as a potential alternative to using an external orchestration tool.
For more details, please check the PDF with the thesis description
Prerequisites
We’re looking for motivated and technically skilled individuals to undertake a challenging and rewarding thesis project. To ensure success, the following prerequisites are essential:
- Strong Technical Acumen: A solid understanding of technical concepts and the ability to quickly adapt to and adopt new technologies.
- Programming Expertise: Proficiency in programming, ideally with experience in Go.
- Containerization Knowledge: Familiarity with container technologies for software deployment (e.g., Docker).
- (Kubernetes Experience): Prior exposure to Kubernetes is a plus but not mandatory.
Contact
- Dr. Patrick Derckx (patrick.derckx@telekom.de)
- Razvan-Mihai Ursu (razvan.ursu@tum.de)
Supervisor:
Advancing Kubernetes Simulations: Modeling Multi-Tier Services with Shadow
Kubernetes, software-in-the-loop, simulations
Description
Shadow [1] is a discrete-event network simulator that directly executes real application code by co-opting native Linux processes into a high-performance network simulation. It achieves this by intercepting system calls and emulating necessary functionalities, allowing applications to operate within a simulated network environment without modification. While initially developed to model large-scale Tor networks, Shadow can also be adapted to simulate other complex systems.
The primary goal of this master’s thesis is to explore the feasibility and methodology of simulating multi-tier Kubernetes-based cloud deployments using the Shadow simulator. This involves setting up and extending Shadow to accurately represent the components and operations of a Kubernetes cluster and evaluating the performance and accuracy of this simulation approach.
[1] Jansen, R., et al. (2022). Co-opting Linux Processes for High-Performance Network Simulation. 2022 USENIX Annual Technical Conference (USENIX ATC ’22). USENIX Association. Retrieved from (https://www.usenix.org/system/files/atc22-jansen.pdf)
Prerequisites
- Strong background in computer networks and distributed systems.
- Proficiency in Linux systems and experience with simulation/emulation tools.
- Familiarity with Kubernetes architecture and operations.
- Programming skills in languages such as C, Python, and Rust.
Contact
- Razvan-Mihai Ursu (razvan.ursu@tum.de)
Supervisor:
Automated Configuration of Complex Networks Using AI-Driven Intent-Based Networking
Networks, Artificial Intelligence, Intent-Based Networking, Large Language Models
Description
In today’s business landscape, the demand for highly available, secure, and scalable networks is continuously increasing, particularly for large enterprises.
Conventional network management faces challenges such as complexity, with manual configurations being error-prone and time-consuming. It also struggles with scalability issues due to slow adaptation to changing needs and limited automation, which requires deep expertise. Modern solutions like SDN, NFV, and AI-driven automation address these problems by enabling dynamic, scalable, and policy-driven network management.
The traditional network management approach relies on manual implementation, requiring expertise in routing, Quality of Service (QoS), and encryption mechanisms. This results in high operational costs and makes the network prone to misconfigurations. Intent-Based Network Configuration Management is a modern approach to managing and automating networks, where the operator defines "what they want the network to do" (the intent) rather than specifying "how to configure the network" (manual steps). The system interprets these high-level intents and automates the necessary configurations and adjustments to achieve the desired outcome.
Prerequisites
• Knowledge in Network Automation and Network Orchestration
• AI and Machine Learning Fundamentals
• Proficiency in programming and scripting, with a strong focus on Python and knowledge in libraries such as TensorFlow and PyTorch
• High level of self-motivation, independence, and problem-solving capability
Contact
kaan.aykurt@tum.de
philip.ulrich@telekom.de
klaus.scheepers@telekom.de
Supervisor:
Mobile Communication RRC Message Security Analysis
5G, SDR, Security, RAN
Description
In this topic an analysis of RRC messages in 4G and 5G should be done. There exist several different kind of these messages with different functions and level of information content. The focus should lay on messages related to the connection release. The analysis should consider privacy and security aspects. After the theoretical review and analysis the practical part should focus on an attack. An implementation of one security and privacy aspect should be done as a proof-of-concept with Open Source hard- and software.
The following things are requested to be designed, implemented, and evaluated (most likely via proof-of-concept) in this thesis:
• Security and availability analysis of specific RRC messages
• Implementation of an attack
• Practical evaluation with testing of commercial smartphones
We will offer you:
• Initial literature
- https://doi.org/10.14722/NDSS.2016.23236
• Smart working environment
• Deep contact to supervisors and a lot of discussions and knowledge exchange
A detailed description of the topics will be formulated with you in initial meetings. For sure, the report needs to be written based on the requirements of the universities, as well as a detailed documentation and handing over the complete project with all sources. Depending on the chosen thesis type the content will be adapted in its complexity.
All applications must be submitted through our application website INTERAMT:
https://interamt.de/koop/app/stelle?0&id=1242375
Carefully note the information provided on the site to avoid any issues with your application.
Please include
• a short CV
• current overview of your grades
• the keyword "T3-MK-RRC" as comment
in your application.
For any questions or further details regarding this thesis and the application process, please don't hesitate
to contact:
• TUM contact: nicolai.kroeger@tum.de, serkut.ayvasik@tum.de
• Forschungreferat T3 (ZITiS), Email: t3@zitis.bund.de
Prerequisites
Knowledge in the following fields is required:
• C/C++
Knowledge in the following fields would be an advantage:
• Mobile Communication 4G, 5G
Contact
• TUM contact: nicolai.kroeger@tum.de, serkut.ayvasik@tum.de
• Forschungreferat T3 (ZITiS), Email: t3@zitis.bund.de
Supervisor:
Latency and Reliability Guarantees in Multi-domain Networks
Multi-domain networks
Description
One of the aspects not covered by 5G networks are multi-domain networks, comprising one or more campus networks. There are private networks, including the Radio Access Network and Core Network, not owned by the cellular operators like within a university, hospital, etc. There will be scenarios in which the transmitter is within a different campus network from the receiver, and the data would have to traverse networks operated by different entities.
Given the different operators managing the “transmitter” and “receiver” networks, providing any end-to-end performance guarantees in terms of latency and reliability can pose significant challenges in multi-domain networks. For example, if there is a maximum latency that a packet can tolerate in the communication cycle between the transmitter and receiver, the former experiencing given channel conditions would require a given amount of RAN resources to meet that latency. The receiver, on the other end of the communication path, will most probably experience different channel conditions. Therefore, it will require a different amount of resources to satisfy the end-to-end latency requirement. Finding an optimal resource allocation approach across different networks that would lead to latency and reliability guarantees in a multi-domain network will be the topic of this thesis.
Prerequisites
The approach used to solve these problems will rely on queueing theory. A good knowledge of any programming language is required.
Supervisor:
Decentralized Federated Learning on Constrained IoT Devices
Description
The Internet of Things (IoT) is an increasingly prominent aspect of our daily lives, with connected devices offering unprecedented convenience and efficiency. As we move towards a more interconnected world, ensuring the privacy and security of data generated by these devices is paramount. That is where decentralized federated learning comes in.
Federated Learning (FL) is a machine-learning paradigm that enables multiple parties to collaboratively train a model without sharing their data directly. This thesis focuses on taking FL one step further by removing the need for a central server, allowing IoT devices to directly collaborate in a peer-to-peer manner.
In this project, you will explore and develop decentralized federated learning frameworks specifically tailored for constrained IoT devices with limited computational power, memory, and energy resources. The aim is to design and implement efficient algorithms that can harness the collective power of these devices while ensuring data privacy and device autonomy. This involves tackling challenges related to resource-constrained environments, heterogeneous device capabilities, and maintaining security and privacy guarantees.
The project offers a unique opportunity to contribute to cutting-edge research with real-world impact. Successful outcomes will enable secure and private machine learning on IoT devices, fostering new applications in areas such as smart homes, industrial automation, and wearable health monitoring.
Responsibilities:
- Literature review on decentralized federated learning, especially in relation to IoT and decentralized systems.
- Design and development of decentralized FL frameworks suitable for constrained IoT devices.
- Implementation and evaluation of the proposed framework using real-world datasets and testbeds.
- Analysis of security and privacy aspects, along with resource utilization.
- Documentation and presentation of findings in a thesis report, possibly leading to publications in top venues.
Requirements:
- Enrollment in a Master's program in Computer Engineering, Computer Science, Electrical Engineering or related fields
- Solid understanding of machine learning algorithms and frameworks (e.g., TensorFlow, PyTorch)
- Proficiency in C and Python programming language
- Experience with IoT devices and embedded systems development
- Excellent analytical skills and a systematic problem-solving approach
Nice to Have:
- Knowledge of cybersecurity and privacy principles
- Familiarity with blockchain or other decentralized technologies
- Interest in distributed computing and edge computing paradigms
Contact
Email: navid.asadi@tum.de
Supervisor:
Attacks on Cloud Autoscaling Mechanisms
Cloud Computing, Kubernetes, autoscaling, low and slow attacks, Horizontal Pod Autoscaler (HPA), Vertical Pod Autoscaler (VPA), cloud security, container orches
Description
In the era of cloud-native computing, Kubernetes has emerged as a leading container orchestration platform, enabling seamless scalability and reliability for modern applications.
However, with its widespread adoption comes a new frontier in cybersecurity challenges, particularly low and slow attacks that exploit autoscaling features to disrupt services subtly yet effectively.
This project aims to delve into the intricacies of these attacks, examining their impact on Kubernetes' Horizontal Pod Autoscaler (HPA) and Vertical Pod Autoscaler (VPA), and proposing mitigation strategies for more resilient systems.
Responsibilities:
- Conduct a thorough literature review to identify existing knowledge gaps and research on similar attacks.
- Develop methodologies to simulate low and slow attack scenarios on Kubernetes clusters with varying configurations of autoscaling mechanisms.
- Analyze the impact of these attacks on resource utilization, service availability, and overall system performance.
- Evaluate current defense mechanisms and propose novel strategies to enhance the resilience of Kubernetes' autoscaling features.
- Implement and test selected mitigation approaches in a controlled environment.
- Document findings, present a comparative analysis of effectiveness, and discuss implications for future development in cloud security practices.
Requirements:
- A strong background in computer engineering, computer science or a related field.
- Familiarity with Kubernetes architecture and container orchestration concepts.
- Experience in deploying and managing applications on Kubernetes clusters.
- Proficiency in at least one scripting/programming language (e.g., Python, Go).
- Understanding of cloud computing and cybersecurity fundamentals.
Nice to Have:
- Prior research or hands-on experience in cloud security, particularly in the context of Kubernetes.
- Knowledge of network protocols and low-level system interactions.
- Experience with DevOps tools and practices.
Contact
Email: navid.asasdi@tum.de
Supervisor:
Working Student/Research Internship - On-Device Training on Microcontrollers
Description
We are seeking a highly motivated and skilled student to replicate a research paper that explores the application of pruning techniques for on-device training on microcontrollers. The original paper demonstrated the feasibility of deploying deep neural networks on resource-constrained devices, and achieved significant reductions in model size and computational requirements while maintaining acceptable accuracy.
Responsibilities:
- Extend our existing framework by implementing the pruning techniques on a microcontroller-based platform (e.g., Arduino, ESP32)
- Replicate the experiments described in the original paper to validate the results
- Evaluate the performance of the pruned models on various benchmark datasets
- Compare the results with the original paper and identify areas for improvement
- Document the replication process, results, and findings in a clear and concise manner
Requirements:
- Strong programming skills in C and Python
- Experience with deep learning frameworks (e.g., TensorFlow, PyTorch) and microcontroller-based platforms
- Familiarity with pruning techniques for neural networks is a plus
- Excellent analytical and problem-solving skills
- Ability to work independently and manage time effectively
- Strong communication and documentation skills
Contact
Email: navid.asadi@tum.de
Supervisor:
Working Student - Machine Learning Serving on Kubernetes
Machine Learning, Kubernetes, Containerization, Docker, Orchestration, Cloud Computing, MLOps, Machine Learning Operations, DevOps, Microservices Architecture,
Description
We are seeking an ambitious and forward-thinking working student to join our dynamic team working at the intersection of Machine Learning (ML) and Kubernetes. In this exciting role, you will be immersed in a cutting-edge environment where advanced ML models meet the power of container orchestration through Kubernetes. Your contributions will directly impact the development and optimization of scalable and robust ML serving systems leveraging the benefits of Kubernetes.
If you are a student passionate about both Machine Learning and Kubernetes, we invite you to join us on this exciting journey! We offer the chance to pioneer cutting-edge solutions that leverage the power of these two transformative technologies.
Responsibilities:
- Collaborate with a cross-functional team to design and implement ML workflows on Kubernetes.
- Assist in packaging and deploying ML models as microservices using containers (Docker) and managing them effectively through Kubernetes.
- Optimize resource allocation, scheduling, and scaling strategies for efficient model serving at varying workloads.
- Implement monitoring solutions specific to ML inference tasks within the Kubernetes cluster.
- Troubleshoot and debug issues related to containerized ML applications
- Document best practices, tutorials, and guides on leveraging Kubernetes for ML serving
Requirements:
- Currently enrolled in a Bachelor's or Master's program in School of CIT
- Strong programming skills in Python with experience in software development lifecycle methodologies.
- Familiarity with machine learning frameworks such as TensorFlow and PyTorch.
- Proficiency in container technologies. Docker and Kubernetes certification would be a plus but not mandatory.
- Experience with cloud computing platforms; e.g., AWS, GCP or Azure.
- Demonstrated ability to work independently with effective time management and strong problem-solving analytical skills.
- Excellent communication and teamwork capabilities.
Nice to Have:
- Kubernetes Certification: Having a valid Kubernetes certification (CKA, CKAD, or CKE) demonstrates your expertise in container orchestration and can be a significant advantage.
- Experience with DevOps and/or MLOps Tools: Familiarity with MLOps tools such as MLflow, Kubeflow, or TensorFlow Extended (TFX) can help you streamline the machine learning workflow and improve collaboration. Experience with OpenTelemetry, Jaeger, Istio, and monitoring tools is a plus.
- Knowledge of Distributed Systems: Understanding distributed systems architecture and design patterns can help you optimize the performance and scalability of your machine learning models.
- Contributions to Open-Source Projects: Having contributed to open-source projects related to Kubernetes, machine learning, or MLOps demonstrates your ability to collaborate with others and adapt to new technologies.
- Familiarity with Agile Methodologies: Knowledge of agile development methodologies such as Scrum or Kanban can help you work efficiently in a fast-paced environment and deliver results quickly.
- Cloud-Native Application Development: Experience with cloud-native application development using frameworks like Cloud Foundry or AWS Cloud Development Kit (CDK) can be beneficial in designing scalable and efficient machine learning workflows.
Contact
Email: navid.asadi@tum.de
Supervisor:
Working Student for the Edge AI Testbed
IoT, Edge Computing, Machine Learning, Measurement, Power Characterization
Description
We are seeking a highly motivated and enthusiastic Working Student to join our team as part of the Edge AI Testbed project. As a Working Student, a key member of our research team, you will contribute to the development and testing of cutting-edge Artificial Intelligence (AI) systems at the edge of the network. You will work closely with our researchers and engineers to design, implement, and evaluate innovative AI solutions that can operate efficiently on resource-constrained edge devices.
Responsibilities:
- Assist in designing and implementing AI models for edge computing
- Develop and test software components for the Edge AI Testbed
- Collaborate with team members to integrate AI models with edge hardware platforms
- Participate in performance optimization and evaluation of AI systems on edge devices
- Contribute to the development of tools and scripts for automated testing and deployment
- Document and report on project progress, results, and findings
If you are a motivated and talented student looking to gain hands-on experience in Edge AI, we encourage you to apply for this exciting opportunity!
Requirements:
- Currently enrolled in a Bachelor's or Master's program in School of CIT
- Strong programming skills in languages such as Python and C++
- Experience with AI frameworks such as TensorFlow, PyTorch, or Keras
- Familiarity with edge computing platforms and devices (e.g., Raspberry Pi, NVIDIA Jetson)
- Basic knowledge of Linux operating systems and shell scripting
- Excellent problem-solving skills and ability to work independently
- Strong communication and teamwork skills
Nice to Have:
- Experience with containerization using Docker
- Familiarity with cloud computing platforms (e.g., Kubernetes)
- Experience with Apache Ray
- Knowledge of computer vision or natural language processing
- Participation in open-source projects or personal projects related to AI and edge computing
Contact
Email: navid.asadi@tum.de
Supervisor:
An AI Benchmarking Suite for Microservices-Based Applications
Kubernetes, Deep Learning, Video Analytics, Microservices
Description
In the realm of AI applications, the deployment strategy significantly impacts performance metrics.
This research internship aims to investigate and benchmark AI applications in two predominant deployment configurations: monolithic and microservices-based, specifically within Kubernetes environments.
The central question revolves around understanding how these deployment strategies affect various performance metrics and determining the more efficient configuration. This inquiry is crucial as the deployment strategy plays a pivotal role in the operational efficiency of AI applications.
Currently, the field lacks a comprehensive benchmarking suite that evaluates AI applications from an end-to-end deployment perspective. Our approach includes the development of a benchmarking suite tailored for microservice-based AI applications.
This suite will capture metrics such as CPU/GPU/Memory utilization, interservice communication, end-to-end and per-service latency, and cache misses.
Requirements:
- Familiarity with Kubernetes
- Familiarity with Deep Learning frameworks (e.g., PyTorch or TensorFlow)
- Basics of computer networking
Contact
Email: navid.asadi@tum.de
Supervisor:
Performance Evaluation of Serverless Frameworks
Serverless, Function as a Service, Machine Learning, Distributed ML
Description
Serverless computing is a cloud computing paradigm that separates infrastructure management from software development and deployment. It offers advantages such as low development overhead, fine-grained unmanaged autoscaling, and reduced customer billing. From the cloud provider's perspective, serverless reduces operational costs through multi-tenant resource multiplexing and infrastructure heterogeneity.
However, the serverless paradigm also comes with its challenges. First, a systematic methodology is needed to assess the performance of heterogeneous open-source serverless solutions. To our knowledge, existing surveys need a thorough comparison between these frameworks. Second, there are inherent challenges associated with the serverless architecture, specifically due to its short-lived and stateless nature.
Requirements:
- Familiarity with Kubernetes
- Basics of computer networking
Contact
Email: navid.asadi@tum.de
Supervisor:
Investigation of Flexibility vs. Sustainability Tradeoffs in 6G
Description
5G networks brought significant performance improvements for different service types like augmented reality, virtual reality, online gaming, live video streaming, robotic surgeries, etc., by providing higher throughput, lower latency, higher reliability as well as the possibility to successfully serve a large number of users. However, these improvements do not come without any costs. The main consequence of satisfying the stringent traffic requirements of the aforementioned applications is excessive energy consumption.
Therefore, making the cellular networks sustainable, i.e., constraining their power consumption, is of utmost importance in the next generation of cellular networks, i.e., 6G. This goal is of interest mostly to cellular network operators. Of course, while achieving network sustainability, the satisfaction of all traffic requirements, which is of interest to cellular users, must be ensured at all times. While these are opposing goals, a certain balance has to be achieved.
In this thesis, the focus is on the type of services known as eMBB (enhanced mobile broadband). These are services that are characterized as latency-tolerant to a certain extent, but sensitive to the throughput and its stability. Live video streaming is a use case falling into this category. For these applications, on the one side, higher data rates imply higher energy consumption. On the other side, the users can be satisfied with slightly lower throughput as long as the provided data rate is constant, which corresponds to the flexibility that the network operator can exploit. Hence, the question that needs to be answered in this thesis is what is the optimal trade-off between the data rate and the energy consumption in a cellular network with eMBB users? To answer this question, the entire communication process will be encompassed, i.e., from the transmitting user through the base station and core network to the receiving end. The student will need to formulate an optimization problem to address the related problem, which they will then solve through exact optimization solvers, but also through proposing simpler algorithms (heuristics) that reduce the solution time while not considerably deteriorating the system performance.
Prerequisites
- Good knowledge of any programming language
- Good mathematical and analytical thinking skills
- High level of self-engagement and motivation
Contact
valentin.haider@tum.de
fidan.mehmeti@tum.de
Supervisor:
Intel's IPU: Starting from the beginning
Description
Intel develops Network Devices consisting of an FPGA and a general purpose processor. These are the so called IPUs. The goal of this Thesis/Position is to get such an IPU (Intel IPU F2000X) up and running and evaluates its potential. Here, the goal is to program a custom IPU application and evaluate metrics like latency, throughput, and many more under varying circumstances.
Prerequisites
- Basic Knowledge Linux Terminal
- Basic Knowledge C/C++
- Basic Knowledge of and about FPGAs
Supervisor:
DPU as Measurement Cards and Load Generators
Description
Datacenters experience higher and more demanding Network Loads and Traffic. Companies like Nvidia developed special networking hardware to fulfill these demands (The Nvidia Bluefield Line-Up). These cards promise high throughput and high precision. The features required to achieve such tasks can also be used to use Bluefield Cards as potential measurement cards or as load generators.
The goal of this Thesis/Position is to evaluate the performance and feasibility of this approach
For more information, please contact me directly (philip.diederich@tum.de)
Prerequisites
- Basic Knowledge Linux Terminal
- Basic Knowledge Python
- Basic Knowledge C/C++
Supervisor:
Advancing Real-time Network Simulations to Real World Behaviour
Description
Testing real-time application and networks is very timing sensitive. It is very hard to get this precision and accuracy in the real-world. However, the real-world itself also behaves different then simualtions. Our Simulator behaves like the theory dictates and allows us to get these precise timing, but needs to be tested and exteded to behave more like a real-network would
Requirements
- Knowledge of NS-3
- Knowledge of Python
- Knowledge of C/C++
Please contact me for more information (philip.diederich@tum.de)
Supervisor:
Working Student - Real-Time Network Controller for Research
Description
Chameleon is a real-time network controller that guarantees packet latencies for admitted flows. However, Chameleon is designed to work in high performance environments. For research and development, a different approach that offers more debugging and extension capablites would suit us better.
Goals:
- Create Real-time Network Controller
- Controller needs to be easy to debug
- Controller needs to be easy to extend
- Controller needs to have good logging and tracing
Requirements:
- Advanced Knowledge of C/C++
- Advanced Knowledge of Python
Please contact me for more information (philip.diederich@tum.de)
Amaury Van Bemten, Nemanja Ðeri?, Amir Varasteh, Stefan Schmid, Carmen Mas-Machuca, Andreas Blenk, and Wolfgang Kellerer. 2020. Chameleon: Predictable Latency and High Utilization with Queue-Aware and Adaptive Source Routing. In The 16th International Conference on emerging Networking EXperiments and Technologies (CoNEXT ’20), December 1–4, 2020, Barcelona, Spain. ACM, New York, NY, USA, 15 pages. https://doi.org/10.1145/3386367. 3432879
Supervisor:
Controlling Stochastic Network Flows for Real-time Networking
Description
Any data that is sent in a real-time network is monitored and accounted for. This allows us with the help of some mathematical frameworks to calculate upper bounds for the latency of the flow. These frameworks and controllers often consider hard real-time guarantees. This means that every packet arrives in time every time. With soft real-time guarantees, this is not the case. Here, we are allowed to have some leeway
In this thesis, we want to explore how we can model and admit network flows that have a stochastical nature.
Please contact me for more information (philip.diederich@tum.de)!!
Supervisor:
Working Student: Framework for Testing Realtime Networks
Description
Testing a Network Controller, custom real-time protocols, or verifying simulations with emulations requires a lot of computing effort. This is why we are developing a framework that helps you run parallel networking experiments. This framework also increases the reproducibility of any networking experiment.
The main Task of this position is to help develop the general-purpose framework for executing parallel networking experiments.
Tasks:
- Continue developing the Framework for multi server / multi app usage
- Extend Web Capabilities of the Framework
- Automate Starting and Stopping
- Ease-of-use Improvements
- Test the functionality
Requirements:
- Knowledge of Python
- Basic Knowledge of Web-App Development (FastApi, React etc...)
- Basic Knowledge of System Architecture Development
Feel free to contact me per mail (philip.diederich@tum.de)
Supervisor:
Working Student Infrastructure Service Management
Description
We are seeking a highly motivated and detail-oriented Working Student to join our data center team. As a Working Student, you will assist in the daily operations of our data center, gaining hands-on experience in a fast-paced and dynamic environment.
Responsibilities:
Assist with regular data center tasks, such as.
- Rack and Stack equipment
- Cable Management and organization
- Perform basic troubleshooting and maintenance tasks
- Assist with inventory management
- Monitor data center systems and report any discrepancies or issues
- Create the basis for our Data Center Infrastructure Management
- Develop and maintain documentation of data center procedures and policies
- Perform other duties as required to support the data center operations
Requirements
- Availability to work 8 - 10 hours per week with flexible scheduling to accommodate academic commitments
- Basic knowledge of computer systems, networks, and data center operations
- Basic knowledge in Python
Supervisor:
Student Assistent for Wireless Sensor Networks Lab Summer Semester 2025
Description
The Wireless Sensor Networks lab offers the opportunity to develop software solutions for the wireless sensor networking system, targeting innovative applications. For the next semester, a position is available to assist the participants in learning the programming environment and during the project development phase. The lab is planned to be held on-site every Tuesday 15:00 to 17:00.
Prerequisites
- Solid knowledge in Wireless Communication: PHY, MAC, and network layers.
- Solid programming skills: C/C++.
- Linux knowledge.
- Experience with embedded systems and microcontroller programming knowledge is preferable.
Contact
yash.deshpande@tum.de
alexander.wietfeld@tum.de
Supervisor:
Distributed Deep Learning for Video Analytics
Distributed Deep Learning, Distributed Computing, Video Analytics, Edge Computing, Edge AI
Description
In recent years, deep learning-based algorithms have demonstrated superior accuracy in video analysis tasks, and scaling up such models; i.e., designing and training larger models with more parameters, can improve their accuracy even more.
On the other hand, due to strict latency requirements as well as privacy concerns, there is a tendency towards deploying video analysis tasks close to data sources; i.e., at the edge. However, compared to dedicated cloud infrastructures, edge devices (e.g., smartphones and IoT devices) as well as edge clouds are constrained in terms of compute, memory and storage resources, which consequently leads to a trade-off between response time and accuracy.
Considering video analysis tasks such as image classification and object detection as the application at the heart of this project, the goal is to evaluate different deep learning model distribution techniques for a scenario of interest.
Contact
Email: navid.asadi@tum.de
Supervisor:
Edge AI in Adversarial Environment: A Simplistic Byzantine Scenario
Distributed Deep Learning, Distributed Computing, Byzantine Attack, Adversarial Inference
Description
This project considers an environment consisting of several low performance machines which are connected together across a network.
Edge AI has drawn the attention of both academia and industry as a way to bring intelligence to edge devices to enhance data privacy as well as latency.
Prior works investigated on improving accuracy-latency trade-off of Edge AI by distributing a model into multiple available and idle machines. Building on top of those works, this project adds one more dimension: a scenario where $f$ out of $n$ contributing nodes are adversary.
Therefore, for each data sample an adversary (1) may not provide an output (can also be considered as a faulty node.) or (2) may provide an arbitrary (i.e., randomly generated) output.
The goal is to evaluate robustness of different parallelism techniques in terms of achievable accuracy in presence of malicious contributors and/or faulty nodes.
Note that contrary to the mainstream existing literature, this project mainly focuses on the inference (i.e., serving) phase of deep learning algorithms, and although robustness of the training phase can be considered as well, it has a much lower priority.
Contact
Email: navid.asadi@tum.de
Supervisor:
On the Efficiency of Deep Learning Parallelism Schemes
Distributed Deep Learning, Parallel Computing, Inference, AI Serving
Description
Deep Learning models are becoming increasingly larger so that most of the state-of-the-art model architectures are either too big to be deployed on a single machine or cause performance issues such as undesired delays.
This is not only true for the largest models being deployed in high performance cloud infrastructures but also for smaller and more efficient models that are designed to have fewer parameters (and hence, lower accuracy) to be deployed on edge devices.
That said, this project considers the second environment where there are multiple resource constrained machines connected through a network.
Continuing the research towards distributing deep learning models into multiple machines, the objective is to generate more efficient variants/submodels compared to existing deep learning parallelism algorithms.
Note that this project mainly focuses on the inference (i.e., serving) phase of deep learning algorithms, and although efficiency of the training phase can be considered as well, it has a much lower priority.
Contact
Email: navid.asadi@tum.de
Supervisor:
Optimizing Distributed Deep Learning Inference
deep learning, distributed systems, parallel computing, model parallelism, communication overhead reduction, performance evaluation, edge devices
Description
The rapid growth in size and complexity of deep learning models has led to significant challenges in deploying these architectures across resource-constrained machines interconnected through a network. This research project focuses on optimizing the deployment of deep learning models at the edge, where limited computational resources and high-latency networks hinder performance. The main objective is to develop efficient distributed inference techniques that can overcome the limitations of edge devices, ensuring real-time processing and decision-making.
The successful candidate will work on addressing the following challenges:
- Employing model parallelism techniques to distribute workload across compute nodes while minimizing communication overhead associated with exchanging intermediate tensors between nodes.
- Reducing inter-operator blocking to improve overall system throughput.
- Developing efficient compression techniques tailored for deep learning data exchanges to minimize network latency.
- Evaluating the performance of proposed modifications using standard deep learning benchmarks and real-world datasets.
Responsibilities:
- Implement and evaluate various parallelism techniques, such as model parallelism and variant parallelism, from a communication efficiency perspective.
- Identify and implement mechanisms to minimize the exchange of intermediate tensors between compute nodes, potentially using advanced compression techniques tailored for deep learning data exchanges.
- Conduct comprehensive performance evaluations of proposed modifications using standard deep learning benchmarks and real-world datasets. Assess improvements in latency, resource efficiency, and overall system throughput compared to baseline configurations.
- Write technical reports and publications detailing the research findings.
Requirements:
- Pursuing a Master's degree in School of CIT
- Strong background in deep learning, distributed systems, and parallel computing.
- Proficiency in Python and experience with deep learning frameworks (e.g., TensorFlow, PyTorch).
- Excellent problem-solving skills and the ability to work independently and collaboratively as part of a team.
- Strong communication and writing skills for technical reports and publications.
Contact
Email: navid.asadi@tum.de