Open Theses, Internship Opportunities, and Working Student Positions

Decentralized Federated Learning on Constrained IoT Devices

Description

The Internet of Things (IoT) is an increasingly prominent aspect of our daily lives, with connected devices offering unprecedented convenience and efficiency. As we move towards a more interconnected world, ensuring the privacy and security of data generated by these devices is paramount. That is where decentralized federated learning comes in.

Federated Learning (FL) is a machine-learning paradigm that enables multiple parties to collaboratively train a model without sharing their data directly. This thesis focuses on taking FL one step further by removing the need for a central server, allowing IoT devices to directly collaborate in a peer-to-peer manner.

In this project, you will explore and develop decentralized federated learning frameworks specifically tailored for constrained IoT devices with limited computational power, memory, and energy resources. The aim is to design and implement efficient algorithms that can harness the collective power of these devices while ensuring data privacy and device autonomy. This involves tackling challenges related to resource-constrained environments, heterogeneous device capabilities, and maintaining security and privacy guarantees.

The project offers a unique opportunity to contribute to cutting-edge research with real-world impact. Successful outcomes will enable secure and private machine learning on IoT devices, fostering new applications in areas such as smart homes, industrial automation, and wearable health monitoring.

Responsibilities:

  • Literature review on decentralized federated learning, especially in relation to IoT and decentralized systems.
  • Design and development of decentralized FL frameworks suitable for constrained IoT devices.
  • Implementation and evaluation of the proposed framework using real-world datasets and testbeds.
  • Analysis of security and privacy aspects, along with resource utilization.
  • Documentation and presentation of findings in a thesis report, possibly leading to publications in top venues.

Requirements:

  • Enrollment in a Master's program in Computer Engineering, Computer Science, Electrical Engineering or related fields
  • Solid understanding of machine learning algorithms and frameworks (e.g., TensorFlow, PyTorch)
  • Proficiency in C and Python programming language
  • Experience with IoT devices and embedded systems development
  • Excellent analytical skills and a systematic problem-solving approach


Nice to Have:

  • Knowledge of cybersecurity and privacy principles
  • Familiarity with blockchain or other decentralized technologies
  • Interest in distributed computing and edge computing paradigms

Contact

Email: navid.asadi@tum.de

Supervisor:

Navidreza Asadi

Attacks on Cloud Autoscaling Mechanisms

Keywords:
Cloud Computing, Kubernetes, autoscaling, low and slow attacks, Horizontal Pod Autoscaler (HPA), Vertical Pod Autoscaler (VPA), cloud security, container orches

Description

In the era of cloud-native computing, Kubernetes has emerged as a leading container orchestration platform, enabling seamless scalability and reliability for modern applications.

However, with its widespread adoption comes a new frontier in cybersecurity challenges, particularly low and slow attacks that exploit autoscaling features to disrupt services subtly yet effectively.

This project aims to delve into the intricacies of these attacks, examining their impact on Kubernetes' Horizontal Pod Autoscaler (HPA) and Vertical Pod Autoscaler (VPA), and proposing mitigation strategies for more resilient systems.

Responsibilities:

  • Conduct a thorough literature review to identify existing knowledge gaps and research on similar attacks.
  • Develop methodologies to simulate low and slow attack scenarios on Kubernetes clusters with varying configurations of autoscaling mechanisms.
  • Analyze the impact of these attacks on resource utilization, service availability, and overall system performance.
  • Evaluate current defense mechanisms and propose novel strategies to enhance the resilience of Kubernetes' autoscaling features.
  • Implement and test selected mitigation approaches in a controlled environment.
  • Document findings, present a comparative analysis of effectiveness, and discuss implications for future development in cloud security practices.


Requirements:

  • A strong background in computer engineering, computer science or a related field.
  • Familiarity with Kubernetes architecture and container orchestration concepts.
  • Experience in deploying and managing applications on Kubernetes clusters.
  • Proficiency in at least one scripting/programming language (e.g., Python, Go).
  • Understanding of cloud computing and cybersecurity fundamentals.


Nice to Have:

  • Prior research or hands-on experience in cloud security, particularly in the context of Kubernetes.
  • Knowledge of network protocols and low-level system interactions.
  • Experience with DevOps tools and practices.

 

Contact

Email: navid.asasdi@tum.de

Supervisor:

Navidreza Asadi

Working Student/Research Internship - On-Device Training on Microcontrollers

Description

We are seeking a highly motivated and skilled student to replicate a research paper that explores the application of pruning techniques for on-device training on microcontrollers. The original paper demonstrated the feasibility of deploying deep neural networks on resource-constrained devices, and achieved significant reductions in model size and computational requirements while maintaining acceptable accuracy.

Responsibilities:

  • Extend our existing framework by implementing the pruning techniques on a microcontroller-based platform (e.g., Arduino, ESP32)
  • Replicate the experiments described in the original paper to validate the results
  • Evaluate the performance of the pruned models on various benchmark datasets
  • Compare the results with the original paper and identify areas for improvement
  • Document the replication process, results, and findings in a clear and concise manner

Requirements:

  • Strong programming skills in C and Python
  • Experience with deep learning frameworks (e.g., TensorFlow, PyTorch) and microcontroller-based platforms
  • Familiarity with pruning techniques for neural networks is a plus
  • Excellent analytical and problem-solving skills
  • Ability to work independently and manage time effectively
  • Strong communication and documentation skills

 

Contact

Email: navid.asadi@tum.de

Supervisor:

Navidreza Asadi

Working Student - Machine Learning Serving on Kubernetes

Keywords:
Machine Learning, Kubernetes, Containerization, Docker, Orchestration, Cloud Computing, MLOps, Machine Learning Operations, DevOps, Microservices Architecture,

Description

We are seeking an ambitious and forward-thinking working student to join our dynamic team working at the intersection of Machine Learning (ML) and Kubernetes. In this exciting role, you will be immersed in a cutting-edge environment where advanced ML models meet the power of container orchestration through Kubernetes. Your contributions will directly impact the development and optimization of scalable and robust ML serving systems leveraging the benefits of Kubernetes.

If you are a student passionate about both Machine Learning and Kubernetes, we invite you to join us on this exciting journey! We offer the chance to pioneer cutting-edge solutions that leverage the power of these two transformative technologies.

Responsibilities:

  • Collaborate with a cross-functional team to design and implement ML workflows on Kubernetes.
  • Assist in packaging and deploying ML models as microservices using containers (Docker) and managing them effectively through Kubernetes.
  • Optimize resource allocation, scheduling, and scaling strategies for efficient model serving at varying workloads.
  • Implement monitoring solutions specific to ML inference tasks within the Kubernetes cluster.
  • Troubleshoot and debug issues related to containerized ML applications
  • Document best practices, tutorials, and guides on leveraging Kubernetes for ML serving

Requirements:

  • Currently enrolled in a Bachelor's or Master's program in School of CIT
  • Strong programming skills in Python with experience in software development lifecycle methodologies.
  • Familiarity with machine learning frameworks such as TensorFlow and PyTorch.
  • Proficiency in container technologies. Docker and Kubernetes certification would be a plus but not mandatory.
  • Experience with cloud computing platforms; e.g., AWS, GCP or Azure.
  • Demonstrated ability to work independently with effective time management and strong problem-solving analytical skills.
  • Excellent communication and teamwork capabilities.

Nice to Have:

  • Kubernetes Certification: Having a valid Kubernetes certification (CKA, CKAD, or CKE) demonstrates your expertise in container orchestration and can be a significant advantage.
  • Experience with DevOps and/or MLOps Tools: Familiarity with MLOps tools such as MLflow, Kubeflow, or TensorFlow Extended (TFX) can help you streamline the machine learning workflow and improve collaboration. Experience with OpenTelemetry, Jaeger, Istio, and monitoring tools is a plus.
  • Knowledge of Distributed Systems: Understanding distributed systems architecture and design patterns can help you optimize the performance and scalability of your machine learning models.
  • Contributions to Open-Source Projects: Having contributed to open-source projects related to Kubernetes, machine learning, or MLOps demonstrates your ability to collaborate with others and adapt to new technologies.
  • Familiarity with Agile Methodologies: Knowledge of agile development methodologies such as Scrum or Kanban can help you work efficiently in a fast-paced environment and deliver results quickly.
  • Cloud-Native Application Development: Experience with cloud-native application development using frameworks like Cloud Foundry or AWS Cloud Development Kit (CDK) can be beneficial in designing scalable and efficient machine learning workflows.

 

Contact

Email: navid.asadi@tum.de

 

Supervisor:

Navidreza Asadi

Working Student for the Edge AI Testbed

Keywords:
IoT, Edge Computing, Machine Learning, Measurement, Power Characterization

Description

We are seeking a highly motivated and enthusiastic Working Student to join our team as part of the Edge AI Testbed project. As a Working Student, a key member of our research team, you will contribute to the development and testing of cutting-edge Artificial Intelligence (AI) systems at the edge of the network. You will work closely with our researchers and engineers to design, implement, and evaluate innovative AI solutions that can operate efficiently on resource-constrained edge devices.

Responsibilities:

  • Assist in designing and implementing AI models for edge computing
  • Develop and test software components for the Edge AI Testbed
  • Collaborate with team members to integrate AI models with edge hardware platforms
  • Participate in performance optimization and evaluation of AI systems on edge devices
  • Contribute to the development of tools and scripts for automated testing and deployment
  • Document and report on project progress, results, and findings


If you are a motivated and talented student looking to gain hands-on experience in Edge AI, we encourage you to apply for this exciting opportunity!

Requirements:

  • Currently enrolled in a Bachelor's or Master's program in School of CIT
  • Strong programming skills in languages such as Python and C++
  • Experience with AI frameworks such as TensorFlow, PyTorch, or Keras
  • Familiarity with edge computing platforms and devices (e.g., Raspberry Pi, NVIDIA Jetson)
  • Basic knowledge of Linux operating systems and shell scripting
  • Excellent problem-solving skills and ability to work independently
  • Strong communication and teamwork skills


Nice to Have:

  • Experience with containerization using Docker
  • Familiarity with cloud computing platforms (e.g., Kubernetes)
  • Experience with Apache Ray
  • Knowledge of computer vision or natural language processing
  • Participation in open-source projects or personal projects related to AI and edge computing

 

Contact

Email: navid.asadi@tum.de

 

Supervisor:

Navidreza Asadi

An AI Benchmarking Suite for Microservices-Based Applications

Keywords:
Kubernetes, Deep Learning, Video Analytics, Microservices

Description

In the realm of AI applications, the deployment strategy significantly impacts performance metrics.

This research internship aims to investigate and benchmark AI applications in two predominant deployment configurations: monolithic and microservices-based, specifically within Kubernetes environments.

The central question revolves around understanding how these deployment strategies affect various performance metrics and determining the more efficient configuration. This inquiry is crucial as the deployment strategy plays a pivotal role in the operational efficiency of AI applications.

Currently, the field lacks a comprehensive benchmarking suite that evaluates AI applications from an end-to-end deployment perspective. Our approach includes the development of a benchmarking suite tailored for microservice-based AI applications.

This suite will capture metrics such as CPU/GPU/Memory utilization, interservice communication, end-to-end and per-service latency, and cache misses.

 Requirements:

  • Familiarity with Kubernetes
  • Familiarity with Deep Learning frameworks (e.g., PyTorch or TensorFlow)
  • Basics of computer networking

 

Contact

Email: navid.asadi@tum.de

 

Supervisor:

Navidreza Asadi

Performance Evaluation of Serverless Frameworks

Keywords:
Serverless, Function as a Service, Machine Learning, Distributed ML

Description

Serverless computing is a cloud computing paradigm that separates infrastructure management from software development and deployment. It offers advantages such as low development overhead, fine-grained unmanaged autoscaling, and reduced customer billing. From the cloud provider's perspective, serverless reduces operational costs through multi-tenant resource multiplexing and infrastructure heterogeneity.

However, the serverless paradigm also comes with its challenges. First, a systematic methodology is needed to assess the performance of heterogeneous open-source serverless solutions. To our knowledge, existing surveys need a thorough comparison between these frameworks. Second, there are inherent challenges associated with the serverless architecture, specifically due to its short-lived and stateless nature.

 Requirements:

  • Familiarity with Kubernetes
  • Basics of computer networking

 

 

Contact

Email: navid.asadi@tum.de

 

Supervisor:

Navidreza Asadi

Distributed Deep Learning for Video Analytics

Keywords:
Distributed Deep Learning, Distributed Computing, Video Analytics, Edge Computing, Edge AI

Description

In recent years, deep learning-based algorithms have demonstrated superior accuracy in video analysis tasks, and scaling up such models; i.e., designing and training larger models with more parameters, can improve their accuracy even more.

On the other hand, due to strict latency requirements as well as privacy concerns, there is a tendency towards deploying video analysis tasks close to data sources; i.e., at the edge. However, compared to dedicated cloud infrastructures, edge devices (e.g., smartphones and IoT devices) as well as edge clouds are constrained in terms of compute, memory and storage resources, which consequently leads to a trade-off between response time and accuracy. 

Considering video analysis tasks such as image classification and object detection as the application at the heart of this project, the goal is to evaluate different deep learning model distribution techniques for a scenario of interest.

Contact

Email: navid.asadi@tum.de

 

Supervisor:

Navidreza Asadi

Edge AI in Adversarial Environment: A Simplistic Byzantine Scenario

Keywords:
Distributed Deep Learning, Distributed Computing, Byzantine Attack, Adversarial Inference

Description

This project considers an environment consisting of several low performance machines which are connected together across a network. 

Edge AI has drawn the attention of both academia and industry as a way to bring intelligence to edge devices to enhance data privacy as well as latency. 

Prior works investigated on improving accuracy-latency trade-off of Edge AI by distributing a model into multiple available and idle machines. Building on top of those works, this project adds one more dimension: a scenario where $f$ out of $n$ contributing nodes are adversary. 

Therefore, for each data sample an adversary (1) may not provide an output (can also be considered as a faulty node.) or (2) may provide an arbitrary (i.e., randomly generated) output.

The goal is to evaluate robustness of different parallelism techniques in terms of achievable accuracy in presence of malicious contributors and/or faulty nodes.

Note that contrary to the mainstream existing literature, this project mainly focuses on the inference (i.e., serving) phase of deep learning algorithms, and although robustness of the training phase can be considered as well, it has a much lower priority.

Contact

Email: navid.asadi@tum.de

 

Supervisor:

Navidreza Asadi

On the Efficiency of Deep Learning Parallelism Schemes

Keywords:
Distributed Deep Learning, Parallel Computing, Inference, AI Serving

Description

Deep Learning models are becoming increasingly larger so that most of the state-of-the-art model architectures are either too big to be deployed on a single machine or cause performance issues such as undesired delays.

This is not only true for the largest models being deployed in high performance cloud infrastructures but also for smaller and more efficient models that are designed to have fewer parameters (and hence, lower accuracy) to be deployed on edge devices.

That said, this project considers the second environment where there are multiple resource constrained machines connected through a network. 

Continuing the research towards distributing deep learning models into multiple machines, the objective is to generate more efficient variants/submodels compared to existing deep learning parallelism algorithms.  

Note that this project mainly focuses on the inference (i.e., serving) phase of deep learning algorithms, and although efficiency of the training phase can be considered as well, it has a much lower priority.

Contact

Email: navid.asadi@tum.de

 

Supervisor:

Navidreza Asadi

Optimizing Distributed Deep Learning Inference

Keywords:
deep learning, distributed systems, parallel computing, model parallelism, communication overhead reduction, performance evaluation, edge devices

Description

The rapid growth in size and complexity of deep learning models has led to significant challenges in deploying these architectures across resource-constrained machines interconnected through a network. This research project focuses on optimizing the deployment of deep learning models at the edge, where limited computational resources and high-latency networks hinder performance. The main objective is to develop efficient distributed inference techniques that can overcome the limitations of edge devices, ensuring real-time processing and decision-making.

The successful candidate will work on addressing the following challenges:

  • Employing model parallelism techniques to distribute workload across compute nodes while minimizing communication overhead associated with exchanging intermediate tensors between nodes.
  • Reducing inter-operator blocking to improve overall system throughput.
  • Developing efficient compression techniques tailored for deep learning data exchanges to minimize network latency.
  • Evaluating the performance of proposed modifications using standard deep learning benchmarks and real-world datasets.

 

Responsibilities:

  • Implement and evaluate various parallelism techniques, such as model parallelism and variant parallelism, from a communication efficiency perspective.
  • Identify and implement mechanisms to minimize the exchange of intermediate tensors between compute nodes, potentially using advanced compression techniques tailored for deep learning data exchanges.
  • Conduct comprehensive performance evaluations of proposed modifications using standard deep learning benchmarks and real-world datasets. Assess improvements in latency, resource efficiency, and overall system throughput compared to baseline configurations.
  • Write technical reports and publications detailing the research findings.

Requirements:

  • Pursuing a Master's degree in School of CIT
  • Strong background in deep learning, distributed systems, and parallel computing.
  • Proficiency in Python and experience with deep learning frameworks (e.g., TensorFlow, PyTorch).
  • Excellent problem-solving skills and the ability to work independently and collaboratively as part of a team.
  • Strong communication and writing skills for technical reports and publications.

 

Contact

Email: navid.asadi@tum.de

Supervisor:

Navidreza Asadi

Ongoing Theses

VM Selection for Financial Exchanges in the Cloud

Keywords:
Cloud Computing, Financial Exchange, Fairness, Subset Selection

Description

Financial exchanges consider a migration to the cloud for scalability, robustness, and cost-efficiency. Jasper presents a scalable and fair multicast solution for cloud-based exchanges, addressing the lack of cloud-native mechanisms for such.

To achieve this, Jasper employs an overlay multicast tree, leveraging clock synchronization, kernel-bypass techniques, and more. However, there are opportunities for enhancement by confronting the issue of inconsistent VM performance within identical instances. LemonDrop tackles this problem, detecting under-performing VMs in a cluster and selecting a subset of VMs optimized for a given application's latency needs. Yet, we believe that LemonDrop's approach of using time-expensive all-to-all latency measurements and an optimization routine for the framed Quadratic Assignment Problem (QAP) is overly complex. 

The proposed work aims to develop a simpler and scalable heuristic, that achieves reasonably good results within Jasper's time constraints. 

Contact

Email: navid.asadi@tum.de

 

Supervisor:

Navidreza Asadi

Towards Improving Model Generation in Variant Parallelism

Keywords:
Distributed Deep Learning, Parallel Computing, Inference, Communication Efficiency

Description

Resource constraints of edge devices serve as a major bottleneck when deploying large AI models in edge computing scenarios. Not only are they difficult to fit into such small devices, but they are also quite slow in inference time, given today's need for rapid decision-making. One major technique developed to solve this issue is Variant Parallelism. In this ensemble-based deep-learning distribution method, different main model variants are created and deployed in separate machines, and their decisions are combined to produce the final output. 

    The method provides graceful degradation in the presence of faulty nodes or poor connectivity while achieving an accuracy similar to the base model.

    However, the technique used to generate variants can fail in scalability as combining variants of smaller size with somewhat identical characteristics may not help achieve a significant accuracy boost unless they are retrained with different random seeds. Therefore, this research will focus on improving variant parallelism by exploring other ways to generate variants. We will apply knowledge distillation (KD), where a teacher model of a certain type (e.g., ResNet-50) can be used to train a smaller student model or a model of a completely different structure (e.g., MobileNet). 

    We aim to develop a variant generation technique where we can generate as many variants as there are participating devices while boosting accuracy and inference speed. Additionally, we will create an optimization scenario that dynamically creates a smaller student model based on specific requirements, such as hardware characteristics and end-to-end performance metrics.

Supervisor:

Navidreza Asadi

Towards Improving Class Parallelism for Edge Environments

Keywords:
Distributed ML, Parallel Computing, CNN, Deep Learning

Description

Main-stream serving paradigms for distributed models, such as data parallelism and model parallelism, are not suitable when it comes to inference for tasks that require low latency and have atomic input streams. A recent effort, Sensai, proposes a new generic approach called class parallelism that aims to distribute a base convolution neural network (CNN) model across several homogeneous machines.

    The model distribution paradigm decomposes a CNN into disconnected subnets, each responsible for predicting specific classes or groups of classes. They claim that this approach enables fast, in-parallel inference on live data with minimal communication overhead, significantly reducing inference latency on single data items without compromising accuracy.

    Class Parallelism, however, comes with its own set of challenges and limitations. For instance, since the generated models should be created in a homogeneous manner; they share similar characteristics. Further, regardless of the input, all sub-models have to be executed to get the final prediction, which directly impacts the robustness and scalability of the system.

    During the first stage of the thesis, our goal is to reproduce the results from the paper. Later, we want to improve the existing method to become more robust and possibly extend it to new use cases besides image classification. Finally, if time permits, we want to evaluate the trained models in an edge environment.

Supervisor:

Navidreza Asadi

Performance Evaluation of Serverless Frameworks

Keywords:
Serverless, Function as a Service, Machine Learning, Distributed ML

Description

Serverless computing is a cloud computing paradigm that separates infrastructure management from software development and deployment. It offers advantages such as low development overhead, fine-grained unmanaged autoscaling, and reduced customer billing. From the cloud provider's perspective, serverless reduces operational costs through multi-tenant resource multiplexing and infrastructure heterogeneity.

 

    However, the serverless paradigm also comes with its challenges. First, a systematic methodology is needed to assess the performance of heterogeneous open-source serverless solutions. To our knowledge, existing surveys need a thorough comparison between these frameworks. Second, there are inherent challenges associated with the serverless architecture, specifically due to its short-lived and stateless nature.

Supervisor:

Navidreza Asadi

A Study on Learning-Based Horizontal Autoscaling on Kubernetes

Keywords:
Autoscaling, Kubernetes, Edge Computing

Description

 The rapid growth of edge computing has introduced new challenges in managing and scaling workloads in distributed environments to maintain stable service performance while saving resources. To address this, this research internship aims to explore the feasibility and implications of extending the AWARE framework (Qiu et al., 2023) [1], which has been developed by as an automated workload autoscaling solution for production cloud systems, to edge environments. 

 

 AWARE utilizes tools such as reinforcement learning, meta-learning, and bootstrapping when scaling out workloads in the horizontal dimension by increasing the number of deployment instances and scaling up in the vertical dimension by increasing the allocated resources of a deployment instance. We will employ edge environment infrastructures with limited resources that run a lightweight distribution of the Kubernetes (K8s) container orchestration tool, and the goal is to gain insights into the performance, adaptability, and limitations of this approach.

Supervisor:

Navidreza Asadi

Load Generation for Benchmarking Kubernetes Autoscaler

Keywords:
Horizontal Pod Autoscaler (HPA), Kubernetes (K8s), Benchmarking

Description

Kubernetes (K8s) has become the de facto standard for orchestrating containerized applications. K8s is an open-source framework which among many features, provides automated scaling and management of services. 

Considering a microservice-based architecture, where each application is composed of multiple independent services (usually each service provides a single functionality), K8s' Horizontal Pod Autoscaler (HPA) can be leveraged to dynamically change the number of  instances (also known as Pods) based on workload and incoming request pattern.

The main focus of this project is to benchmark the HPA behavior of a Kubernetes cluster running a microservice-based application having multiple services chained together. That means, there is a dependency between multiple services, and by sending a request to a certain service, other services might be called once or multiple times.

This project aims to generate incoming request load patterns that lead to an increase in either the operational cost of the Kubernetes cluster or response time of the requests. This potentially helps to identify corner cases of the algorithm and/or weak spots of the system; hence called adversarial benchmarking.

The applications can be selected from commonly used benchmarks such as DeathStarBench*. The objective is to investigate on the dependencies between services and how different sequences of incoming request patterns can affect each service as well as the whole system.

* https://github.com/delimitrou/DeathStarBench/blob/master/hotelReservation/README.md

Supervisor:

Navidreza Asadi, Razvan-Mihai Ursu