Offered Theses

Please contact the doctoral researchers directly if you are interested in a Bachelor or Master thesis, a student job, an "Ingenieurspraxis" or a "Forschungspraxis". It is also usually possible to find a topic that matches your specific interests. Each doctoral researcher's research topics and contact details are available on the personal websites.

Please include a curriculum vitae together with a list of attended courses when applying for a thesis.

If your "Ingenieurspraxis" is selected to be supervised by one of our professors, please hand in the documents to Doris Dorn (Room N2401).

Bachelor's Theses

Coding theory in different metrics

Description

In this thesis, the student will study the mathematics of codes in diffent metrics such as the Hamming metric, (sum-)rank metric, column/row-cover metric, etc.

The focus can lie on similar mathematical ideas shared across different metrics, such as bounds on codes, good code constructions, decoding algorithms, applications.  

Supervisor:

Code-based Cryptography

Description

In this thesis, the student will study the mathematics of linear codes and how they can be used to design cryptosystems. 

Supervisor:

Master's Theses

Private and Secure Federated Learning

Description

In federated learning, a machine learning model shall be trained on private user data with the help of a central server, the so-called federator. This setting differs from other machine learning settings in that the user data shall not be shared with the federator for privacy reasons and/or to decrease the communication load of the system.

Even though only intermediate results are shared, extra care is necessary to guarantee data privacy. An additional challenge arises if the system includes malicious users that breach protocol and send corrupt computation results.

The goal of this work is to design, implement and analyze coding- and information-theoretic solutions for privacy and security in federated learning.

Prerequisites

  • Information Theory
  • Coding Theory (e.g., Channel Coding)
  • Machine Learning (Theory and Practice)

Supervisor:

Coding theory for NVMs

Short Description:
Coding problems motivated by properties and asymmetries of NVMs

Description

Non-volatile memories (NVMs) are electronic data-storage technologies that do not require a continuous power supply to retain data; unlike traditional magnetic or optical media, they do not utilize mechanically movable components and can therefore offer better performance, and allow for three-dimensional scaling of storage devices. Under most realistic workloads, they also offer better energy efficiency.

However, these technologies also feature imbalances in behavior, performance and consequences, between the processes of reading data and writing it. To wit, in memory cells which represent data by the level of held charge (traditionally allowing for representation of several logical levels), the process of charge-injection is a simple and efficient, whereas charge-depletion is both technically complex (requiring the depletion of entire blocks of cells) and destructive, a main driver of cell-degradation over the device's life cycle.

Different coding theoretic approaches have been explored to alleviate this imbalance, including coding schemes that delay charge-depletion cycles [1]--[3], and such that seek to mitigate the effects of defective memory cells once those appear in a device [4], [5].

Theses are available in extending either approach, as well as combining them.

[1] A. Jiang, R. Mateescu, M. Schwartz and J. Bruck, "Rank Modulation for Flash Memories," in IEEE Transactions on Information Theory, vol. 55, no. 6, pp. 2659-2673, June 2009, doi: 10.1109/TIT.2009.2018336.

[2] M. Horovitz and E. Yaakobi, "On the Capacity of Write-Once Memories," in IEEE Transactions on Information Theory, vol. 63, no. 8, pp. 5124-5137, Aug. 2017, doi: 10.1109/TIT.2017.2689034.

[3] M. Horovitz and T. Etzion, "Local Rank Modulation for Flash Memories," in IEEE Transactions on Information Theory, vol. 65, no. 3, pp. 1705-1713, March 2019, doi: 10.1109/TIT.2018.2859403.

[4] V. Sidorenko, G. Schmidt, E. Gabidulin, M. Bossert and V. Afanassiev, "On polyalphabetic block codes," IEEE Information Theory Workshop, 2005., Rotorua, New Zealand, 2005, pp. 4 pp.-, doi: 10.1109/ITW.2005.1531889.

[5] Y. Yehezkeally, H. A. Kim, S. Puchinger and A. Wachter-Zeh, "Bounds on Mixed Codes with Finite Alphabets," 2023 IEEE Information Theory Workshop (ITW), Saint-Malo, France, 2023, pp. 389-394, doi: 10.1109/ITW55543.2023.10161655.

Supervisor:

DNA-based data storage

Short Description:
Coding theoretic problems motivated by DNA-based stroage

Description

Contemporary global demand for storage capacity is increasing exponentially, even as traditional magnetic storage media has exhausted its potential for optimization, and requires unsustainable investments for both production and maintenance, in capital, energy and space.

One promising potential medium for archival data storage is DNA; it features high density, extreme longevity, convenient scalability and a lower maintenance footprint. Complete DNA-based storage ecosystems are in active development, raising multiple coding theoretic (as well as engineering, algorithmic, and biotechnological) challenges; correspondingly, increasing attention is recently given to the study of such systems, and they are drawing significant investments from both governments
and the private sector.

The following theses are available (other topics in this domain will also be entertained):

  • The torn paper channel models the effects of DNA strand breakage in storage or processing. It has been studied from both an average-case and a worst-case [1] perspective, with several distinct adversarial models. Recently, the t-break model was studied [2] as a refinement of the previously studied min-max constraint. These developments open the way to study new problems in this setting.
  • Reconstruction from substring spectra is a model motivated by the process of shot-gun sequencing, where short strands are drawn sufficiently many times to reconstruct a long information sequence. Adapting existing literature [3,4] to more realistic models is an open problem.
  • Nanopore sequencing is a nascent technology that reads single-stranded DNA molecules by passing them through a narrow pore while passing electric current through it. More work studying its properties and designing codes capable of handling its relatively high error rate, extending existing literature [5,6], is necessary.
  • Duplications are a type of mutation occurring in the process of cell replication, which may be responsible for large portions of our current genome. For data storage schemes in in vivo DNA (e.g., for watermarking research material) it is an error model that needs to be countered. We aim to extend and build upon existing literature [7,8].



[1] D. Bar-Lev, S. Marcovich, E. Yaakobi and Y. Yehezkeally, "Adversarial Torn-Paper Codes," in IEEE Transactions on Information Theory, vol. 69, no. 10, pp. 6414-6427, Oct. 2023, doi: 10.1109/TIT.2023.3292895.

[2] C. Wang, J. Sima and N. Raviv, "Break-Resilient Codes for Forensic 3D Fingerprinting," arXiv preprint arXiv:2310.03897v1 [cs.IT], Oct. 2023, doi: https://doi.org/10.48550/arXiv.2310.03897.

[3] Y. Yehezkeally, D. Bar-Lev, S. Marcovich and E. Yaakobi, "Generalized Unique Reconstruction From Substrings," in IEEE Transactions on Information Theory, vol. 69, no. 9, pp. 5648-5659, Sept. 2023, doi: 10.1109/TIT.2023.3269124.

[4] H. Wei, M. Schwartz, G. Ge, "Reconstruction from Noisy Substrings," arXiv preprint arXiv:2312.04790v1 [cs.IT], Dec. 2023, doi: 10.48550/arXiv.2312.04790.

[5] A. Banerjee, Y. Yehezkeally, A. Wachter-Zeh and E. Yaakobi, "Error-Correcting Codes for Nanopore Sequencing," in IEEE Transactions on Information Theory, doi: 10.1109/TIT.2024.3380615.

[6] A. Banerjee, Y. Yehezkeally, A. Wachter-Zeh and E. Yaakobi, "Correcting a Single Deletion in Reads from a Nanopore Sequencer," arXiv preprint arXiv:2401.15939v2 [cs.IT], May. 2024, doi: 10.48550/arXiv.2401.15939.

[7] S. Jain, F. Farnoud Hassanzadeh, M. Schwartz and J. Bruck, "Duplication-Correcting Codes for Data Storage in the DNA of Living Organisms," in IEEE Transactions on Information Theory, vol. 63, no. 8, pp. 4996-5010, Aug. 2017, doi: 10.1109/TIT.2017.2688361.

[8] D. Goshkoder, N. Polyanskii and I. Vorobyev, "Codes Correcting Long Duplication Errors," in IEEE Transactions on Molecular, Biological, and Multi-Scale Communications, doi: 10.1109/TMBMC.2024.3403755.

Supervisor:

Homomorphic Encryption for Machine Learning

Keywords:
Partial/Somewhat Homomorphic Encryption, Federated Learning

Description

Homomorphic encryption (HE) schemes are increasingly attracting attention in the era of large scale computing. While lattice-based approaches have been well-studied, recently first progress has been made towards establishing code-based alternatives. Preliminary results show that such alterative approaches might enable undiscovered functionalities not present in current lattice-based schemes. In this project, we particularily study novel code-based Partial/Somewhat HE schemes tailored to applications in artificial intelligence and federated learning.

After familiarizing with SotA methods in relevant fields (such as [1]), the student should analyze the requirements for use-cases at hand and explore suitable modifications to current schemes and novel approaches.

Please take note that this Master Thesis is further designed to open up the possibility for a subsequent PhD position in homomorphic encrpytion with Prof. Dr.-Ing. Antonia Wachter-Zeh.

[1] Aguilar-Melchor, Carlos, Victor Dyseryn, and Philippe Gaborit, "Somewhat Homomorphic Encryption based on Random Codes," Cryptology ePrint Archive (2023).

Prerequisites

- Strong foundation in linear algebra
- Channel Coding
- Security in Communications and Storage
- Basic understanding of Machine Learning concepts

Supervisor:

List decoding for multivariate polynomial codes

Description

Multivariate polynomials have been attracting increasing interest in constructing codes with repair capabilities by accessing only a small amount of available symbols, which is required to build failure-resistant distributed storage systems.

List decoding is a technique to decode beyond half the minimum distance and has shown its advantages for codes based on univariate polynomials.

In this thesis, the student is expected to learn from the literature about the list decoding algorithm for multivariate polynomial codes and develop new decoding techniques upon it.

Supervisor:

Automatic Bias Control for Optical IQ Modulation in Quantum Communications

Description

For our Advanced Technology team in Munich/Martinsried, we are looking for a motivated master thesis student at the intersection of optical (quantum) communications, electrical engineering, and cyber security. With the advancement of the continuous variable QKD (CV-QKD), information can be encoded on the quadratures of the incident electromagnetic field, like in commercial optical communication systems. This allows the information on coherent states of light to be captured at the receiver. Similar to coherent optical communications, Inphase & Quadrature (IQ) modulation is a crucial step for the generation of the QKD transmit signal . For this step the modulator, usually a Mach-Zehnder modulator (MZM), imprints the electrical baseband DAC output onto the continuous wave laser light.

Throughout this thesis work, the student’s main task will be to research, analyze and implement a practical IQ modulator for QKD applications.

Contact

utku.akin@advasecurity.com

Supervisor:

Utku Akin - Utku Akin (Adva Network Security GmbH)

Error correction for DNA storage

Description

DNA-based data storage is a novel approach for long term digital data archiving.

Due to the unique nature of writing and reading DNA, the channel associated with these processes is still relatively poorly understood and varies over different synthesis (writing) and sequencing (reading) technologies. The task of the student is to evaluate various decoding strategies for certain error-correcting schemes tailored for the DNA storage channel.

 

Prerequisites

- Basic principles of stochastic and algebra
- Channel Coding
- Information Theory

Supervisor:

Anisha Banerjee

Efficient Block Propagation in Cryptocurrency Networks

Description

Cryptocurrencies like Bitcoin and Ethereum use a decentralized ledger called Blockchain to track transactions. Whenever a new block is added to the Blockchain, the change is spread through the network using a gossip-like protocol. This process is known as block propagation.

To increase scalability, the efficiency of block propagation is crucial. This thesis aims to explore the information theoretic limits of block propagation, derive realistic models based on real data, and investigate innovative and efficient techniques for block propagation.

The thesis will be conducted at the Institute of Communications and Navigation at DLR (German Aerospace Center) in Oberpfaffenhofen.

Prerequisites

Required qualifications are

  • basic knowledge of information theory
  • programming experience in Matlab, C, or python.
  • Interest in cryptocurrencies.

Contact

Interested applicants may contact Dr. Francisco Lázaro via email at francisco.lazaroblasco@dlr.de.

Supervisor:

Juan Diego Lentner Ibanez - Dr. Francisco Lázaro (DLR (German Aerospace Center))

Research Internships (Forschungspraxis)

Private and Secure Federated Learning

Description

In federated learning, a machine learning model shall be trained on private user data with the help of a central server, the so-called federator. This setting differs from other machine learning settings in that the user data shall not be shared with the federator for privacy reasons and/or to decrease the communication load of the system.

Even though only intermediate results are shared, extra care is necessary to guarantee data privacy. An additional challenge arises if the system includes malicious users that breach protocol and send corrupt computation results.

The goal of this work is to design, implement and analyze coding- and information-theoretic solutions for privacy and security in federated learning.

Prerequisites

  • Information Theory
  • Coding Theory (e.g., Channel Coding)
  • Machine Learning (Theory and Practice)

Supervisor:

Coding theory for NVMs

Short Description:
Coding problems motivated by properties and asymmetries of NVMs

Description

Non-volatile memories (NVMs) are electronic data-storage technologies that do not require a continuous power supply to retain data; unlike traditional magnetic or optical media, they do not utilize mechanically movable components and can therefore offer better performance, and allow for three-dimensional scaling of storage devices. Under most realistic workloads, they also offer better energy efficiency.

However, these technologies also feature imbalances in behavior, performance and consequences, between the processes of reading data and writing it. To wit, in memory cells which represent data by the level of held charge (traditionally allowing for representation of several logical levels), the process of charge-injection is a simple and efficient, whereas charge-depletion is both technically complex (requiring the depletion of entire blocks of cells) and destructive, a main driver of cell-degradation over the device's life cycle.

Different coding theoretic approaches have been explored to alleviate this imbalance, including coding schemes that delay charge-depletion cycles [1]--[3], and such that seek to mitigate the effects of defective memory cells once those appear in a device [4], [5].

Theses are available in extending either approach, as well as combining them.

[1] A. Jiang, R. Mateescu, M. Schwartz and J. Bruck, "Rank Modulation for Flash Memories," in IEEE Transactions on Information Theory, vol. 55, no. 6, pp. 2659-2673, June 2009, doi: 10.1109/TIT.2009.2018336.

[2] M. Horovitz and E. Yaakobi, "On the Capacity of Write-Once Memories," in IEEE Transactions on Information Theory, vol. 63, no. 8, pp. 5124-5137, Aug. 2017, doi: 10.1109/TIT.2017.2689034.

[3] M. Horovitz and T. Etzion, "Local Rank Modulation for Flash Memories," in IEEE Transactions on Information Theory, vol. 65, no. 3, pp. 1705-1713, March 2019, doi: 10.1109/TIT.2018.2859403.

[4] V. Sidorenko, G. Schmidt, E. Gabidulin, M. Bossert and V. Afanassiev, "On polyalphabetic block codes," IEEE Information Theory Workshop, 2005., Rotorua, New Zealand, 2005, pp. 4 pp.-, doi: 10.1109/ITW.2005.1531889.

[5] Y. Yehezkeally, H. A. Kim, S. Puchinger and A. Wachter-Zeh, "Bounds on Mixed Codes with Finite Alphabets," 2023 IEEE Information Theory Workshop (ITW), Saint-Malo, France, 2023, pp. 389-394, doi: 10.1109/ITW55543.2023.10161655.

Supervisor:

DNA-based data storage

Short Description:
Coding theoretic problems motivated by DNA-based stroage

Description

Contemporary global demand for storage capacity is increasing exponentially, even as traditional magnetic storage media has exhausted its potential for optimization, and requires unsustainable investments for both production and maintenance, in capital, energy and space.

One promising potential medium for archival data storage is DNA; it features high density, extreme longevity, convenient scalability and a lower maintenance footprint. Complete DNA-based storage ecosystems are in active development, raising multiple coding theoretic (as well as engineering, algorithmic, and biotechnological) challenges; correspondingly, increasing attention is recently given to the study of such systems, and they are drawing significant investments from both governments
and the private sector.

The following theses are available (other topics in this domain will also be entertained):

  • The torn paper channel models the effects of DNA strand breakage in storage or processing. It has been studied from both an average-case and a worst-case [1] perspective, with several distinct adversarial models. Recently, the t-break model was studied [2] as a refinement of the previously studied min-max constraint. These developments open the way to study new problems in this setting.
  • Reconstruction from substring spectra is a model motivated by the process of shot-gun sequencing, where short strands are drawn sufficiently many times to reconstruct a long information sequence. Adapting existing literature [3,4] to more realistic models is an open problem.
  • Nanopore sequencing is a nascent technology that reads single-stranded DNA molecules by passing them through a narrow pore while passing electric current through it. More work studying its properties and designing codes capable of handling its relatively high error rate, extending existing literature [5,6], is necessary.
  • Duplications are a type of mutation occurring in the process of cell replication, which may be responsible for large portions of our current genome. For data storage schemes in in vivo DNA (e.g., for watermarking research material) it is an error model that needs to be countered. We aim to extend and build upon existing literature [7,8].



[1] D. Bar-Lev, S. Marcovich, E. Yaakobi and Y. Yehezkeally, "Adversarial Torn-Paper Codes," in IEEE Transactions on Information Theory, vol. 69, no. 10, pp. 6414-6427, Oct. 2023, doi: 10.1109/TIT.2023.3292895.

[2] C. Wang, J. Sima and N. Raviv, "Break-Resilient Codes for Forensic 3D Fingerprinting," arXiv preprint arXiv:2310.03897v1 [cs.IT], Oct. 2023, doi: https://doi.org/10.48550/arXiv.2310.03897.

[3] Y. Yehezkeally, D. Bar-Lev, S. Marcovich and E. Yaakobi, "Generalized Unique Reconstruction From Substrings," in IEEE Transactions on Information Theory, vol. 69, no. 9, pp. 5648-5659, Sept. 2023, doi: 10.1109/TIT.2023.3269124.

[4] H. Wei, M. Schwartz, G. Ge, "Reconstruction from Noisy Substrings," arXiv preprint arXiv:2312.04790v1 [cs.IT], Dec. 2023, doi: 10.48550/arXiv.2312.04790.

[5] A. Banerjee, Y. Yehezkeally, A. Wachter-Zeh and E. Yaakobi, "Error-Correcting Codes for Nanopore Sequencing," in IEEE Transactions on Information Theory, doi: 10.1109/TIT.2024.3380615.

[6] A. Banerjee, Y. Yehezkeally, A. Wachter-Zeh and E. Yaakobi, "Correcting a Single Deletion in Reads from a Nanopore Sequencer," arXiv preprint arXiv:2401.15939v2 [cs.IT], May. 2024, doi: 10.48550/arXiv.2401.15939.

[7] S. Jain, F. Farnoud Hassanzadeh, M. Schwartz and J. Bruck, "Duplication-Correcting Codes for Data Storage in the DNA of Living Organisms," in IEEE Transactions on Information Theory, vol. 63, no. 8, pp. 4996-5010, Aug. 2017, doi: 10.1109/TIT.2017.2688361.

[8] D. Goshkoder, N. Polyanskii and I. Vorobyev, "Codes Correcting Long Duplication Errors," in IEEE Transactions on Molecular, Biological, and Multi-Scale Communications, doi: 10.1109/TMBMC.2024.3403755.

Supervisor:

Homomorphic Encryption for Machine Learning

Keywords:
Partial/Somewhat Homomorphic Encryption, Federated Learning

Description

Homomorphic encryption (HE) schemes are increasingly attracting attention in the era of large scale computing. While lattice-based approaches have been well-studied, recently first progress has been made towards establishing code-based alternatives. Preliminary results show that such alterative approaches might enable undiscovered functionalities not present in current lattice-based schemes. In this project, we particularily study novel code-based Partial/Somewhat HE schemes tailored to applications in artificial intelligence and federated learning.

After familiarizing with SotA methods in relevant fields (such as [1]), the student should analyze the requirements for use-cases at hand and explore suitable modifications to current schemes and novel approaches.

Please take note that this Master Thesis is further designed to open up the possibility for a subsequent PhD position in homomorphic encrpytion with Prof. Dr.-Ing. Antonia Wachter-Zeh.

[1] Aguilar-Melchor, Carlos, Victor Dyseryn, and Philippe Gaborit, "Somewhat Homomorphic Encryption based on Random Codes," Cryptology ePrint Archive (2023).

Prerequisites

- Strong foundation in linear algebra
- Channel Coding
- Security in Communications and Storage
- Basic understanding of Machine Learning concepts

Supervisor:

Error correction for DNA storage

Description

DNA-based data storage is a novel approach for long term digital data archiving.

Due to the unique nature of writing and reading DNA, the channel associated with these processes is still relatively poorly understood and varies over different synthesis (writing) and sequencing (reading) technologies. The task of the student is to evaluate various decoding strategies for certain error-correcting schemes tailored for the DNA storage channel.

 

Prerequisites

- Basic principles of stochastic and algebra
- Channel Coding
- Information Theory

Supervisor:

Anisha Banerjee

Seminar Topics

The three Seminars "Seminar on Coding and Cryptography", "Seminar on Digital Communications" and "Seminar on Optical Communications" are organized jointly.

You can find more information at Seminar Topics.