Measuring Uncertainty in Structured Stochastic Processes
Description
The entropy rate of a stochastic process quantifies the average uncertainty per time step. In the context of Hidden Markov Models (HMMs)—where we observe a sequence of outputs generated by an underlying Markov process—this notion becomes particularly intriguing. While standard Markov chains have a well-defined structure for computing entropy rates, the introduction of a hidden layer complicates matters significantly. The entropy rate of an HMM reflects the long-term unpredictability of its observable outputs, capturing both the randomness of the state transitions and the ambiguity introduced by the observation process.
Understanding the entropy rate of HMMs has broad implications, from speech recognition and bioinformatics to machine learning and signal processing. Despite their practical importance, entropy rates of HMMs are notoriously hard to compute exactly, often requiring approximate methods or asymptotic analysis. This challenge opens up interesting theoretical questions and motivates efficient computational approaches. In this project, we investigate how structure, memory, and noise in the model impact entropy rate, and what this reveals about the information limits of systems modeled by HMMs.
Prerequisites
Information Theory
Supervisor:
Arimoto Channel Coding Converse and Rényi Divergence
Description
A foundational concept in information theory is Shannon entropy. However, Shannon entropy does not always provide sufficient flexibility when dealing with various optimization problems, robustness considerations, or scenarios where fine control over uncertainty quantification is required. In 1961, Rényi provided a parametric generalization of Shannon entropy [1], allowing for a more nuanced analysis of information measures.
The target of this project is to understand the difference between Shannon and Rényi entropy, conditional entropy, and divergence, as well as Polyanski's result [2] --the derivation of the Arimoto converse [3] based on the data-processing inequality for Rényi divergence. This result generalizes to codes with feedback and gives the simplest proof of the strong converse for the DMC with feedback. Meanwhile, it demonstrates that the sphere-packing bound is strictly tighter than Arimoto converse for all channels, blocklengths, and rates. Finally, these results can be generalized to other (non-Rényi) divergence measures.
[1] A. Rényi, “On measures of entropy and information,” in Proc. 4th Berkeley Symp. Mathematics, Statistics, and Probability, vol. 1, Berkeley, CA, USA, 1961, pp. 547–561.
[2] Y. Polyanskiy and S. Verdu, "Arimoto channel coding converse and Rényi divergence", 2010 48th Annual Allerton Conference on Communication, Control and Computing (Allerton), Monticello, IL, USA, 2010, pp. 1327-1333.
[3] S. Arimoto, “On the converse to the coding theorem for discrete memoryless channels,” IEEE Trans. Inf. Theory, vol. 19, no. 3, pp. 357 – 359, May 1973.
Prerequisites
Information Theory