Nearly Optimal Constructions of PIR and Batch Codes
Prof. Eitan Yaakobi
Technion
Distributed and cloud storage systems today are required to tolerate the failure or unavailability of some of the nodes in the system. The simplest and most commonly method to accomplish this task is by replication, whereby every node is replicated several times, usually three. This solution has clear advantages due to its simplicity, fast recovery, and efficient availability. However, it entails a large storage overhead which becomes costly in large storage systems. Availability in particular plays an important role in supporting high throughput of the system. Consider the case in which a large number of files are stored in a distributed storage system and user requests of the files are received. If each file has one copy or can be recovered only once, and several users request this file, then these requests will have to be served sequentially, which will significantly slow down the system's response time.
In this work, we study two families of codes with availability, namely private information retrieval (PIR) codes and batch codes. While the former requires that every information symbol has k mutually disjoint recovering sets, the latter imposes this property for each multiset request of k information symbols. The main problem under this paradigm is to minimize the number of redundancy symbols. We denote this value by rP(n,k), rB(n,k), for PIR codes, batch codes, respectively, where n is the number of information symbols. Previous results showed that for any constant k, rP(n,k) = (n) and rB(n,k)=O(nlog(n)). We study the asymptotic behavior of these codes for non-constant k and specifically for k=(n). We also study the largest value of k such that the rate of the codes approaches 1, and show that for all <1, rP(n,n) = o(n) and rB(n,n) = o(n). Several more results are proved for the case of fixed k.
Eitan Yaakobi is an Assistant Professor at the Computer Science Department at the Technion – Israel Institute of Technology. He received a PhD. degree in Electrical and Computer Engineering at the University of California, San Diego, under the supervision of Prof. Paul Siegel, Prof. Alexander Vardy, and Prof. Jack Wolf. Between 2011-2014, he was a postdoctoral researcher in the department of Electrical Engineering at the California Institute of Technology, where he worked with Prof. Shuki Bruck and the Center for Memories and Recording Research at the University of California, San Diego, where he worked with Prof. Paul Siegel. He received the Marconi Society Young Scholar Award in 2009 and was a recipient of the Intel Ph.D. fellowship in 2010-2011. His research interests include information and coding theory with applications to non-volatile memories, associative memories, data storage and retrieval, and voting theory.