Information Theory (cs.IT)
Thu, 11 May 2023
1.Finite-State Relative Dimension, dimensions of A. P. subsequences and a Finite-State van Lambalgen's theorem
Authors:Satyadev Nandakumar, Subin Pulari, Akhil S
Abstract: Finite-state dimension (Dai, Lathrop, Lutz, and Mayordomo (2004)) quantifies the information rate in an infinite sequence as measured by finite-state automata. In this paper, we define a relative version of finite-state dimension. The finite-state relative dimension $dim_{FS}^Y(X)$ of a sequence $X$ relative to $Y$ is the finite-state dimension of $X$ measured using the class of finite-state gamblers with an oracle access to $Y$. We show its mathematical robustness by equivalently characterizing this notion using the relative block entropy rate of $X$ conditioned on $Y$. We derive inequalities relating the dimension of a sequence to the relative dimension of its subsequences along any arithmetic progression (A.P.). These enable us to obtain a strengthening of Wall's Theorem on the normality of A.P. subsequences of a normal number, in terms of relative dimension. In contrast to the original theorem, this stronger version has an exact converse yielding a new characterization of normality. We also obtain finite-state analogues of van Lambalgen's theorem on the symmetry of relative normality.
2.Preferential Pliable Index Coding
Authors:Daniel Byrne, Lawrence Ong, Parastoo Sadeghi, Badri N. Vellambi
Abstract: We propose and study a variant of pliable index coding (PICOD) where receivers have preferences for their unknown messages and give each unknown message a preference ranking. We call this the preferential pliable index-coding (PPICOD) problem and study the Pareto trade-off between the code length and overall satisfaction metric among all receivers. We derive theoretical characteristics of the PPICOD problem in terms of interactions between achievable code length and satisfaction metric. We also conceptually characterise two methods for computation of the Pareto boundary of the set of all achievable code length-satisfaction pairs. As for a coding scheme, we extend the Greedy Cover Algorithm for PICOD by Brahma and Fragouli, 2015, to balance the number of satisfied receivers and average satisfaction metric in each iteration. We present numerical results which show the efficacy of our proposed algorithm in approaching the Pareto boundary, found via brute-force computation.
3.Designing Compact Repair Groups for Reed-Solomon Codes
Authors:Thi Xinh Dinh, Serdar Boztas, Son Hoang Dau, Emanuele Viterbo
Abstract: Motivated by the application of Reed-Solomon codes to recently emerging decentralized storage systems such as Storj and Filebase/Sia, we study the problem of designing compact repair groups for recovering multiple failures in a decentralized manner. Here, compactness means that the corresponding trace repair schemes of these groups of helpers can be generated from a single or a few seed repair schemes, thus saving the time and space required for finding and storing them. The goal is to design compact repair groups that can tolerate as many failures as possible. It turns out that the maximum number of failures a collection of repair groups can tolerate equals the size of a minimum hitting set of a collection of subsets of the finite field {\mathbb{F}_{q^{\ell}}} minus one. When the repair groups for each symbol are generated from a single subspace, we establish a pair of asymptotically tight lower bound and upper bound on the size of such a minimum hitting set. Using Burnside's Lemma and the M\"{o}bius inversion formula, we determine a number of subspaces that together attain the upper bound on the minimum hitting set size when the repair groups are generated from multiple subspaces.
4.Zero-Error Distributed Function Compression for Binary Arithmetic Sum
Authors:Xuan Guang, Ruze Zhang
Abstract: In this paper, we put forward the model of zero-error distributed function compression system of two binary memoryless sources X and Y, where there are two encoders En1 and En2 and one decoder De, connected by two channels (En1, De) and (En2, De) with the capacity constraints C1 and C2, respectively. The encoder En1 can observe X or (X,Y) and the encoder En2 can observe Y or (X,Y) according to the two switches s1 and s2 open or closed (corresponding to taking values 0 or 1). The decoder De is required to compress the binary arithmetic sum f(X,Y)=X+Y with zero error by using the system multiple times. We use (s1s2;C1,C2;f) to denote the model in which it is assumed that C1 \geq C2 by symmetry. The compression capacity for the model is defined as the maximum average number of times that the function f can be compressed with zero error for one use of the system, which measures the efficiency of using the system. We fully characterize the compression capacities for all the four cases of the model (s1s2;C1,C2;f) for s1s2= 00,01,10,11. Here, the characterization of the compression capacity for the case (01;C1,C2;f) with C1>C2 is highly nontrivial, where a novel graph coloring approach is developed. Furthermore, we apply the compression capacity for (01;C1,C2;f) to an open problem in network function computation that whether the best known upper bound of Guang et al. on computing capacity is in general tight.
5.Joint Identification and Sensing for Discrete Memoryless Channels
Authors:Wafa Labidi, Christian Deppe, Holger Boche
Abstract: In the identification (ID) scheme proposed by Ahlswede and Dueck, the receiver only checks whether a message of special interest to him has been sent or not. In contrast to Shannon transmission codes, the size of ID codes for a Discrete Memoryless Channel (DMC) grows doubly exponentially fast with the blocklength, if randomized encoding is used. This groundbreaking result makes the ID paradigm more efficient than the classical Shannon transmission in terms of necessary energy and hardware components. Further gains can be achieved by taking advantage of additional resources such as feedback. We study the problem of joint ID and channel state estimation over a DMC with independent and identically distributed (i.i.d.) state sequences. The sender simultaneously sends an ID message over the DMC with a random state and estimates the channel state via a strictly causal channel output. The random channel state is available to neither the sender nor the receiver. For the proposed system model, we establish a lower bound on the ID capacity-distortion function.
6.Adaptive Privacy-Preserving Coded Computing With Hierarchical Task Partitioning
Authors:Qicheng Zeng, Zhaojun Nan, Sheng Zhou
Abstract: Distributed computing is known as an emerging and efficient technique to support various intelligent services, such as large-scale machine learning. However, privacy leakage and random delays from straggling servers pose significant challenges. To address these issues, coded computing, a promising solution that combines coding theory with distributed computing, recovers computation tasks with results from a subset of workers. In this paper, we propose the adaptive privacy-preserving coded computing (APCC) strategy, which can adaptively provide accurate or approximated results according to the form of computation functions, so as to suit diverse types of computation tasks. We prove that APCC achieves complete data privacy preservation and demonstrate its optimality in terms of encoding rate, defined as the ratio between the computation loads of tasks before and after encoding. To further alleviate the straggling effect and reduce delay, we integrate hierarchical task partitioning and task cancellation into the coding design of APCC. The corresponding partitioning problems are formulated as mixed-integer nonlinear programming (MINLP) problems with the objective of minimizing task completion delay. We propose a low-complexity maximum value descent (MVD) algorithm to optimally solve these problems. Simulation results show that APCC can reduce task completion delay by at least 42.9% compared to other state-of-the-art benchmarks.
7.Full-Spectrum Wireless Communications for 6G and Beyond: From Microwave, Millimeter-Wave, Terahertz to Lightwave
Authors:Wei Jiang, Hans D. Schotten
Abstract: As of today, 5G is rolling out across the world, but academia and industry have shifted their attention to the sixth generation (6G) cellular technology for a full-digitalized, intelligent society in 2030 and beyond. 6G demands far more bandwidth to support extreme performance, exacerbating the problem of spectrum shortage in mobile communications. In this context, this paper proposes a novel concept coined Full-Spectrum Wireless Communications (FSWC). It makes use of all communication-feasible spectral resources over the whole electromagnetic (EW) spectrum, from microwave, millimeter wave, terahertz (THz), infrared light, visible light, to ultraviolet light. FSWC not only provides sufficient bandwidth but also enables new paradigms taking advantage of peculiarities on different EW bands. This paper will define FSWC, justify its necessity for 6G, and then discuss the opportunities and challenges of exploiting THz and optical bands.
8.Correcting One Error in Non-Binary Channels with Feedback
Authors:Ilya Vorobyev, Vladimir Lebedev, Alexey Lebedev
Abstract: In this paper, the problem of correction of a single error in $q$-ary symmetric channel with noiseless feedback is considered. We propose an algorithm to construct codes with feedback inductively. For all prime power $q$ we prove that two instances of feedback are sufficient to transmit over the $q$-ary symmetric channel the same number of messages as in the case of complete feedback. Our other contribution is the construction of codes with one-time feedback with the same parameters as Hamming codes for $q$ that is not a prime power. We also construct single-error-correcting codes with one-time feedback of size $q^{n-2}$ for arbitrary $q$ and $n\leq q+1$, which can be seen as an analog for Reed-Solomon codes.
9.A Diagonal Splitting Algorithm for Adaptive Group Testing
Authors:Chaorui Yao, Pavlos Nikolopoulos, Christina Fragouli
Abstract: Group testing enables to identify infected individuals in a population using a smaller number of tests than individual testing. To achieve this, group testing algorithms commonly assume knowledge of the number of infected individuals; nonadaptive and several adaptive algorithms fall in this category. Some adaptive algorithms, like binary splitting, operate without this assumption, but require a number of stages that may scale linearly with the size of the population. In this paper we contribute a new algorithm that enables a balance between the number of tests and the number of stages used, and which we term diagonal group testing. Diagonal group testing, like binary splitting, does not require knowledge of the number of infected individuals, yet unlike binary splitting, is order-optimal w.r.t. the expected number of tests it requires and is guaranteed to succeed in a small number of stages that scales at most logarithmically with the size of the population. Numerical evaluations, for diagonal group testing and a hybrid approach we propose, support our theoretical findings.
10.Vector Quantization with Error Uniformly Distributed over an Arbitrary Set
Authors:Chih Wei, Ling, Cheuk Ting, Li
Abstract: For uniform scalar quantization, the error distribution is approximately a uniform distribution over an interval (which is also a 1-dimensional ball). Nevertheless, for lattice vector quantization, the error distribution is uniform not over a ball, but over the basic cell of the quantization lattice. In this paper, we construct vector quantizers where the error is uniform over the n-ball, or any other prescribed set. We then prove bounds on the entropy of the quantized signals.
11.MIMO Radar Transmit Signal Optimization for Target Localization Exploiting Prior Information
Authors:Chan Xu, Shuowen Zhang
Abstract: In this paper, we consider a multiple-input multiple-output (MIMO) radar system for localizing a target based on its reflected echo signals. Specifically, we aim to estimate the random and unknown angle information of the target, by exploiting its prior distribution information. First, we characterize the estimation performance by deriving the posterior Cram\'er-Rao bound (PCRB), which quantifies a lower bound of the estimation mean-squared error (MSE). Since the PCRB is in a complicated form, we derive a tight upper bound of it to approximate the estimation performance. Based on this, we analytically show that by exploiting the prior distribution information, the PCRB is always no larger than the CRB averaged over random angle realizations without prior information exploitation. Next, we formulate the transmit signal optimization problem to minimize the PCRB upper bound. We show that the optimal sample covariance matrix has a rank-one structure, and derive the optimal signal solution in closed form. Numerical results show that our proposed design achieves significantly improved PCRB performance compared to various benchmark schemes.
12.Single-Server Pliable Private Information Retrieval With Side Information
Authors:Sarah A. Obead, Hsuan-Yin Lin, Eirik Rosnes
Abstract: We study the problem of pliable private information retrieval with side information (PPIR-SI) for the single server case. In PPIR, the messages are partitioned into nonoverlapping classes and stored in a number of noncolluding databases. The user wishes to retrieve any one message from a desired class while revealing no information about the desired class identity to the databases. In PPIR-SI, the user has prior access to some side information in the form of messages from different classes and wishes to retrieve any one new message from a desired class, i.e., the message is not included in the side information set, while revealing no information about the desired class to the databases. We characterize the capacity of (linear) single-server PPIR-SI for the case where the user's side information is unidentified, i.e., the user is oblivious of the identities of its side information messages and the database structure. We term this case PPIR-USI. Surprisingly, we show that having side information, in PPIR-USI, is disadvantageous, in terms of the download rate, compared to PPIR.
13.Multi-Antenna Coded Caching for Location-Aware Content Delivery
Authors:Hamidreza Bakhshzad Mahmoodi, MohammadJavad Salehi, Antti Tolli
Abstract: A location-aware coded caching scheme is introduced for applications with location-dependent data requests. An example of such an application is a wireless immersive experience, where users are immersed in a three-dimensional virtual world and their viewpoint varies as they move within the application area. As the wireless connectivity condition of the users also varies with their location due to small- and large-scale fading, a non-uniform memory allocation process is used to avoid excessive delivery time in the bottleneck areas. Then, a well-defined location-aware placement and delivery array (LAPDA) is used for data delivery to utilize unicast transmission with a fast converging, iterative linear beamforming process. An underlying weighted max-min transmit precoder design enables the proposed scheme to serve users in poor connectivity areas with smaller amounts of data while simultaneously delivering larger amounts to other users. Unlike previous studies in the literature, our new scheme is not constrained by the number of users or network parameters (users' cache capacity, number of antennas at the transmitter, etc.) and is suitable for large networks due to its linear transceiver structure. Despite non-uniform cache placement, the proposed scheme achieves a coded caching gain that is additive to the multiplexing gain and outperforms conventional symmetric CC schemes with only a moderate degree of freedom (DoF) loss.
14.An Information-Spectrum Approach to Distributed Hypothesis Testing for General Sources
Authors:Ismaila Salihou Adamou, Elsa Dupraz, Tad Matsumoto
Abstract: This paper investigates Distributed Hypothesis testing (DHT), in which a source $\mathbf{X}$ is encoded given that side information $\mathbf{Y}$ is available at the decoder only. Based on the received coded data, the receiver aims to decide on the two hypotheses $H_0$ or $H_1$ related to the joint distribution of $\mathbf{X}$ and $\mathbf{Y}$. While most existing contributions in the literature on DHT consider i.i.d. assumptions, this paper assumes more generic, non-i.i.d., non-stationary, and non-ergodic sources models. It relies on information-spectrum tools to provide general formulas on the achievable Type-II error exponent under a constraint on the Type-I error. The achievability proof is based on a quantize-and-binning scheme. It is shown that with the quantize-and-binning approach, the error exponent boils down to a trade-off between a binning error and a decision error, as already observed for the i.i.d. sources. The last part of the paper provides error exponents for particular source models, \emph{e.g.}, Gaussian, stationary, and ergodic models.
15.On the Advantages of Asynchrony in the Unsourced MAC
Authors:Alexander Fengler, Alejandro Lancho, Krishna Narayanan, Yury Polyanskiy
Abstract: In this work we demonstrate how a lack of synchronization can in fact be advantageous in the problem of random access. Specifically, we consider a multiple-access problem over a frame-asynchronous 2-user binary-input adder channel in the unsourced setup (2-UBAC). Previous work has shown that under perfect synchronization the per-user rates achievable with linear codes over the 2-UBAC are limited by 0.5 bit per channel use (compared to the capacity of 0.75). In this paper, we first demonstrate that arbitrary small (even single-bit) shift between the user's frames enables (random) linear codes to attain full capacity of 0.75 bit/user. Furthermore, we derive density evolution equations for irregular LDPC codes, and prove (via concentration arguments) that they correctly track the asymptotic bit-error rate of a BP decoder. Optimizing the degree distributions we construct LDPC codes achieving per-user rates of 0.73 bit per channel use.
16.The Cardinality Bound on the Information Bottleneck Representations is Tight
Authors:Etam Benger, Shahab Asoodeh, Jun Chen
Abstract: The information bottleneck (IB) method aims to find compressed representations of a variable $X$ that retain the most relevant information about a target variable $Y$. We show that for a wide family of distributions -- namely, when $Y$ is generated by $X$ through a Hamming channel, under mild conditions -- the optimal IB representations require an alphabet strictly larger than that of $X$. This implies that, despite several recent works, the cardinality bound first identified by Witsenhausen and Wyner in 1975 is tight. At the core of our finding is the observation that the IB function in this setting is not strictly concave, similar to the deterministic case, even though the joint distribution of $X$ and $Y$ is of full support. Finally, we provide a complete characterization of the IB function, as well as of the optimal representations for the Hamming case.
17.Computing Unique Information for Poisson and Multinomial Systems
Authors:Chaitanya Goswami, Amanda Merkley, Pulkit Grover
Abstract: Bivariate Partial Information Decomposition (PID) describes how the mutual information between a random variable M and two random variables Y and Z is decomposed into unique, redundant, and synergistic terms. Recently, PID has shown promise as an emerging tool to understand biological systems and biases in machine learning. However, computing PID is a challenging problem as it typically involves optimizing over distributions. In this work, we study the problem of computing PID in two systems: the Poisson system inspired by the 'ideal Poisson channel' and the multinomial system inspired by multinomial thinning, for a scalar M. We provide sufficient conditions for both systems under which closed-form expressions for many operationally-motivated PID can be obtained, thereby allowing us to easily compute PID for these systems. Our proof consists of showing that one of the unique information terms is zero, which allows the remaining unique, redundant, and synergistic terms to be easily computed using only the marginal and the joint mutual information.