Generalization Bounds and Algorithms for Learning to Communicate over Additive Noise Channels

Nir Weinberger*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review


An additive noise channel is considered, in which the distribution of the noise is nonparametric and unknown. The problem of learning encoders and decoders based on noise samples is considered. For uncoded communication systems, the problem of choosing a codebook and possibly also a generalized minimal distance decoder (which is parameterized by a covariance matrix) is addressed. High probability generalization bounds for the error probability loss function, as well as for a hinge-type surrogate loss function are provided. A stochastic-gradient based alternating-minimization algorithm for the latter loss function is proposed. In addition, a Gibbs-based algorithm that gradually expurgates an initial codebook from codewords in order to obtain a smaller codebook with improved error probability is proposed, and bounds on its average empirical error and generalization error, as well as a high probability generalization bound, are stated. Various experiments demonstrate the performance of the proposed algorithms. For coded systems, the problem of maximizing the mutual information between the input and the output with respect to the input distribution is addressed, and uniform convergence bounds for two different classes of input distributions are obtained.

Original languageEnglish
Pages (from-to)1886-1921
Number of pages36
JournalIEEE Transactions on Information Theory
Issue number3
StatePublished - 1 Mar 2022
Externally publishedYes


  • Additive noise channels
  • Alternating optimization algorithm
  • Entropy estimation
  • Expurgation
  • Generalization bounds
  • Gibbs algorithm
  • Hinge loss
  • Minimal distance decoding
  • Mismatch decoding
  • Statistical learning
  • Stochastic gradient descent


Dive into the research topics of 'Generalization Bounds and Algorithms for Learning to Communicate over Additive Noise Channels'. Together they form a unique fingerprint.

Cite this