The complexity of approximating entropy

Tukan Batu*, Sanjoy Dasgupta, Ravi Kumar, Ronitt Rubinfeld

*Corresponding author for this work

Research output: Contribution to journalConference articlepeer-review

Abstract

We consider the problem of approximating the entropy of a discrete distribution under several models. If the distribution is given explicitly as an array where the i-th location is the probability of the i-th element, then linear time is both necessary and sufficient for approximating the entropy. We consider a model in which the algorithm is given access only to independent samples from the distribution. Here, we show that a γ-multiplicative approximation to the entropy can be obtained in O (n(1+η)/γ(2) poly(log n)) time for distributions with entropy Ω(γ/η), where n is the size of the domain of the distribution and η is an arbitrarily small positive constant. We show that one cannot get a multiplicative approximation to the entropy in general in this model. Even for the class of distributions to which our upper bound applies, we obtain a lower bound of Ω (nmax(1/(2γ2),2/(5γ2-2))). We next consider a hybrid model in which both the explicit distribution as well as independent samples are available. Here, significantly more efficient algorithms can be achieved: a γ-multiplicative approximation to the entropy can be obtained in O (γ2 log2 n/h2(γ-1)2) time for distributions with entropy Ω(h); we show a lower bound of Ω (log n/h(γ2-1)). Finally, we consider two special families of distributions: those for which the probability of an element decreases monotonically in the label of the element, and those that are uniform over a subset of the domain. In each case, we give more efficient algorithms for approximating the entropy.

Original languageEnglish
Pages (from-to)678-687
Number of pages10
JournalConference Proceedings of the Annual ACM Symposium on Theory of Computing
DOIs
StatePublished - 2002
Externally publishedYes
EventProceedings of the 34th Annual ACM Symposium on Theory of Computing - Montreal, Que., Canada
Duration: 19 May 200221 May 2002

Fingerprint

Dive into the research topics of 'The complexity of approximating entropy'. Together they form a unique fingerprint.

Cite this