TY - GEN
T1 - Sublinear algorithms for approximating string compressibility
AU - Raskhodnikova, Sofya
AU - Ron, Dana
AU - Rubinfeld, Ronitt
AU - Smith, Adam
PY - 2007
Y1 - 2007
N2 - We raise the question of approximating the compressibility of a string with respect to a fixed compression scheme, in sublinear time. We study this question in detail for two popular lossless compression schemes: run-length encoding (RLE) and Lempel-Ziv (LZ), and present sublinear algorithms for approximating compressibility with respect to both schemes. We also give several lower bounds that show that our algorithms for both schemes cannot be improved significantly. Our investigation of LZ yields results whose interest goes beyond the initial questions we set out to study. In particular, we prove combinatorial structural lemmas that relate the compressibility of a string with respect to Lempel-Ziv to the number of distinct short substrings contained in it. In addition, we show that approximating the compressibility with respect to LZ is related to approximating the support size of a distribution.
AB - We raise the question of approximating the compressibility of a string with respect to a fixed compression scheme, in sublinear time. We study this question in detail for two popular lossless compression schemes: run-length encoding (RLE) and Lempel-Ziv (LZ), and present sublinear algorithms for approximating compressibility with respect to both schemes. We also give several lower bounds that show that our algorithms for both schemes cannot be improved significantly. Our investigation of LZ yields results whose interest goes beyond the initial questions we set out to study. In particular, we prove combinatorial structural lemmas that relate the compressibility of a string with respect to Lempel-Ziv to the number of distinct short substrings contained in it. In addition, we show that approximating the compressibility with respect to LZ is related to approximating the support size of a distribution.
UR - http://www.scopus.com/inward/record.url?scp=38049015443&partnerID=8YFLogxK
U2 - 10.1007/978-3-540-74208-1_44
DO - 10.1007/978-3-540-74208-1_44
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:38049015443
SN - 9783540742074
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 609
EP - 623
BT - Approximation, Randomization, and Combinatorial Optimization
PB - Springer Verlag
T2 - 10th International Workshop on Approximation Algorithms for Combinatorial Optimization Problems, APPROX 2007 and 11th International Workshop on Randomization and Computation, RANDOM 2007
Y2 - 20 August 2007 through 22 August 2007
ER -