TY - JOUR

T1 - Logarithmically Larger Deletion Codes of All Distances

AU - Alon, Noga

AU - Bourla, Gabriela

AU - Graham, Ben

AU - He, Xiaoyu

AU - Kravitz, Noah

N1 - Publisher Copyright:
© 2023 IEEE.

PY - 2024/1/1

Y1 - 2024/1/1

N2 - — The deletion distance between two binary words u, v ∈ {0, 1}n is the smallest k such that u and v share a common subsequence of length n−k. A set C of binary words of length n is called a k-deletion code if every pair of distinct words in C has deletion distance greater than k. In 1965, Levenshtein initiated the study of deletion codes by showing that, for k ≥ 1 fixed and n going to infinity, a k-deletion code C ⊆ {0, 1}n of maximum size satisfies Ωk(2n/n2k) ≤ |C| ≤ Ok(2n/nk). We make the first asymptotic improvement to these bounds by showing that there exist k-deletion codes with size at least Ωk(2n log n/n2k). Our proof is inspired by Jiang and Vardy’s improvement to the classical Gilbert–Varshamov bounds. We also establish several related results on the number of longest common subsequences and shortest common supersequences of a pair of words with given length and deletion distance.

AB - — The deletion distance between two binary words u, v ∈ {0, 1}n is the smallest k such that u and v share a common subsequence of length n−k. A set C of binary words of length n is called a k-deletion code if every pair of distinct words in C has deletion distance greater than k. In 1965, Levenshtein initiated the study of deletion codes by showing that, for k ≥ 1 fixed and n going to infinity, a k-deletion code C ⊆ {0, 1}n of maximum size satisfies Ωk(2n/n2k) ≤ |C| ≤ Ok(2n/nk). We make the first asymptotic improvement to these bounds by showing that there exist k-deletion codes with size at least Ωk(2n log n/n2k). Our proof is inspired by Jiang and Vardy’s improvement to the classical Gilbert–Varshamov bounds. We also establish several related results on the number of longest common subsequences and shortest common supersequences of a pair of words with given length and deletion distance.

KW - Deletion codes

KW - longest common subsequence

KW - probabilistic combinatorics

UR - http://www.scopus.com/inward/record.url?scp=85167805763&partnerID=8YFLogxK

U2 - 10.1109/TIT.2023.3304565

DO - 10.1109/TIT.2023.3304565

M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???

AN - SCOPUS:85167805763

SN - 0018-9448

VL - 70

SP - 125

EP - 130

JO - IEEE Transactions on Information Theory

JF - IEEE Transactions on Information Theory

IS - 1

ER -