TY - GEN
T1 - Sample-Based Distance-Approximation for Subsequence-Freeness
AU - Cohen-Sidon, Omer
AU - Ron, Dana
N1 - Publisher Copyright:
© Omer Cohen Sidon and Dana Ron.
PY - 2023/7
Y1 - 2023/7
N2 - In this work, we study the problem of approximating the distance to subsequence-freeness in the sample-based distribution-free model. For a given subsequence (word) w = w1 . . . wk, a sequence (text) T = t1 . . . tn is said to contain w if there exist indices 1 ≤ i1 < · · · < ik ≤ n such that tij = wj for every 1 ≤ j ≤ k. Otherwise, T is w-free. Ron and Rosin (ACM TOCT 2022) showed that the number of samples both necessary and sufficient for one-sided error testing of subsequence-freeness in the sample-based distribution-free model is Θ(k/ϵ). Denoting by ∆(T, w, p) the distance of T to w-freeness under a distribution p : [n] → [0, 1], we are interested in obtaining an estimate ∆b , such that |∆b − ∆(T, w, p)| ≤ δ with probability at least 2/3, for a given distance parameter δ. Our main result is an algorithm whose sample complexity is Õ(k2/δ2). We first present an algorithm that works when the underlying distribution p is uniform, and then show how it can be modified to work for any (unknown) distribution p. We also show that a quadratic dependence on 1/δ is necessary.
AB - In this work, we study the problem of approximating the distance to subsequence-freeness in the sample-based distribution-free model. For a given subsequence (word) w = w1 . . . wk, a sequence (text) T = t1 . . . tn is said to contain w if there exist indices 1 ≤ i1 < · · · < ik ≤ n such that tij = wj for every 1 ≤ j ≤ k. Otherwise, T is w-free. Ron and Rosin (ACM TOCT 2022) showed that the number of samples both necessary and sufficient for one-sided error testing of subsequence-freeness in the sample-based distribution-free model is Θ(k/ϵ). Denoting by ∆(T, w, p) the distance of T to w-freeness under a distribution p : [n] → [0, 1], we are interested in obtaining an estimate ∆b , such that |∆b − ∆(T, w, p)| ≤ δ with probability at least 2/3, for a given distance parameter δ. Our main result is an algorithm whose sample complexity is Õ(k2/δ2). We first present an algorithm that works when the underlying distribution p is uniform, and then show how it can be modified to work for any (unknown) distribution p. We also show that a quadratic dependence on 1/δ is necessary.
KW - Distance Approximation
KW - Property Testing
UR - http://www.scopus.com/inward/record.url?scp=85167335411&partnerID=8YFLogxK
U2 - 10.4230/LIPIcs.ICALP.2023.44
DO - 10.4230/LIPIcs.ICALP.2023.44
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:85167335411
T3 - Leibniz International Proceedings in Informatics, LIPIcs
BT - 50th International Colloquium on Automata, Languages, and Programming, ICALP 2023
A2 - Etessami, Kousha
A2 - Feige, Uriel
A2 - Puppis, Gabriele
PB - Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing
T2 - 50th International Colloquium on Automata, Languages, and Programming, ICALP 2023
Y2 - 10 July 2023 through 14 July 2023
ER -