TY - JOUR
T1 - On approximating the number of k-cliques in sublinear time
AU - Eden, Talya
AU - Ron, Dana
AU - Seshadhri, C.
N1 - Publisher Copyright:
Copyright © by SIAM.
PY - 2020
Y1 - 2020
N2 - We study the problem of approximating the number of k-cliques in a graph when given query access to the graph. We consider the standard query model for general graphs via (1) degree queries, (2) neighbor queries, and (3) pair queries. Let n denote the number of vertices in the graph, m the number of edges, and Ck the number of k-cliques. We design an algorithm that outputs a (1+ϵ)-approximation (with high probability) for Ck, whose expected query complexity and running time are O(n/C1/kk + mk/2/Ck )poly(log n, 1/ϵ, k). Hence, the complexity of the algorithm is sublinear in the size of the graph for Ck = ω (mk/2-1). Furthermore, we prove a lower bound showing that the query complexity of our algorithm is essentially optimal (up to the dependence on log n, 1/ϵ, and k). The previous results in this vein are by Feige [SIAM J. Comput., 35 (2006), pp. 964-984] and by Goldreich and Ron [Random Structures Algorithms, 32 (2008), pp. 473-493] for edge counting (k = 2) and by Eden, Levi, Ron, and Seshadhri [SIAM J. Comput., 46 (2017), pp. 1603-1646] for triangle counting (k = 3). Our result matches the complexities of these results. The previous result by Eden et al. hinges on a certain amortization technique that works only for triangle counting and does not generalize for larger cliques. We obtain a general algorithm that works for any k ≥ 3 by designing a procedure that samples each k-clique incident to one of the vertices of a given set S of vertices with approximately equal probability.
AB - We study the problem of approximating the number of k-cliques in a graph when given query access to the graph. We consider the standard query model for general graphs via (1) degree queries, (2) neighbor queries, and (3) pair queries. Let n denote the number of vertices in the graph, m the number of edges, and Ck the number of k-cliques. We design an algorithm that outputs a (1+ϵ)-approximation (with high probability) for Ck, whose expected query complexity and running time are O(n/C1/kk + mk/2/Ck )poly(log n, 1/ϵ, k). Hence, the complexity of the algorithm is sublinear in the size of the graph for Ck = ω (mk/2-1). Furthermore, we prove a lower bound showing that the query complexity of our algorithm is essentially optimal (up to the dependence on log n, 1/ϵ, and k). The previous results in this vein are by Feige [SIAM J. Comput., 35 (2006), pp. 964-984] and by Goldreich and Ron [Random Structures Algorithms, 32 (2008), pp. 473-493] for edge counting (k = 2) and by Eden, Levi, Ron, and Seshadhri [SIAM J. Comput., 46 (2017), pp. 1603-1646] for triangle counting (k = 3). Our result matches the complexities of these results. The previous result by Eden et al. hinges on a certain amortization technique that works only for triangle counting and does not generalize for larger cliques. We obtain a general algorithm that works for any k ≥ 3 by designing a procedure that samples each k-clique incident to one of the vertices of a given set S of vertices with approximately equal probability.
KW - Approximation algorithms
KW - Counting cliques
KW - Sublinear algorithms
UR - http://www.scopus.com/inward/record.url?scp=85091398273&partnerID=8YFLogxK
U2 - 10.1137/18M1176701
DO - 10.1137/18M1176701
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:85091398273
SN - 0097-5397
VL - 49
SP - 747
EP - 771
JO - SIAM Journal on Computing
JF - SIAM Journal on Computing
IS - 4
ER -