TY - JOUR

T1 - On approximating the number of k-cliques in sublinear time

AU - Eden, Talya

AU - Ron, Dana

AU - Seshadhri, C.

N1 - Publisher Copyright:
Copyright © by SIAM.

PY - 2020

Y1 - 2020

N2 - We study the problem of approximating the number of k-cliques in a graph when given query access to the graph. We consider the standard query model for general graphs via (1) degree queries, (2) neighbor queries, and (3) pair queries. Let n denote the number of vertices in the graph, m the number of edges, and Ck the number of k-cliques. We design an algorithm that outputs a (1+ϵ)-approximation (with high probability) for Ck, whose expected query complexity and running time are O(n/C1/kk + mk/2/Ck )poly(log n, 1/ϵ, k). Hence, the complexity of the algorithm is sublinear in the size of the graph for Ck = ω (mk/2-1). Furthermore, we prove a lower bound showing that the query complexity of our algorithm is essentially optimal (up to the dependence on log n, 1/ϵ, and k). The previous results in this vein are by Feige [SIAM J. Comput., 35 (2006), pp. 964-984] and by Goldreich and Ron [Random Structures Algorithms, 32 (2008), pp. 473-493] for edge counting (k = 2) and by Eden, Levi, Ron, and Seshadhri [SIAM J. Comput., 46 (2017), pp. 1603-1646] for triangle counting (k = 3). Our result matches the complexities of these results. The previous result by Eden et al. hinges on a certain amortization technique that works only for triangle counting and does not generalize for larger cliques. We obtain a general algorithm that works for any k ≥ 3 by designing a procedure that samples each k-clique incident to one of the vertices of a given set S of vertices with approximately equal probability.

AB - We study the problem of approximating the number of k-cliques in a graph when given query access to the graph. We consider the standard query model for general graphs via (1) degree queries, (2) neighbor queries, and (3) pair queries. Let n denote the number of vertices in the graph, m the number of edges, and Ck the number of k-cliques. We design an algorithm that outputs a (1+ϵ)-approximation (with high probability) for Ck, whose expected query complexity and running time are O(n/C1/kk + mk/2/Ck )poly(log n, 1/ϵ, k). Hence, the complexity of the algorithm is sublinear in the size of the graph for Ck = ω (mk/2-1). Furthermore, we prove a lower bound showing that the query complexity of our algorithm is essentially optimal (up to the dependence on log n, 1/ϵ, and k). The previous results in this vein are by Feige [SIAM J. Comput., 35 (2006), pp. 964-984] and by Goldreich and Ron [Random Structures Algorithms, 32 (2008), pp. 473-493] for edge counting (k = 2) and by Eden, Levi, Ron, and Seshadhri [SIAM J. Comput., 46 (2017), pp. 1603-1646] for triangle counting (k = 3). Our result matches the complexities of these results. The previous result by Eden et al. hinges on a certain amortization technique that works only for triangle counting and does not generalize for larger cliques. We obtain a general algorithm that works for any k ≥ 3 by designing a procedure that samples each k-clique incident to one of the vertices of a given set S of vertices with approximately equal probability.

KW - Approximation algorithms

KW - Counting cliques

KW - Sublinear algorithms

UR - http://www.scopus.com/inward/record.url?scp=85091398273&partnerID=8YFLogxK

U2 - 10.1137/18M1176701

DO - 10.1137/18M1176701

M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???

AN - SCOPUS:85091398273

SN - 0097-5397

VL - 49

SP - 747

EP - 771

JO - SIAM Journal on Computing

JF - SIAM Journal on Computing

IS - 4

ER -