TY - GEN
T1 - Making Progress Based on False Discoveries
AU - Livni, Roi
N1 - Publisher Copyright:
© 2024 Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing. All rights reserved.
PY - 2024/1
Y1 - 2024/1
N2 - We consider Stochastic Convex Optimization as a case-study for Adaptive Data Analysis. A basic question is how many samples are needed in order to compute accurate estimates of O(1/2) gradients queried by gradient descent. We provide two intermediate answers to this question. First, we show that for a general analyst (not necessarily gradient descent) ω(1/3) samples are required, which is more than the number of sample required to simply optimize the population loss. Our construction builds upon a new lower bound (that may be of interest of its own right) for an analyst that may ask several non adaptive questions in a batch of fixed and known T rounds of adaptivity and requires a fraction of true discoveries. We show that for such an analyst ω( T/2) samples are necessary. Second, we show that, under certain assumptions on the oracle, in an interaction with gradient descent ω(1/2.5) samples are necessary. Which is again suboptimal in terms of optimization. Our assumptions are that the oracle has only first order access and is post-hoc generalizing. First order access means that it can only compute the gradients of the sampled function at points queried by the algorithm. Our assumption of post-hoc generalization follows from existing lower bounds for statistical queries. More generally then, we provide a generic reduction from the standard setting of statistical queries to the problem of estimating gradients queried by gradient descent. Overall these results are in contrast with classical bounds that show that with O(1/2) samples one can optimize the population risk to accuracy of O but, as it turns out, with spurious gradients.
AB - We consider Stochastic Convex Optimization as a case-study for Adaptive Data Analysis. A basic question is how many samples are needed in order to compute accurate estimates of O(1/2) gradients queried by gradient descent. We provide two intermediate answers to this question. First, we show that for a general analyst (not necessarily gradient descent) ω(1/3) samples are required, which is more than the number of sample required to simply optimize the population loss. Our construction builds upon a new lower bound (that may be of interest of its own right) for an analyst that may ask several non adaptive questions in a batch of fixed and known T rounds of adaptivity and requires a fraction of true discoveries. We show that for such an analyst ω( T/2) samples are necessary. Second, we show that, under certain assumptions on the oracle, in an interaction with gradient descent ω(1/2.5) samples are necessary. Which is again suboptimal in terms of optimization. Our assumptions are that the oracle has only first order access and is post-hoc generalizing. First order access means that it can only compute the gradients of the sampled function at points queried by the algorithm. Our assumption of post-hoc generalization follows from existing lower bounds for statistical queries. More generally then, we provide a generic reduction from the standard setting of statistical queries to the problem of estimating gradients queried by gradient descent. Overall these results are in contrast with classical bounds that show that with O(1/2) samples one can optimize the population risk to accuracy of O but, as it turns out, with spurious gradients.
KW - Adaptive Data Analysis
KW - Learning Theory
KW - Stochastic Convex Optimization
UR - http://www.scopus.com/inward/record.url?scp=85184136093&partnerID=8YFLogxK
U2 - 10.4230/LIPIcs.ITCS.2024.76
DO - 10.4230/LIPIcs.ITCS.2024.76
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:85184136093
T3 - Leibniz International Proceedings in Informatics, LIPIcs
BT - 15th Innovations in Theoretical Computer Science Conference, ITCS 2024
A2 - Guruswami, Venkatesan
PB - Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing
T2 - 15th Innovations in Theoretical Computer Science Conference, ITCS 2024
Y2 - 30 January 2024 through 2 February 2024
ER -