TY - CONF
T1 - A CONSTRUCTIVE PREDICTION OF THE GENERALIZATION ERROR ACROSS SCALES
AU - Rosenfeld, Jonathan S.
AU - Rosenfeld, Amir
AU - Belinkov, Yonatan
AU - Shavit, Nir
N1 - Publisher Copyright:
© 2020 8th International Conference on Learning Representations, ICLR 2020. All rights reserved.
PY - 2020
Y1 - 2020
N2 - The dependency of the generalization error of neural networks on model and dataset size is of critical importance both in practice and for understanding the theory of neural networks. Nevertheless, the functional form of this dependency remains elusive. In this work, we present a functional form which approximates well the generalization error in practice. Capitalizing on the successful concept of model scaling (e.g., width, depth), we are able to simultaneously construct such a form and specify the exact models which can attain it across model/data scales. Our construction follows insights obtained from observations conducted over a range of model/data scales, in various model types and datasets, in vision and language tasks. We show that the form both fits the observations well across scales, and provides accurate predictions from small- to large-scale models and data.
AB - The dependency of the generalization error of neural networks on model and dataset size is of critical importance both in practice and for understanding the theory of neural networks. Nevertheless, the functional form of this dependency remains elusive. In this work, we present a functional form which approximates well the generalization error in practice. Capitalizing on the successful concept of model scaling (e.g., width, depth), we are able to simultaneously construct such a form and specify the exact models which can attain it across model/data scales. Our construction follows insights obtained from observations conducted over a range of model/data scales, in various model types and datasets, in vision and language tasks. We show that the form both fits the observations well across scales, and provides accurate predictions from small- to large-scale models and data.
UR - http://www.scopus.com/inward/record.url?scp=85150618505&partnerID=8YFLogxK
M3 - ???researchoutput.researchoutputtypes.contributiontoconference.paper???
AN - SCOPUS:85150618505
T2 - 8th International Conference on Learning Representations, ICLR 2020
Y2 - 30 April 2020
ER -