TY - CONF
T1 - An Optimization and Generalization Analysis for Max-Pooling Networks
AU - Brutzkus, Alon
AU - Globerson, Amir
N1 - Publisher Copyright:
© 2021 37th Conference on Uncertainty in Artificial Intelligence, UAI 2021. All Rights Reserved.
PY - 2021
Y1 - 2021
N2 - Max-Pooling operations are a core component of deep learning architectures. In particular, they are part of most convolutional architectures used in machine vision, since pooling is a natural approach to pattern detection problems. However, these architectures are not well understood from a theoretical perspective. For example, we do not understand when they can be globally optimized, and what is the effect of over-parameterization on generalization. Here we perform a theoretical analysis of a convolutional max-pooling architecture, proving that it can be globally optimized, and can generalize well even for highly over-parameterized models. Our analysis focuses on a data generating distribution inspired by pattern detection problem, where a “discriminative” pattern needs to be detected among “spurious” patterns. We empirically validate that CNNs significantly outperform fully connected networks in our setting, as predicted by our theoretical results.
AB - Max-Pooling operations are a core component of deep learning architectures. In particular, they are part of most convolutional architectures used in machine vision, since pooling is a natural approach to pattern detection problems. However, these architectures are not well understood from a theoretical perspective. For example, we do not understand when they can be globally optimized, and what is the effect of over-parameterization on generalization. Here we perform a theoretical analysis of a convolutional max-pooling architecture, proving that it can be globally optimized, and can generalize well even for highly over-parameterized models. Our analysis focuses on a data generating distribution inspired by pattern detection problem, where a “discriminative” pattern needs to be detected among “spurious” patterns. We empirically validate that CNNs significantly outperform fully connected networks in our setting, as predicted by our theoretical results.
UR - http://www.scopus.com/inward/record.url?scp=85124332368&partnerID=8YFLogxK
M3 - ???researchoutput.researchoutputtypes.contributiontoconference.paper???
AN - SCOPUS:85124332368
SP - 1650
EP - 1660
T2 - 37th Conference on Uncertainty in Artificial Intelligence, UAI 2021
Y2 - 27 July 2021 through 30 July 2021
ER -