An Optimization and Generalization Analysis for Max-Pooling Networks

Alon Brutzkus, Amir Globerson

Research output: Contribution to conferencePaperpeer-review

13 Scopus citations

Abstract

Max-Pooling operations are a core component of deep learning architectures. In particular, they are part of most convolutional architectures used in machine vision, since pooling is a natural approach to pattern detection problems. However, these architectures are not well understood from a theoretical perspective. For example, we do not understand when they can be globally optimized, and what is the effect of over-parameterization on generalization. Here we perform a theoretical analysis of a convolutional max-pooling architecture, proving that it can be globally optimized, and can generalize well even for highly over-parameterized models. Our analysis focuses on a data generating distribution inspired by pattern detection problem, where a “discriminative” pattern needs to be detected among “spurious” patterns. We empirically validate that CNNs significantly outperform fully connected networks in our setting, as predicted by our theoretical results.

Original languageEnglish
Pages1650-1660
Number of pages11
StatePublished - 2021
Event37th Conference on Uncertainty in Artificial Intelligence, UAI 2021 - Virtual, Online
Duration: 27 Jul 202130 Jul 2021

Conference

Conference37th Conference on Uncertainty in Artificial Intelligence, UAI 2021
CityVirtual, Online
Period27/07/2130/07/21

Fingerprint

Dive into the research topics of 'An Optimization and Generalization Analysis for Max-Pooling Networks'. Together they form a unique fingerprint.

Cite this