NEAREST CLASS-CENTER SIMPLIFICATION THROUGH INTERMEDIATE LAYERS

Ido Ben-Shaul, Shai Dekel

Research output: Contribution to journalConference articlepeer-review

Abstract

Recent advances in neural network theory have introduced geometric properties that occur during training, past the Interpolation Threshold- where the training error reaches zero. We inquire into the phenomena coined Neural Collapse in the intermediate layers of the network, and emphasize the innerworkings of Nearest Class-Center Mismatch inside a deepnet. We further show that these processes occur both in vision and language model architectures. Lastly, we propose a Stochastic Variability-Simplification Loss (SVSL) that encourages better geometrical features in intermediate layers, yielding improvements in both train metrics and generalization.

Original languageEnglish
Pages (from-to)37-47
Number of pages11
JournalProceedings of Machine Learning Research
Volume196
StatePublished - 2022
EventICML Workshop on Topology, Algebra, and Geometry in Machine Learning, TAG:ML 2022 - Virtual, Online, United States
Duration: 20 Jul 2022 → …

Fingerprint

Dive into the research topics of 'NEAREST CLASS-CENTER SIMPLIFICATION THROUGH INTERMEDIATE LAYERS'. Together they form a unique fingerprint.

Cite this