Recent advances in neural network theory have introduced geometric properties that occur during training, past the Interpolation Threshold- where the training error reaches zero. We inquire into the phenomena coined Neural Collapse in the intermediate layers of the network, and emphasize the innerworkings of Nearest Class-Center Mismatch inside a deepnet. We further show that these processes occur both in vision and language model architectures. Lastly, we propose a Stochastic Variability-Simplification Loss (SVSL) that encourages better geometrical features in intermediate layers, yielding improvements in both train metrics and generalization.
|Number of pages||11|
|Journal||Proceedings of Machine Learning Research|
|State||Published - 2022|
|Event||ICML Workshop on Topology, Algebra, and Geometry in Machine Learning, TAG:ML 2022 - Virtual, Online, United States|
Duration: 20 Jul 2022 → …