TY - JOUR
T1 - Robust Large Margin Deep Neural Networks
AU - Sokolić, Jure
AU - Giryes, Raja
AU - Sapiro, Guillermo
AU - Rodrigues, Miguel R.D.
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/8/15
Y1 - 2017/8/15
N2 - The generalization error of deep neural networks via their classification margin is studied in this paper. Our approach is based on the Jacobian matrix of a deep neural network and can be applied to networks with arbitrary nonlinearities and pooling layers, and to networks with different architectures such as feed forward networks and residual networks. Our analysis leads to the conclusion that a bounded spectral norm of the network's Jacobian matrix in the neighbourhood of the training samples is crucial for a deep neural network of arbitrary depth and width to generalize well. This is a significant improvement over the current bounds in the literature, which imply that the generalization error grows with either the width or the depth of the network. Moreover, it shows that the recently proposed batch normalization and weight normalization reparametrizations enjoy good generalization properties, and leads to a novel network regularizer based on the network's Jacobian matrix. The analysis is supported with experimental results on the MNIST, CIFAR-10, LaRED, and ImageNet datasets.
AB - The generalization error of deep neural networks via their classification margin is studied in this paper. Our approach is based on the Jacobian matrix of a deep neural network and can be applied to networks with arbitrary nonlinearities and pooling layers, and to networks with different architectures such as feed forward networks and residual networks. Our analysis leads to the conclusion that a bounded spectral norm of the network's Jacobian matrix in the neighbourhood of the training samples is crucial for a deep neural network of arbitrary depth and width to generalize well. This is a significant improvement over the current bounds in the literature, which imply that the generalization error grows with either the width or the depth of the network. Moreover, it shows that the recently proposed batch normalization and weight normalization reparametrizations enjoy good generalization properties, and leads to a novel network regularizer based on the network's Jacobian matrix. The analysis is supported with experimental results on the MNIST, CIFAR-10, LaRED, and ImageNet datasets.
KW - Deep learning
KW - deep neural networks
KW - generalization error
KW - robustness
UR - http://www.scopus.com/inward/record.url?scp=85028425694&partnerID=8YFLogxK
U2 - 10.1109/TSP.2017.2708039
DO - 10.1109/TSP.2017.2708039
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:85028425694
SN - 1053-587X
VL - 65
SP - 4265
EP - 4280
JO - IEEE Transactions on Signal Processing
JF - IEEE Transactions on Signal Processing
IS - 16
M1 - 7934087
ER -