TY - JOUR
T1 - Classification with many classes
T2 - Challenges and pluses
AU - Abramovich, Felix
AU - Pensky, Marianna
N1 - Publisher Copyright:
© 2019 Elsevier Inc.
PY - 2019/11
Y1 - 2019/11
N2 - The objective of the paper is to study accuracy of multi-class classification in high-dimensional setting, where the number of classes is also large (“large L, large p, small n” model). While this problem arises in many practical applications and many techniques have been recently developed for its solution, to the best of our knowledge nobody provided a rigorous theoretical analysis of this important setup. The purpose of the present paper is to fill in this gap. We consider one of the most common settings, classification of high-dimensional normal vectors where, unlike standard assumptions, the number of classes could be large. We derive non-asymptotic conditions on effects of significant features, and the low and the upper bounds for distances between classes required for successful feature selection and classification with a given accuracy. Furthermore, we study an asymptotic setup where the number of classes is diverging with the dimension of feature space and while the number of samples per class is possibly limited. We point out on an interesting and, at first glance, somewhat counter-intuitive phenomenon that a large number of classes may be a “blessing” rather than a “curse” since, in certain settings, the precision of classification can improve as the number of classes grows. This is due to more accurate feature selection since even weaker significant features, which are not sufficiently strong to be manifested in a coarse classification, being shared across the classes, have a stronger impact as the number of classes increases. We supplement our theoretical investigation by a simulation study and a real data example where we again observe the above phenomenon.
AB - The objective of the paper is to study accuracy of multi-class classification in high-dimensional setting, where the number of classes is also large (“large L, large p, small n” model). While this problem arises in many practical applications and many techniques have been recently developed for its solution, to the best of our knowledge nobody provided a rigorous theoretical analysis of this important setup. The purpose of the present paper is to fill in this gap. We consider one of the most common settings, classification of high-dimensional normal vectors where, unlike standard assumptions, the number of classes could be large. We derive non-asymptotic conditions on effects of significant features, and the low and the upper bounds for distances between classes required for successful feature selection and classification with a given accuracy. Furthermore, we study an asymptotic setup where the number of classes is diverging with the dimension of feature space and while the number of samples per class is possibly limited. We point out on an interesting and, at first glance, somewhat counter-intuitive phenomenon that a large number of classes may be a “blessing” rather than a “curse” since, in certain settings, the precision of classification can improve as the number of classes grows. This is due to more accurate feature selection since even weaker significant features, which are not sufficiently strong to be manifested in a coarse classification, being shared across the classes, have a stronger impact as the number of classes increases. We supplement our theoretical investigation by a simulation study and a real data example where we again observe the above phenomenon.
KW - Feature selection
KW - High-dimensionality
KW - Misclassification error
KW - Multi-class classification
KW - Sparsity
UR - http://www.scopus.com/inward/record.url?scp=85070858613&partnerID=8YFLogxK
U2 - 10.1016/j.jmva.2019.104536
DO - 10.1016/j.jmva.2019.104536
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:85070858613
SN - 0047-259X
VL - 174
JO - Journal of Multivariate Analysis
JF - Journal of Multivariate Analysis
M1 - 104536
ER -