TY - JOUR
T1 - Active Learning via Predictive Normalized Maximum Likelihood Minimization
AU - Shayovitz, Shachar
AU - Feder, Meir
N1 - Publisher Copyright:
© 1963-2012 IEEE.
PY - 2024
Y1 - 2024
N2 - Machine learning systems require massive amounts of labeled training data in order to achieve high accuracy rates. Active learning uses feedback to label the most informative data points and significantly reduce the training set size. Many heuristics for selecting data points have been developed in recent years which are usually tailored to a specific task and a general unified framework is lacking. In this work, the individual setting is considered and an active learning criterion is proposed. Motivated by universal source coding, the proposed criterion attempts to find data points which minimize the Predictive Normalized Maximum Likelihood (pNML) regret on an un-labelled test set. It is shown that for binary classification and linear regression, the resulting criterion coincides with well-known active learning criteria and thus represents a unified information theoretic active learning approach for general hypothesis classes. Finally, it is shown using real data that the proposed criterion performs better than other active learning criteria in terms of sample complexity.
AB - Machine learning systems require massive amounts of labeled training data in order to achieve high accuracy rates. Active learning uses feedback to label the most informative data points and significantly reduce the training set size. Many heuristics for selecting data points have been developed in recent years which are usually tailored to a specific task and a general unified framework is lacking. In this work, the individual setting is considered and an active learning criterion is proposed. Motivated by universal source coding, the proposed criterion attempts to find data points which minimize the Predictive Normalized Maximum Likelihood (pNML) regret on an un-labelled test set. It is shown that for binary classification and linear regression, the resulting criterion coincides with well-known active learning criteria and thus represents a unified information theoretic active learning approach for general hypothesis classes. Finally, it is shown using real data that the proposed criterion performs better than other active learning criteria in terms of sample complexity.
KW - Minimax learning
KW - active learning
KW - individual sequences
KW - universal prediction
UR - http://www.scopus.com/inward/record.url?scp=85194870864&partnerID=8YFLogxK
U2 - 10.1109/TIT.2024.3406926
DO - 10.1109/TIT.2024.3406926
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:85194870864
SN - 0018-9448
VL - 70
SP - 5799
EP - 5810
JO - IEEE Transactions on Information Theory
JF - IEEE Transactions on Information Theory
IS - 8
ER -