TY - GEN

T1 - Regret minimization with concept drift

AU - Crammer, Koby

AU - Even-Dar, Eyal

AU - Mansour, Yishay

AU - Vaughan, Jennifer Wortman

PY - 2010

Y1 - 2010

N2 - In standard online learning, the goal of the learner is to maintain an average loss close to the loss of the best-performing function in a fixed class. Classic results show that simple algorithms can achieve an average loss arbitrarily close to that of the best function in retrospect, even when input and output pairs are chosen by an adversary. However, in many real-world applications such as spam prediction and classification of news articles, the best target function may be drifting over time. We introduce a novel model of concept drift in which an adversary is given control of both the distribution over input at each time step and the corresponding labels. The goal of the learner is to maintain an average loss close to the 0/1 loss of the best slowly changing sequence of functions with no more than K large shifts. We provide regret bounds for learning in this model using an (inefficient) reduction to the standard no-regret setting. We then go on to provide and analyze an efficient algorithm for learning d-dimensional hyperplanes with drift. We conclude with some simulations illustrating the circumstances under which this algorithm outperforms other commonly studied algorithms when the target hyperplane is drifting.

AB - In standard online learning, the goal of the learner is to maintain an average loss close to the loss of the best-performing function in a fixed class. Classic results show that simple algorithms can achieve an average loss arbitrarily close to that of the best function in retrospect, even when input and output pairs are chosen by an adversary. However, in many real-world applications such as spam prediction and classification of news articles, the best target function may be drifting over time. We introduce a novel model of concept drift in which an adversary is given control of both the distribution over input at each time step and the corresponding labels. The goal of the learner is to maintain an average loss close to the 0/1 loss of the best slowly changing sequence of functions with no more than K large shifts. We provide regret bounds for learning in this model using an (inefficient) reduction to the standard no-regret setting. We then go on to provide and analyze an efficient algorithm for learning d-dimensional hyperplanes with drift. We conclude with some simulations illustrating the circumstances under which this algorithm outperforms other commonly studied algorithms when the target hyperplane is drifting.

UR - http://www.scopus.com/inward/record.url?scp=84860605585&partnerID=8YFLogxK

M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???

AN - SCOPUS:84860605585

SN - 9780982252925

T3 - COLT 2010 - The 23rd Conference on Learning Theory

SP - 168

EP - 180

BT - COLT 2010 - The 23rd Conference on Learning Theory

T2 - 23rd Conference on Learning Theory, COLT 2010

Y2 - 27 June 2010 through 29 June 2010

ER -