TY - CONF

T1 - Online learning for global cost functions

AU - Even-Dar, Eyal

AU - Kleinberg, Robert

AU - Mannor, Shie

AU - Mansour, Yishay

PY - 2009

Y1 - 2009

N2 - We consider an online learning setting where at each time step the decision maker has to choose how to distribute the future loss between k alternatives, and then observes the loss of each alternative. Motivated by load balancing and job scheduling, we consider a global cost function (over the losses incurred by each alternative), rather than a summation of the instantaneous losses as done traditionally in online learning. Such global cost functions include the makespan (the maximum over the alternatives) and the Ld norm (over the alternatives). Based on approachability theory, we design an algorithm that guarantees vanishing regret for this setting, where the regret is measured with respect to the best static decision that selects the same distribution over alternatives at every time step. For the special case of makespan cost we devise a simple and efficient algorithm. In contrast, we show that for concave global cost functions, such as Ld norms for d < 1, the worst-case average regret does not vanish.

AB - We consider an online learning setting where at each time step the decision maker has to choose how to distribute the future loss between k alternatives, and then observes the loss of each alternative. Motivated by load balancing and job scheduling, we consider a global cost function (over the losses incurred by each alternative), rather than a summation of the instantaneous losses as done traditionally in online learning. Such global cost functions include the makespan (the maximum over the alternatives) and the Ld norm (over the alternatives). Based on approachability theory, we design an algorithm that guarantees vanishing regret for this setting, where the regret is measured with respect to the best static decision that selects the same distribution over alternatives at every time step. For the special case of makespan cost we devise a simple and efficient algorithm. In contrast, we show that for concave global cost functions, such as Ld norms for d < 1, the worst-case average regret does not vanish.

UR - http://www.scopus.com/inward/record.url?scp=84898078798&partnerID=8YFLogxK

M3 - ???researchoutput.researchoutputtypes.contributiontoconference.paper???

AN - SCOPUS:84898078798

T2 - 22nd Conference on Learning Theory, COLT 2009

Y2 - 18 June 2009 through 21 June 2009

ER -