Learning efficiently with approximate inference via dual losses

Ofer Meshi, David Sontag, Tommi Jaakkola, Amir Globerson

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Many structured prediction tasks involve complex models where inference is computationally intractable, but where it can be well approximated using a linear programming relaxation. Previous approaches for learning for structured prediction (e.g., cutting-plane, subgradient methods, perceptron) repeatedly make predictions for some of the data points. These approaches are computationally demanding because each prediction involves solving a linear program to optimally. We present a scalable algorithm for learning for structured prediction. The main idea is to instead solve the dual of the structured prediction loss. We formulate the learning task as a convex minimization over both the weights and the dual variables corresponding to each data point. As a result, we can begin to optimize the weights even before completely solving any of the individual prediction problems. We show how the dual variables can be efficiently optimized using co-ordinate descent. Our algorithm is competitive with state-of-the-art methods such as stochastic subgradient and cutting-plane.

Original languageEnglish
Title of host publicationICML 2010 - Proceedings, 27th International Conference on Machine Learning
Pages783-790
Number of pages8
StatePublished - 2010
Event27th International Conference on Machine Learning, ICML 2010 - Haifa, Israel
Duration: 21 Jun 201025 Jun 2010

Publication series

NameICML 2010 - Proceedings, 27th International Conference on Machine Learning

Conference

Conference27th International Conference on Machine Learning, ICML 2010
Country/TerritoryIsrael
CityHaifa
Period21/06/1025/06/10

Fingerprint

Dive into the research topics of 'Learning efficiently with approximate inference via dual losses'. Together they form a unique fingerprint.

Cite this