Abstract
Several authors have suggested viewing boosting as a gradient descent search for a good fit in function space. At each iteration observations are re-weighted using the gradient of the underlying loss function. We present an approach of weight decay for observation weights which is equivalent to "robustifying" the underlying loss function. At the extreme end of decay this approach converges to Bagging, which can be viewed as boosting with a linear underlying loss function. We illustrate the practical usefulness of weight decay for improving prediction performance and present an equivalence between one form of weight decay and "Huberizing" - a statistical method for making loss functions more robust.
Original language | English |
---|---|
Pages | 249-255 |
Number of pages | 7 |
DOIs | |
State | Published - 2005 |
Externally published | Yes |
Event | KDD-2005: 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - Chicago, IL, United States Duration: 21 Aug 2005 → 24 Aug 2005 |
Conference
Conference | KDD-2005: 11th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining |
---|---|
Country/Territory | United States |
City | Chicago, IL |
Period | 21/08/05 → 24/08/05 |
Keywords
- Bagging
- Boosting
- Robust Fitting