Ranking-based evaluation of regression models

Saharon Rosset, Claudia Perlich, Bianca Zadrozny

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

We suggest the use of ranking-based evaluation measures for regression models, as a complement to the commonly used residual-based evaluation. We argue that in some cases, such as the case study we present, ranking can be the main underlying goal in building a regression model, and ranking performance is the correct evaluation metric. However, even when ranking is not the contextually correct performance metric, the measures we explore still have significant advantages: They are robust against extreme outliers in the evaluation set; and they are interpretable. The two measures we consider correspond closely to non-parametric correlation coefficients commonly used in data analysis (Spearman's ρ and Kendall's τ); and they both have interesting graphical representations, which, similarly to ROC curves, offer useful "partial" model performance views, in addition to a one-number summary in the area under the curve. We illustrate our methods on a case study of evaluating IT Wallet size estimation models for IBM's customers.

Original languageEnglish
Title of host publicationProceedings - Fifth IEEE International Conference on Data Mining, ICDM 2005
Pages370-377
Number of pages8
DOIs
StatePublished - 2005
Externally publishedYes
Event5th IEEE International Conference on Data Mining, ICDM 2005 - Houston, TX, United States
Duration: 27 Nov 200530 Nov 2005

Publication series

NameProceedings - IEEE International Conference on Data Mining, ICDM
ISSN (Print)1550-4786

Conference

Conference5th IEEE International Conference on Data Mining, ICDM 2005
Country/TerritoryUnited States
CityHouston, TX
Period27/11/0530/11/05

Cite this