TY - GEN

T1 - Data-enhanced predictive modeling for sales targeting

AU - Rosset, Saharon

AU - Lawrence, Richard D.

PY - 2006

Y1 - 2006

N2 - We describe and analyze the idea of data-enhanced predictive modeling (DEM). The term "enhanced" here refers to the case that the data used for modeling is sampled not from the true target population, but from an alternative (closely related) population, from which much larger samples are available. This leads to a "bias-variance" tradeoff, which implies that in some cases, DEM can improve predictive performance on the true target population. We theoretically analyze this tradeoff for the case of linear regression. We illustrate DEM on a problem of sales targeting for a set of software products. The "correct" learning problem is to differentiate non-customers from newly acquired customers. The latter, however, are scarce. We illustrate how we can build better prediction models by using more flexible definitions of interesting targets, which give bigger learning samples.

AB - We describe and analyze the idea of data-enhanced predictive modeling (DEM). The term "enhanced" here refers to the case that the data used for modeling is sampled not from the true target population, but from an alternative (closely related) population, from which much larger samples are available. This leads to a "bias-variance" tradeoff, which implies that in some cases, DEM can improve predictive performance on the true target population. We theoretically analyze this tradeoff for the case of linear regression. We illustrate DEM on a problem of sales targeting for a set of software products. The "correct" learning problem is to differentiate non-customers from newly acquired customers. The latter, however, are scarce. We illustrate how we can build better prediction models by using more flexible definitions of interesting targets, which give bigger learning samples.

UR - http://www.scopus.com/inward/record.url?scp=33745449402&partnerID=8YFLogxK

U2 - 10.1137/1.9781611972764.62

DO - 10.1137/1.9781611972764.62

M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???

AN - SCOPUS:33745449402

SN - 089871611X

SN - 9780898716115

T3 - Proceedings of the Sixth SIAM International Conference on Data Mining

SP - 569

EP - 573

BT - Proceedings of the Sixth SIAM International Conference on Data Mining

PB - Society for Industrial and Applied Mathematics

T2 - Sixth SIAM International Conference on Data Mining

Y2 - 20 April 2006 through 22 April 2006

ER -