TY - GEN
T1 - Data-enhanced predictive modeling for sales targeting
AU - Rosset, Saharon
AU - Lawrence, Richard D.
PY - 2006
Y1 - 2006
N2 - We describe and analyze the idea of data-enhanced predictive modeling (DEM). The term "enhanced" here refers to the case that the data used for modeling is sampled not from the true target population, but from an alternative (closely related) population, from which much larger samples are available. This leads to a "bias-variance" tradeoff, which implies that in some cases, DEM can improve predictive performance on the true target population. We theoretically analyze this tradeoff for the case of linear regression. We illustrate DEM on a problem of sales targeting for a set of software products. The "correct" learning problem is to differentiate non-customers from newly acquired customers. The latter, however, are scarce. We illustrate how we can build better prediction models by using more flexible definitions of interesting targets, which give bigger learning samples.
AB - We describe and analyze the idea of data-enhanced predictive modeling (DEM). The term "enhanced" here refers to the case that the data used for modeling is sampled not from the true target population, but from an alternative (closely related) population, from which much larger samples are available. This leads to a "bias-variance" tradeoff, which implies that in some cases, DEM can improve predictive performance on the true target population. We theoretically analyze this tradeoff for the case of linear regression. We illustrate DEM on a problem of sales targeting for a set of software products. The "correct" learning problem is to differentiate non-customers from newly acquired customers. The latter, however, are scarce. We illustrate how we can build better prediction models by using more flexible definitions of interesting targets, which give bigger learning samples.
UR - http://www.scopus.com/inward/record.url?scp=33745449402&partnerID=8YFLogxK
U2 - 10.1137/1.9781611972764.62
DO - 10.1137/1.9781611972764.62
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:33745449402
SN - 089871611X
SN - 9780898716115
T3 - Proceedings of the Sixth SIAM International Conference on Data Mining
SP - 569
EP - 573
BT - Proceedings of the Sixth SIAM International Conference on Data Mining
PB - Society for Industrial and Applied Mathematics
T2 - Sixth SIAM International Conference on Data Mining
Y2 - 20 April 2006 through 22 April 2006
ER -