TY - JOUR
T1 - Using simulated annealing to optimize the feature selection problem in marketing applications
AU - Meiri, Ronen
AU - Zahavi, Jacob
PY - 2006/6/16
Y1 - 2006/6/16
N2 - The feature selection (also, specification) problem is concerned with finding the most influential subset of predictors in predictive modeling from a much larger set of potential predictors that can contain hundreds of predictors. The problem belongs to the realm of combinatorial optimization where the objective is to find the subset of variables that optimize the value of some goodness of fit function. Due to the dimensionality of the problem, the feature selection problem belongs to the group of NP-hard problems. Most of the available predictors are noisy or redundant and add very little, if any, to the prediction power of the model. Using all the predictors in the model often results in strong over-fitting and very poor predictions. Constructing a prediction model by checking out all possible subsets is impractical due to computational volume. Looking on the contribution of each predictor separately is not accurate because it ignores the inter-correlations between predictors. As a result, no analytic solution is available for the feature selection problem, requiring that one resorts to heuristics. In this paper we employ the simulated annealing (SA) approach, which is one of the leading stochastic search methods, for specifying a large-scale linear regression model. The SA results are compared to the results of the more common stepwise regression (SWR) approach for model specification. The models are applied on realistic data sets in database marketing. We also use simulated data sets to investigate what data characteristics make the SWR approach equivalent to the supposedly more superior SA approach.
AB - The feature selection (also, specification) problem is concerned with finding the most influential subset of predictors in predictive modeling from a much larger set of potential predictors that can contain hundreds of predictors. The problem belongs to the realm of combinatorial optimization where the objective is to find the subset of variables that optimize the value of some goodness of fit function. Due to the dimensionality of the problem, the feature selection problem belongs to the group of NP-hard problems. Most of the available predictors are noisy or redundant and add very little, if any, to the prediction power of the model. Using all the predictors in the model often results in strong over-fitting and very poor predictions. Constructing a prediction model by checking out all possible subsets is impractical due to computational volume. Looking on the contribution of each predictor separately is not accurate because it ignores the inter-correlations between predictors. As a result, no analytic solution is available for the feature selection problem, requiring that one resorts to heuristics. In this paper we employ the simulated annealing (SA) approach, which is one of the leading stochastic search methods, for specifying a large-scale linear regression model. The SA results are compared to the results of the more common stepwise regression (SWR) approach for model specification. The models are applied on realistic data sets in database marketing. We also use simulated data sets to investigate what data characteristics make the SWR approach equivalent to the supposedly more superior SA approach.
KW - Database marketing
KW - Feature selection
KW - Simulated annealing
KW - Stepwise regression
UR - http://www.scopus.com/inward/record.url?scp=31144448615&partnerID=8YFLogxK
U2 - 10.1016/j.ejor.2004.09.010
DO - 10.1016/j.ejor.2004.09.010
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:31144448615
SN - 0377-2217
VL - 171
SP - 842
EP - 858
JO - European Journal of Operational Research
JF - European Journal of Operational Research
IS - 3
ER -