TY - GEN
T1 - Identifying bundles of product options using mutual information clustering
AU - Perlich, Claudia
AU - Rosset, Saharon
PY - 2007
Y1 - 2007
N2 - Mass-produced goods tend to be highly standardized in order to maximize manufacturing efficiencies. Some high-value goods with limited production quantities remain much less standardized and each sale can be configured to meet the specific requirements of the customer. In this work we suggest a novel methodology to reduce the number of options for complex product configurations by identifying meaningful sets of options that exhibit strong empirical dependencies in previous customer orders. Our approach explores different measures from statistics and information theory to capture the degree of interdependence between the choices for any pair of product components. We use hierarchical clustering to identify meaningful sets of components that can be combined to decrease the number of unique product specifications and increase production standardization. The focus of our analysis is on the influence of different similarity measure - including chisquared statistics and versions of mutual information on the ability of the clustering to find meaningful clusters.
AB - Mass-produced goods tend to be highly standardized in order to maximize manufacturing efficiencies. Some high-value goods with limited production quantities remain much less standardized and each sale can be configured to meet the specific requirements of the customer. In this work we suggest a novel methodology to reduce the number of options for complex product configurations by identifying meaningful sets of options that exhibit strong empirical dependencies in previous customer orders. Our approach explores different measures from statistics and information theory to capture the degree of interdependence between the choices for any pair of product components. We use hierarchical clustering to identify meaningful sets of components that can be combined to decrease the number of unique product specifications and increase production standardization. The focus of our analysis is on the influence of different similarity measure - including chisquared statistics and versions of mutual information on the ability of the clustering to find meaningful clusters.
UR - http://www.scopus.com/inward/record.url?scp=70449130373&partnerID=8YFLogxK
U2 - 10.1137/1.9781611972771.35
DO - 10.1137/1.9781611972771.35
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:70449130373
SN - 9780898716306
T3 - Proceedings of the 7th SIAM International Conference on Data Mining
SP - 390
EP - 397
BT - Proceedings of the 7th SIAM International Conference on Data Mining
PB - Society for Industrial and Applied Mathematics (SIAM)
T2 - 7th SIAM International Conference on Data Mining
Y2 - 26 April 2007 through 28 April 2007
ER -