Polar classification of nominal data

Guy Wolf*, Shachar Harussi, Yaniv Shmueli, Amir Averbuch

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Many modern systems record various types of parameter values. Numerical values are relatively convenient for data analysis tools because there are many methods to measure distances and similarities between them. The application of dimensionality reduction techniques for data sets with such values is also a well known practice. Nominal (i.e., categorical) values, on the other hand, encompass some problems for current methods. Most of all, there is no meaningful distance between possible nominal values, which are either equal or unequal to each other. Since many dimensionality reduction methods rely on preserving some form of similarity or distance measure, their application to such data sets is not straightforward. We propose a method to achieve clustering of such data sets by applying the diffusion maps methodology to it. Our method is based on a distance metric that utilizes the effect of the boolean nature of similarities between nominal values (i.e., equal or unequal) on the diffusion kernel and, in turn, on the embedded space resulting from its principal components.We use a multi-view approach by analyzing small, closely related, sets of parameters at a time instead of the whole data set. This way, we achieve a comprehensive understanding of the data set from many points of view.

Original languageEnglish
Title of host publicationNumerical Methods for Differential Equations, Optimization, and Technological Problems
EditorsSergey Repin, Timo Tiihonen, Tero Tuovinen
PublisherSpringer Netherland
Pages253-271
Number of pages19
ISBN (Print)9789400752870
DOIs
StatePublished - 2013
EventECCOMAS Thematic Conference Computational Analysis and Optimization, CAO 2011 - Jyvaskyla, Finland
Duration: 9 Jun 201111 Jun 2011

Publication series

NameComputational Methods in Applied Sciences
Volume27
ISSN (Print)1871-3033

Conference

ConferenceECCOMAS Thematic Conference Computational Analysis and Optimization, CAO 2011
Country/TerritoryFinland
CityJyvaskyla
Period9/06/1111/06/11

Keywords

  • Clustering
  • Diffusion maps
  • Nominal data
  • Unsupervised learning

Fingerprint

Dive into the research topics of 'Polar classification of nominal data'. Together they form a unique fingerprint.

Cite this