TY - JOUR
T1 - Coarse-grained localized diffusion
AU - Wolf, Guy
AU - Rotbart, Aviv
AU - David, Gil
AU - Averbuch, Amir
N1 - Funding Information:
This research was partially supported by the Israel Science Foundation (Grant No. 1041/10). The first author was also supported by the Eshkol Fellowship from the Israeli Ministry of Science & Technology.
PY - 2012/11
Y1 - 2012/11
N2 - Data-analysis methods nowadays are expected to deal with increasingly large amounts of data. Such massive datasets often contain many redundancies. One effect from these redundancies is the high dimensionality of datasets, which is handled by dimensionality reduction techniques. Another effect is the duplicity of very similar observations (or data-points) that can be analyzed together as a cluster. We propose an approach for dealing with both effects by coarse-graining the popular Diffusion Maps (DM) dimensionality reduction framework from the data-point level to the cluster level. This way, the size of the analyzed dataset is decreased by only referring to clusters instead of individual data-points. Then, the dimensionality of the dataset can be decreased by the DM embedding. We show that the essential properties (e.g., ergodicity) of the underlying diffusion process of DM are preserved by the coarse-graining. The affinity that is generated by the coarse-grained process, which we call Localized Diffusion Process (LDP), is strongly related to the recently introduced Localized Diffusion Folders (LDF) [G. David, A. Averbuch, Hierarchical data organization, clustering and denoising via localized diffusion folders, Appl. Comput. Harmon. Anal. (2011), in press] hierarchical clustering algorithm. We show that the LDP coarse-graining is in fact equivalent to the affinity-pruning that is achieved at each folder-level in the LDF hierarchy.
AB - Data-analysis methods nowadays are expected to deal with increasingly large amounts of data. Such massive datasets often contain many redundancies. One effect from these redundancies is the high dimensionality of datasets, which is handled by dimensionality reduction techniques. Another effect is the duplicity of very similar observations (or data-points) that can be analyzed together as a cluster. We propose an approach for dealing with both effects by coarse-graining the popular Diffusion Maps (DM) dimensionality reduction framework from the data-point level to the cluster level. This way, the size of the analyzed dataset is decreased by only referring to clusters instead of individual data-points. Then, the dimensionality of the dataset can be decreased by the DM embedding. We show that the essential properties (e.g., ergodicity) of the underlying diffusion process of DM are preserved by the coarse-graining. The affinity that is generated by the coarse-grained process, which we call Localized Diffusion Process (LDP), is strongly related to the recently introduced Localized Diffusion Folders (LDF) [G. David, A. Averbuch, Hierarchical data organization, clustering and denoising via localized diffusion folders, Appl. Comput. Harmon. Anal. (2011), in press] hierarchical clustering algorithm. We show that the LDP coarse-graining is in fact equivalent to the affinity-pruning that is achieved at each folder-level in the LDF hierarchy.
KW - Coarse-graining
KW - Diffusion maps
KW - Dimensionality reduction
KW - Localized diffusion folders
UR - http://www.scopus.com/inward/record.url?scp=84864440338&partnerID=8YFLogxK
U2 - 10.1016/j.acha.2012.02.004
DO - 10.1016/j.acha.2012.02.004
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:84864440338
SN - 1063-5203
VL - 33
SP - 388
EP - 400
JO - Applied and Computational Harmonic Analysis
JF - Applied and Computational Harmonic Analysis
IS - 3
ER -