Euclidean embedding of co-occurrence data

Amir Globerson, Gal Chechik, Fernando Pereira, Naftali Tishby

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Embedding algorithms search for low dimensional structure in complex data, but most algorithms only handle objects of a single type for which pairwise distances are specified. This paper describes a method for embedding objects of different types, such as images and text, into a single common Euclidean space based on their co-occurrence statistics. The joint distributions are modeled as exponentials of Euclidean distances in the low-dimensional embedding space, which links the problem to convex optimization over positive semidefinite matrices. The local structure of our embedding corresponds to the statistical correlations via random walks in the Euclidean space. We quantify the performance of our method on two text datasets, and show that it consistently and significantly outperforms standard methods of statistical correspondence modeling, such as multidimensional scaling and correspondence analysis.

Original languageEnglish
Title of host publicationAdvances in Neural Information Processing Systems 17 - Proceedings of the 2004 Conference, NIPS 2004
PublisherNeural information processing systems foundation
ISBN (Print)0262195348, 9780262195348
StatePublished - 2005
Externally publishedYes
Event18th Annual Conference on Neural Information Processing Systems, NIPS 2004 - Vancouver, BC, Canada
Duration: 13 Dec 200416 Dec 2004

Publication series

NameAdvances in Neural Information Processing Systems
ISSN (Print)1049-5258

Conference

Conference18th Annual Conference on Neural Information Processing Systems, NIPS 2004
Country/TerritoryCanada
CityVancouver, BC
Period13/12/0416/12/04

Fingerprint

Dive into the research topics of 'Euclidean embedding of co-occurrence data'. Together they form a unique fingerprint.

Cite this