TY - GEN
T1 - Pair distance distribution
T2 - 1st Workshop on Representation Learning for NLP, Rep4NLP 2016 at the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016
AU - Ramni, Yonatan
AU - Maimon, Oded
AU - Khmelnitsky, Evgeni
N1 - Publisher Copyright:
© 2016 Proceedings of the Annual Meeting of the Association for Computational Linguistics. All Rights Reserved.
PY - 2016
Y1 - 2016
N2 - We introduce PDD (Pair Distance Distribution), a novel corpus-based model of semantic representation. Most corpus-based models are VSMs (Vector Space Models), which while being successful, suffer from both practical and theoretical shortcomings. VSM models produce very large, sparse matrices, and dimensionality reduction is usually performed, leading to high computational complexity, and obscuring the meaning of the dimensions. Similarity in VSMs is constrained to be both symmetric and transitive, contrary to evidence from human subject tests. PDD is featurebased, created automatically from corpora without producing large, sparse matrices. The dimensions along which words are compared are meaningful, enabling better understanding of the model and providing an explanation as to how any two words are similar. Similarity is neither symmetric nor transitive. The model achieved accuracy of 97.6% on a published semantic similarity test.
AB - We introduce PDD (Pair Distance Distribution), a novel corpus-based model of semantic representation. Most corpus-based models are VSMs (Vector Space Models), which while being successful, suffer from both practical and theoretical shortcomings. VSM models produce very large, sparse matrices, and dimensionality reduction is usually performed, leading to high computational complexity, and obscuring the meaning of the dimensions. Similarity in VSMs is constrained to be both symmetric and transitive, contrary to evidence from human subject tests. PDD is featurebased, created automatically from corpora without producing large, sparse matrices. The dimensions along which words are compared are meaningful, enabling better understanding of the model and providing an explanation as to how any two words are similar. Similarity is neither symmetric nor transitive. The model achieved accuracy of 97.6% on a published semantic similarity test.
UR - http://www.scopus.com/inward/record.url?scp=85121276043&partnerID=8YFLogxK
U2 - 10.18653/v1/w16-1621
DO - 10.18653/v1/w16-1621
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:85121276043
T3 - Proceedings of the Annual Meeting of the Association for Computational Linguistics
SP - 184
EP - 192
BT - Proceedings of the 1st Workshop on Representation Learning for NLP, Rep4NLP 2016 at the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016
A2 - Blunsom, Phil
A2 - Cho, Kyunghyun
A2 - Cohen, Shay
A2 - Grefenstette, Edward
A2 - Hermann, Karl Moritz
A2 - Rimell, Laura
A2 - Weston, Jason
A2 - Yih, Scott Wen-Tau
PB - Association for Computational Linguistics (ACL)
Y2 - 11 August 2016
ER -