Selective sharing for multilingual dependency parsing

Tahira Naseem, Regina Barzilay, Amir Globerson

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

We present a novel algorithm for multilingual dependency parsing that uses annotations from a diverse set of source languages to parse a new unannotated language. Our motivation is to broaden the advantages of multilingual learning to languages that exhibit significant differences from existing resource-rich languages. The algorithm learns which aspects of the source languages are relevant for the target language and ties model parameters accordingly. The model factorizes the process of generating a dependency tree into two steps: selection of syntactic dependents and their ordering. Being largely languageuniversal, the selection component is learned in a supervised fashion from all the training languages. In contrast, the ordering decisions are only influenced by languages with similar properties. We systematically model this cross-lingual sharing using typological features. In our experiments, the model consistently outperforms a state-of-the-art multilingual parser. The largest improvement is achieved on the non Indo-European languages yielding a gain of 14.4%.

Original languageEnglish
Title of host publication50th Annual Meeting of the Association for Computational Linguistics, ACL 2012 - Proceedings of the Conference
Pages629-637
Number of pages9
StatePublished - 2012
Externally publishedYes
Event50th Annual Meeting of the Association for Computational Linguistics, ACL 2012 - Jeju Island, Korea, Republic of
Duration: 8 Jul 201214 Jul 2012

Publication series

Name50th Annual Meeting of the Association for Computational Linguistics, ACL 2012 - Proceedings of the Conference
Volume1

Conference

Conference50th Annual Meeting of the Association for Computational Linguistics, ACL 2012
Country/TerritoryKorea, Republic of
CityJeju Island
Period8/07/1214/07/12

Fingerprint

Dive into the research topics of 'Selective sharing for multilingual dependency parsing'. Together they form a unique fingerprint.

Cite this