TY - GEN
T1 - Span-based semantic parsing for compositional generalization
AU - Herzig, Jonathan
AU - Berant, Jonathan
N1 - Publisher Copyright:
© 2021 Association for Computational Linguistics
PY - 2021
Y1 - 2021
N2 - Despite the success of sequence-to-sequence (seq2seq) models in semantic parsing, recent work has shown that they fail in compositional generalization, i.e., the ability to generalize to new structures built of components observed during training. In this work, we posit that a span-based parser should lead to better compositional generalization. we propose SPANBASEDSP, a parser that predicts a span tree over an input utterance, explicitly encoding how partial programs compose over spans in the input. SPANBASEDSP extends Pasupat et al. (2019) to be comparable to seq2seq models by (i) training from programs, without access to gold trees, treating trees as latent variables, (ii) parsing a class of non-projective trees through an extension to standard CKY. On GEOQUERY, SCAN and CLOSURE datasets, SPANBASEDSP performs similarly to strong seq2seq baselines on random splits, but dramatically improves performance compared to baselines on splits that require compositional generalization: from 61.0 ? 88.9 average accuracy.
AB - Despite the success of sequence-to-sequence (seq2seq) models in semantic parsing, recent work has shown that they fail in compositional generalization, i.e., the ability to generalize to new structures built of components observed during training. In this work, we posit that a span-based parser should lead to better compositional generalization. we propose SPANBASEDSP, a parser that predicts a span tree over an input utterance, explicitly encoding how partial programs compose over spans in the input. SPANBASEDSP extends Pasupat et al. (2019) to be comparable to seq2seq models by (i) training from programs, without access to gold trees, treating trees as latent variables, (ii) parsing a class of non-projective trees through an extension to standard CKY. On GEOQUERY, SCAN and CLOSURE datasets, SPANBASEDSP performs similarly to strong seq2seq baselines on random splits, but dramatically improves performance compared to baselines on splits that require compositional generalization: from 61.0 ? 88.9 average accuracy.
UR - http://www.scopus.com/inward/record.url?scp=85115407376&partnerID=8YFLogxK
U2 - 10.18653/v1/2021.acl-long.74
DO - 10.18653/v1/2021.acl-long.74
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:85115407376
T3 - ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference
SP - 908
EP - 921
BT - ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference
PB - Association for Computational Linguistics (ACL)
T2 - Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL-IJCNLP 2021
Y2 - 1 August 2021 through 6 August 2021
ER -