TY - GEN
T1 - Aligned cross entropy for non-autoregressive machine translation
AU - Ghazvininejad, Marjan
AU - Karpukhin, Vladimir
AU - Zettlemoyer, Luke
AU - Levy, Omer
N1 - Publisher Copyright:
Copyright 2020 by the author(s).
PY - 2020
Y1 - 2020
N2 - Non-autoregressive machine translation models significantly speed up decoding by allowing for parallel prediction of the entire target sequence. However, modeling word order is more challenging due to the lack of autoregressive factors in the model. This difficultly is compounded during training with cross entropy loss, which can highly penalize small shifts in word order. In this paper, we propose aligned cross entropy (AXE) as an alternative loss function for training of non-autoregressive models. AXE uses a differentiable dynamic program to assign loss based on the best possible monotonic alignment between target tokens and model predictions. AXE-based training of conditional masked language models (CMLMs) substantially improves performance on major WMT benchmarks, while setting a new state of the art for non-autoregressive models.
AB - Non-autoregressive machine translation models significantly speed up decoding by allowing for parallel prediction of the entire target sequence. However, modeling word order is more challenging due to the lack of autoregressive factors in the model. This difficultly is compounded during training with cross entropy loss, which can highly penalize small shifts in word order. In this paper, we propose aligned cross entropy (AXE) as an alternative loss function for training of non-autoregressive models. AXE uses a differentiable dynamic program to assign loss based on the best possible monotonic alignment between target tokens and model predictions. AXE-based training of conditional masked language models (CMLMs) substantially improves performance on major WMT benchmarks, while setting a new state of the art for non-autoregressive models.
UR - http://www.scopus.com/inward/record.url?scp=85101715982&partnerID=8YFLogxK
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:85101715982
T3 - Proceedings of Machine Learning Research
SP - 3515
EP - 3523
BT - Proceedings of the 37th International Conference on Machine Learning
A2 - Daume, Hal
A2 - Singh, Aarti
PB - PMLR
T2 - 37th International Conference on Machine Learning, ICML 2020
Y2 - 13 July 2020 through 18 July 2020
ER -