TY - GEN
T1 - LM-Debugger
T2 - 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022
AU - Geva, Mor
AU - Caciularu, Avi
AU - Dar, Guy
AU - Roit, Paul
AU - Sadde, Shoval
AU - Shlain, Micah
AU - Tamir, Bar
AU - Goldberg, Yoav
N1 - Publisher Copyright:
© 2022 Association for Computational Linguistics.
PY - 2022
Y1 - 2022
N2 - The opaque nature and unexplained behavior of transformer-based language models (LMs) have spurred a wide interest in interpreting their predictions. However, current interpretation methods mostly focus on probing models from outside, executing behavioral tests, and analyzing salience input features, while the internal prediction construction process is largely not understood. In this work, we introduce LM-Debugger, an interactive debugger tool for transformer-based LMs, which provides a fine-grained interpretation of the model's internal prediction process, as well as a powerful framework for intervening in LM behavior. For its backbone, LM-Debugger relies on a recent method that interprets the inner token representations and their updates by the feed-forward layers in the vocabulary space. We demonstrate the utility of LM-Debugger for single-prediction debugging, by inspecting the internal disambiguation process done by GPT2. Moreover, we show how easily LM-Debugger allows to shift model behavior in a direction of the user's choice, by identifying a few vectors in the network and inducing effective interventions to the prediction process. We release LM-Debugger as an open-source tool and a demo over GPT2 models.
AB - The opaque nature and unexplained behavior of transformer-based language models (LMs) have spurred a wide interest in interpreting their predictions. However, current interpretation methods mostly focus on probing models from outside, executing behavioral tests, and analyzing salience input features, while the internal prediction construction process is largely not understood. In this work, we introduce LM-Debugger, an interactive debugger tool for transformer-based LMs, which provides a fine-grained interpretation of the model's internal prediction process, as well as a powerful framework for intervening in LM behavior. For its backbone, LM-Debugger relies on a recent method that interprets the inner token representations and their updates by the feed-forward layers in the vocabulary space. We demonstrate the utility of LM-Debugger for single-prediction debugging, by inspecting the internal disambiguation process done by GPT2. Moreover, we show how easily LM-Debugger allows to shift model behavior in a direction of the user's choice, by identifying a few vectors in the network and inducing effective interventions to the prediction process. We release LM-Debugger as an open-source tool and a demo over GPT2 models.
UR - http://www.scopus.com/inward/record.url?scp=85149655827&partnerID=8YFLogxK
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:85149655827
T3 - EMNLP 2022 - 2022 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Demonstrations Session
SP - 12
EP - 21
BT - EMNLP 2022 - 2022 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Demonstrations Session
A2 - Che, Wanxiang
A2 - Shutova, Ekaterina
PB - Association for Computational Linguistics (ACL)
Y2 - 7 December 2022 through 11 December 2022
ER -