What's in Your Head? Emergent Behaviour in Multi-Task Transformer Models

Mor Geva, Uri Katz, Aviv Ben-Arie, Jonathan Berant

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Scopus citations

Abstract

The primary paradigm for multi-task training in natural language processing is to represent the input with a shared pre-trained language model, and add a small, thin network (head) per task. Given an input, a target head is the head that is selected for outputting the final prediction. In this work, we examine the behaviour of non-target heads, that is, the output of heads when given input that belongs to a different task than the one they were trained for. We find that non-target heads exhibit emergent behaviour, which may either explain the target task, or generalize beyond their original task. For example, in a numerical reasoning task, a span extraction head extracts from the input the arguments to a computation that results in a number generated by a target generative head. In addition, a summarization head that is trained with a target question answering head, outputs query-based summaries when given a question and a context from which the answer is to be extracted. This emergent behaviour suggests that multi-task training leads to nontrivial extrapolation of skills, which can be harnessed for interpretability and generalization.

Original languageEnglish
Title of host publicationEMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings
PublisherAssociation for Computational Linguistics (ACL)
Pages8201-8215
Number of pages15
ISBN (Electronic)9781955917094
StatePublished - 2021
Event2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021 - Virtual, Punta Cana, Dominican Republic
Duration: 7 Nov 202111 Nov 2021

Publication series

NameEMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings

Conference

Conference2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021
Country/TerritoryDominican Republic
CityVirtual, Punta Cana
Period7/11/2111/11/21

Funding

FundersFunder number
European Union Horizons 2020 research and innovation programmeDELPHI 802800
Yandex Initiative for Machine Learning
European Research Council

    Fingerprint

    Dive into the research topics of 'What's in Your Head? Emergent Behaviour in Multi-Task Transformer Models'. Together they form a unique fingerprint.

    Cite this