TY - JOUR
T1 - Minimum Description Length Recurrent Neural Networks
AU - Lan, Nur
AU - Geyer, Michal
AU - Chemla, Emmanuel
AU - Katzir, Roni
N1 - Publisher Copyright:
© MIT Press Journals. All rights reserved.
PY - 2022/7/27
Y1 - 2022/7/27
N2 - We train neural networks to optimize a Minimum Description Length score, that is, to balance between the complexity of the network and its accuracy at a task. We show that networks optimizing this objective function master tasks involving memory challenges and go beyond context-free languages. These learners master languages such as an bn, an bn cn, an b2n, an bm cn+m, and they perform addition. Moreover, they often do so with 100% accuracy. The networks are small, and their inner workings are transparent. We thus provide formal proofs that their perfect accuracy holds not only on a given test set, but for any input sequence. To our knowledge, no other connectionist model has been shown to capture the underlying grammars for these languages in full generality.
AB - We train neural networks to optimize a Minimum Description Length score, that is, to balance between the complexity of the network and its accuracy at a task. We show that networks optimizing this objective function master tasks involving memory challenges and go beyond context-free languages. These learners master languages such as an bn, an bn cn, an b2n, an bm cn+m, and they perform addition. Moreover, they often do so with 100% accuracy. The networks are small, and their inner workings are transparent. We thus provide formal proofs that their perfect accuracy holds not only on a given test set, but for any input sequence. To our knowledge, no other connectionist model has been shown to capture the underlying grammars for these languages in full generality.
UR - http://www.scopus.com/inward/record.url?scp=85135512589&partnerID=8YFLogxK
U2 - 10.1162/tacl_a_00489
DO - 10.1162/tacl_a_00489
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:85135512589
SN - 2307-387X
VL - 10
SP - 785
EP - 799
JO - Trans. Assoc. Comput. Linguistics
JF - Trans. Assoc. Comput. Linguistics
ER -