TY - GEN
T1 - Orb
T2 - 2nd Workshop on Machine Reading for Question Answering, MRQA@EMNLP 2019
AU - Dua, Dheeru
AU - Gottumukkala, Ananth
AU - Talmor, Alon
AU - Singh, Sameer
AU - Gardner, Matt
N1 - Publisher Copyright:
© 2019 MRQA@EMNLP 2019 - Proceedings of the 2nd Workshop on Machine Reading for Question Answering. All rights reserved.
PY - 2019
Y1 - 2019
N2 - Reading comprehension is one of the crucial tasks for furthering research in natural language understanding. A lot of diverse reading comprehension datasets have recently been introduced to study various phenomena in natural language, ranging from simple paraphrase matching and entity typing to entity tracking and understanding the implications of the context. Given the availability of many such datasets, comprehensive and reliable evaluation is tedious and time-consuming for researchers working on this problem. We present an evaluation server, ORB, that reports performance on seven diverse reading comprehension datasets, encouraging and facilitating testing a single model's capability in understanding a wide variety of reading phenomena. The evaluation server places no restrictions on how models are trained, so it is a suitable test bed for exploring training paradigms and representation learning for general reading facility. As more suitable datasets are released, they will be added to the evaluation server. We also collect and include synthetic augmentations for these datasets, testing how well models can handle out-of-domain questions.
AB - Reading comprehension is one of the crucial tasks for furthering research in natural language understanding. A lot of diverse reading comprehension datasets have recently been introduced to study various phenomena in natural language, ranging from simple paraphrase matching and entity typing to entity tracking and understanding the implications of the context. Given the availability of many such datasets, comprehensive and reliable evaluation is tedious and time-consuming for researchers working on this problem. We present an evaluation server, ORB, that reports performance on seven diverse reading comprehension datasets, encouraging and facilitating testing a single model's capability in understanding a wide variety of reading phenomena. The evaluation server places no restrictions on how models are trained, so it is a suitable test bed for exploring training paradigms and representation learning for general reading facility. As more suitable datasets are released, they will be added to the evaluation server. We also collect and include synthetic augmentations for these datasets, testing how well models can handle out-of-domain questions.
UR - http://www.scopus.com/inward/record.url?scp=85119198060&partnerID=8YFLogxK
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:85119198060
T3 - MRQA@EMNLP 2019 - Proceedings of the 2nd Workshop on Machine Reading for Question Answering
SP - 147
EP - 153
BT - MRQA@EMNLP 2019 - Proceedings of the 2nd Workshop on Machine Reading for Question Answering
PB - Association for Computational Linguistics (ACL)
Y2 - 4 November 2019
ER -