TY - JOUR
T1 - Learning a generative probabilistic grammar of experience
T2 - A process-level model of language acquisition
AU - Kolodny, Oren
AU - Lotem, Arnon
AU - Edelman, Shimon
N1 - Publisher Copyright:
© 2014 Cognitive Science Society, Inc.
PY - 2015/3/1
Y1 - 2015/3/1
N2 - We introduce a set of biologically and computationally motivated design choices for modeling the learning of language, or of other types of sequential, hierarchically structured experience and behavior, and describe an implemented system that conforms to these choices and is capable of unsupervised learning from raw natural-language corpora. Given a stream of linguistic input, our model incrementally learns a grammar that captures its statistical patterns, which can then be used to parse or generate new data. The grammar constructed in this manner takes the form of a directed weighted graph, whose nodes are recursively (hierarchically) defined patterns over the elements of the input stream. We evaluated the model in seventeen experiments, grouped into five studies, which examined, respectively, (a) the generative ability of grammar learned from a corpus of natural language, (b) the characteristics of the learned representation, (c) sequence segmentation and chunking, (d) artificial grammar learning, and (e) certain types of structure dependence. The model's performance largely vindicates our design choices, suggesting that progress in modeling language acquisition can be made on a broad front-ranging from issues of generativity to the replication of human experimental findings-by bringing biological and computational considerations, as well as lessons from prior efforts, to bear on the modeling approach.
AB - We introduce a set of biologically and computationally motivated design choices for modeling the learning of language, or of other types of sequential, hierarchically structured experience and behavior, and describe an implemented system that conforms to these choices and is capable of unsupervised learning from raw natural-language corpora. Given a stream of linguistic input, our model incrementally learns a grammar that captures its statistical patterns, which can then be used to parse or generate new data. The grammar constructed in this manner takes the form of a directed weighted graph, whose nodes are recursively (hierarchically) defined patterns over the elements of the input stream. We evaluated the model in seventeen experiments, grouped into five studies, which examined, respectively, (a) the generative ability of grammar learned from a corpus of natural language, (b) the characteristics of the learned representation, (c) sequence segmentation and chunking, (d) artificial grammar learning, and (e) certain types of structure dependence. The model's performance largely vindicates our design choices, suggesting that progress in modeling language acquisition can be made on a broad front-ranging from issues of generativity to the replication of human experimental findings-by bringing biological and computational considerations, as well as lessons from prior efforts, to bear on the modeling approach.
KW - Generative grammar
KW - Grammar of behavior
KW - Graph-based representation
KW - Incremental learning
KW - Language learning
KW - Learning
KW - Linguistic experience
KW - Statistical learning
UR - http://www.scopus.com/inward/record.url?scp=84924759659&partnerID=8YFLogxK
U2 - 10.1111/cogs.12140
DO - 10.1111/cogs.12140
M3 - ???researchoutput.researchoutputtypes.contributiontojournal.article???
AN - SCOPUS:84924759659
SN - 0364-0213
VL - 39
SP - 227
EP - 267
JO - Cognitive Science
JF - Cognitive Science
IS - 2
ER -