907

Waterfall, H.R., Sandbank, B., Onnis, L., & Edelman, S. (in press).
An empirical generative framework for computational modeling of language acquisition. ( pdf )
Journal of Child Language
This paper reports progress in developing a computer model of language acquisition in the form of 1) a generative grammar that is 2) algorithmically learnable from realistic corpus data, 3) viable in its large-scale quantitative performance, and 4) psychologically real. First, we describe new algorithmic methods for unsupervised learning of generative grammars from raw CHILDES data and give an account of the generative performance of the acquired grammars. Next, we summarize findings from recent longitudinal and experimental work that suggests how certain statistically prominent structural properties of child-directed speech may facilitate language acquisition. We then present a series of new analyses of CHILDES data indicating that the desired properties are indeed present in realistic child-directed speech corpora. Finally, we suggest how our computational results, behavioral findings, and corpus-based insights can be integrated into a next-generation model aimed at meeting the four requirements of our modeling framework.

~~~~~ back to Publications ~~~~~

Christiansen, M., Onnis, L., & Hockema, S. (2009).
The secret is in the sound: From unsegmented speech to lexical categories. ( pdf )
Developmental Science, 12(3), 388-395.
When learning language, young children are faced with many seemingly formidable challenges, including discovering words embedded in a continuous stream of sounds and determining what role these words play in syntactic constructions. We suggest that knowledge of phoneme distributions may play a crucial part in helping children segment words and determine their lexical category, and we propose an integrated model of how children might go from unsegmented speech to lexical categories. We corroborated this theoretical model using a two-stage computational analysis of a large corpus of English child-directed speech. First, we used transition probabilities between phonemes to find words in unsegmented speech. Second, we used distributional information about word edges – the beginning and ending phonemes of words – to predict whether the segmented words from the first stage were nouns, verbs, or something else. The results indicate that discovering lexical units and their associated syntactic category in child-directed speech is possible by attending to the statistics of single phoneme transitions and word-initial and final phonemes. Thus, we suggest that a core computational principle in language acquisition is that the same source of information is used to learn about different aspects of linguistic structure.

~~~~~ back to Publications ~~~~~

Onnis, L. & Christiansen, M.H. (2008).
Lexical Categories at the Edge of the Word. (pdf )
Cognitive Science, 32(1), 184-221.
Language acquisition may be one of the most difficult tasks that children face during development. They have to segment words from fluent speech, figure out the meanings of these words, and discover the syntactic constraints for joining them together into meaningful sentences. Over the past couple of decades, computational modeling has emerged as a new paradigm for gaining insights into the mechanisms by which children may accomplish these feats. Unfortunately, many of these models assume a computational complexity and linguistic knowledge likely to be beyond the abilities of developing young children. This article shows that, using simple statistical procedures, significant correlations exist between the beginnings and endings of a word and its lexical category in English, Dutch, French, and Japanese. Therefore, phonetic information can contribute to individuating higher level structural properties of these languages. This article also presents a simple 2-layer connectionist model that, once trained with an initial small sample of words labeled for lexical category, can infer the lexical category of a large proportion of novel words using only word-edge phonological information, namely the first and last phoneme of a word. The results suggest that simple procedures combined with phonetic information perceptually available to children provide solid scaffolding for emerging lexical categories in language development.

~~~~~ back to Publications ~~~~~

Baroni, M., Lenci, A., & Onnis, L. (2007).
ISA meets Lara: A fully incremental word space model for cognitively plausible simulations of semantic learning. ( pdf )
Proceedings of the 45th Meeting of the Association for Computational Linguistics
We introduce Incremental Semantic Analysis, a fully incremental word space model, and we test it on longitudinal child-directed speech data. On this task, ISA outperforms the related Random Indexing algorithm, as well as a SVD-based technique. In addition, the model has interesting properties that might also be characteristic of the semantic space of children.

~~~~~ back to Publications ~~~~~

Roberts, M., Onnis, L., & Chater, N. (2005).
Language Acquisition and Language Evolution: Two puzzles for the price of one. ( pdf )
Prerequisites for the evolution of language,334-356.
The quasi-productivity of natural languages appears to pose two difficult problems for language research. Firstly, why do irregularities in natural language not disappear over time, leaving languages completely regular (a transmission problem), and secondly, how did such irregularity arise in the first place (an emergence problem)? To address the transmission problem, we present an artificial, simplicity-based learner capable of acquiring quasiregular structures. In doing so, we present an explicitly psychological model of a famously problematic aspect of language acquisition known as Baker’s Paradox. We present several simulations of an Iterated Learning Model (ILM) illustrating the emergence and stability of quasi-regular irregularities using a rudimentary language. These simulations offer a possible resolution to the emergence problem. Other possible resolutions are discussed.

~~~~~ back to Publications ~~~~~

Onnis, L., Roberts, M., & Chater, N. (2002).
Simplicity: A cure for overregularizations in language acquisition? ( pdf )
Proceedings of the 24th Conference of the Cognitive Science Society., 720-725.
A formal model of learning as induction, the simplicity principle (e.g. Chater & Vitányi, 2001) states that the cognitive system seeks the hypothesis that provides the briefest representation of the available data- here the linguistic input to the child. Data gathered from the CHILDES database were used as an approximation of positive input the child receives from adults. We considered linguistic structures that would yield overgeneralization, according to Baker’s paradox (Baker, 1979). A simplicity based simulation was run incorporating two different hypotheses about the grammar: (1) The child assumes that there are no exceptions to the grammar. This hypothesis leads to overgeneralization. (2) The child assumes that some constructions are not allowed. For small corpora of data, the first hypothesis produced a simpler representation. However, for larger corpora, the second hypothesis was preferred as it lead to a shorter input description and eliminated overgeneralization.

~~~~~ back to Publications ~~~~~

Onnis, L. (submitted).
Language-induced constraints on statistical learning: Evidence from Korean and English speakers.
Statistical learning has been indicated as a potentially powerful set of simple mechanisms for inferring language structure from distributional information in the input. Yet it is not clear how statistical learning can be constrained to avoid a combinatorial explosion of hypotheses about the input. We present evidence that learning itself constraints adult statistical learning of novel stimuli. Korean and American speakers exhibited different expectations about the forward and backward transitional probabilities of an artificial grammar, a bias that we ascribe to prior experience with different word order patterns in the two natural languages. Furthermore, although Korean speakers were immersed in an English-speaking environment and had received extensive formal explicit training in English, they exhibited statistical learning biases congruent with their native language. We propose that the same language-induced constraints that afford successful learning in a first language may simultaneously engender difficulties in learning a second language later in life.

~~~~~ back to Publications ~~~~~

Onnis, L., Waterfall, H., & Edelman S. (2008).
Learn locally, act globally: Learning language with variation set cues. ( pdf )
Cognition, 109, 423-430.
Variation set structure – partial overlap of successive utterances in child-directed speech – has been shown to correlate with progress in children’s acquisition of syntax. We demonstrate the benefits of variation set structure directly: in miniature artificial languages, arranging a certain proportion of utterances in a training corpus in variation sets facilitated word and phrase constituent learning in adults. Our findings have implications for understanding the mechanisms of L1 acquisition by children, and for the development of more efficient algorithms for automatic language acquisition, as well as better methods for L2 instruction.

~~~~~ back to Publications ~~~~~

Onnis, L., Baroni, M., Spivey, M., Christiansen, M., & Farmer, T. (2009).
Generalizable distributional regularities aid fluent language processing: The case of semantic valence tendencies. ( pdf )
Italian Journal of Linguistics, 21(2).
We hypothesized that these statistical patterns form units of meaning that imbue lexical items, and their argument structures, with semantic valence tendencies (SVTs), and that such knowledge assists fluent on-line sentence comprehension by facilitating the predictability of upcoming information. First, a sentence completion task elicited such tendencies in adults, suggesting that speakers constrain their free productions to conform to the connotative meaning of words. Second, fluent on-line reading was slowed down significantly in sentences that contained a violation of a valence tendency (e.g. cause optimism). Third, an automated computer algorithm assessed the pervasiveness of valence tendencies in large computerized samples of English, supporting the hypothesis that valence tendencies are a distributional phenomenon. We conclude that not only can aspects of meaning be modeled with word cooccurrence statistics, but that such statistics are likely to be computed by the human brain during the processing of language. They thus simultaneously contribute to our understanding of the use of language and the psychology of language.

~~~~~ back to Publications ~~~~~

Onnis, L., Monaghan, P., Richmond, K. & Chater. N. (2005).
Phonology impacts segmentation in speech processing. ( pdf )
Journal of Memory and Language, 53/2, 225-237.
Pena, Bonatti, Nespor, and Mehler (2002) investigated an artificial language where the structure of words was determined by nonadjacent dependencies between syllables. They found that segmentation of continuous speech could proceed on the basis of these dependencies. However, Pen˜a et al.s artificial language contained a confound in terms of phonology, in that the dependent syllables began with plosives and the intervening syllables began with continuants. We consider three hypotheses concerning the role of phonology in speech segmentation in this task: (1) participants may recruit probabilistic phonotactic information from their native language to the artificial language learning task; (2) phonetic properties of the stimuli, such as the gaps that precede unvoiced plosives, can influences segmentation; and (3) grouping by phonological similarity between dependent syllables contributes to learning the dependency. In a series of experiments controlling the phonological and statistical structure of the language, we found that segmentation performance is influenced by the three factors in different degrees. Learning of nonadjacent dependencies did not occur when (3) is eliminated. We suggest that phonological processing provides a fundamental contribution to distributional analysis.

~~~~~ back to Publications ~~~~~

Onnis, L., Monaghan, P., Christiansen, M.H., & Chater, N. (2004).
Variability is the spice of learning, and a crucial ingredient for detecting and generalising nonadjacent dependencies. ( pdf )
Proceedings of the 26th Annual Conference of the Cognitive Science Society
An important aspect of language acquisition involves learning the syntactic nonadjacent dependencies that hold between words in sentences, such as subject/verb agreement or tense marking in English. Despite successes in statistical learning of adjacent dependencies, the evidence is not conclusive for learning nonadjacent items. We provide evidence that discovering nonadjacent dependencies is possible through statistical learning, provided it is modulated by the variability of the intervening material between items. We show that generalization to novel syntactic-like categories embedded in nonadjacent dependencies occurs with either zero or large variability. In addition, it can be supported even in more complex learning tasks such as continuous speech, despite earlier failures.

~~~~~ back to Publications ~~~~~

Onnis, L. Christiansen, M., Chater, N. & Gomez, R. (2003)
Reduction of uncertainty in human sequential learning: Evidence from artificial language learning. ( pdf )
Proceedings of The 25th Annual Conference of the Cognitive Science Society, pp.886-891.
Research on statistical learning in adults and infants has shown that humans are particularly sensitive to statistical properties of the input. Early experiments in artificial grammar learning, for instance, show a sensitivity for transitional n-gram probabilities. It has been argued, however, that this source of information may not help in detecting nonadjacent dependencies, in the presence of substantial variability of the intervening material, thus suggesting a different focus of attention involving change versus non-change (Gómez, 2002). Following Gómez proposal, we contend that alternative sources of information may be attended to simultaneously by learners, in an attempt to reduce uncertainty. With several potential cues in competition, performance crucially depends on which cue is strong enough to be relied upon. By carefully manipulating the statistical environment it is possible to weigh the contribution of each cue. Several implications for the field of statistical learning and language development are drawn.

~~~~~ back to Publications ~~~~~

Onnis, L., and Spivey, M. (submitted).
A new model visualization for the language sciences
We advance a new model conceptualization of the human faculty of language. In analogy with the history of physics, we argue that the sciences of language are clinging to an obsolete model visualization borrowed from box-and-arrows flow charts in the early days of engineering and computer science. This obsolete model assumes that the language faculty is composed of autonomously organized levels of linguistic representation, which in turn are assumed to be modular, organized in rank order of dominance, and feed unidirectionally into one another in stage-like algorithmic procedures. We review relevant literature in psycholinguistics and language acquisition that cannot be accommodated by the received model. Levels of representation in adult language processing and language acquisition appear to be highly integrated and interconnected, and function simultaneously rather than sequentially. Therefore, we submit a new model visualization for language, in which stacked levels of linguistic representation are replaced by trajectories in a multidimensional space. Processing language in the brain equates to traversing such a space in regions afforded by multiple probabilistic cues that simultaneously activate different linguistic representations. We propose new concepts and venues for research that may assist the field in transitioning to a new conceptualization, and provide a clear direction for the next decade.

~~~~~ back to Publications ~~~~~

Goldstein, M., Waterfall, H., Lotem, A., Halpern, J., Schwade, J., Onnis, L., Edelman, S. (submitted).
General Cognitive Principles for Learning Structure in Time and Space
How are hierarchically structured sequences of objects, events, or actions learned from experience and represented in the brain? When several streams of regularities present themselves, which will be learned and which ignored? Can statistical regularities take effect on their own, or are additional factors such as behavioral outcomes expected to influence statistical learning? Answers to these questions are starting to emerge through a convergence of findings from several disciplines, including naturalistic observations, behavioral experiments, neurobiological studies, and computational analyses and simulations. We propose that a small set of principles are at work in every situation that involves learning of structure from patterns of experience and outline a general framework that accounts for such learning.

~~~~~ back to Publications ~~~~~

Christiansen, M., Conway, C., and Onnis, L. (submitted).
The P600 as an Index of Expectation Violations in Language and Statistical Learning
We used event-related potentials (ERPs) to investigate the time course and distribution of brain activity while adults performed (a) a statistical learning task involving sequenced stimuli, and (b) a natural language processing task. The same positive ERP deflection, the P600 effect, typically linked to difficult or ungrammatical syntactic processing, was found for structural incongruencies in both statistical learning as well as natural language, and with similar topographical distributions. Moreover, the within-subject design revealed that the magnitude of the P600 component elicited by the statistical learning task predicted the strength of the P600 effect in the natural language task. These results are interpreted as an indication that the P600 provides an index of violations of expectations for upcoming material when processing complex sequential structure. We conclude that the same neural mechanisms may be recruited for both syntactic processing of linguistic stimuli and statistical learning of sequential patterns more generally.

~~~~~ back to Publications ~~~~~

Christiansen, M.H., Conway, C., & Onnis, L. (2007).
Neural Responses to Structural Incongruencies in Language and Statistical Learning Point to Similar Underlying Mechanisms. ( pdf )
Proceedings of the 29th Annual Meeting of the Cognitive Science Society. We used event-related potentials (ERPs) to investigate the distribution of brain activity while adults performed (a) a natural language reading task and (b) a statistical learning task involving sequenced stimuli. The same positive ERP deflection, the P600 effect, typically linked to difficult or ungrammatical syntactic processing, was found for structural incongruencies in both natural language as well as statistical learning and had similar topographical distributions. These results suggest that general learning abilities related to the processing of complex, sequenced material may be implicated in language processing. We conclude that the same neural mechanisms are recruited for both syntactic processing of language stimuli and statistical learning of sequential patterns more generally.

~~~~~ back to Publications ~~~~~