PERMEABLE MODULES: ON EVOLVING AND ACQUIRING LANGUAGE-SPECIFIC CAPACITIES
Lise Menn, University of Colorado
Ann M. Peters, University of Hawai'i
Stories about beginnings come in only two basic modes. An entity either has an explicit
point of origin...or else it evolves...
(Gould 1991:48)
Introduction
Here at the end of the 20th century, linguists and psycholinguists have increasing
opportunities to contribute to the understanding of how our brains represent and
process language. However, the emerging picture is already too complex to deal with
in terms of conventional notions of discrete language areas and cognitive modules. This is
because we are dealing with an evolved system, not an engineered one - which implies
that it is, like other complex evolved systems, an enormous, wondrous mess. To understand it, we must understand the nature of evolved systems; therefore, we must think about
how evolution works.
The intent of this paper is mostly heuristic: we suggest some ways of thinking about
modularity that will allow us to deal with this wonderful messiness in an evolutionary
context. We will illustrate these approaches by showing how they apply to a relatively simple and very familiar evolved system, the human vocal tract.
We consider the question of modularity in language processing without regard to the
particular theories of grammar that are typically associated with particular views
on modularity. "Autonomy" of syntactic processing and an "autonomous" theory of syntax
are actually very different notions, as Bock (1995) emphasizes, and one's views on one
of them need have no particular connection with one's views on the other. This paper
is intended to be theory-neutral as far as syntax is concerned; the focus is simply
on how a system can have internal structure - internal clumps and layers which may develop
at different times and different rates - and yet still have integrated functioning
in the skilled user.
Psycholinguistics requires a conceptual model for modularity that is subtle enough
to deal with current findings from neurolinguistics, as well as with language evolution.
Rather than adopting a notion of modules as monolithic and impermeable, we consider
a gradient conceptualization of the components of language as complex and interwoven,
with degrees of permeability. The strong notion of a module is a processing component
(transforming some kind of "input" to some kind of "output") whose internal activity is not affected by the state of any other part of the system. The modularity of a
system is the extent to which a system is composed of such self-contained modules;
the modularity of a component is the extent to which its processing is independent
of simultaneously occurring events.
Anatomically, there is no difficulty in realizing a system whose components have differing
degrees of modularity; as Wilkins & Wakefield (1995) point out, this is a matter
of varying the proportion of neural connections that are internal to a component
as compared to the proportion that connects that component to others. The higher the
proportion of internal connections, the more encapsulated the component.
An important principle of evolution is that it is characterized by a great deal of
inertia. Nature is, in the classical phrase, "a tinkerer": evolution works on what
is at hand - on the variation inherent in the population under a particular existing
environmental range of conditions. In this view, humans, as well as all our co-inhabitants
of the planet, are contrived from old parts, stretched and twisted and reorganized,
with bits of new tubing and wiring stuck in here and there (for a formal anatomical
exposition, see Wilkins & Wakefield, 1995).
There are two important misconceptions about the evolution of language that we have
found in recent literature. The first is the assumption that, since syntax as it
currently exists is a complex integrated system, it must have been invented "all
at once". The second is the more subtle but related claim - one on which we disagree with Wilkins
& Wakefield - that there must have been a definable moment at which the brain had
reached the point where it was capable of syntactic computations worthy of the name
of Language (cf. Pinker & Bloom, 1990; Wang, 1991).
The modern theory of "punctuated equilibrium" - available to nonspecialists like ourselves
in, for example, the monthly Natural History
magazine columns by Stephen Jay Gould, does not require that enormous linked changes
in the structure and behavior of a species (for example, wings and powered flight)
must arise, or even tend to arise, either in tandem or as a result of a single mutation. On the contrary, there can be great asynchrony between changes in structures and
concomitant changes in behaviors: a new structure may make a new behavior possible,
but that behavior may emerge only later. Or, a new environment may bring out a new
behavior, and individuals may then be selected for a structure which makes that behavior
more effective. It follows that the large brain that enables modern human behaviors
and the behaviors themselves did not have to arise either suddenly or simultaneously.
If they indeed arose at different times, as we argue below, we will need to consider
how an integrated system such as syntax could have evolved.
Modularity and evolution of the vocal tract
Consider the human vocal tract and its control mechanisms. The vocal tract is a complex,
integrated, but "lumpy" system involving many elements which have enormously varied
phylogenetic depth and overlaid functions. Its gross anatomical basis is a set of
"old" respiratory and ingestive organs - nose, mouth, and throat - which look roughly
similar in all primates. The vocal tract takes part in many behaviors, most of them
highly automatized, with a rich variety of hard-wired and acquired control loops,
both involuntary and voluntary. The more automatic, hard-wired functions like sneezing
are almost completely introspectively inaccessible and cognitively impenetrable (i.e.
unaffected by knowledge); other functions were acquired at ontogenetically and phylogenetically different times, and are subject to both involuntary and voluntary control.
The entire vocal tract works together smoothly in the normal mature human, but it
may shows its seams in cases of brain injury or abnormal development.
Consider the control of airflow in the vocal tract, as outlined in Table 1A. Breathing
is controlled by largely involuntary neural circuitry that responds to at least two
needs: the body's physiological need for oxygen, and involuntary phonation: that
is, involuntary crying, laughing, screaming, sighing, and other emotionally controlled
sounds, driven by the limbic system. The oxygen-supply aspect of the respiratory
control system is evolutionarily very old; more recent, but still at least as old
as the mammals, are yawns, coughs, and sneezes. Yawning and coughing also must involve voluntary
circuits (experienced air travelers can yawn voluntarily).
In humans, the respiratory control apparatus acquires another automatic control capacity:
it learns to produce language-specific prosodic patterns. These appear to the hearer
to be largely in place and automatic by the time a child is one year old, so that we hear the familiar intonation contours of "jargon" prosody. These patterns then
increasingly come under syntactic and pragmatic control over the next several years,
as knowledge of language grows. The syntactic control of prosodic patterns also becomes
automatized: hard to access metalinguistically, and resistant to voluntary control.
We have two kinds of evidence for this: first, fully native-sounding prosody can
rarely be acquired for a second language after middle childhood; second, intonation
contour and timing are difficult for the prelingually profoundly deaf to acquire after early
childhood, even when the normal fundamental frequency range of speech lies within
the range of signals that are made audible with amplification. The syntactic respiratory control mechanism can also be selectively disrupted in adulthood by brain lesions
(leading to the mis-named "foreign accent syndrome").
Tension, depression, amusement, and many other emotional states also have major involuntary
effects on phonatory respiration; depression leads to restricted pitch range, tension
to heightened pitch, and so on (Scherer & Oshinsky, 1977; Williams & Stevens, 1972). Some of these involuntary effects can be controlled with effort, but we have
all experienced others - say, a fit of giggles at a wedding - that cannot be repressed.
In summary, the overall prosodic control mechanism must be in some way specific to
language, must be acquired during a critical window, and varies across languages.
But, it is simultaneously dependent on factors external to language structure: the
respiratory mechanism, and affective social/cognitive interactions. It is under considerable
voluntary control, yet highly susceptible to involuntary influences, only some of
which can be overridden.
TABLE 1. Functions of the vocal tract in evolutionary perspective
A. Respiratory control - airways and diaphragm (a few major functions):
- Land animals:
- Humans, neonate:
- affective response cries - volume, pitch, duration
- yawns, coughs, sneezes (to clear respiratory tract)
- Humans, by 1 year:
- language-specific prosodic patterns, laughing
- Humans, before adulthood:
- syntactic modulation of prosody
B. Jaw, mouth, tongue movement (a few major functions):
- Complex animals:
- Vertebrates:
- Mammals:
- chewing
- affective response cries, facial expression (snarl...)
- yawns, coughs
- sucking (until weaned)
- Humans:
- language-specific babbling patterns
- lexical/phonological articulatory control
Now let us turn to the control of oral configuration for speech sound articulation,
as outlined in Table 1B. Here again we see a basic set of automatic commands - in
this case, motor commands related to sucking, biting, chewing, swallowing, spitting,
licking, buccal grooming, and buccal components of emotional signalling. We infer that
oral configuration must be responsive to a range of stimuli, processed by a range
of different neural circuits, in all animals that have mouths and guts. From this
basis, the human individual develops, refines, and automatizes speech articulation over the
first seven years or so (Templin 1957, Irwin & Wang, 1983). This automatic control
is again determined in great detail by the specific language being acquired; and
it is also susceptible to specific disruption by brain damage, as in
acquired dysarthrias.
So linguistic phonation and oral configuration are also controlled by many areas of
brain that are necessary for their operation, but not exclusively dedicated to it;
some of these control circuits are clearly hard-wired, while others which interact
with them are just as clearly acquired during the first years of life, becoming trained
through internal feedback, and ending up in the adult as fully automatic, and highly
modular, with dedicated neural circuits.
What can we infer about how this vocal tract and its multiple control systems evolved?
Although spoken language is referred to as a communicative function that is overlaid
on the respiratory and ingestive organs, that does not imply that these organs were
simply appropriated and used in their original form. Being used for speech changed
the organs, but the mouth, nose, throat, etc. had to remain good at their original
jobs - after all, we developed no alternative way to eat or to
breathe!
Can we date the evolutionary change which makes speech possible for us, but not for
chimpanzees? The major vocal tract change discernible in the fossil record is an
elongation of the neck, which had not yet taken place in the Neanderthal (Lieberman
and Crelin 1971). This stretching out of the neck greatly increased the size of the pharynx,
which in turn increased the variety of vowels that could be produced. Roughly the
same change is present in the ontogenetic development of each human today between
birth and age two - newborns have almost no neck, as everyone who has bathed one knows.
This increase in speech production capacity comes at considerable cost, however:
an increased risk of inhaling food and choking to death.
The fact that the longer neck came at such a cost has an important implication which
is often not appreciated, although it has been mentioned by Lieberman and others
(Lieberman and Crelin, 1971, Barber & Peters, 1992, Lieberman, 1995). This implication
is that our ancestor whose neck got longer already had a highly vocal communication system
of some sort, even if it was not a Language (having neither syntax nor phonology).
Otherwise, at that point in time, there would have been no compensating payoff for
elongating the neck 1.
Bringing the bulk of vocalization under voluntary control required a brain change
that could logically have occurred before or after the neck began to lengthen.2 Current
thinking places it, in fact, much earlier, tying it into the enormous increase in
size of the frontal lobes, as shown from skulls about 250,000 years old (Deacon, 1992).
There is no reason to assume that the payoff for this frontal lobe increase was initially
linguistic; the most likely payoff, from what we know about the functions of the
frontal lobe, would have been increased capacity for long-range planning. Barber and
Peters (1992) argue that early artifacts support such an increase in foresight: core-based
stone tools, which require extensive pre-envisioning of how blades will be struck off from a pre-shaped core, date from 200,000 years ago; and the Lazaret cave, which
was left with a pair of wolf skulls neatly guarding its sheltered entrances, argues
for an active human imagination by 150,000 years ago.
Language ontogeny and evolution
To support our earlier assertion that there is no clear ontogenetic boundary between
pre-language and language, and that therefore there is no reason to expect a sharp
phylogenetic boundary either, we present Table 2, which reviews well-known facts
about the ontogeny of language in terms of language 'design features' (cf. Hockett 1958).
We suggest that a dynamic systems view that sees the 'parts' of language as heterogeneous,
interlocking, and overlaid - just like the 'parts' of the vocal tract and its control systems, but more complex and hooked into many more of aspects of cognition -
can help us address both evolutionary and developmental questions. Rather than deciding
whether our ancestors or our children do or do not 'have' a particular function of
the brain, we can consider what kinds of functional precursors could have developed
into the present integrated adult system.
TABLE 2: Design features of continuum from pre-language to language, in order of ontogenetic
emergence
(modified from Barber & Peters 1992, p. 337)
1) Communication, 2-12 months
Linearity (turn-taking)
Vocal tract as medium (for hearing infants)
External feedback, establishment of dynamic stability (imitation)
2) Contrast (discrete symbols), 10 -15 months
Attention, rejection, other pragmatic functions
Noun-like words, predications
3) Arbitariness, conventionality, 16-20 months
Naming explosion
4) Efficiency
Phonological and syntactic patterning
Productivity, redundancy
5) Displacement, 22-28 months
6) Recursiveness, perhaps 30 months
A major encouragement for this approach is work in cognitive modeling which demonstrates
that task-specific modularity can develop from an initially undifferentiated processing
system (e.g. Jacobs, Jordan, & Barto, 1991). A relatively undifferentiated neonate cortex might similarly be able to develop a modular structure. Indeed, Posner's
studies of reading (Posner & Carr, 1991) have long supported the viability of the
notion of ontogenetically acquired modularity, and recent models in various cognitive
areas show how automaticity can be acquired through training with internal feedback (Givon,
1995, ch. 9; Markey, 1994). On such a view, the mechanisms supporting a complex hierarchical
structure such as syntactically complex language would not have had to emerge all at once.
We assert further that it is not just unnecessary, but also improbable that language
appeared full-blown, like Athene from the head of Zeus. We support this claim by
considering another property of language development: the intricate interaction between
exposure to language and development of a brain that can process it, within the history
of each individual.
According to widely accepted interpretations of the stories of language-deprived children
(Curtiss, 1977; Johnson & Newport, 1989; Goldin-Meadow, 1979), the human brain can
develop syntactically complex language only if it is exposed fairly early to input
of sufficient complexity. Home sign - the rudimentary signing used in uninstructed
hearing families of deaf children - is not complex enough. Indeed, Christine Yoshinaga-Itano's
research group at the University of Colorado has been finding that outreach to tutor such families in ASL (or its English-adapted variants) gives no guarantee of
linguistic success for the deaf children; the hearing parents may never reach enough
competence in Sign for them to give the children sufficiently complex input. So the
syntactic potential of the modern brain may never be realized in such cases. Furthermore,
Curtiss (1977) indicates that the syntactic potential itself will degenerate and
be lost in the individual, if it is not stimulated to develop.
In order to understand how organisms are shaped by evolution (phylogeny), it is also
necessary to consider the development of the individual (ontogeny). A serious ambiguity
in the term "capacity" can confuse matters here. "Capacity" is sometimes used to
mean what one will have achieved as a mature organism, but at other times to mean what
one could have achieved in all other possible worlds. Because we have neural (and
other kinds of) plasticity, we are born with the potential to develop in different
ways in alternative worlds; that is, we have potentials which will only be realized in particular
environments. In this sense, most of the potential of every organism is latent and
unrealized, since each of us lives in only one of these possible worlds. For example, chimpanzees have the capacity to learn to communicate with signs and lexigrams
(Savage-Rumbaugh, 1986), but they cannot realize that capacity without the special
environment of skilled human intervention (cf. Wilkins & Wakefield, 1995:176). Likewise,
humans require an environment containing a certain level of language use - more than
what some of Yoshinaga-Itano's subjects have been getting - in order to realize their
capacity for language. Apparently, the input must be richer, perhaps as rich as a
pidgin, for syntactically complex language to develop3. So the first hominids who had
a brain with the capacity for language - that is, who had the developmental potential
for language - could not have actually realized a syntactically complex language,
because there would have been no adequate model for them to hear. This state of affairs
might have persisted for millennia, until at least a pidgin-like level had been reached. The phylogenetic development of language might have been excruciatingly slow,
spiralling up gradually; generations of pre-syntactic speakers might have put together
a few words, gestures, and facial expressions, with recursive structures only at
the discourse level.
We cannot assume that a Great Leap Forward took place even once a level comparable
to recent pidgins was reached, as one might think from reading Bickerton (1981).
Pidgins develop in a world where most of the speakers have command of some other,
fully developed language - although, in the worst case, it is a language that no one around
them knows. Pidgin speakers have therefore become mentally capable of complex syntax
in some language, and so they can push the limited resources of the pidgin to communicate complex ideas, creating patterns of usage which can be grammatized by young learners.
Without that kind of model and that kind of early experience, we suggest that humans
could only have gradually developed to and through the level of syntactic complexity found in pidgins of the modern era. This presumably happened through gradual abbreviation
and automatization, as we see in grammatization of aspects of spoken and signed language
(Traugott & Heine, 1991; Frishberg, 1975) and in the development of writing systems (Daniels & Bright, 1996).
Conclusion.
We suggest that psycholinguists should start trying to link up whatever set of 'parts'
they propose, neither as boxes nor as homogenous masses, but as interlocking and
multiply controlled elements, on the analogy of the vocal tract: components partly
hard-wired and partly acquired, partially independent and partially interdependent, having
degrees of autonomy from each other and from other cognitive and pre-cognitive systems.
This is the only way, we think, to cope with the functional imaging reports that
are showing more and more areas of brain activation during language tasks (e.g. Martin,
Wiggs, Ungerleider, & Haxby, 1996) and with reports of localization of category-specific
naming disorders (Warrington & Shallice, 1984; Sheridan &
Humphreys, 1993).
Since the brain is a real-world object and cannot perform language tasks in isolation
from meaning and understanding, models must further consider how and to what extent
conceptually-distinct types of linguistic and real-world information are integrated.
Many specialized, local relations - mini-modules, one might say - may be set up during
the acquisition of skill and knowledge.
If we, as linguists and psycholinguists, do not think clearly and creatively - in
fact, imaginatively and aggressively - about how language may be represented in the
brain, our potential for input to the emerging neurological research paradigms will
simply be seen as irrelevant, and our concerns will be ignored. Surely, Language, like its
subsystem Speech, is a complex, multiply-controlled system whose parts vary in the
nature and richness of their interconnections. We can no longer afford to approach
experimental design and data interpretation with arguments that essentially go "It's modular."
"It isn't." "It is TOO!" "Isn't!" "Phooey on you!"
NOTES
1. Unless the payoff was in some other area, but it's not obvious what that might
be.
2. We might argue that the potential for voluntary phonatory control given by the
larger prefrontal lobes was in fact realized in Neanderthal, and that it helped to
drive the neck elongation. However, any kind of signaling system - continuous (like
pitch range) or discrete (like phonemes), finite (like a small lexicon) or open (with hierarchical
structure and recursion) - can profit from having a greater output range, because
any increasing in range will makecontrasts among the signals easier to perceive.
3. Recent work on Nicaraguan Sign Language (Morford & Kegl, 1996, Senghas, 1996) may
soon help to specify this threshold further.
REFERENCES
Barber, E. J. W., & Peters, A. M. W. (1992). Ontogeny and phylogeny: What child language
and archaeology have to say to each other. In J. A. Hawkins & M. Gell-Mann (Eds.),
The evolution of human languages
(Santa Fe Institute Studies in the Sciences of Complexity, 9, pp. 305-352.) Redwood
City, CA: Addison-Wesley.
Bickerton, D. (1981). The roots of language
. Ann Arbor: Karoma.
Bock, J. K. (1995). Sentence production: From mind to mouth. In J. L. Miller & P.
D. Eimas (Eds.), Handbook of perception & cognition, Vol . 11: Speech, language, and communication
(pp. 181-216.) Orlando: Academic Press.
Curtiss, S. (1977). Genie: A psycholinguistic study of a modern-day "wild child"
. New York: Academic Press.
Daniels, P., & Bright, W., eds. (1996). The world's writing systems.
New York: Oxford University Press.
Deacon, T. W. (1992). Brain co-evolution. In J. A. Hawkins and M. Gell-Mann (Eds.),
The evolution of human languages
. (Santa Fe Institute Studies in the Sciences of Complexity, 9.) Redwood City, CA:
Addison-Wesley.
Frishberg, N. (1975) Arbitrariness and iconicity: Historical change in American Sign
Language. Language 51,
696-719.
Givon, T. (1995). Functionalism and grammar,
ch. 9. Amsterdam: Benjamins.
Goldin-Meadow, S. (1979). Structure in a manual communication system developed without
a conventional language model: Language without a helping hand. In H. A. Whitaker
& H. A. Whitaker (Eds.), Studies in neurolinguistics, vol.4
. New York: Academic Press.
Gould, S. J. (1991). The creation myths of Cooperstown. Bully for Brontosaurus
(pp. 42-58). New York: Norton.
Hockett, C. H. (1958). A course in modern linguistics.
New York: Macmillan.
Irwin, J. V., & Wong, S. P. (Eds.) (1983). Phonological development in children: 18 to 72 months
. Carbondale: Southern Illinois University Press.
Jacobs, R. A., Jordan, M. I., & Barto, A. G. (1991). Task decomposition through competition
in a modular connectionist architecture: The what and where vision tasks. Cognitive Science 15
, 219-250.
Johnson, J., & Newport, E. M. (1989). Critical period effects in second language learning:
The influence of maturational state on the acquisition of English as a second language.
Cognitive Psychology
, 21
, 60-99.
Lieberman, P. (1995) Manual versus speech motor control and the evolution of language.
Behavioral and Brain Science 18
, 197-198.
Lieberman, P., & Crelin, E. S. (1971). On the speech of Neanderthal man. Linguistic Inquiry 2
, 203-22.
Markey, K. L. (1994). The sensorimotor foundations of phonology: A computational model
of early childhood articulatory and phonetic development. Ph.D. thesis. Technical
report CU-CS-752-94. Boulder CO: University of Colorado, Department of Computer Science.
Martin, A., Wiggs, C. L., Ungerleider, L. G., & Haxby, J. V. (1996). Neural correlates
of category-specific knowledge. Nature 379
, 949-652.
Morford, J. P., & Kegl, J. (1996) Grammaticization in a newly emerging signed language
in Nicaragua. Fifth International Conference on Theoretical Issues in Sign Language
Research, Montreal, Canada.
Pinker, S., & Bloom, P. (1990). Natural language and natural selection. Behavioral and Brain Science 13
, 707-784.
Posner, M. I., & Carr, T. H. (1991). Lexical access and the brain: Anatomical constraints
on models of word recognition. TR91-5, Institute of Cognitive and Decision Sciences,
U. of Oregon.
Savage-Rumbaugh, E. S. (1986). Ape language: From conditioned response to symbol
. New York: Columbia University Press.
Scherer, K. R., & Oshinsky, J. S. (1977). Cue utilization in emotion attribution from
auditory stimuli. Motivation and Emotion 1,
331-346.
Senghas, A. (1996). The creolization of agreement in Nicaraguan Sign Language. Fifth
International Conference on Theoretical Issues in Sign Language Research, Montreal,
Canada.
Sheridan, J., & Humphreys, G. W. (1993). A verbal-semantic category-specific recognition
impairment. Cognitive Neuropsychology 10
, 143-184.
Templin, M. C. (1957). Certain language skills in children: Their development and interrelationships
. Institute of Child Welfare Monographs, 26. Minneapolis: University of Minnesota
Press.
Traugott, E. C., & Heine, B. Approches to grammaticalization, vol. 1.
Amsterdam: Benjamins.
Wang, W. S-Y. (1991). Explorations in language evolution. In Wang, William S-Y. (Ed.),
Explorations in language
(pp. 105-130). Taipei: Pyramid Press.
Warrington, E.K., & Shallice, T. (1984). Category-specific semantic impairments. Brain 107
, 829-854.
Wilkins, W. K., & Wakefield, J. (1995). Brain evolution and neurolinguistic preconditions.
Behavioral and Brain Science 18
, 161-226.
Yoshinaga-Itano, C. (1997) Personal communication.
Williams, C. E., & Stevens, K. N. (1972). Emotions and speech: Some acoustic correlates.
Journal of the Acoustical Society of America 52,
1238-1250.