FITCH v. BICKERTON: THE DEBATE TO DATE
1. FITCH FIRST ROUND
Introduction
Darwin's "Origin of Species" (Darwin, 1859) made little
mention of human evolution. This initial avoidance of human evolution was no
oversight, but rather a carefully calculated move: Darwin was well aware of the widespread
resistance his theory would meet from scientists, clergymen, and the lay
public, and mention of human evolution might have generated insuperable
opposition. But Darwin's many opponents quickly
seized on the human mind, and language in particular, as a potent weapon in the
battle against Darwin's
new way of thinking. Alfred Wallace, whose independent discovery of the
principle of natural selection spurred Darwin
into finally publishing his long-developing "outline" of the theory
in 1859, didn't help by arguing that natural selection was unable to explain
the origins of the human mind. Although Wallace had reservations about all
evolutionary approaches to the mind, human language provided the most powerful
argument, due to the respectable position of linguistics and philology in
Victorian science.
Darwin's most formidable foe on the
linguistic front was Friederich Max Müller, professor of linguistics at Oxford University,
a very well-known and well-respected scholar (Stam, 1976). In his
"Lectures on the science of language," delivered at the Royal
Institution of Great Britain in 1861, and rapidly published thereafter (Müller,
1861), Müller launched a full frontal attack on Darwin and Darwinism, using his
credentials in the "science of language" as a powerful bludgeon.
Müller's position was uncomplicated: "language is the Rubicon which
divides man from beast, and no animal will ever cross it … the science of
language will yet enable us to withstand the extreme theories of the
Darwinians, and to draw a hard and fast line between man and brute." For
Müller, "Language" was the key feature distinguishing humans from all
animals. Müller's arguments were seen by many as convincing: his student Noiré
dubbed him "the Darwin of the mind" and considered Müller to be
"the only equal, not to say superior, antagonist, who has entered the
arena against Darwin" (p. 73,Noiré, 1917). Müller's argument about the
unbridgeable, qualitative difference between human language and all forms of
animal communication, combined with Wallace's opinions, provided arguments that
Darwin by
necessity took very seriously.
Thus, when Darwin
finally broached the subject of human evolution in 1871, in his second great
book "The Descent of Man and Selection in Relation to Sex," the need
to provide a credible explanation of language evolution was a central concern. Darwin rose to the
challenge: his "musical protolanguage" model represents a powerful
marriage of comparative data, evolutionary insight, and a biological
perspective on language. Darwin's
view of language was ahead of its time, and his model and arguments remain
surprisingly relevant to contemporary debates. He clearly adopted a
"multicomponent" view of language, one that recognized the necessity
of several distinct mechanisms to produce the complex product that we now call
language, rather than privileging any one factor as the single "key"
to Language in a monolithic sense. Among these several components, he
presciently recognized the necessity for complex vocal learning, and recognized
that this biological capacity, while unusual among mammals, is shared with many
birds. The importance of vocal learning has often been forgotten, but also
frequently reaffirmed by later scholars (Egnor & Hauser, 2004; Fitch, 2000;
Janik & Slater, 1997; Marler, 1976; Nottebohm, 1976).
Darwin also
adopted an empirical, data-driven approach to the problem at hand. In
particular, Darwin
exploited a wide comparative database, exploiting not just his knowledge of
nonhuman primate behaviour, but also insights from many other vertebrates. Finally,
and most characteristically, he resisted any special pleading about human
evolution. He intended his model of human evolution to fit within, and remain
consistent with, a broader theory of evolution that applies to beetles, flowers
and birds. Unlike Wallace, who remained a human exceptionalist to his death
(Wallace, 1905), Darwin
aimed to uncover general principles, like sexual selection and shifts of
function, to provide explanations of unusual or unique human traits. While
gradualistic, his model does not assume any simple continuity of function
between nonhuman primate calls and language, and he clearly recognized the
uniqueness of language in our species. In many ways, then, Darwin's model of
language evolution finds a natural place in the landscape of contemporary
debate concerning language evolution, and it is surprising that his model has
received relatively little detailed consideration in the modern literature (for
exceptions see Donald, 1991; Fitch, 2006).
In this essay, I aim to redress this neglect by considering Darwin's model of language evolution in
detail. After discussing Darwin's main points
and arguments, I will briefly review additional data supporting Darwin's model that has
appeared since his death. I will also discuss the issue of meaning, about which
Darwin had too
little to say, but which can be resolved by the addition of a hypothesis due to
(Jespersen, 1922). My conclusion is that, suitably modified in the light of
contemporary understanding, Darwin's
model of language evolution, based on a "protolanguage" more musical
than linguistic, provides one of the most convincing frameworks available for
understanding language evolution. The timing of my writing, on the 150th
anniversary of the Origin, and the 200th of Darwin's
birth, is also appropriate for a revival of interest in Darwin's compelling and well-supported
hypothesis.
Language as an "Instinct to Learn"
Chapter Two of the Descent of Man, entitled "Comparison of the
mental powers of man and the lower animals" is one of the most remarkable
in the entire Darwinian corpus, noteworthy for its concision and its breadth of
argument, in considering the evolution of the human mind. The first half of the
chapter lays the groundwork of modern research in comparative cognition, arguing
that animals have emotions, attention, memory as well as many other mental
traits in common with humans. However, Darwin's
opponents, notably Müller, had already ceded the point that animals have
memory, experience emotions, and so on. Language was the key issue, and one can
imagine considerable anticipation of both pro- and anti-Darwinian readers as
they turned to the section simply titled "Language".
In ten densely-argued pages, Darwin
considers some theoretical preliminaries, and then lays out his theory of
language evolution. The first stage involved a general increase in intelligence
and complex mental abilities, and the second involves a sexually-selected
attainment of the specific capacity for complex vocal control: singing. The
third stage was the addition of meaning to the "songs" of the second
stage, which was both driven by, and in turn fueled, further increases in
intelligence.
Theoretically, Darwin
makes a number of important observations. First, he recognizes the crucial
distinction between the language faculty (the biological capacity which enables
humans to acquire language) and particular languages (like Latin or English).
The former capacity, which Darwin
refers to as "an instinctive tendency to acquire an art" (p 56), is
shared by all members of the human species. Darwin neatly bypasses the unproductive
nature/nurture debate that has consumed so much scholarly energy by observing
that language "is not a true instinct, as every language has to be learnt.
It differs, however, from all ordinary arts, for man has an instinctive
tendency to speak, as we see in the babble of our young children" (p 55).
As ethologist Peter Marler has put it, language is not an instinct, but an
"instinct to learn" whose expression entails that both biological and
environmental preconditions be fulfilled. It is this "instinct to
learn" for which an biological, evolutionary explanation must be sought: a
thoroughly modern perspective.
Second, although he was well-aware of the peculiarities of the human vocal
tract, Darwin
argues that the human capacity for language must be sought in the brain, rather
than the peripheral vocal tract. He acknowledges that "articulate
speech" (by which he means vocalization augmented by controlled movement
of the lips and tongue, p. 59) is "peculiar to man", but he denies
that this mere power of articulation suffices to distinguish human language
"for as every one knows, parrots can talk." Instead, Darwin states that it is not speech, but
humans' "large power of connecting definite sounds with definite
ideas" that is definitive of language, and that this capacity
"obviously depends on the development of the mental faculties" (p.
54). By locating the language capacity in the human brain, Darwin's viewpoint is again thoroughly
modern.
Finally, Darwin
recognized the relevance to language evolution of birdsong, which he considered
the "nearest analogy to language". Like humans, birds have fully
instinctive calls, and an instinct to sing. But the songs themselves are
learned. He recognized the parallel between infant babbling and songbird
"subsong", and recognized the key fact that cultural transmission
ensures the formation of regional dialects in both birdsong and speech.
Finally, he recognizes that physiology is not enough for learned song: crows have
a syrinx as complex as a nightingale's but use it only in unmusical croaking.
All of these parallels have been amply confirmed, and further explored, by
modern researchers (Doupe & Kuhl, 1999; Marler, 1970; Nottebohm, 1972,
1975).
Darwin's
"Musical Protolanguage" Hypothesis
Darwin's
model of the phylogenesis of the language faculty, like most models today,
posits that different aspects of language were acquired sequentially, in a
particular order, and under the influence of distinguishable selection pressures.
The hypothetical systems characterized by each addition can be termed,
following (Bickerton, 1990; Hewes, 1973) "protolanguages". Darwin's
first hypothetical stage in the procession from an ape-like ancestor to modern
humans was a greater development of proto-human cognition: "The mental
powers in some early progenitor of man must have been more highly developed
than in any existing ape, before even the most imperfect form of speech could
have come into use" (p 57). He elsewhere suggests that both social and
technological factors may have driven this increase in cognitive power.
Next, Darwin
outlines the crucial second step: what I have dubbed "musical
protolanguage" (Fitch, 2006). Having noted multiple similarities with
birdsong, he argues that the evolution of a key aspect of spoken language,
vocal imitation, was driven by sexual selection, and used largely "in
producing true musical cadences, that is in singing". He suggests that
this musical proto-language would have been used in both courtship and
territoriality (as a "challenge to rivals"), as well as in the
expression of emotions like love, jealousy, and triumph. Darwin concludes
"from a widely-spread analogy" (amply documented with comparative
data later in the book) that sexual selection played a crucial role driving
this stage of language evolution, in particular suggesting that the capacity to
imitate vocally evolved analogously in humans and songbirds.
The crucial remaining question is how emotionally-expressive musical
proto-language made the transition to true meaningful language — how, in
Humboldt's words, humans became "a singing creature, only associating
thoughts with the tones" (p. 76von Humboldt, 1836). This leap, from
non-propositional song to propositionally-meaningful speech, remains the
greatest explanatory challenge for all musical protolanguage theories (cf.
Mithen, 2005). Darwin, citing the previous writings of Müller and (Farrar,
1870), suggests that articulate language "owes its origins to the
imitation and modification, aided by signs and gestures, of various natural
sounds, the voices of other animals, and man's own instinctive cries". Darwin thus embraces all
three of the major leading theories of word origins of his contemporaries (cf.
Fitch, in press). Once proto-humans had the capacity to imitate vocally, and to
combine such signals with meanings, virtually any source of word forms and
meanings would suffice, including onomatopoeia (an imitated roar for
"lion", or "whoosh" for wind), and controlled imitation of
human emotional vocalizations (mock laughter for "play" or
"happiness"). The attachment of specific and flexible meanings to
vocalizations required only that "some unusually wise ape-like animal
should have thought of imitating the growl of a beast of prey … And this would
have been a first step in the formation of a language".
Darwin does
not suggest that the evolutionary process would stop with the initial
acquisition of meaning. For "as the voice was used more and more, the
vocal organs would have been strengthened and perfected". Additionally,
language would have "reacted on the mind by enabling and encouraging it to
carry on long trains of thought" which "can no more be carried on
without the aid of words, whether spoken or silent, than a long calculation
without the use of figures or algebra". Thus began the interactive
evolutionary spiral that led to modern humans.
Signalling Modality: Vocalization or Gesture?
Darwin also
explicitly acknowledged the role of gesture in conveying meaning, echoing
Condillac's earlier arguments (Condillac, 1971 (1747)) and presaging
contemporary discussions (Arbib, 2005; Corballis, 2003; Hewes, 1973; Stokoe,
1974; Tomasello & Call, 2007). Darwin
was aware of the power of signed language: he reminds us that using his fingers
"a person with practice can report to a deaf man every word of a speech
rapidly delivered at a public meeting" (p 58). He also acknowledged the
value of gesture in conveying meaning, and allowed that vocal communication
would have been "aided by signs and gestures" (p. 56). Nevertheless,
he argues against gestural theorists, because the pre-existence in all mammals
of "vocal organs, constructed on the same general plan as ours" would
lead any further development of communication to target the vocal organs rather
than the fingers.
Darwin
clearly believes that the power of speech is neural, not peripheral, citing the
early aphasia literature as a demonstration of "the intimate connection
between the brain, as it is now developed in us, and the faculty of
speech". Comparing the vocal organs and brain, he concludes "that the
development of the brain has no doubt been far more important". And
although he uses a continuity argument to support the early and sustained role
of speech, he firmly acknowledges the abrupt modern discontinuity in
the linguistic system that has thus evolved. Thus, like many other insightful
commentators (e.g., Donald, 1991; Hockett & Ascher, 1964), Darwin recognized that posing phylogenetic
continuity and modern discontinuity as in any way opposed is to create a false
dichotomy. The tree-like nature of phylogeny guarantees that both are core
parts of the evolutionary process.
Darwin
Redux: Modern Comparative Data
Summarizing, Darwin
suggested that the first step on the road to human language was a general
increase in intelligence in the hominid lineage. In a typically pluralistic
fashion, he recognized both "social intelligence"
("Machiavellian intelligence" in the modern trope (Byrne &
Whiten, 1988)) and technological/ecological intelligence (e.g. for tool use) as
playing important selective roles. Given our modern understanding of hominid
evolution, this first stage might be provisionally linked to the genus Australopithecus
or perhaps early Homo (e.g. Homo habilis).
The second stage is the least intuitive: that before vocalizations were used
meaningfully they were used, so to speak, aesthetically, to fulfil many of the
same functions that modern humans use music today (courtship, bonding,
territorial advertisement and defense, competitive displays, etc.). This idea
that complex vocalizations (and thus some aspects of phonology and syntax)
might have preceded the ability of speech to convey propositions and distinct
meanings is the most challenging aspect of Darwin's model. But Darwin uses the comparative database, and
particularly detailed analogy between learned bird song and human song and
speech, to show that this step is not just plausible but well-documented: it
has occurred in many other species. Indeed, modern data shows that vocal
learning, without propositional meaning, has evolved independently in at least
three other clades of mammals (cetaceans, pinnipeds and bats) and three clades
of birds (parrots, hummingbirds and oscine songbirds) (Janik & Slater,
1997; Jarvis, 2004). Such convergent evolution, or repeated independent
evolutionary developments of a comparable ability, provides our strongest
empirical basis for estimating the likelihood of a particular type of
evolutionary event (Harvey & Pagel, 1991). Many of the chapters in this book
affirm, and extend, the observations of parallels between language learning and
birdsong that Darwin
offered in 1871. Thus, whether intuitive or not, Darwin's focus on, and hypothesis for, the
evolution of vocal learning is consistent with a wealth of evolutionary and
comparative data.
Difficulties with Darwin's
Model: Evolving Phrasal Semantics
"How did man become, as Humboldt somewhere defined him, 'a singing
creature, only associating thoughts with the tones'?" Otto Jespersen
1922 (p. 437)
Despite its many virtues, there remain some important problems with Darwin's model that have
impeded its acceptance today. The first and most important is his explanation
of the addition of meaning. Darwin's
explanation, as typical for his day, was concerned only with word meanings
(what today would be termed "lexical semantics"). But from the
viewpoint of modern linguistics, his model seems wholly inadequate to deal with
large swaths of semantics, particularly those aspects tied in with the
interpretation of whole phrases and sentences ("phrasal semantics").
Modern formal semantics has developed rigorous models of this aspect of
linguistic meaning (Dowty, Wall, & Peters, 1981; Guttenplan, 1986;
Montague, 1974; Portner, 2005), and it is far more complex and difficult to
explain than lexical semantics. Although one can hardly blame Darwin for not foreseeing these relatively
recent developments in linguistics, they nonetheless raise substantial
difficulties for his model. For much of the syntactic "glue" which
binds sentences together into large, meaningful wholes (function words,
inflection, bound morphemes, word order, and a host of others) cannot be
understood as resulting from onamatopoeia or imitation of emotional
expressions. Nor can they be readily understood as "inventions" of
some uniquely intelligent individual: all evidence suggests that these
indispensable linguistic tools develop reliably in individuals of normal
intelligence (Bickerton, 1981; Kegl, 2002; Mufwene, 2001; Mühlhäusler, 1997;
Senghas, Kita, & Özyürek, 2005). This key aspect of language thus seems to
have a biological basis. Darwin
does recognize the phenomenon today called "grammaticalization": he
states that "conjugations, declensions, &c., originally existed as
distinct words, since joined together" (p 61). But he offers no model for
the origin of these distinct words, and it is hard to see how onamotopoeia or
similar processes could have generated this original syntactic and semantic
"glue". Thus, complex phrasal semantics remains unexplained by Darwin's model.
However, this oversight was remedied long ago by the linguist Otto Jespersen
(Jespersen, 1922). Jespersen's basic insight involves recognizing the link, in
humans, between musical and linguistic phrases, and working conceptually
backward from there. Jespersen suggested a form of protolanguage in which,
initially, whole propositional meanings attached to entire sung phrases, but
where there was no consistent link between the individual conceptual components
of the meaning, and component parts of the musical phrases (syllables and
notes). Thus, there were no "words" as we now understand them. From
this "holistic" starting point, Jespersen argued that a cognitive
process of analysis started, which slowly isolated individual chunks of the
musical phrase (syllables, or multi-syllabic "phraselets" — what
today we call "words") and associated them with individual components
of the meaning (e.g. nouns, verbs and adjectives, whose precursors were already
present in the conceptual systems of our pre-linguistic ancestors).
Jespersen's hypothesis of a "holistic protolanguage" has recently
been rediscovered and championed by linguist Alison Wray (Wray, 1998, 2000) and
neuroscientist Michael Arbib (Arbib, 2005). Both cite considerable additional
evidence supporting this "analytic" model, including data from modern
adult language, child language acquisition, and cognitive neuroscience.
Supporters of the more intuitive "synthetic" model of protolanguage,
in which words evolved first followed by syntactic operations for combining
them (e.g., Bickerton, 1990), have subjected holistic models to extensive
criticisms (Bickerton, 2007; Tallerman, 2007, 2008). However, I argue that most
of these critiques miss their mark if the notion of a musical protolanguage is
accepted as a starting point (cf. Fitch, in press). Jespersen/Wray's model of
holistic protolanguage thus dovetails nicely with the musical protolanguage
hypothesis, in ways that I believe resolve many, if not all, of these
criticisms (cf. Fitch, 2006; Mithen, 2005).
Sexual Selection:
A second problem with Darwin's
model remains unresolved at present: his focus on sexual selection as the force
driving the evolution of musical protolanguage. Appearing as it did as a few
pages of an extensive tome introducing and then extensively documenting the
very idea of sexual selection, this aspect of Darwin's theory has the virtue of explaining
a core aspect of human evolution using a broad principle abundantly
demonstrated in the evolution of other species. As throughout his work, Darwin eschewed
"special pleading" for our own species. The central difficulty for
this beautiful hypothesis is posed by two ugly facts about modern human
language: it is equally developed in males and females, and is expressed very
early in ontogeny, essentially at birth (Fitch, 2005a). These aspects of
language differentiate it sharply from most sexually-selected traits, which are
strongly biased to develop in the more competitive sex (typically males), and
only at sexual maturity. If anything, human females have superior language
skills when compared to men (Henton, 1992; Kimura, 1983; Maccoby & Jacklin,
1974), and language is remarkable in its very early development, with at least
some early tuning to phonology already occurring in utero before birth (DeCasper
& Fifer, 1980; Mehler et al., 1988; Spence & Freeman, 1996).
There are several potential answers to the difficulty that these facts pose:
one is to argue that during the musical protolanguage stage, sexual selection
was the driving force, and song was (as in most bird species) expressed mainly
in males at sexual maturity. Then, at a later stage (presumably during the
evolution of meaningful language) some other selective force kicked in, so that
language became equally (or better) expressed in females, and was pushed to
develop early. A candidate selective force is kin communication: that selection
for information transmission between parents and their offspring, or more
generally between adults and their younger kin. I have suggested that kin selection
drove this second stage of the evolution of propositional semantic content
(Fitch, 2004, 2007). For an exploration and critique of this idea, see
(Zawidzki, 2006). This kin-selection scenario neatly explains the early
ontogenetic appearance of language in infants (the earlier offspring begin
absorbing their elders' knowledge, the better), and its bias towards females
(who are primary caregivers in all hominoids). The continued presence of
meaningful speech in males is easily explained by the dual facts that immature
males must also learn, and that, unusually in humans, adult males play an
important role in child rearing (whether the father, or male siblings of the
mother, is irrelevant to this fact). Finally, this kin-selection model has the
virtue of explaining why language evolved in humans and not in other
"musical" lineages. Humans combine an extended childhood, with ample
time to acquire knowledge, with very small reproductive output. The fact that
ape babies are born singly, and rarely, conspire to make the survival of each
individual hominid infant a crucial component of reproductive success in the
great ape lineage (cf. Fitch, 2007; Hrdy, 1999, 2004).
An alternative possibility is that sexual selection was, and remains, an
important driving force in human cognitive evolution, including language
(Miller, 2001), but that human pair-bonding has "changed the rules"
in significant ways, so that both sexes are choosy, and both compete for
high-quality mates. Some comparative data can be cited in support of this
second option. Recent data shows that female bird song is not so uncommon as
thought by Darwin, who considered female song to be a simple aberration
(Langmore, 2000; Riebel, 2003; Ritchison, 1986). There is some evidence
suggesting that sexual selection can indeed drive female bird song, though it
seems clear that female song is a secondary derivation of male song in most
lineages (Langmore, 1996). While these observations provide some support for
the idea that the dual-sex expression of human language could result from
sexual selection, it is important to recognize that female song still appears
to be numerically speaking exceptional and that any model based on sexual
selection will have difficulty explaining the extremely early development, and
productive use, of language in human infants.
A final possibility is that sexual selection never played a role in the
evolution of music or of language. The popular notion that music evolved for
courtship (Miller, 2000, 2001) stands on a surprisingly weak empirical footing
compared to a less obvious, but better-documented function of music:
mother-infant communication (Trainor, 1996; Trehub, 2003a, 2003b). Mothers sing
to their infants all over the world, even those who claim to be unable to sing
(Street, Young, Tafuri, & Ilari, 2003), and infants both prefer song to
speech, and respond to song in manifestly adaptive ways (e.g. engaging with and
getting excited by play songs, and being lulled to sleep by lullabies (Trehub
& Trainor, 1998). These observations suggest that music originally
functioned in a childcare context, as it continues to do today. By this model,
the use of music in bonding among adults is simply a side-effect of this
central function, and its occasional use in courtship is a red herring (Dissanayake,
2000; Falk, 2004; Trehub & Trainor, 1998). This final possibility is
clearly compatible with the kin-selection arguments advanced above, but here
there would be no intervening stage of language evolution in which sexual
selection ever played a dominating role. Even Darwin was occasionally wrong.
Terminological Niceties: Musical or Prosodic Protolanguage?
A final, less crucial difficulty with Darwin's
model is terminological. Darwin himself seemed to conceive of his pre-semantic
protolangage in terms directly comparable to modern day music (or at least he
provides no indication that this is not the case). He concludes that
"musical notes and rhythm" were present in this protolanguage, and
that they were deployed "in producing true musical cadences, that is in
singing." This is why I term his model "musical protolanguage".
However, modern human music consists not just of song, but also instrumental
music, so this appellation might immediately have connotations of drumming,
whistling or flutes that are not, strictly speaking, relevant to language
evolution. More pertinently, if we take the musical protolanguage model
seriously, we must acknowledge that modern music may not necessarily preserve
the state of this protolanguage precisely, and that both music and language
have changed in the interim (cf. Brown, 2000). That is, Darwin's hypothetical communication system
was proto-music, not music per se. Adopting the logic of comparative
reconstruction, we can then ask which aspects of modern speech, and of song,
are shared, and thereby reconstruct this system (Fitch, 2005b). The central
shared aspects are prosodic and phonological: the use of a set of primitives
(syllables) to produce larger, hierarchically-structured units (phrases) which
are discretely distinctive. But two key "musical" aspects are not
shared between speech and song: namely discrete-pitched notes, and temporal
isochrony (a steady beat). I have used this comparison of modern speech and
song to argue for a subtly different model from that of Darwin, which I termed "prosodic"
rather than "musical" protolanguage, in which protolanguage consisted
of sung syllables, but not of notes that could be arranged in a scale, nor
produced with a steady rhythm (Fitch, 2006). This prosodic protolanguage model
thus includes the "sung cadence" aspect of Darwin's model, while rejecting both his
"notes" and "rhythm" (at least as normally construed). Both
of these aspects of (most) modern song are, by hypothesis, more recent
developments in music not present in protolanguage. I see this as an adjustment
of Darwin's
hypothesis, fully in keeping with its spirit. Furthermore, it is unclear from
his writings whether Darwin
would have disagreed with this adjustment.
A different reconstruction of the common ancestor of music and language,
involving both discrete pitches and isochronic rhythm (as well as tone-based
meaning) is given in (Brown, 2000). Brown also argues that his hypothetical
protolanguage, which he dubs "musilanguage" could not have evolved by
normal neo-Darwinian selection and thus demands a group selection explanation.
This remains its clearest, and most dubious, distinction from what is otherwise
just a rediscovery of Darwin's
basic hypothesis (for critiques see Botha, 2008; Fitch, in press).
Conclusions
I have argued that Darwin's
model for language evolution, "musical protolanguage," suitably
updated, provides a compelling fit to both the phenomenology of modern music
and language, and to a wealth of comparative data. By placing vocal control at
the centre of his model, Darwin availed himself of the rich comparative
database of other species who have independently evolved complex vocal
imitation, and he thus explains two of the features of human language that set
if off most sharply from nonhuman primate communication systems: vocal learning
and cultural transmission. The biggest missing piece in Darwin's model, as I see it, is a reasonable
explanation of phrasal semantics (and the aspects of syntax that go with it),
but this gap was filled by Jespersen by 1922. Together, these hypotheses
provide one of the leading models of language evolution available today (for an
enthusiastic book-length exploration seeMithen, 2005), and one that has been
repeatedly re-discovered by later scholars (e.g., Brown, 2000; Livingstone,
1973; Richman, 1993). While many aspects of what has now become a family of
models remain to be explored empirically (the issues surrounding sexual, kin
and group-selection remain particularly unclear), this is a model worthy of
detailed consideration and elaboration today. Most importantly, Darwin's model makes
numerous testable empirical predictions (for example about the partially
overlapping nature of the brain mechanisms underlying music and spoken
language, and their genetic basis) that can be answered in the coming decades.
This year of Charles Darwin's 200th birthday seems an opportune time for Darwin' own model of
language evolution to regain the prominence it deserves.
2. BICKERTON FIRST ROUND
I yield to no-one in my admiration of Darwin.
But admiration should not blind us to the fact that in many cases he was,
inevitably, limited by the state of knowledge in his time. Not only
Mendelian genetics, but also almost the entire ancestry of humans, was wholly
unknown to him; ethology and the study of non-human communication had yet to be
systematically developed, and linguistics still lay in the womb of
philology. It is truly amazing, not that he was sometimes wrong, but that
he was so often and so stunningly right.
He was right when he saw language as the seed, rather than the fruit, of
human intelligence. But appealing as the notion is, he was wrong in
proposing a scenario in which language issued from a "musical
protolanguage". Tecumseh Fitch argues that his own account,
developed from Darwin's,
is soundly based on principles of evolutionary biology. It is therefore
somewhat surprising that his account pays as little attention to the evolution
of humans (and the ways in which this evolution differed from that of other
primates) as do those of biologically-naïve linguists or psychologists.
The notion of a terrestrial and heavily-predated primate indulging in any
form of vocal activity-especially one that must, in quantity as well as
quality, have exceeded those of all other primates barring gibbons-is simply
bizarre, as I point out in a chapter of my book Adam's Tongue (out
next month) devoted to the "singing ape" hypothesis:
"What could possibly have been the functions of song for a pre-human species
in largely treeless grasslands? Song as a pair-bonding mechanism is
highly unlikely. Human ancestors probably weren't monogamous-great apes
aren't, and neither are we, even if we try or pretend to be, so a monogamous
interval at any time in the past looks unlikely. But suppose we did go
through a monogamous period. If two mates don't happen to be out of sight
of one another up two different trees, there are countless more effective ways
of bonding than yodeling at each other.
"Human ancestors probably weren't territorial, either-at least not in
the sense of holding small, well-defined chunks of territory. Most likely
they had a fission-fusion social structure, like that of contemporary apes,
that's to say groups would be continually splitting up and reforming, merging
with other groups. In open terrain, where different groups might utilize
the same areas at different times without conflict or even contact, what would
be the point of noisily-defended frontiers?
"Furthermore, the terrains in which gibbons and human ancestors lived
were such that for maintaining contact sound was essential in one and useless,
even dangerous, in the other…On the savanna, where there are beasts with keen
hearing far larger and more lethal than our ancestors, to sing out with any
frequency would have been to write one's own death warrant. Moreover, the
absence of trees and the level or undulating nature of most savannas means
that, in contrast with the rain-forest, animals are visible at considerable
distances. To be out of sight is, under those conditions, almost always
to be out of earshot–there's little point in yelling and hoping your friends
will hear you.
"To assume that, even if our ancestors had sung before, they would go
on singing under these conditions is absurd-something you can do only if you
think that behavior and environment are completely divorced from one another…
Conditions on the savanna were such that while they lived there our ancestors
very probably produced less sound than our ape relatives, not more. If
this was indeed the case, a single source for music and language becomes highly
unlikely. Unless, of course, someone succeeds in coming up with some function
pre-humans had to perform, under those same savanna conditions, that they
couldn't have performed by any means other than by singing. It's unlikely
anyone will, but never say never in science."
To persuade us of the "musical protolanguage" theory,
Tecumseh will have to come up with a scenario in which singing (of some kind)
somehow increased human fitness. Here he has proposed mother-child
interaction (as already suggested by Dean Falk in a recent article, "Prelinguistic
evolution in early hominins: Whence motherese?", Behavioral and
Brain Sciences 27(4):491-503, 2004). The problem with this is that
all other primates have mother-child interactions, but only one has picked on
this kind. Why? Why humans? And this doesn't end the
problems that "musical protolanguage" raises.
Tecumseh recognizes that the severest of these problems ("the greatest
explanatory challenge for all musical protolanguage theories") is how
sound acquired sense-how a continuously variable medium with no specific
reference turned into strings of discrete chunks with individual
meanings. However, he skips nimbly over the solution:
"Supporters of the more intuitive
"synthetic" model of protolanguage, in which words evolved first
followed by syntactic operations for combining them (e.g., Bickerton, 1990),
have subjected holistic models to extensive criticisms (Bickerton, 2007;
Tallerman, 2007, 2008). However, I argue that most of these critiques miss
their mark if the notion of a musical protolanguage is accepted as a starting point
(cf. Fitch, in press). Jespersen/Wray's model of holistic protolanguage
thus dovetails nicely with the musical protolanguage hypothesis, in ways that I
believe resolve many, if not all, of these criticisms (cf. Fitch, 2006; Mithen,
2005)."
As I don't have a copy of Fitch (in press), I remain in the dark as to what
these ways are. All I know is that when Dean Falk made the same proposal,
I wrote a commentary that, inter alia, pointed out she gave no account of how
symbolic meaning — symbolic use of words or signs to refer to particular
classes or individuals — emerged from originally meaningless sounds.
Significantly, she responded to all the points I made… except for that one.
Maggie Tallerman and I have made some very specific and pointed criticisms
of the "holistic protolanguage" model, most of which have never been
satisfactorily answered by anyone, as far as I know. If Tecumseh believes
he can answer them, he should show how. He
does point out that "Darwin…
embraces all three of the major leading theories of word origins of his
contemporaries" but he fails to point out that at least two of these are
incompatible with one another. For according to Darwin, "the
attachment of specific and flexible meanings to vocalizations required only
that 'some unusually wise ape-like animal should have thought of imitating the
growl of a beast of prey'" (and of course that some even wiser primates
should have understood what was meant-a lion coming, or lions often hang around
here, or one was seen here last week, or "Gee, guys, see how well I can
imitate a lion!"). But of course this onomatopoeic proposal is
incompatible with "musical protolanguage", since it avoids the
holistic phase altogether and goes straight to the kind of compositional,
already-symbolic protolanguage that Tecumseh rejects. The "lion's
roar" idea needs a good bit of tweaking, but at least it's nearer the mark
than a holistic protolanguage.
A major motive behind "musical protolanguage" is Strict Continuism
— the belief that language grew seamlessly from animal
communication. Animal calls — if translated into humanese, and that
turns out to be a very dodgy business in itself — are, like holophrases, often
the equivalents of whole clauses: "Mate with me"; "Stay off my
territory"; "Terrestrial predator coming — get up a
tree". Split these into their components and for a few
glorious moments it seems that the transition problem has been solved.
But in Adam's Tongue I go more deeply into the transition problem than
anyone ever has before. And it's the transition problem — how any species
could get from a standard animal communication system to even the crudest and
most basic kind of protolanguage — that lies at the very heart of language
evolution, and without which all "explanations" are mere hand-waving,
smoke and mirrors.
FITCH SECOND ROUND
The point of my essay, written on Darwin's
birthday, was to revive interest in Darwin's
long-neglected ideas about language evolution, not to offer (or defend) my own
model. Derek Bickerton's critique of Darwin's
musical protolanguage model suggests that our hominin ancestors lived out their
terrified lives on the treeless savannah, cowed into silence by their many
predators. This "fact" renders absurd, Bickerton claims, Darwin's notion that our
ancestors evolved learned, complex vocalizations ("song", for simplicity,
hereafter) before language.
By dubbing Darwin's
idea "bizarre" and "absurd," Bickerton reveals his
unwillingness to engage in a sympathetic interpretation of the musical
protolanguage hypothesis first advanced by Darwin (and others who have followed
in his footsteps). But as philosopher Suzanne Langer observed "The chance
that the key ideas of any professional scholar's work are pure nonsense is
small; much greater the chance that a devastating refutation is based on a
superficial reading or even a distorted one, subconsciously twisted by a desire
to refute" (p. ix, Langer, 1962).
My goal here is only to show that Bickerton's interpretation is of this
superficial and distorted sort, and because of this and some factual errors
cited below, that his attempt at a devastating refutation misses its mark.
First, Bickerton misses the central point of Darwin's hypothesis: to explain
the origin of vocal learning in the hominid line. This is an
indubitable capacity in our species, indubitably lacking in other apes, and
there must be an evolutionary explanation for it. Although it is possible that
vocal learning is a "spandrel", a by-product of some other
evolutionary change (e.g. large brains), it does not seem absurd to suppose
that this capacity was selected, for some reason or another. If it was not in
the mute Australopithecines of the Bickertonian savanna, and did not play a
role in pair- or mother-infant bonding, it must have happened at some other
time, for some other reason - but it happened.
Darwin simply,
and correctly, observed that the capacity for vocal learning is not uniquely
human, but is shared with birds, and it was this central observation upon which
he built his theory. This observation, and the deductions Darwin
drew from it, have subsequently been supported by additional comparative data
from many species, from hummingbirds to seals and whales, of which Darwin was unaware. It is
true, obviously, that language has many other critical components besides vocal
control — vocal learning is one of several speech- and music-related mechanisms
in our species, and language per se involves several others in addition (most
notably complex syntax and semantics). But vocal learning did evolve in our
species, and Darwin's
hypothesis of an evolutionary route through song is a reasonable one,
well-supported by abundant data.
Further, the human capacity for music has much in common with language: it
is another early-developing trait, found in all human cultures, and Darwin's hypothesis has
the virtue of explaining the continued existence of music along with language.
This too, like vocal learning, needs to be explained if we are to understand
human evolution.
From a historical viewpoint, Bickerton's critique is reminisicent of that of
Darwin's
nemesis, the linguist Max Müller. It was Müller who began coining the
(unfortunately long-lived) nicknames for many models of language evolution,
dismissing Darwin's "sing song" theory with the same brief sneer as
the older onomatopoeia and interjectional hypotheses, which he nicknamed
"bow wow" and "pooh-pooh" (Müller, 1861, 1873) Turnabout
being fair play, Müller's own theory of semi-mystical resonance between words
and things was dubbed the "ding dong" theory [(Noiré, 1917)].
Bickerton's "yodelling Australopithecines" image has the same absurd
comic flair. But sneers and derogatory nicknames, however rhetorically
effective, are not scientific arguments.
To close, I will only point out three key factual errors in Bickerton's
critique:
1) Maggie Tallerman's critique of holistic protolanguage has been answered
convincingly, point by point, by Kenny Smith recently (Smith, 2008). In a
journal issue that, if I'm not mistaken, Bickerton himself co-edited…
2) Most paleoanthropologists now agree that the environment in which much of
human evolution occurred was not the unitary "savannah" imagined by
Bickerton, but an ecologically diverse environment better characterized as
"mixed woodlands" (Kingston, Marino, & Hill, 1994). Our ancestors
probably had plenty of trees to climb, and they probably did so regularly, as
the lasting arboreal adaptations of the Australopithecines attest. Indeed it
seems likely that Australopithecines built nests in the trees for sleeping,
just as do modern chimpanzees and orangutans (Sabater Pi, Veà, & Serrallonga,
1997). Given that all apes persist in using loud vocalizations to stay in
contact, for example chimpanzee pant-hoots, it is surprising that someone with
an imagination as fertile as Bickerton's can't conceive of any function for
loud nocturnal vocalizations in early hominins. He should consult (Mithen,
2005) for some inspiration.
3) Bickerton finds it "absurd" to suppose that a
mostly-terrestrial, moderate-sized primate would be highly vocal, producing
loud, repetitive vocalizations. What about Theropithecus gelada, the
grassland living gelada baboon? These are mostly-terrestrial African primates,
preyed upon by leopards, dogs, and humans, and whose fossil remains are also
known from Olduvai Gorge. Geladas are
extremely vocal and indeed noted for their vocal complexity (Aich, Moos-Heilen,
& Zimmerman, 1990; Richman, 1976), and interestingly are one of the only
primates for whom claims have been made of rhythmic, synchronized vocalizations
(Richman, 1978, 1987). Whether Richman's claims about the musicality of gelada
vocalizations hold up or not, there can be no doubt that geladas are highly
vocal terrestrial primates, in apparent violation of Bickerton's evolutionary
principles, and who evolved in a grassy, nearly treeless environment.
Most primates, whether arboreal or terrestrial, are highly vocal in at least
some circumstances, as are humans today, and I conclude that Bickerton provides
no good argument to suppose that hominins have ever been otherwise.
In conclusion, the discipline of language evolution is full of questions,
and the field is only likely to make empirical progress if practitioners find
it in their hearts and heads to sympathetically read, understand and compare
multiple hypotheses, even those that initially seem unintuitive or even "absurd".
Far too many unknowns remain about our species' past for Darwin's hypothesis to be dismissed so
quickly, on such scant evidence and weak argument.
Especially on Darwin's
birthday, and the 150th anniversary of his greatest book.
4. BICKERTON SECOND ROUND
Let me answer the points made by Tecumseh Fitch in his latest posting:
A. I don’t explain the origin of vocal learning.
Well, what would select for vocal learning better than having a few words to
learn? Vocal learning is not the result
of some “deep homology” but has arisen spontaneously in a wide variety of
species, each of which required vocal learning for its own particular
needs. Fitch’s assumption that I ought
to be providing a source for vocal learning preceding
the origin of language is based on the strange dogma found in Hauser, Chomsky
& Fitch 2002: that everything needed for language (except, perhaps, the
ability to create recursive structure) must have been present in the hominid
line before language could emerge. To
the best of my knowledge, there’s no hard evidence for this. Evolution does not need to get all its ducks
in a row before it can create novel faculties.
To the contrary, new faculties typically emerge in some very crude and
primitive form and then themselves act as selective pressures for traits that
will subserve and expand those faculties.
B. Kenny Smith’s 2008 article disproves the arguments against a holistic
protolanguage.
Yes, I co-edited the journal issue in question, and I actually did read this
article. It’s one of the better holophrastic
papers, makes some good and useful points against Tallerman, and protracts the
debate, but is far from the tie-breaker that Fitch claims. More to the point, it doesn’t even attempt to
tackle the strongest argument against a holophrastic protolanguage.
A holophrastic protolanguage consists of things that are basically like
animal calls, and it’s true that we can roughly translate many such calls into
humanese sentences: “Come mate with me”,
“Stay off my territory”. The basic idea
in a holophrastic protolanguage is that these calls are then fractionated into
words. But the basic assumption of
holophrasis is that every holophrastic call is actually the exact equivalent of
a particular sentence. It must be so.
How else could you transition from holophrasis to compositionality? How else could you get agreement on what each
of the fractionated parts of the holophrase meant?
In fact, animal calls are not the equivalent of sentences. They are designed for entirely different
purposes and function in entirely different ways. Take the famous “vervet eagle alarm”. It could translate as “Look out, here’s an
eagle”. It could equally well translate
as “Danger from above” or “Quick, hide in the bushes!” How would any primate know whether to
fractionate this into “look” and “eagle”, or “danger” and “above”, or “hide”
and “bushes”? More on this below, see
question 2.
C. Most human evolution did not occur in the savanna.
Did I say it did? The quote from Adam’s
Tongue that I gave did not, naturally, include the pages preceding
it where I discussed at some length the mosaic woodland which, as I am of
course well aware, human ancestors inhabited for three or four million years
after they split from the other apes.
And Fitch himself knew perfectly well that I knew this when he accused
me of imagining a “unitary savannah”. He
knew this because on October 30, 2008, I sent him the chapter in which this
information is contained.
But in any case, the careful reader of my original rebuttal will already
have noted my real point: “To assume that, even if our ancestors had sung
before, they would go on singing under these conditions is absurd.” In other words, my point was that, whatever
australopithecines might have done, hominids would hardly have persisted in
doing it once the great drying set in around 2mya and they really did have to
subsist in a savanna environment.
D. Geladas are ground-living savanna primates, and they make a lot of noise,
so why shouldn’t human ancestors have done the same?
To me, this is worse than being accused of things I don’t believe. Fitch is a biologist. That means he must know something about
ecology, realize what a niche is and how it determines how a species
behaves. If the niches of geladas and
hominids are totally different, he must surely know that you can’t use geladas
as a model for possible hominid behavior.
Well, their niches differ dramatically.
There are three things and only three that geladas and early Homo have
in common. They are primates, they are
terrestrial, and they live in savannas.
In just about everything else they are different. Geladas eat grass and hardly anything but
grass. Human ancestors could eat pretty
well anything except grass. Geladas
spend most of their foraging time hunkered down, hunching across the grass a
meter or so at a time. The long-legged
Homo erectus wouldn’t have had long legs if he hadn’t had to use them to cover
great distances. The day range of
geladas can be measured in meters. The
day range of Homo had to be measured in kilometers, given the scarcity of
non-grass comestibles in savannas.
Geladas move around in troops of 300 to 400, indeed groups up to 600 or
more have been recorded. Human ancestors could never have foraged in groups
even approaching this size—they must have traveled in much smaller units.
Precisely because they’re grass-eaters, geladas can live together in large
numbers, which (a) reduces the chances of predation—there’s safety in
numbers—and (b) makes irrelevant their use of vocalization—out in the
grassland, a huge bunch of them together, any predator is going to see them
regardless of whether they vocalize or not.
For hominids it was different.
Small groups at long distances from other small groups, they might get
together at night but during the day they’d have to go far and wide to find
food, and for much of that—opportunistic stalking of birds and small mammals,
ambush hunting and so on—they’d have to keep dead quiet. What, exactly, would they have needed
elaborate vocalizations for?
And that brings us to the meat of the matter. Fitch’s four points serve to disguise the
fact that he has no answers to the two really serious questions involved here.
Question No. 1: For what function did
hominids need complex vocalizations?
(Note: it would have to be a function basic and essential enough to
offset the risk from attracting predators—“loud nocturnal vocalizations to stay
in contact” won’t cut it, because at night they would have been damn careful to
keep in close physical contact!)
Question No. 2: How did meaning get into the vocalizations? There have been any number of attempts to
explain this, but I have yet to see one that is even halfway convincing.
One final word. Contra Fitch, the
“singing ape” is not in any way a “key idea” of Darwin’s, any more than my original article
was a “twisted desire to refute.” In a
volume of several hundred pages Darwin
devotes a few sentences to it, alongside a couple of other possible language
origins—and Fitch himself mentions all three!
So to talk about “Darwin's musical
protolanguage model” simply expands and distorts what Darwin actually said. And to genuflect before every casual remark a
writer made is not admiration—it’s idolatry.
Idolatry of Darwin does not increase, but
rather detracts from, appreciation of the many great things that Darwin did say.
5. ????????
Well, folks, that's where it's at as of now. Fitch hasn't answered my last post. It's my belief he hasn't answered it because he can't answer it. So I'm declaring victory and going home until such time as (or IF) he answers my points. If he does, it will appear in this blog. Meantime I'll debate anyone, ANYONE who disputes the conclusions reached in Adam's Tongue. Come one come all, let's see your stuff.