![]() | ![]() |
Formats:
|
||||||||||||||||||||||||
Copyright © 1999, The National Academy of Sciences Evolution The evolution of language Institute for Advanced Study, Princeton, NJ 08540 *To whom reprint requests should be addressed. e-mail:
nowak/at/ias.edu. Communicated by Robert May, University of Oxford, Oxford, United
Kingdom Received March 12, 1999; Accepted May 7, 1999. This article has been cited by other articles in PMC.Abstract The emergence of language was a defining moment in the evolution of
modern humans. It was an innovation that changed radically the
character of human society. Here, we provide an approach to language
evolution based on evolutionary game theory. We explore the ways in
which protolanguages can evolve in a nonlinguistic society and how
specific signals can become associated with specific objects. We assume
that early in the evolution of language, errors in signaling and
perception would be common. We model the probability of
misunderstanding a signal and show that this limits the number of
objects that can be described by a protolanguage. This “error
limit” is not overcome by employing more sounds but by combining a
small set of more easily distinguishable sounds into words. The process
of “word formation” enables a language to encode an essentially
unlimited number of objects. Next, we analyze how words can be combined
into sentences and specify the conditions for the evolution of very
simple grammatical rules. We argue that grammar originated as a
simplified rule system that evolved by natural selection to reduce
mistakes in communication. Our theory provides a systematic approach
for thinking about the origin and evolution of human language. Language remains in the minds of many philosophers, linguists, and
biologists a quintessentially human trait (1–3). Attempts to shed
light on the evolution of human language have come from many areas
including studies of primate social behavior (4–6), the diversity of
existing human languages (7, 8), the development of language in
children (9–11), and the genetic and anatomical correlates of language
competence (12–16), as well as theoretical studies of cultural
evolution (17–21) and of learning and lexicon formation (22). Studies
of bees, birds, and mammals have shown that complex communication can
evolve without the need for a human grammar or for large vocabularies
of symbols (23, 24). All human languages are thought to possess the
same general structure and permit an almost limitless production of
information for communication (25). This limitlessness has been
described as “making infinite use of finite means” (45). The lack
of obvious formal similarities between human language and animal
communication has led some to propose that human language is not a
product of evolution but a side-effect of a large and complex brain
evolved for nonlinguistic purposes (1, 26). Others suggest that
language represents a mix of organic and cultural factors and, as such,
can only be understood fully by investigating its cultural history (16,
27). One problem in the study of language evolution has been the
tendency to identify contemporary features of human language and
suggest scenarios in which these would be selectively advantageous.
This approach ignores the fact that if language has evolved, it must
have done so from a relatively simple precursor (28, 29). We are
therefore required to provide an explanation that proposes an advantage
for a very simple language in a population that is prelinguistic
(30–32). This work can be seen as part of a recent program to
understand language evolution based on mathematical and computational
modeling (33–37). The Evolution of
Signal–Object Associations. We assume that language evolved as a
means of communicating information between individuals. In the basic
“evolutionary language game,” we imagine a group of individuals
(early hominids) that can produce a variety of sounds. Information
shall be transferred about a number of “objects.” Suppose there
are m sounds and n objects. The matrix
P contains the entries pij,
denoting the probability that for a speaker object i is
associated with sound j. The matrix Q contains
the entries qji, which denote the
probability that for a listener sound j is associated with
object i. P is called “active matrix,” whereas
Q is called “passive matrix.” A similar formalism was
used by Hurford (22). Imagine two individuals, A and B, that use
slightly different languages L (given by P and
Q) and L′ (given by P′ and
Q′). For individual A,
pij denotes the probability of making
sound j when seeing object i, whereas
qji denotes the probability of inferring
object i when hearing sound j. For individual
B, these probabilities are given by
p′ij and
q′ji. Suppose A sees
object i and signals, then B will infer object
i with probability
Σj=1mpijq′ji.
A measure of A’s ability to convey information to
B is given by summing this probability over all objects
(n). The overall payoff for communication between
A and B is taken as the average of
A’s ability to convey information to B, and
B’s ability to convey information to A. Thus,
Hence, we assume that both speaker and listener receive a reward for
mutual understanding. If for example only the listener receives a
benefit, then the evolution of language requires cooperation. In each round of the game, every individual communicates with every
other individual, and the accumulated payoffs are summed up. The total
payoff for each player represents the ability of this player to
communicate information with other individuals of the community.
Following the central assumption of evolutionary game theory (38), the
payoff from the game is interpreted as fitness: individuals with a
higher payoff have a higher survival chance and leave more offspring
who learn the language of their parents by sampling their responses to
individual objects. Fig. Fig.11
For m = n, the evolutionary optimum is
reached if each object is associated with one specific sound and vice
versa. Evolution does not always lead to the optimum solution, but
certain suboptimum solutions, in which the same signal is used for two
(or more) objects, can be evolutionarily stable. A Linguistic Error Limit. Below, we discuss two essential
extensions of the basic model. First, we include the possibility of
errors in perception: early in the evolution of communication, signals
are likely to have been noisy and can therefore be mistaken for each
other (39). We denote the probability of interpreting sound
i as sound j by uij.
The payoff for L communicating with L′ is now
given by
,
whereas sii = 1. In this case, the
probability of correct understanding is
uii = 1/[1 + (m −
1) ]. The maximum payoff for a language with m sounds
(when communicating with another individual who is using the same
language) is given by F(m) =
Σim=1uii,
and therefore F(m) = m/[1 +
(m − 1) ]. The fitness, F, is
an increasing function of m converging to a maximum value of
1/ for large values of m. Without error, we would have
F(m) = m. Thus, in the
presence of error, the maximum capacity of information transfer is
limited and equivalent to what could be achieved by 1/ sounds
without error.Next, we assume that objects can have different values,
ai. (For example when a leopard represents
a higher risk than a python, the word “leopard” may be more
valuable than “python.”) We have F(m)
= [1 + (m −
1) ]−1Σim=1ai,
where the objects are ranked according to their value,
a1 > a2 >… . This fitness
function can adopt a maximum value for a certain number m
and decline if the value of m becomes too big. In this case,
natural selection will limit the number of sounds used in the language
and consequently also limit the number of objects described. Fig.
Fig.22
The principal result of the extended model, including misunderstanding,
is that of a “linguistic error limit”: the number of
distinguishable sounds in a protolanguage, and therefore the number of
objects that can be accurately described by this language, is limited.
Adding new sounds increases the number of objects that can be described
but at the cost of an increased probability of making mistakes; the
overall ability to transfer information does not improve. This obstacle
in the evolution of language has interesting parallels with the
error-threshold concept of molecular evolution (40). The origin of life
has been described as a passage from limited to unlimited hereditary
replicators, whereas the origin of language as a transition from
limited to unlimited semantic representation (41). Word Formation. The way to overcome the error limit is by
combining sounds into words. Words are strings of sounds. As before, we
define the fitness of a language as the total amount of successful
information transfer. The maximum fitness is obtained by summing over
all probabilities of correct understanding of words. For a language
with m sounds (phonemes) and a word-length l, the
maximum payoff is given by F(m,l)
= ml[1 + (m −
1) ]−l, which converges to
1/ l for large values of m,
thus allowing a much greater potential for communication. This equation
assumes that understanding of a word is based on the correct
understanding of each individual sound.More realistically, we may assume that correct understanding of a word
is based (to some extent) on matching the perceived string of phonemes
to known words of the language. Consider a language with N
words, wi, which are strings of phonemes:
wi = (xi1,
xi2…
xil). For m different phonemes
there are ml possible words. A particular
language will contain a subset of these words, N ≤
ml. We define the similarity between two
words as the product of the similarities between individual phonemes in
corresponding positions. The similarity between word
wi and wj is
Sij =
Πk=1lsij(k), where
sij(k) denotes the similarity between the
k–th phonemes of words wi and
wj. The probability of correctly
understanding word wi is
Pi =
1/Σjml=1Sijσj,
where σj = 1 if word
wj is part of the language, and
σj = σ if word
wj is not part of the language. The
parameter σ is a number between 0 and 1 and specifies the
degree to which word recognition is based on correct understanding of
every phoneme versus understanding of the whole word. If σ
= 0, then each word is only compared with every other word that is
a part of the language; correct understanding of a word consists in
comparing the perceived word with all other words that are part of the
lexicon. An implicit assumption here is that individuals have perfect
knowledge of the whole lexicon. If σ = 1, then every word
is compared with every other possible word that can be formed by
combining the phonemes. Correct understanding of a word requires a
correct identification of each individual phoneme. The listener does
not need to have a list of the lexicon. A value of σ
between 0 and 1 blends these two possibilities. In this case,
recognition of a word is to some extent based on identification of each
individual phoneme and to some extent on identification of the word
selected from the list of all words that are contained in the language.
The maximum payoff for such a language is given by F =
Σi=1NPi (Fig.
(Fig.3).
3
Combining sounds into words leads to an essentially unlimited potential
for different words. This step in language evolution can be seen as a
transition from an analogue to a digital system. The repertoire is not
increased by adding more sounds, but by combining a set of easily
distinguishable sounds into words. In all existing human languages,
only a small subset of the sounds producible by the vocal apparatus are
employed to generate a large number of words. These words are then used
to construct an unlimited number of sentences. The crucial difference
between word and sentence formation is that the first consists
essentially of memorizing all (relevant) words of a language, whereas
the second is based on grammatical rules. We do not memorize a list of
all possible sentences. The Evolution of Basic Grammatical Rules. The next step in
language evolution is the emergence of a basic syntax or grammar.
Recall that by combining sounds into words, the protolanguage achieves
an almost limitless potential for generating words with the power of
describing a large number of objects or actions. Grammar emerges in the
attempt to convey more information by combining these words into
phrases or sentences. Simply naming an object will be less valuable
than naming it and describing its action. (A leopard can be stalking,
in which case it is a serious risk, or merely sleeping and thereby
posing a lesser risk.) There is an obvious advantage to describing both
objects and actions. Suppose there are n objects and
h actions; there are nh possible combinations,
but only a fraction, , of them may be relevant (for
example: leopard runs; monkey runs; but not banana runs). A
“nongrammatical” approach would be to conceive N
= nh different words for all combinations. A
“grammatical” approach would be to have n words for
objects (i.e., nouns) and h words for actions (i.e., verbs).
Let us compare the fitness of grammar and nongrammar.Again, we will include errors, this time as a probability to
mistake words, which can include acoustic misunderstanding and/or
incorrect assignment of meaning. The maximum fitness of a
nongrammatical language with N different words is
Fng = N/[1 +
(N − 1)ξ]. The maximum fitness for a grammatical
language is Fg = N/{[1 +
(n − 1)ξ][1 + (h − 1)ξ]}.
Here, ξ is the similarity between words. In the
nongrammatical language, each event is described by one word, and
correct communication requires that this word is distinguished from
N − 1 other words. The grammatical language
uses two words for every event: we can say that nouns describe objects
and verbs describe actions. Each noun has to be distinguished from
n − 1 other nouns, and each verb from
h − 1 other verbs. Whether grammar wins in the
evolutionary language game depends on the number of combinations of
nouns and verbs that describe relevant events.
Fg > Fng
leads to
From Eq. 3, it follows that a necessary condition for
grammar to win is
Thus far, we have specified only those conditions conducive for
grammar to have a higher fitness than nongrammar. We can also formulate
a model describing how grammar can evolve gradually by natural
selection (see Fig. Fig.44
The model can be extended in many ways. For example, events can
consist of one action and several objects. Objects may be associated
with properties, giving rise to adjectives. Events can have similar
associations, giving rise to adverbs. The essential result is that a
grammatical language that has words for each component of an event
receives a higher payoff in the evolutionary language game than a
nongrammatical language that has words (or a string of words) for the
whole event. In this context, the grammar of human languages evolved to
reflect the “grammar of the real world” (that is, the underlying
logic of how objects relate to actions and other objects). Conclusions. In this paper, we have outlined simple
mathematical models that provide new insights into how natural
selection can guide three fundamental, early steps in the evolution of
human language. The question concerning why only humans evolved language is hard
to answer. Interestingly, however, our models do not suggest that a
protolanguage will evolve under all circumstances but outline several
obstacles impeding its emergence. (i) In the simplest model
(Fig. (Fig.1),1 We view this paper as a contribution toward formalizing the laws
that governed the evolution of the primordial human languages. There
are, of course, many important and more complex properties of human
language that we have not considered here and that should ultimately be
part of an evolutionary theory of language. We argue, however, that any
such theory has to address the basic questions of signal–object
association, word formation, and the emergence of a simple syntax or
grammar, for these are the atomic units that make up the edifice of
human language. Acknowledgments Thanks to Dominic Welsh, Sebastian Bonhoeffer, Lindi Wahl, Nick
Grassly, and Robert May for stimulating discussion. Support from The
Alfred P. Sloan Foundation, The Florence Gould Foundation, The Ambrose
Monell Foundation, and the J. Seward Johnson Trust is gratefully
acknowledged. Appendix Consider two objects, O1
and O2, that can cooccur with two
actions, A1 and
A2. Thus, there are four events,
O1A1,
O2A1,
O1A2, and
O2A2. The
nongrammatical approach is to describe each event with a separate word,
W1–W4.
The grammatical approach is to have separate words for objects,
N1 and
N2, and actions,
V1 and
V2. Consider mixed strategies that use
the grammatical system with probability x. The active
matrix, P, is given by
The system can be completely understood in analytic terms. The payoff
for language x communicating with language y is
given (with Eq. 2) by: F
(x,y) = (2 − x −
y)f1 + (x +
y)f2, where
f1 = 4/(1 + 3ξ) and
f2 = 4/(1 +
ξ)2. These equations hold for x and
y between 0 and 1. Otherwise, we have
F(x,0) =
F(0,x) = (2 −
x)f1 and
F(x,1) = F (1,x)
= (1 +
x)f2. The payoffs for
nongrammar and grammar are, respectively, F(0,0)
= 2f1 and
F(1,1) =
2f2. Because
f1 < f2 and
f2 <
2f1, we have the following interesting
dynamics: both x = 0 and x = 1 are
evolutionarily stable strategies that cannot invade any other strategy,
but every mixed strategy, x, is invaded and replaced by
every other strategy, y, if x <
y < 1. Thus, the adaptive dynamics flow toward
grammar. Alternatively, one can also assume that the pure strategies
can understand each other, that is, the passive matrices of all
strategies are the same; in this case, grammar (x = 1)
is the only evolutionarily stable strategy and can beat every other
strategy. References 1. Chomsky N. Language and Mind. New York: Harcourt Brace Jovanovich; 1972. 2. Pinker S. The Language Instinct. New York: Morrow; 1994. 3. Eco U. The Search for the Perfect Language. London: Fontana; 1995. 4. Seyfarth R, Cheney D, Marler P. Science. 1980;210:801–803. [PubMed] 5. Burling R. Curr Anthropol. 1989;34:25–53. 6. Cheney D, Seyfarth R. How Monkeys See the World. Chicago: Univ. of Chicago Press; 1990. 7. Greenberg J H. Language, Culture and Communication. CA: Stanford Univ. Press; 1971. 8. Cavalli-Sforza L L, Cavalli-Sforza F. The Great Human Diasporas. Reading, MA: Addison–Wesley; 1995. 9. Newport E. Cogn Sci. 1990;14:11–28. 10. Bates E. Curr Opin Neurobiol. 1992;2:180–185. [PubMed] 11. Hurford J R. Cognition. 1991;40:159–201. [PubMed] 12. Lieberman P. The Biology and Evolution of Language. Cambridge, MA: Harvard Univ. Press; 1984. 13. Nobre A, Allison T, McCarthy G. Nature (London). 1994;372:260–263. [PubMed] 14. Aboitiz F, Garcia R. Brain Res Rev. 1997;25:381–396. [PubMed] 15. Hutsler J J, Gazzaniga M S. Neuroscientist. 1997;3:61–72. 16. Deacon T. The Symbolic Species. London: Penguin; 1997. 17. Cavalli-Sforza L L, Feldman M W. Cultural Transmission and Evolution: A Quantitative Approach. Princeton: Princeton Univ. Press; 1981. 18. Yasuda N, Cavalli-Sforza L L, Skolnick M, Moroni A. Theor Popul Biol. 1974;5:123–142. [PubMed] 19. Aoki K, Feldman M W. Proc Natl Acad Sci USA. 1987;84:7164–7168. [PubMed] 20. Aoki K, Feldman M W. Theor Popul Biol. 1989;35:181–194. [PubMed] 21. Cavalli-Sforza L L. Proc Natl. Acad Sci USA. 1997;94:7719–7724. [PubMed] 22. Hurford J R. Lingua. 1989;77:187–222. 23. Von Frisch K. The Dance Language and Orientation of Bees. Cambridge, MA: Harvard Univ. Press; 1967. 24. Hauser M D. The Evolution of Communication. Cambridge, MA: Harvard Univ. Press; 1996. 25. Chomsky N. Rules and Representations. New York: Columbia Univ. Press; 1980. 26. Bickerton D. Language and Species. Chicago: Univ. of Chicago Press; 1990. 27. de Saussure F. Cours de Linguistique Generale. Paris: Paycot; 1916. 28. Pinker S, Bloom P. In: The Adapted Mind: Evolutionary Psychology and the Generation of Culture. Barkow J, Cosmides L, Tooby J, editors. London: Oxford Univ. Press; 1992. pp. 451–493. 29. Dunbar R. Grooming, Gossip and the Evolution of Language. Cambridge, MA: Harvard Univ. Press; 1997. 30. MacLennan B. In: Artificial Life II: SFI Studies in the Sciences of Complexity. Langton C G, Taylor C D F, Rasmussen S, editors. Redwood City, CA: Addison–Wesley; 1992. pp. 631–658. 31. Hutchins E, Hazelhurst B. How to Invent a Lexicon: The Development of Shared Symbols in Interaction. London: UCL; 1995. 32. Akmajian A, Demers R A, Farmer A K, Harnish R M. Linguistics: An Introduction to Language and Communication. Cambridge, MA: MIT Press; 1997. 33. Hurford J R, Studdert-Kennedy M, Knight C. Approaches to the Evolution of Language. Cambridge, U.K.: Cambridge Univ. Press; 1998. 34. Parisi D. Brain Cogn. 1997;34:160–184. [PubMed] 35. Steels L. Evol Commun J. 1997;1(1):1–34. 36. Oliphant M. BioSystems. 1996;37:31–38. [PubMed] 37. Maynard Smith J, Szathmary E. The Major Transitions in Evolution. New York: Freeman; 1995. 38. Maynard Smith J. Evolution and the Theory of Games. Cambridge, U.K.: Cambridge Univ. Press; 1982. 39. Smith W J. The Behavior of Communicating. Cambridge, MA: Harvard Univ. Press; 1977. 40. Eigen M, Schuster P. The Hypercycle: A Principle of Natural Self-Organisation. Berlin: Springer; 1979. 41. Szathmary E, Maynard Smith J. Nature (London). 1995;374:227–232. [PubMed] 42. Nowak M A, Sigmund K. Acta Appl Math. 1990;20:247–265. 43. Metz J A J, Geritz S A H, Meszena F G, Jacobs F J A, van Heerwaarden J S. In: Stochastic and Spatial Structures of Dynamical Systems. Van Strien S J, Verduyn Lunel S M, editors. Amsterdam: North Holland; 1996. pp. 183–231. 44. Hofbauer J, Sigmund K. Evolutionary Games and Replicator Dynamics. Cambridge, U.K.: Cambridge Univ. Press; 1998. 45. von Humboldt W. Ueber die Verschiedenheit des Menschlichen Sprachbaus. Bonn: Dummlers; 1836. |
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||||||||||||||
Science. 1980 Nov 14; 210(4471):801-3.
[Science. 1980]Cognition. 1991 Sep; 40(3):159-201.
[Cognition. 1991]Proc Natl Acad Sci U S A. 1997 Jul 22; 94(15):7719-24.
[Proc Natl Acad Sci U S A. 1997]Nature. 1995 Mar 16; 374(6519):227-32.
[Nature. 1995]Science. 1980 Nov 14; 210(4471):801-3.
[Science. 1980]