graphically communicate a series of confusable items (e.g., Arnold Schwarzenegger,. Brad Pitt, Russell Crowe) in such a way that their partner could identify ...
Can iterated learning explain the emergence of graphical symbols? Simon Garrod1,2, Nicolas Fay2, Shane Rogers2, Bradley Walker2 & Nik Swoboda3 1University
of Glasgow/2University of Western Australia/3Universidad Politecnia de Madrid This paper contrasts two influential theoretical accounts of language change and evolution – Iterated Learning and Social Coordination. The contrast is based on an experiment that compares drawings produced with Garrod et al.’s (2007) ‘pictionary’ task with those produced in an Iterated Learning version of the same task. The main finding is that Iterated Learning does not lead to the systematic simplification and increased symbolicity of graphical signs produced in the standard interactive version of the task. A second finding is that Iterated Learning leads to less conceptual and structural alignment between participants than observed for those in the interactive condition. The paper concludes with a comparison of the two accounts in relation to how each promotes signs that are efficient, systematic and learnable. Keywords: Graphical communication, Interaction, Iterated learning, Evolution of communication
Introduction
In recent years there has been a resurgence of interest in the science of language evolution, as reflected in publications in major international journals such as Science and Nature (e.g., Atkinson et al., 2008; Lieberman, 2007). Much of the theory behind this work comes from computer modelling of language evolution assuming a process of either cultural evolution or genetic-cultural co-evolution. One influential model assumes an evolutionary principle analogous to iterated learning in which the language is transmitted vertically down generations of speakers. Other accounts attribute evolution to processes of social coordination within communities of communicators. In this paper we report an experiment that contrasts the two types of account using a communication task that tracks the evolution of graphical signs. The main finding from this experiment is that certain phenomena Interaction Studies (),–.:E?/is..gar ?IID–/; ?IID– © John Benjamins Publishing Company
Simon Garrod, Nicolas Fay, Shane Rogers, Bradley Walker & Nik Swoboda
associated with the rapid evolution of graphical signs in an interactive setting do not manifest themselves in an iterated learning version of the task. First, we briefly discuss the principal theoretical approaches to the evolution of language. Then we consider recent experimental approaches that investigate the evolution of complex non-linguistic communication systems which highlight the importance of interactive communication. Then we report the experiment itself.
Theoretical approaches to the evolution of language
By analogy with biological evolution, languages may evolve as a result of genetic (evolutionary) changes which then affect language learning processes (Pinker, 1994). For example, Dedieu and Ladd (2007) show that the marked geographical distribution of tone languages is a result of the distribution of recently evolved alleles of the brain growth and development genes ASPM and Microcephalin. They suggest that this reflects genetic influences on the ability of their owners to easily acquire the tone languages. However, genetic change can only affect the evolution of language over very long time-scales and recent modelling work indicates that it is insufficient to account for most historical changes that have occurred in human languages (Chater, Reali & Christiansen, 2009; Berwick, 2009). This has led to consideration of a different kind of mechanism termed cultural evolution (Christiansen & Kirby, 2003; Kirby, Dowman, & Griffiths, 2007). Just as biological adaptation optimizes individual learning, cultural evolution (through the transmission of language across and within generations of speakers) may refine language systems by linguistic selection. There are two influential modelling approaches to cultural evolution of language. The first assumes that languages evolve as a consequence of vertical transmission from adults to children. For example, the Iterated Learning Model (Kirby, 2002; Kirby & Hurford, 1997, 2002), assumes that as language is transmitted from generation to generation it is incrementally influenced by agents’ innate learning biases and constraints on transmission until the language reaches an equilibrium that reflects these prior linguistic biases. Simulations indicate that under the right conditions iterated learning across many generations will lead to the emergence of structured languages. A typical simulation starts out with a small population of agents with two agents in the system at any one time. One represents an adult ‘speaker’ the other a child ‘learner’. The learner is exposed to strings of characters produced by the speaker and acquires mappings between these signals and their meanings – pairs of features that can take different values. Initially, the signal mappings produced by the speaker are holistic and unstructured. However, the learner can use this data (signal-meaning © 2010. John Benjamins Publishing Company All rights reserved
Can iterated learning explain the emergence of graphical symbols
mappings) to induce their own representation of the language system which they subsequently produce as adult speakers for the next generation of child learners. Because learners are only exposed to a sub-set of the adult’s productions the ‘languages’ change during transmission across generations of speaker-learners. This data bottleneck turns out to be crucial for the emergence of structured languages. If it is too wide (e.g., exposure to all the adults productions) the languages do not change and remain essentially random. If it is too narrow the languages become highly unstable. However, if it is neither too wide nor too narrow the languages start to become compositional and the signals decompose into smaller units representing different aspects of meaning. This is presumably because compositional expressions are more likely to ‘survive’ intergenerational transmission than holistic expressions. In other words, constraints on the learning process lead to an adaptation of the language. In this way iterated learning together with an appropriate data bottleneck can simulate the emergence of structure in language. In addition to computer simulations there is recent experimental evidence with human participants that also shows that iterated learning (under the right circumstances) can lead to the emergence of compositional artificial languages (Kirby, Cornish & Smith, 2008: see also Cornish this issue). Notice that this account does not depend upon any interaction between agents, it only depends upon one-way vertical transmission. The second kind of model of cultural evolution assumes that communication systems arise through horizontal transmission among communities of language users. For example, Steels and colleagues’ social coordination simulations (Steels, 1999; Steels, Kaplan, McIntyre, & Van Looveren, 2002) show that when meaning is locally negotiated, a dominant representation propagates horizontally among a population of interacting agents until the entire community converges upon a shared communication system (see also Barr, 2004). Crucially, this model depends upon interaction and feedback between computer agents. This paper sets out to contrast the two accounts, but in the context of the evolution of graphical as opposed to linguistic communication systems. In particular we concentrate on a feature of the evolution of complex graphical communication systems which relates to the emergence of arbitrary symbolic graphical signs. To illustrate the issue we turn next to work on graphical communication and show how this can be used to address questions about the evolution of human communication systems.
Experiments on the evolution of graphical communication
Graphical communication tasks can be especially useful in studying the emergence and evolution of communication systems. This is because such tasks allow us to © 2010. John Benjamins Publishing Company All rights reserved
Simon Garrod, Nicolas Fay, Shane Rogers, Bradley Walker & Nik Swoboda
look at how complex communication systems evolve in the absence of a preestablished system of signs. For example, Galantucci (2005) had pairs of participants communicate their positions on a grid using only graphical means. He found that with time most participants were able to do this even though the graphical interface did not allow them to produce any kind of conventional signs. In other studies participants attempted to communicate one of two pieces of music using purely graphical means (Healy, Swoboda, Umata & King, 2007), or specific items from a list (Garrod, Fay, Lee, Oberlander & McLeod, 2007) by drawing on a whiteboard. In all of these studies participants were not permitted to use conventional language, either spoken or written. Hence, they needed to create a novel communication system from scratch. A consistent finding across studies was that participants rapidly established complex communication systems, and that this process depended crucially on interaction and feedback. The communication paradigm that we concentrate on in this paper was developed by Garrod et al. (2007) and was based on the parlour game ‘Pictionary’. They had participants communicate a series of easily confusable items (e.g., Art gallery, Museum, Drama, Soap Opera) by drawing on a standard whiteboard. Like the game Pictionary, participants were not allowed to use spoken or written language. In one condition pairs communicated a series of recurring items, alternating between drawing and identifying roles from one game to the next. The changing form of the signs used to convey ‘Clint Eastwood’ across six games of the task are shown in Figure 1. What begins as an iconic depiction of the character evolves into
Game 1
Game 2
Game 3
Game 4
Game 5
Game 6
Figure 1. Drawing refinement and convergence for the concept ‘Clint Eastwood’ across six games between a pair of interlocutors playing the Pictionary task (adapted from Garrod et al., 2007)
© 2010. John Benjamins Publishing Company All rights reserved
Can iterated learning explain the emergence of graphical symbols
a simplified symbolic form (outline of hat plus triangle). Crucially, such refinement only occurred when partners were allowed to interact graphically (even if this only involved placing a tick next to the drawing to indicate comprehension). In a control condition in which the participants simply repeated their drawings for an imaginary audience the drawings became more complex and retained their iconic character. A second finding was that with extended interaction communicators’ drawings became increasingly similar, or convergent (See Figure 1). Thus, in this task graphical communication systems emerged and evolved through a process of interactive grounding. Specifically, Garrod et al. (2007) argued that icons rapidly evolve into symbols via interaction; icons help ground shared sign systems and interaction promotes a shift in the locus of information from the sign to the users’ memory of the sign’s usage. This shift in information facilitates the evolution of increasingly simple abstract signs that are easy for communicators to produce and interpret. So this graphical communication experiment might tell us something about how communicators overcome the symbol grounding problem (i.e., how language systems might have emerged out of meaningless symbols; Harnad, 1990). The basic argument is that the symbols evolve through the combination of convergence on increasingly simple forms together with alignment on their meaning. However, this kind of dyadic communication task only addresses the emergence of local communication systems developed by isolated pairs of communicators. To generalise the task Fay, Garrod, Roberts and Swoboda (in press; See also Fay, Garrod & Roberts, 2008) devised a community based version of the game (see Garrod & Doherty, 1994 for a community language version). They created four 8-person laboratory communities, or microsocieties, via the one-to-one interactions of partners drawn from the same pool. Participants played six consecutive games with a partner, where each game contained the same to-be-communicated items (16 targets plus 4 distracters, presented in a different random order on each game) that were known to both partners. As in the previous example, drawing and identifying roles alternated from game to game. Participants then switched partners and played a further 6 games with a new partner, and continued to do so until they had interacted with each of the other community members. In other words they eventually constituted a completely connected communication network. This Community condition was contrasted with an Isolated Pair condition, in which participants interacted with the same partner over the same number of games (i.e., 42 games). The task was administered using a virtual whiteboard tool (Healy, Swoboda, & King, 2002), with each participant seated at a computer terminal and drawing input and item selection made via a standard mouse. Crucially, participants were unaware of the identity of their partner in any round. Figure 2 illustrates the global and local evolution of the sign representing ‘Brad Pitt’ within a single Community and a corresponding number of Isolated Pairs. © 2010. John Benjamins Publishing Company All rights reserved
Simon Garrod, Nicolas Fay, Shane Rogers, Bradley Walker & Nik Swoboda
Community drawings at Round 1
Community drawings at Round 7
Pair drawings at Round 1
Pair drawings at Round 7
Figure 2. Drawing refinement and convergence for the concept ‘Brad Pitt’ among a Community and between Isolated Pairs at Round 1 and Round 7 of the Pictionary task (from Fay et al., in press). Participant numbers are given in bold on the top right of the drawing
The first drawings of ‘Brad Pitt’ (Round 1) illustrate the diversity of graphical signs; some indicate his American origins, others his frequent casting as a ladies man, while others use the rebus principle to represent part of the test item (Community members 5 and 6 draw a large hole in the ground to convey a ‘pit’, whereas Isolated Pair member 4 draws an arrow pointing at an arm ‘pit’). Drawing diversity at Round 1 in the Community condition contrasts sharply with drawing uniformity © 2010. John Benjamins Publishing Company All rights reserved
Can iterated learning explain the emergence of graphical symbols
at Round 7, where all Community members have globally converged on a refined version of person 5’s initial ‘pit’ drawing. Unlike Community members, Isolated Pairs locally converged on a shared sign system, but globally diverged across games. Note that in both conditions groups arrived at a series of signs of equal visual complexity. To summarise, experiments using the pictionary task demonstrate how graphical communication systems (i.e., systems of graphical signs) naturally evolve as a result of interactive communication. In particular, the signs change from complex iconic representations to much simpler symbolic representations as a consequence of grounding. Furthermore this evolutionary process is apparent both within isolated pairs of graphical communicators and within micro-societies who interact like a small linguistic community. This evolution is particularly interesting because it offers one explanation for how communicators overcome the symbol grounding problem. Next, we ask whether this kind of semiotic evolution can arise as a consequence of vertical transmission and iterated learning in the same way that it occurs with horizontal transmission and interactive grounding.
Comparing iterated learning with interactive communication
One prediction from iterated learning models is that as information is passed down over generations of teachers and learners so its representation converges on the dominant representation for that population. This has been clearly shown in function learning tasks. For example, Kalish, Griffiths and Lewandowsky (2007) had people learn one of four mathematical functions (positive linear, negative linear, U-shaped and random) by giving them corrective feedback after they had guessed the y-value associated with a given x-value. The participants then completed a test phase (a combination of prior training values and novel values, without corrective feedback) whose output then served as training data for the next generation. This procedure continued across nine generations. Irrespective of the data seen by the first learner on each transmission chain, after only a few generations there was a marked tendency for all participants to converge on a positive linear function. Thus, the dominant prior for positive linear functions was amplified across generations until all agents eventually converged on it. This iterative evolution process might explain the evolution of graphical representations in the pictionary task. As representations are repeatedly generated by different members of the community so those representations should tend to converge on the form that reflects the strongest prior for that population of communicators. On the assumption that the dominant prior is a simple efficient form, this would lead to the emergence over iterations of simpler forms. In other words, the simplification process might occur in the absence of interactive feedback, even © 2010. John Benjamins Publishing Company All rights reserved
Simon Garrod, Nicolas Fay, Shane Rogers, Bradley Walker & Nik Swoboda
though this did not occur when a single drawer repeatedly produced drawings for an imaginary audience. To test this alternative account we devised an iterated diffusion chain version of the pictionary task to compare the drawings produced in this version of the task with those produced by matched interacting pairs.
Experiment
The experiment contrasted situations in which pairs performed the standard pictionary task with matching chains of participants in an iterated learning version of the task.
Participants
One hundred and five undergraduate students participated in exchange for partial course credit. Participants were randomly allocated to one of two conditions: (1) interacting pairs (15 pairs) or (2) iterated learning chain (15 5-person chains).
Task and procedure
The experimental task is a graphical analogue of a verbal referential communication task (Krauss & Weinheimer, 1964, 1967). The goal for each participant was to graphically communicate a series of confusable items (e.g., Arnold Schwarzenegger, Brad Pitt, Russell Crowe) in such a way that their partner could identify their intended referent. Like the game Pictionary, participants were prohibited from using letters or numbers in their drawings. The director would draw each item from their ordered list (16 targets plus 4 distracters; see Table 1 for a complete listing) and their partner, the matcher, tried to identify each item from their unordered list of the same items. In the interacting pair condition (interactive) participants played 6 consecutive games of the Pictionary-like task with the same partner, using the same item set on each game (the same target and distracter items were presented in a different random order on each game). In this condition participants alternated between directing and matching roles from game to game (i.e., one participant was the director on games 1, 3 and 5 and the matcher on games 2, 4 and 6). Importantly, irrespective of directing or matching role, participants were able to graphically interact within any trial. Thus, a matcher might provide feedback to the director by annotating part of their depiction or by offering a graphical alternative. In the iterated learning condition (chain), participants’ graphical signs were passed along a 6-generation diffusion chain. Each diffusion chain was initiated with the first drawing of each experimental item produced by directors in the interacting pair condition (i.e., graphical signs produced by directors from pairs © 2010. John Benjamins Publishing Company All rights reserved
Can iterated learning explain the emergence of graphical symbols
1 to 15 at game 1). Thus, at game/generation 1 the interacting pair and diffusion chain condition was comparable. Generation 2 members then tried to identify each drawing before graphically communicating each item (in a different random order) for generation 3. This continued across 6 generations in the diffusion chain condition. Unlike participants in the interaction condition, who were able to graphically interact across six games of the task, diffusion chain participants attempted to identify the referent of each graphical sign once and produced each graphical sign once before it was passed along the chain. Furthermore, there was no opportunity for interaction between adjacent members of diffusion chains (all drawing activity was recorded using a virtual whiteboard tool and played back to the next member of the diffusion chain). Table 1. The set of items that directors communicated to matchers (distracter items given in italic). Target and distracter items were fixed across conditions and throughout the experiment PB79;I
P;EFB;
EDJ;HJ7?DC;DJ
E8@;9JI
78IJH79J
Art Gallery
Arnold Schwarzenegger Brad Pitt Hugh Grant Russell Crowe
Cartoon
Computer Monitor Microwave Refrigerator Television
Homesick
Parliament Museum Theatre
Drama Sci-Fi Soap Opera
Loud Poverty Sadness
The task was administered using a virtual whiteboard tool (Healy, Swoboda, & King, 2002). Each participant sat at a computer terminal where drawing input and item selection was made via a standard mouse. For the director, each to-be-depicted item was highlighted in white text. Holding down the left mouse button initiated drawing. Director drawing was restricted to black ink, whereas matcher drawing was restricted to green ink (in order to distinguish original drawing from feedback). By clicking an erase button on the interface participants were able to erase parts of their own drawing and their partner’s drawing (only in the interacting pair condition). All drawing and erasing activity was displayed simultaneously on the director and matcher’s shared whiteboards. When the matcher believed they had identified the director’s intended referent they clicked a ‘Got It’ button on the interface. Doing so activated the list of competing referents, allowing the matcher to make their selection. Item selection brought the current trial to an end and initiated the next trial. With the exception of the feedback option the drawing and identification process was identical for participants in both the interaction and chain conditions. Participants were given no explicit feedback with regard to their communication success in either condition. Finally, having participants communicate remotely across networked computers ensured they were unaware of the identity of their partner. © 2010. John Benjamins Publishing Company All rights reserved
Simon Garrod, Nicolas Fay, Shane Rogers, Bradley Walker & Nik Swoboda
Results
Examples of drawings of ‘parliament’ are shown in Figure 3 for one pair in the interactive condition (3a) and the matched sequence of participants in the iterated learning chain condition (3b). As can be seen in the figure whereas there is a marked reduction in drawing complexity for the interactive pair there is no such
Game 1
Game 2
Game 3
Game 4
Game 5 (a)
Game 6
Generation 1
Generation 2
Generation 3
Generation 4
Generation 5
Generation 6
(b)
Figure 3a and b. Drawing refinement and convergence for the concept ‘Parliament’ (a) across six games between an interacting pair and (b) across six generations in a diffusion chain. Participant numbers are given in bold on the top right of each drawing
© 2010. John Benjamins Publishing Company All rights reserved
Can iterated learning explain the emergence of graphical symbols
reduction across the iterated learning chain. Notice also how the drawings in the interactive condition converge over time in a way that those in the iterated learning chain do not (e.g., compare drawings from games 5 & 6 in both conditions). Below we consider the results in more detail, in terms of identification accuracy, drawing refinement and convergence of drawings over trials. Identification accuracy The first analysis looked at identification accuracy between pairs of interacting partners (interactive) and non-interacting members of iterated learning chains (chain). The graph in Fig. 4 shows the change in identification accuracy (%) over games. The results clearly show a steady increase in identification accuracy across games for both the interacting pairs and the iterated learning chains. These observations are confirmed by analysis of variance (ANOVA). Because the interactive condition contains 6 data points (game 1–6) and the chain condition contains only 5 data points (generations 1–5) the ANOVA was run across game 1 to 5 for both task conditions. 100 95
Accuracy (%)
90 85 80
Interactive Chain
75 70 65 60 55 50
1
2
3
4
5
6
Game
Figure 4. Identification accuracy for players in the interactive pictionary task (Interactive) compared to accuracy of players in the iterated learning chain (Chain) across games (1–6 for interactive; 1–5 for chain)
For simplicity the data were analyzed by item (whereas items remain constant between interactive and chain conditions participants do not). Percent correct scores were entered into a within items design Linear Trends ANOVA with Condition (interactive, chain) and Game (1–5) as factors. This returned a main © 2010. John Benjamins Publishing Company All rights reserved
Simon Garrod, Nicolas Fay, Shane Rogers, Bradley Walker & Nik Swoboda
effect of Condition [F1,15 = 30.58, ηp2 = 0.67, p < .05] and Game [F1,15 = 49.86, ηp2 = 0.77, p < .05] but no Condition by Game interaction [F