Towards an Integrated View of Learning From Text and ... - CiteSeerX

P1: FYJ Educational Psychology Review [jepr]

PP317-edpr-363440

November 24, 2001

8:41

Style file version Nov. 19th, 1999

C 2002) Educational Psychology Review, Vol. 14, No. 1, March 2002 (°

Commentary

Towards an Integrated View of Learning From Text and Visual Displays Wolfgang Schnotz1,2

Visuo-spatial text adjuncts such as static or animated pictures, geographic maps, thematic maps, graphs, and knowledge maps that have been analyzed in the articles contained in this special issue provide complex pictorial information that complements the verbal information of texts. These spatial text adjuncts are considered as depictive representations that can support communication, thinking, and learning. An essential precondition of this supportive function is that the visuo-spatial displays interact appropriately with human visual perception and the individual’s cognitive system, which is characterized by prior knowledge, cognitive abilities, and learning skills. Accordingly, effective learning with visuo-spatial text adjuncts can be fostered by instructional design and by adequate processing strategies, both dependent on sufficient understanding of how the human cognitive system interacts with these displays. Perspectives for further research in this area are provided. KEY WORDS: visual displays; spatial displays; adjunct displays; spatial cognition; representations.

Visual displays play an increasingly important role not only in our daily life, but also in the field of learning and instruction where instructional materials today include more pictures, diagrams, and graphs than a few decades ago. From a historical perspective, the use of pictorial information in learning and instruction has a long tradition. In the seventeenth century, Comenius (influenced by John Locke’s sensualism) published his “Didacta Magna,” 1 Department of General and Educational Psychology, University of Koblenz – Landau, Landau,

Germany.

2 Correspondence

should be addressed to Wolfgang Schnotz, Department of General and Educational Psychology, University of Koblenz – Landau, Thomas-Nast-Street 44, D-76829 Landau, Germany; e-mail: [email protected]. 101 C 2002 Plenum Publishing Corporation 1040-726X/02/0300-0101/0 °


PP317-edpr-363440

November 24, 2001

102

8:41


Schnotz

which emphasized that envisioning information is extremely important for effective learning. Scholars of educational science have followed these basic ideas for centuries. However, it is only since the 1970s that these ideas have been investigated in a systematic way on an empirical basis. The articles in this special issue provide an excellent survey of this research especially with regard to the last decade. They focus on rather different kinds of visual displays: static and animated illustrations, geographic, thematic and knowledge maps, and graphs. These visuals look very different and can serve rather different purposes. Nevertheless, all of these spatial text adjuncts have supportive effects on communication, thinking, and learning. The articles of this volume follow a common intention: They specify under which conditions and why these effects take place. To attain an integrated picture of the empirical results and the underlying theoretical concepts, I consider some representational issues with regard to visuo-spatial text adjuncts. Then, I briefly analyze the interplay of these spatial displays with human visual perception and higher order cognitive processing. Visual diplays are considered tools for communication, thinking, and learning that require specific individual prerequisites (especially prior knowledge and cognitive skills) in order to be used effectively. Based on this analysis I discuss instructional consequences with regard to design issues and with regard to processing strategies. Finally, I discuss further perspectives for research on learning from text and visuals.

REPRESENTATIONAL ISSUES Symbols and Icons Representations are objects or events that stand for something else (Peterson, 1996). Texts and visual displays are external representations. These external representations are understood when a reader or observer constructs internal mental representations of the content described in the text or shown in the picture. Comprehension is usually task-oriented. That is, the mental construction is performed by the individual in a way that allows him or her to deal effectively with current or anticipated requirements. In other words, comprehension of text and pictures is a task-oriented construction of mental representations. Text and visual displays are based on different sign systems. A fundamental distinction between different sign systems was introduced by Peirce (1906): the differentiation between symbols and icons. According to Peirce, symbols have an arbitrary structure and are associated with the designated object by a convention. Words and sentences of natural language are


PP317-edpr-363440

November 24, 2001

Towards an Integrated View of Learning From Text and Visual Displays

8:41


103

examples of symbols. Icons, on the contrary, do not have an arbitrary structure. Instead, they are associated with the designated object by similarity. Accordingly, all kinds of static as well as animated realistic pictures (or pictorial illustrations, respectively) and all kinds of geographic maps can be considered icons. However, graphs and knowledge maps do not possess similarity with what they represent, and parts of their structure are specified by convention. One could therefore argue that they are symbols rather than icons. Nevertheless, graphs and knowledge maps have more in common with icons than with symbols. This becomes obvious if one characterizes icons in a more general way: Icons can be defined as signs that are associated with their designated object by common structural properties. Similarity, then, is only one kind of structural commonality that is typical for realistic pictures, pictorial illustrations, and geographic maps. Graphs, on the contrary, are characterized by a more abstract kind of structural commonality with the designated object. Knowledge maps that visualize the macrostructure of a learning content can be considered a pictorial display of the corresponding knowledge structure.

Descriptive and Depictive Representations According to the different sign systems on which they are based, texts and visual displays belong to different classes of representations: descriptive and depictive representations. Texts (as well as mathematical equations, e.g.) are descriptive representations. A descriptive representation consists of symbols that have an arbitrary structure and that are associated with the content they represent simply by means of a convention. If we describe something in a text, we use nouns to refer to its parts and we use verbs and prepositions to relate these parts to each other. Visual displays, on the contrary, are depictive representations. A depictive representation consists of iconic signs. These signs are associated with the content they represent through common structural features on either a concrete or more abstract level. Representations can differ from one another with respect to their informational content and their usability. The informational content of a representation is the set of information that can be extracted from the representation with the help of available procedures (Palmer, 1978). Thus, the informational content of a representation depends on both its structure and on the procedures that operate on the structure. Two representations are informationally equivalent if every information item that can be taken from one representation can also be taken from the other representation (Larkin and Simon, 1987). A piece of information can be relevant for some tasks and


104

PP317-edpr-363440

November 24, 2001

8:41


Schnotz

irrelevant for other tasks, so it is possible to define the informational content of a representation with respect to a specific set of tasks. Accordingly, two representations are (in a task-specific sense) informationally equivalent if both allow the extraction of the same information required to solve the specific tasks. When two representations are informationally equivalent they can nevertheless differ in their usefulness. Representations are used to retrieve information about what they represent. Depending on the structure of the representation and the processes operating on it, information retrieval (which often means the computation of new information) can be easy or difficult. Representations, which are not only informationally equivalent, but also equivalent in terms of retrieving information, are referred to as computationally equivalent (Larkin and Simon, 1987). Two representations are (in a task-specific sense) computationally equivalent if each task-relevant information can be retrieved from one representation as easily as from the other representation. Shah and Hoeffner (this issue) address this issue with regard to graph design. They argue that there is no specific graph format that is generally better then others. Designing graphs or any other external representations always requires taking into account the interplay between the representation and the task demands. The relevant questions are: What kind of procedures have to be performed to solve the task, and how easily can these procedures be performed with the given representation structure? Descriptive representations and depictive representations have different uses for different purposes. Descriptive representations have a higher representational power than depictive representations. For example, there is no problem in a descriptive representation to express a general negation (No pets allowed !) or a general disjunction (Seat reserved for infirm people and for mothers with babies). In a depictive representation, however, one can express only specific negations (e.g., a picture showing a dog combined with a prohibitive sign). Disjunctions are depicted through a series of pictures (e.g., a picture showing an old man plus a picture showing a mother with her baby). On the other hand, depictive representations encompass a specific class of information in its entirety. For example, it is possible to read from a geometric figure (such as a triangle) all its geometric properties. Similarly, a picture of an object is not limited to information about its form, but also has information about its size and its orientation in space. In contrast, in a description it is possible to mention only a few geometric characteristics of a figure or to specify only the form of the object without providing information about its size or orientation. Accordingly, depictive representations are especially useful to gain new information from already known information. A depiction constructed on the basis of already known


PP317-edpr-363440

November 24, 2001


8:41


105

information contains further information that has not been made explicit so far (Kosslyn, 1994). If one draws a triangle based on information about two sides and one angle, one can read the size of the third side, the size of the other two angles, the area of the triangle, and many more geometric characteristics. The new information is not generated in the sense of a logical conclusion, but rather can be read directly from the representation (JohnsonLaird, 1983). These have sometimes been called pseudo-inferences (Garrod, 1985).

Mental Representations The distinction between descriptions and depictions can be applied not only to external representations such as texts and pictures, but also to internal mental representations, which are constructed during text and picture comprehension. Current approaches in text comprehension research assume that in understanding a text a reader constructs multiple mental representations. The representations include a surface representation of the text, a propositional text base, a mental model of what the text is about, a communication level, and a genre level (Graesser et al., 1997). The text surface representation includes the detailed linguistic information, such as the specific words, phrases, and syntactic structures. The text base represents the semantic content of the text in the form of propositions. The mental model represents the referential content of the text. In narrative texts this is frequently referred to as a situation model (van Dijk and Kintsch, 1983). The mental model is constrained both by the text base and by domain-specific world knowledge. The communication level represents the pragmatic context of the communication between reader and writer. The genre level captures knowledge about the class of text and its corresponding text function. Evidence for a differentiation between the surface code, the text base, and the mental model level has been found in several investigations (Kintsch et al., 1990; Schmalhofer and Glavanov, 1986). In picture comprehension, the individual also constructs multiple mental representations. These include a surface structure representation, a mental model, a propositional representation as well as a communication level and a genre level representation. The surface structure representation corresponds to the perceptual (visual) image of the picture in the individual’s mind. The mental model represents the subject matter shown in the picture on the basis of common structural features (i.e., based on an analogy) between the picture and its referential content. The propositional representation contains information that is read from the model and that is encoded in a propositional format. The communication level represents the


106

PP317-edpr-363440

November 24, 2001

8:41


Schnotz

pragmatic context of the pictorial communication, whereas the genre level represents knowledge about the class of pictures and their corresponding functions. Propositional representations, whether constructed during text comprehension or during picture comprehension, are descriptive representations. They consist of internal symbols, which can be decomposed, similar to sentences of natural language, into simple symbols. Accordingly, propositional representations are symbolic representations. Propositional representations can be viewed as internal descriptions in the language of the mind (Chafe, 1994). The perceptual images created as surface structure representations during picture comprehension are internal depictive representations. They retain structural characteristics of the picture and use these inherent structural characteristics as a means of representation. Perceptual images created in picture comprehension are sensory specific because they are linked to the visual modality. The proximity of these images to perception can be attributed to the fact that visual images and visual perceptions are based on the same cognitive mechanisms (Kosslyn, 1994). Mental models, whether constructed during picture comprehension or during text comprehension, are also internal depictive representations, as they have inherent structural features in common with the depicted object. That is, they represent the object based on a structural or functional analogy (Johnson-Laird, 1983; Johnson-Laird and Byrne, 1991). Such an analogy does not imply that mental models represent only spatial information. A mental model can represent, for example, also the increase or decrease of birth rates or incomes during a specific period of time (as it can be described in a text or displayed in a line graph), although birth rates and incomes are certainly not spatial information. Contrary to visual images, mental models are not sensory specific. For example, a mental model of a spatial configuration (say, of a room) can be constructed not only by visual perception, but also by auditory, kinesthetic, or haptic perception. Because mental models are not bound to specific sensory modalities, they can be considered as more abstract than perceptual images. On the one hand, a mental model constructed from a picture contains less information than the corresponding visual image because of its abstraction. That is, irrelevant pictorial details that are included in the visual image are omitted from the mental model. On the other hand, the mental model contains more information than the corresponding visual image because it also includes prior knowledge that is not present in the visual perception. For example, a mental model of a brake can contain information about causal relationships that are not explicitly included in the corresponding picture of the brake (Mayer, 1997).


PP317-edpr-363440

November 24, 2001

8:41



107

THEORIES OF LEARNING FROM TEXTS AND VISUAL DISPLAYS Dual Coding Theory, Conjoint Processing Theory, and Multimedia Learning Theory While text comprehension has been investigated rather intensively during the last 25 years (Graesser et al., 1997), research on comprehension of visual displays has received much less attention. It is a special merit of the authors of this special issue to address explicitly comprehension of visual displays as spatial text adjuncts. Former studies on text and picture comprehension focused primarily on the mnemonic function of pictures in texts. The main result of these studies was that text information is remembered better when it is illustrated by pictures than when there is no illustration (Levie and Lentz, 1982; Levin et al., 1987). Carney and Levin (this issue) emphasize that research throughout the 1990s has also demonstrated that carefully constructed pictures as visual text adjuncts can not only have a decorative function, but also have functions of representation, organization, interpretation, and mnemonic encoding (what they refer to as a “transformation function”). The facilitation of pictures on learning from text was usually explained by Paivio’s dual coding theory (Clark and Paivio, 1991; Paivio, 1986). According to this theory, verbal information and pictorial information are processed in different cognitive subsystems: a verbal system and an imagery system. Words and sentences are usually processed and encoded only in the verbal system, whereas pictures are processed and encoded both in the imagery system and in the verbal system. Thus, the high memory for pictorial information and the memory-enhancing effect of pictures in texts is ascribed to the advantage of dual coding as compared to single coding in memory. As Verdi and Kulhavy (this issue) point out, the mnemonic function of maps combined with texts can also be explained according to the conjoint processing theory (Kulhavy et al., 1993). They emphasize the simultaneous availability of text information and pictorial information in working memory: An “intact map” requires only little capacity of working memory and therefore, leaves, enough capacity for processing of text information. Thus, verbal information and pictorial information can be kept simultaneously in working memory and, accordingly, it is easier for the learner to make cross-connections between the two different codes and later retrieval information. Mayer (1997) has developed a model of multimedia learning that combines the assumptions of dual coding theory with the notion of multilevel mental representations. A main assumption of Mayer’s model is that verbal and pictorial information are processed in different cognitive subsystems


PP317-edpr-363440

November 24, 2001

8:41


108

Schnotz

and that processing results in the parallel construction of two kinds of mental models that are finally mapped onto each other. Accordingly, an individual understanding a text with pictures selects relevant words, constructs a propositional representation or text base, and then organizes the selected verbal information into a verbal mental model of the situation described in the text. Similarly, the individual selects relevant images, creates what is called a pictorial representation or image base, and organizes the selected pictorial information into a visual mental model of the situation shown in the picture. The final step is to build connections through a one-to-one-mapping between the text-based model and the picture-based model. Integrative processing is most likely to occur if verbal and visual information are simultaneously available in working memory, that is, the corresponding entities in the two models are mentally available at the same time (Baddeley, 1992; Chandler and Sweller, 1991). An Integrative Model of Text and Picture Comprehension The parallelism of text processing and picture processing assumed in Mayer’s model is problematic, however, because texts and pictures are based on different sign systems and use quite different principles of representation. Thus, Schnotz and Bannert (1999) have proposed an integrative model of text and picture comprehension that gives more emphasis to representational principles (cf. Schnotz, 2001). An outline of this model is shown in Fig. 1. It consists of a descriptive (left side) and a depictive (right side) branch of representations. The descriptive branch comprises the (external) text, the (internal) mental representation of the text surface structure, and the propositional representation of the text’s semantic content. The interaction between these descriptive representations is based on symbol processing. The depictive branch comprises the (external) picture, the (internal) visual perception or image of the picture, and the (also internal) mental model of the subject matter presented in the picture. The interaction between these depictive representations is based on processes of structure mapping due to the structural correspondences (i.e., analogy relations) between the representations (Gentner, 1989). In text comprehension, the reader constructs a mental representation of the text surface structure, generates a propositional representation of the semantic content, and constructs from this so-called text base a mental model of the described subject matter (van Dijk and Kintsch, 1983; Schnotz, 1994; Weaver et al., 1995). These construction processes are based on an interaction of bottom-up and top-down activation of cognitive schemata that have both a selective and an organizing function. The selection of task-relevant information is performed by top-down processing, whereas the organizing


PP317-edpr-363440

November 24, 2001


8:41


109

Fig. 1. Schematic illustration of an integrative model of text and picture comprehension.

function is based on the interaction of bottom-up and top-down processing. This interaction results in a specific configuration of activated cognitive schemata that fits best to the incoming information and organizes it into a coherent structure. Text information is processed with regard to morphologic and syntactic aspects by verbal organization processes that lead to a mental


110

PP317-edpr-363440

November 24, 2001

8:41


Schnotz

representation of the text surface structure. This text surface structure in turn triggers conceptual organization processes, that result in a structured propositional representation and a mental model. Picture comprehension is based on a specific interplay between visual perception and higher-order cognitive processing. In picture comprehension, the individual first creates through perceptual processing a visual mental representation of the picture’s graphic display. Then, the individual constructs through semantic processing a mental model and a propositional representation of the subject matter shown in the picture. In perceptual processing, task-relevant information is selected through top-down activation of cognitive schemata and then visually organized through automated visual routines (Ullman, 1984). Perceptual processing includes identification and discrimination of graphic entities, as well as the visual organization of these entities according to the Gestalt laws (Wertheimer, 1938; Winn, 1994). The resulting mental representation is the visual perception of the picture in the imagery part of working memory, the so-called visual sketchpad (Baddeley, 1992; Kruley et al., 1994; Sims and Hegarty, 1997). Perception and imagery are based on the same cognitive mechanisms, therefore, the same kind of representation can also be referred to as a perceptual image if the representation is created on the basis of internal world knowledge rather than external sensory data (Kosslyn, 1994; Shepard, 1984). Semantic processing is required to understand a picture as opposed to merely perceiving it. During this process, the individual constructs a mental model of the depicted subject matter through a schema-driven mapping process in which graphic entities are mapped onto mental model entities and spatial relations are mapped onto semantic relations. In other words, picture comprehension is a process of analogical structure mapping between a system of visuo-spatial relations and a system of semantic relations (Falkenhainer et al., 1989–90; Schnotz, 1993). This mapping can take place in both directions; it is possible to construct a mental model bottom-up from a picture, and it is also possible to evaluate an existing mental model topdown with a picture. While understanding pictorial illustrations or maps, the individual can use cognitive schemata of everyday perception. While understanding graphs and knowledge maps, however, the individual requires specific cognitive schemata (so-called graphic schemata) in order to be able to read off information from the visuo-spatial configuration (Lowe, 1993; Pinker, 1990). When a mental model has been constructed, new information can be read from the model through a process of model inspection. The new information gained in this way is made explicit by encoding it in a propositional format. The new propositional information is used to elaborate the propositional representation. In other words, there is a continuous interaction between the propositional representation and the mental model (Baddeley,


PP317-edpr-363440

November 24, 2001


8:41


111

1992). In text comprehension, the starting point of this interaction is a propositional representation, which is used to construct a mental model. When understanding pictures, the starting point of the interaction is a mental model which is used to read new information that is added to the propositional representation. Besides the interaction between the propositional representation and the mental model, there may also be an interaction between the text surface representation and the mental model, and between the perceptual representation of the picture and the propositional representation. This is shown in Fig. 1 by the dotted diagonal arrows. Accordingly, there is no one-to-one relationship between external and internal representations. A text as an external descriptive representation leads to both an internal descriptive and an internal depictive mental representation. A picture, on the other hand, as an external depictive representation leads to both an internal depictive and an internal descriptive mental representation. Formally, one can consider the construction of a propositional representation and of a mental model as a kind of dual coding. Nevertheless, my view is fundamentally different from the traditional dual coding theory. First, dual coding presumably applies not only to the processing of pictures, but also to the processing of words and texts. Second, the construction of a mental model is regarded as more than simply adding a further code that elaborates the mental representation and provides a quantitative advantage compared to a single code. Rather, the essential point is that propositional representations and mental models are based on different sign systems and different principles of representation that complement one another.

Feature Information and Structure Information Verdi and Kulhavy (this issue) point out that two kinds of information can be distinguished in visual displays: feature information and structure information. Feature information is provided by symbols or pictograms representing an external referent. It tells us what exists at a specific place. Furthermore, feature information helps to activate appropriate prior knowledge schemata. Structure information is provided by the place where a specific feature is located on the display. To summarize: Feature and structure information tell the observer what exists (or happened) where. The distinction between what and where is also supported by results of brain research and the practice of neurology. Visual information about object attributes and visual information about spatial configurations is processed in different cognitive subsystems: A what-system and a where-system. The what-system contains knowledge about the appearance of objects and is used to identify objects. The where-system contains knowledge about spatial


112

PP317-edpr-363440

November 24, 2001

8:41


Schnotz

directions and distances between objects, that is, knowledge about spatial structures and their location (Kosslyn, 1991). One can find patients with partial brain damage who can localize objects but are unable to say what these objects are, and one can find patients who can identify objects but cannot localize them (Farah et al., 1988). TOOLS FOR COMMUNICATION, THINKING, AND LEARNING The articles in this special issue deal with various kinds of visual displays: static or animated pictures, geographic maps, thematic maps, graphs, and knowledge maps. These visual displays can be considered as complex pictorial signs, and like other kinds of signs, they can help to communicate information and support thinking or learning processes. Like sentences of natural language (as complex verbal signs), these pictorial signs can be analyzed under a syntactic, a semantic, and a pragmatic perspective. The syntactic perspective deals with the well-formedness of signs. The semantic perspective deals with the meaning of the pictorial signs. The pragmatic perspective deals with the use of pictorial signs in communication, thinking, and learning. Syntactic constraints on the well-formedness of visual displays derive from the need to maintain similarity or a structural commonality with what they represent and from the requirements of human perception. The syntactic constraints of pictorial illustrations and geographic maps are based on similarity (cf. Carney and Levin, this issue; Verdi and Kulhavy, this issue). The syntactic constraints of graphs and knowledge maps derive from the conventional representation formats (e.g., pie charts, bar charts, line graphs, scatter plots, and box plots), from structural commonalities with the represented subject matter, and from the mechanism of human visual perception, especially the Gestalt laws. As O’Donnell et al. (this issue) point out, knowledge maps that were constructed according to the Gestalt laws resulted in better learning than other kinds of knowledge maps. Semantic constraints of visual displays are implicitly addressed by all the contributors when they analyze conditions that make comprehension and learning with these displays easier. Finally, all articles deal with the pragmatic perspective on pictorial signs. When Carney and Levin (this issue) distinguish between representation, organization, interpretation, transformation, and decoration as possible functions of pictorial illustrations, they refer to the pragmatic perspective. Similarly, Verdi and Kulhavy (this issue) point out the facilitative function of maps. As an amendment, it should be noted that both pictorial illustrations and maps can also serve as tools for thinking. An example is the use of pictures by Abraham Wald during World War II: In order to find out which


PP317-edpr-363440

November 24, 2001


8:41


113

areas of airplanes required more armor, he copied the bullet holes from a large number of returning aircraft on an outline picture of the airplane and put extra armor everyplace else (Wainer, 1992). Another example is Dr John Snow’s use of a map of Central London in 1854, when he plotted the location of deaths from cholera and found that the decease came from the Broad Street water pump (Tufte, 1983). O’Donnell et al. (this issue) also report about the use of knowledge maps as a tool for communication and thinking: These maps can be used, for example, in a counselling setting where a counseller and a client try to attain a common understanding of a problem situation. Similarly, graphs can be used both by novices and experts in order to communicate about a problem. Shah and Hoeffner (this issue) emphasize that graphs should generally be designed according to their intended usage. If individuals should understand the interaction between three variables, for example, one three-dimensional display would be better suited than there exact values from a graph.

INDIVIDUAL DIFFERENCES Visuo-spatial text adjuncts and other forms of visual displays can support communication, thinking, and learning only if they interact appropriately with the individual’s cognitive system. Accordingly, the effects of visuo-spatial adjunct aids depend on prior knowledge, cognitive abilities, and learning skills. These factors are, of course, age-dependent. Children in the kindergarten age range are generally skilled in understanding realistic pictures, whereas verbal literacy (as result of learning to read and a prerequisite of reading to learn) is attained in primary school. Finally, visual literacy, which includes understanding graphs, is acquired (if at all) still later (Shah and Hoeffner, this issue). Carney and Levin (this issue) point out that pictorial illustrations can have a decorative and motivational function in materials for first graders who learn to read. However, these pictures should not illustrate what children are expected to understand from reading the text. Individuals seem to be experts in cognitive economy. They are therefore skillful in finding shortcuts for solving cognitive tasks. Generally speaking, one should not provide alternative routes for understanding when the learner should be trained in understanding a specific kind of representation. Among readers, visual displays can have a supporting function for understanding and learning difficult materials. The more difficult a learning content is, the higher is the learner’s frequency of looking at adjunct visual displays (Carney and Levin, this issue). The supportive function of visuo-spatial adjuncts seems to be especially evident with learners of low


PP317-edpr-363440

November 24, 2001

114

8:41


Schnotz

prior knowledge and low verbal skills. Previous research has pointed out that comprehension among learners with low domain knowledge (but sufficient visuo-spatial cognitive skills) is increased when pictures are added to a text. Learners with high domain knowledge, on the contrary, are able to construct a mental model without pictorial support (Mayer, 1997). Carney and Levin (this issue) draw a similar conclusion when they argue that a text that is simple and can be easily envisioned by the learner does not need additional pictures. However, if the subject matter is complex and/or if learners have low prior knowledge, then visual displays increase comprehension. This is true not only for pictorial illustrations, but also for knowledge maps. Knowledge maps are especially helpful for learners with low prior knowledge and for learners with low verbal skills (O’Donnell et al., this issue). Verdi and Kulhavy (this issue) also emphasize the role of prior knowledge: Learners with high prior knowledge better recall map information than learners with low prior knowledge. However, maps are highly familiar both for novices and for experts. Both groups show comparable ability in processing map information. Abstract kinds of visual displays such as graphs, however, require knowledge about specific forms of representations. The individual has to acquire specific cognitive schemata (graph-schemata) in order to understand these so-called logical pictures (Pinker, 1990; Shah and Hoffner, ¨ this issue).

INSTRUCTIONAL CONSEQUENCES Effective learning with visuo-spatial text adjuncts can be fostered through instructional design by the teacher or author of instructional material and through adequate processing strategies by the learner. The contributions of this Special Issue include both of these perspectives.

Instructional Design All contributors agree that effective learning with visuo-spatial text adjuncts is not dependent on the professional appearance of visuals, but rather on the relation between these displays and the task demands and on the learner’s prior knowledge and cognitive abilities. Instructional design of visual displays therefore requires sufficient understanding of how the human cognitive system interacts with these displays. The authors agree on various points with regard to instructional design: First, if verbal and pictorial information is provided to learners, both kinds of information should be coherent with some semantic overlap (Carney


PP317-edpr-363440

November 24, 2001


8:41


115

and Levin, this issue; Shah and Hoeffner, this issue; Mayer and Moreno, this issue). Second, both kinds of information should enter working memory simultaneously in order to make interconnections between them more likely; simultaneous availability of information requires spatial or temporal contiguity (Mayer and Moreno, this issue). Third, semantic processing of verbal and pictorial information requires activation of thematically related prior knowledge. Access to prior knowledge is facilitated in comprehending geographic or thematic maps and graphs if meaningful symbols, colors, or icons are used that can be easily associated with their referent (Verdi and Kulhavy, this issue). Access to prior knowledge is more difficult if a legend is used in a map or a graph because this requires an additional step in order to associate a color or a visual pattern with its external referent (Shah and Hoeffner, this issue). Fourth, if possible, verbal and pictorial information should not enter working memory through the visual channel in order to avoid cognitive overload. Fifth, the same verbal information should not be presented simultaneously through the visual and the auditive channel (Mayer and Moreno, this issue). There is also agreement that visual displays—ranging from concrete pictorial illustrations to abstract graphs or knowledge maps—should be designed according to the requirements of the human perceptual apparatus. They should, for example, include visual features that can be easily distinguished, and they should arrange visual features according to the Gestalt laws. Finally, visual displays should be designed according to the aim of communication or of teaching and learning. Sometimes, text and picture cannot be presented simultaneously. Based on their own research, Verdi and Kulhavy (this issue) suggest that in this case the picture should be presented first and the text later. The authors argue that when text processing occurs first most of the capacity of working memory is used leaving little capacity for processing the following picture. Processing a picture first requires little space in working memory and, thus, leaves enough capacity for processing text. Thereafter, I believe that an alternative and probably more simple explanation would be that a text never describes a subject matter with enough detail to allow only one kind of envisioning. A mental model or visual image constructed only from the text is therefore likely to differ from the picture presented afterwards and, thus, interferes with the picture (cf. Fig. 1). This kind of interference can be avoided by presenting the picture before the text. Processing Strategies Visual displays can support communication, thinking, and learning. However, they do not provide this support automatically. Learners often


116

PP317-edpr-363440

November 24, 2001

8:41


Schnotz

underestimate the informational content of pictures and believe that a short look would be enough for understanding and for extracting the relevant information (Mokros and Tinker, 1987; Weidenmann, 1989). These individuals do not engage in a schema-driven analysis of the depictive representation, do not read enough information, and thus do not elaborate their propositional representation of the subject matter sufficiently (cf. Fig. 1). Thus, it is not enough that learners possess the cognitve schemata of everyday knowledge required for understanding pictorial illustrations or the cognitive schemata required for understanding graphs. These schemata also must be activated (Shuell, 1988). All articles contained in this special issue emphasize the importance of active cognitive processing of visuo-spatial adjuncts requiring appropriate processing strategies. Carney and Levin (this issue) show that the functions of representation, organization, interpretation, and transformation require appropriate encoding. Similarly, animations are only beneficial for learning if the individual engages in active cognitive processing (Mayer and Moreno, this issue). Shah and Hoeffner point out that learners need visual literacy. They emphasize that individuals have difficulty mapping one form of representation into another one, and argue that learning from graphs should be considered a special metacognitive task. The authors also make suggestions how individuals could attain visual literacy. Several authors also emphasize that learners should operate with visual displays in an active way that results in a controllable product. The effectiveness of knowledge maps also seems to result from active cognitive processing. Knowledge maps require that learners use a restricted set of semantic relations: The individual has to subsume specific semantic relations from the text under one of the higher-order relations provided in knowledge maps. This requires deeper semantic processing than simply copying the name of a semantic relation from the text to a link in the conceptual map. O’Donnell et al. (this issue) report that a spatial display of a knowledge map organized in a left–right order can by mistake trigger a simple text reading strategy. In this case, the knowledge map can be harmful for learning because the map is processed too superficially. The authors show that active construction of knowledge maps by learners helps them to reflect about the material more deeply and enables them to communicate the content more clearly. The use of knowledge maps requires specific strategies that must be learned. It is remarkable that even brief training frequently results in positive transfer even in learning situations where no knowledge maps are available. One can assume that such training helps learners focus on the structural aspects of knowledge even without an external knowledge map. In other words, learners acquire a general ability to structure and organize knowledge as a general cognitive tool.


PP317-edpr-363440

November 24, 2001


8:41


117

FURTHER PERSPECTIVES The articles contained in this special issue show that visual displays are powerful devices to support teaching and learning as well as other kinds of communication. There is converging evidence that specific principles of designing visuals and of combining them with texts are important to support comprehension and learning. There is also converging evidence that prior knowledge about representation formats and active processing of visuals based on adequate strategies are crucial for effective support of comprehension and learning. Nevertheless, there are still a number of open questions that require investigation. Learning from verbal and pictorial information has generally been considered as (potentially) beneficial for learning. However, research on knowledge acquisition from multiple representations has made obvious that the use of more representational formats does not only have cognitive benefits but also cognitive costs (Ainsworth, 1999). Learning from verbal and pictorial information has also frequently been associated with individual representational preferences and cognitive styles. Examples of this are the distinction between visualizers and verbalizers and between field independency and field dependency (cf. Verdi and Kulhavy, this issue). Research on the relevance of such preferences and cognitive styles, however, has not attained clear results yet, and it is unknown whether matching the learners’ individual preferences really will result in better learning. Accordingly, it remains an open question whether we need to adapt texts and visuo-spatial adjuncts to the assumed aptitude–treatment effects hypothesized by some researchers. The development of new technologies is a specific challenge for the use of verbal and pictorial information in learning and instruction. While traditional print material allows only static visual displays to be presented, computer-based instruction makes it possible to show animated displays. Many practitioners and researchers consider animation an ideal form for presenting change and development. Empirical results, however, do not generally support this assumption. Further research on the conditions for using animations effectively is required. This research should be based on a well supported cognitive theory (cf. Mayer and Moreno, this issue). The development of new technologies also casts some well-known kinds of visual displays into a new light. Knowledge maps, for example, can be used not only as schaffolds for generating semantic macrostructures, they can also be used as external visual models of an information space. Thus, tools for communication, thinking, and learning also become tools for information search. Despite these developments, I doubt whether we have to repeat the research on learning from verbal and pictorial information with print media


PP317-edpr-363440

November 24, 2001

118

8:41


Schnotz

under the conditions of the new electronic media. I also doubt that the design principles for the new media will be fundamentally different from the design principles developed for the traditional print media. The essential point in this context is whether there are really new qualities emerging from the use of new technologies that are relevant for cognitive processing. Another essential point is whether and in what respect learners might differ in the future from today’s learners. The general constraints of the human cognitive system will certainly not change as a result of new technologies. However, future learners could have new attitudes and processing habits. As humans are exposed to an increasing mass of information that frequently dazzles the eyes, ears, and mind, new standards of presenting information emerge. For example, television stations present short, dynamic, and entertaining information sequences, and most mass media provide an increasing amount of pictorial information that allow easy and rapid information processing. One can assume that learners who have much experience with electronic media and with new kinds of information presentation might have new expectations, new attitudes, and new processing habits that affect their cognitive processing. Cognitive processing, however, is only one factor that contributes to effective learning. Affective and motivational factors must be considered as well. If new media have appeal for young learners and if these learners are motivated to interact with a computer-based learning environment longer than with traditional print materials (because it is “more fun”), then this could justify the use of new technologies even when the cognitive effects would be about the same as with traditional print media. Research on learning from text with visuo-spatial adjuncts will have to be conducted not only from a cognitive, but also from an affective, motivational, and social perspective to reach adequate educational decisions. REFERENCES Ainsworth, S. (1999). The functions of multiple representations. Comput. Educ. 33: 131–152. Baddeley, A. (1992). Working memory, Science 255: 556–559. Chafe, W. L. (1994). Discourse, Consciousness, and Time, University of Chicago Press, Chicago. Chandler, P., and Sweller, J. (1991). Cognitive load theory and the format of instruction. Cogn. Instr. 8: 293–332. Clark, J. M., and Paivio, A. (1991). Dual coding theory and education. Educ. Psychol. Rev. 3: 149–210. Falkenhainer, B., Forbus, K. D., and Gentner, D. (1989–90). The structure-mapping enginge: Algorithm and examples. Artif. Intell. 41: 1–63. Farah, M. J., Hammond, K. M., Levine, D. N., and Calvanio, R. (1988). Visual and spatial mental imagery: Dissociable systems of representation. Cogn. Psychol. 20: 439–462. Garrod, S. C. (1985). Incremental pragmatic interpretation versus occasional inferencing during fluent reading. In Rickheit, G., and Strohner, H. (eds.), Inferences in Text Processing, NorthHolland, Amsterdam, pp. 161–181.


PP317-edpr-363440

November 24, 2001


8:41


119

Gentner, D. (1989). The mechanisms of analogical learning. In Vosniadou, S., and Ortony, A. (eds.), Similarity and Analogical Reasoning, Cambridge University Press, Cambridge, England, pp. 197–241. Graesser, A. C., Millis, K. K., and Zwaan, R. A. (1997). Discourse comprehension. Annu. Rev. Psychol. 48: 163–189. Johnson-Laird, P. N. (1983). Mental Models. Towards a Cognitive Science of Language, Interference, and Consciousness, Cambridge University Press, Cambridge, England. Johnson-Laird, P. N., and Byrne, R. M. J. (1991). Deduction, Erlbaum, Hillsdale, NJ. Kintsch, W., Welsch, D., Schmalhofer, F., and Zimny, S. (1990). Sentence memory: A theoretical analysis. J. Mem. Lang. 29: 133–159. Kosslyn, S. M. (1991). A cognitive neuroscience of visual cognition: Further developments. In Logie, R. H., and Denis, M. (eds.), Mental Images in Human Cognition, North-Holland, Amsterdam, pp. 351–381. Kosslyn, S. M. (1994). Image and Brain. The Resolution of the Imagery Debate, MIT Press, Cambridge, MA. Kruley, P., Sciama, S. C., and Glenberg, A. M. (1994). On-line processing of textual illustrations in the visuospatial sketchpad: Evidence from dual-task studies. Mem. Cogn. 22: 261–272. Kulhavy, R. W., Stock, W. A., and Kealy, W. A. (1993). How geographic maps increase recall of instructional text. Educ. Technol. Res. Dev. 41: 47–62. Larkin, J. H., and Simon, H. A. (1987). Why a diagram is (sometimes) worth ten thousand words. Cogn. Sci. 11: 65–99. Levie, H. W., and Lentz, R. (1982). Effects of text illustrations: A review of research. Educ. Commun. Technol. J. 30: 195–232. Levin, J. R., Anglin, G. J., and Carney, R. N. (1987). On empirically validating functions of pictures in prose. In Willows, D. M., and Houghton, H. A. (eds.), The Psychology of Illustration, Vol. 1, Springer, New York, pp. 51–86. Lowe, R. K. (1993). Constructing a mental representation from an abstract technical diagram. Learn. Instr. 3(3): 157–179. Mayer, R. E. (1997). Multimedia learning: Are we asking the right questions? Educ. Psychol. 32: 1–19. Mokros, J. R., and Tinker, R. F. (1987). The impact of microcomputer based labs on children’s ability to interpret graphs. J. Res. Sci. Teach. 24(4): 369–383. Paivio, A. (1986). Mental Representations: A Dual Coding Approach, Oxford University Press, Oxford, England. Palmer, S. E. (1978). Fundamental aspects of cognitive representation. In Rosch, E., and Lloyd, B. B. (eds.), Cognition and Categorization, Erlbaum, Hillsdale, NJ, pp. 259–303. Peirce, C. S. (1906). Prolegomena to an apology for pragmaticism. Monist 492–546. Peterson, D. (1996). Forms of Representation, Intellect, Exeter. Pinker, S. (1990). A theory of graph comprehension. In Freedle, R. (ed.), Artificial Intelligence and the Future of Testing, Erlbaum, Hillsdale, NJ, pp. 73–126. Schmalhofer, F., and Glavanov, D. (1986). Three components of understanding a programmer’s manual: Verbatim, propositional, and situational representations. J. Mem. Lang. 25: 279– 294. Schnotz, W. (1993). On the relation between dual coding and mental models in graphics comprehension. Learn. Instr. 3: 247–249. Schnotz, W. (1994). Auf bau von Wissenstrukturen. Untersuchungen zur Koh¨arenzbildung beim Wissenserwerb mit Texten, Pschologie Verlags Union, Weinheim. Schnotz, W. (2001). Sign sytems, technologies, and the acquisition of knowledge. In Rouet, J. F., Levonen, J., and Biardeau, A. (eds.), Multimedia Learning—Cognitive and Instructional Issues, Elsevier, Amsterdam, pp. 9–29. Schnotz, W., and Bannert, M. (1999). Support and interference effects in learning from multiple representations. In Bagnara, S. (ed.), European Conference on Cognitive Science, 27th–30th Oct. 1999, Istituto di Psicologia Consiglio, Nazionale delle Ricerche, Rome, Italy, pp. 447– 452. Shepard, R. N. (1984). Ecological constraints on internal representations: Resonant kinematics of perceiving, thinking, and dreaming. Psychol. Rev. 91: 417–447.


120

PP317-edpr-363440

November 24, 2001

8:41


Schnotz

Shuell, T. J. (1988). The role of the student in the learning from instruction. Contemp. Educ. Psychol. 13: 276–295. Sims, V. K., and Hegarty, M. (1997). Mental animation in the visuospatial sketchpad: Evidence from dual-tasks studies. Mem. Cogn. 25: 321–332. Tufte, E. R. (1983). The Visual Display of Quantitative Information, Graphics Press, Cheshire, CT. Ullman, S. (1984). Visual routines. Cognition 18: 97–159. Van Dijk, T. A., and Kintsch, W. (1983). Strategies of Discourse Comprehension, Academic Press, New York. Wainer, H. (1992). Understanding graphs and tables. Educ. Res. 21(1): 14–23. Weaver, C. A., III, Mannes, S., and Fletcher, C. R. (eds.). (1995). Discourse Comprehension, Erlbaum, Hillsdale, NJ. Weidenmann, B. (1989). When good pictures fail: An information-processing approach to the effects of illustrations. In Mandl, H., and Levin, J. R. (eds.), Knowledge Acquisition From Text and Pictures, North-Holland, Amsterdam, pp. 157–170. Wertheimer, M. (1938). Laws of Organization in Perceptual Forms in a Source Book for Gestalt Psychology, Routledge & Kegan Paul, London. Winn, W. D. (1994). Contributions of perceptual and cognitive processes to the comprehension of graphics. In Schnotz, W., and Kulhavy, R. (eds.), Comprehension of Graphics, Elsevier, Amsterdam, pp. 3–27.

Towards an Integrated View of Learning From Text and ... - CiteSeerX

Towards an Integrated View of Learning From Text and ... - CiteSeerX

Suggest Documents

An Integrated View of Strategy: Towards a Resource ... - CiteSeerX

XCHIPS: Towards an Integrated Cooperative Learning and ...

XCHIPS: Towards an Integrated Cooperative Learning and

Towards an Integrated Organization and Technology ... - CiteSeerX

An integrated view of foresight - CiteSeerX

An Integrated View of Teaching and Learning for a Foundational ...

Towards an Integrated View on Architecture and its Evolution - SERG

Towards an Integrated View on Architecture and its Evolution - SERG

An integrated view of foresight: integrated foresight ... - CiteSeerX

Towards an integrated view of vocal development - PLOS

Plant immunity: towards an integrated view of plantâpathogen ... - FBMC

Towards an integrated view of Wnt signaling in development

INTEGRATED VIEW OF A QUASI-EXPERIMENT An Integrated View of ...

Towards An Integrated Information Environment With ... - CiteSeerX

Towards an Integrated, Web-executable Parallel ... - CiteSeerX

Learning apart and together: towards an integrated ... - AtitudeNow

Testosterone and carotenoids: an integrated view of

Towards an Integrated Learning Framework for Behavior Modeling of ...

Theory, Practice and Policy: An Integrated View on ... - CiteSeerX

Theory, Practice and Policy: An Integrated View on ... - CiteSeerX

An evolutionary integrated view of regional systems of ... - CiteSeerX

Towards unsupervised learning of constructions from text - Research

An integrated model for learning organization with strategic view ...

Towards an integrated coastal sediment dynamics and ... - CiteSeerX

Towards an Integrated View of Learning From Text and ... - CiteSeerX