Instructional Science 32: 115–132, 2004. © 2004 Kluwer Academic Publishers. Printed in the Netherlands.
115
Assessment of Cognitive Load in Multimedia Learning with Dual-Task Methodology: Auditory Load and Modality Effects ROLAND BRÜNKEN1,∗, JAN L. PLASS2 AND DETLEV LEUTNER3 1 Georg Elias Müller Institut of Psychology, Georg August University Göttingen, Waldweg 26, D-37073 Göttingen, Germany; 2 The Steinhardt School of Education, New York University (E-mail:
[email protected]); 3 Department of Learning Psychology, University
Duisburg-Essen (E-mail:
[email protected]) (∗ author for correspondence: e-mail:
[email protected]) Abstract. Using cognitive load theory and cognitive theory of multimedia learning as a framework, we conducted two within-subject experiments with 10 participants each in order to investigate (1) if the audiovisual presentation of verbal and pictorial learning materials would lead to a higher demand on phonological cognitive capacities than the visual-only presentation of the same material, and (2) if adding seductive background music to an audiovisual information presentation would increase the phonological cognitive load. We employed the dual-task methodology in order to achieve a direct measurement of cognitive load in the phonological system. In both experiments, the modality effect could be confirmed in the patterns of secondary task performance and in the primary learning task.
Over the past decade, educational researchers have conducted a large number of experiments on the instructional design of learning scenarios in order to determine under which conditions learners benefit most from multimedia learning materials (e.g. Mayer 2001). In these experiments, researchers have found various main effects of instructional design on learning outcomes, and differential effects of the learning material presentation depending on different learners characteristics (e.g., Plass, Chun, Mayer and Leutner 1998; Plass, Chun, Mayer and Leutner 2003). Recently, the focus of educational research in multimedia learning has shifted from research on individual instructional design effects to the development and evaluation of more integrated theoretical models of cognitive processing in multimedia learning (Mayer 2001; Schnotz 2001; Sweller 1999). These models are based on assumptions of information processing derived from cognitive theory, such as schema theory (Anderson 1983), generative theory (Wittrock 1990), dualcoding theory (Paivio 1986), and working memory models (Baddeley 1986). These assumptions are integrated into recently published theoretical frameworks, such as cognitive load theory (CLT: Sweller 1999), and cognitive theory of multimedia learning (CTML: Mayer 2001), that are suitable for the explanation of both instructional design effects (such as a modality effect or
116
ROLAND BRÜNKEN ET AL.
a split-attention effect) and individual differences in information processing (such as a prior knowledge effect or a spatial ability effect). The theoretical models CLT or CTML are able to successfully inform the design of learning materials and the management of the learning process. However, most of the empirical evidence underpinning these models is quantity-related, with researchers interested in quantifying the amount of knowledge acquisition. That is, the variant of learning materials used by the experimental group that demonstrates the most knowledge acquisition is seen as the most beneficial. Further, as far as the instructional variants have been derived from cognitive theory, such quantity-based experimental results are seen as empirical validation of the underlying models. One example of this line of quantity-based reasoning on multimedia design is the so-called modality effect. This well-established effect states that learners who receive textual and pictorial materials audiovisually, i.e., using visual images and narration of the text, acquire more knowledge than learners who receive the same material presented only visually, i.e., as visual images and on-screen text (Brünken and Leutner 2001; Mayer and Moreno 1998; Mousavi, Low and Sweller 1995; Tabbers 2002; Tindall-Ford, Chandler and Sweller 1997). Because of this finding and other recent work on working memory models (Baddeley 1986; Baddeley and Logie 1999; Miyake and Shah 1999), it is assumed that the cognitive processing of visually and acoustically presented materials takes place in two separate subsystems of working memory that command separate, independent cognitive resources. Based on this view of working memory, differences in learning outcomes are caused by the different amounts of cognitive capacity available for information processing. In the audiovisual presentation condition, the learner can use the cognitive resources of both systems for information processing, while the learners in the visual-only condition can only use the resources of the visual system. Most of the theoretical assumptions underlying CLT and CTML are based on empirical results from various learning experiments and are related to more general models of cognitive theory, such as schema theory or working memory research. However, these assumptions are mainly based on studies that have employed indirect measures, commonly learning outcomes. More research is needed to ground theories of multimedia learning in our knowledge of basic cognitive processes. We therefore need to develop and implement methods that can provide insights into cognitive processes that go beyond learning outcomes, such as eye tracking analysis, neuroimaging techniques, or dual task analysis, which, although new in this domain, are successfully used in general psychology (Brünken, Plass and Leutner 2003; Paas, Tuovinen, Tabbers and Van Gerven 2003).
AUDITORY LOAD AND MODALITY EFFECTS IN MULTIMEDIA LEARNING
117
Limited capacity assumption and learning In our workgroup we employed the dual task approach in order to study the limited capacity assumption in multimedia learning. This assumption has been outlined in CLT as one key factor in the learning process (Brünken, Steinbacher, Plass and Leutner 2002; Brünken, Plass and Leutner 2003). CLT assumes that the amount of cognitive capacity available in a specific learning situation is limited and has to be distributed over several cognitive processes and their resource requirements. The materials to be learned induce a specific demand on this capacity – called intrinsic load – depending on their complexity and element interactivity (Paas, Renkl and Sweller 2003). Moreover, different types of learning material presentation as well as different instructional designs require different amounts of cognitive capacity, independent from the content of the learning material. The capacity needed to meet these design-related requirements – called extraneous load – is assumed to make no contribution to the learning process because it has to be used to compensate for a “bad” (e.g., non-integrated, or working memory demanding) instructional design. Finally, cognitive capacity is needed for active knowledge construction, for example, schema construction or schema integration. This is called germane load, and is assumed to be the key factor in the understanding and the storing of the learning material. It is argued that total available capacity is limited, and that the three different types of cognitive load are additive in their combined capacity requirement. Therefore, in the development of specific learning materials, we can free up cognitive resources for germane activities by using a design of the materials that minimizes extraneous cognitive load, assuming that the intrinsic load of the materials is constant. Virtually all previous research on the different amounts of extraneous load induced by different forms of information presentation has been conducted using learning outcomes as the indirect measure for load, i.e., low outcome scores were interpreted as indicators for high cognitive load. In recent experiments, we introduced a dual-task approach to this type of research that provided direct empirical evidence for a limited processing capacity (Brünken et al. 2002). In these experiments, we studied capacity limitations of visual memory for the modality effect using a secondary task in addition to the multimedia learning scenario, which presented textual and pictorial information either in visual-only format (as pictures and text) or in audiovisual format (as pictures and narrations). The secondary task was a visual monitoring task that had to be performed simultaneously to the learning process. As predicted by CLT, we found that the performance in the visual secondary task is directly related to the modality of the multimedia learning material (visual v. audiovisual) that served as a primary task – independent of the
118
ROLAND BRÜNKEN ET AL.
amount of information provided, which was equal for both presentation conditions. Learners performed better on the secondary task, i.e., experienced less cognitive load, when the learning material was presented in an audiovisual format than when it was presented in a visual format. This direct approach to the measurement of cognitive load using a secondary monitoring task has not previously been employed in research on multimedia learning (Brünken, Plass and Leutner 2003). Generally, few studies exist that employ dual task methodology for the purpose of analyzing resource requirements in learning processes (Brünken, et al. 2002). The strength of this methodology is that it allows us to validate cognitive load effects found with indirect measures in other contexts, and to analyze cognitive load effects in channels other than the visual modality. In the studies presented in this paper, we have used the dual task methodology to study auditory load in the modality effect and to examine a closely related instructional design effect, the seductive background music effect (Moreno and Mayer 2000). Auditory load and modality effects Based on a large body of empirical research within the framework of CTML and CLT, the modality effect is known as one of the most reliable and valid instructional design effects in multimedia learning (Brünken and Leutner 2000; Mayer and Moreno 1998; Mousavi et al. 1995; Tabbers 2002). When textual and pictorial learning materials are presented simultaneously, an audiovisual presentation (narration and picture) is more beneficial for learning than an visual-only presentation (written text and pictures) of the same material. The theoretical assumptions that serve as an explanation for this effect are taken from working memory models, as described above. It is assumed that visual and auditory materials are processed in different subsystems of working memory (dual channel assumption: Mayer 2001), and, moreover, that both subsystems have separate, limited processing capacities that cannot easily be exchanged between the systems. In the case of a visual-only presentation, written text and pictures have to be processed, at least initially, in the visual part of the working memory, and the processing capacity of this memory system has to be split between the two sources of information, while the phonological part of the working memory remains unemployed, and its capacity is not used for information processing. However, if the same learning material is presented audiovisually, i.e., instead of written text a narration is used, the spoken text can be processed in the phonological subsystem and the pictures can be processed in the visual subsystem of working memory. Because the available capacity of both subsystems can be used, more cognitive capacity is available for processing
AUDITORY LOAD AND MODALITY EFFECTS IN MULTIMEDIA LEARNING
119
audiovisual materials compared to visual-only material of the same content. The effect of the different amount of cognitive capacity available for information processing on learning outcomes, which are higher for the audiovisual materials than the visual-only materials, have been consistently replicated (Brünken and Leutner 2000; Mayer and Moreno 1998; Mousavi et al. 1995; Tabbers 2002). This effect was also supported by our own prior research using dual task methodology (Brünken et al. 2002), where we found that participants learning from audiovisual learning materials had more capacity available for processing a visual secondary task than those working with the same learning materials presented in a visual-only format. Most research on the modality effect focuses on cognitive load in the visual subsystem of working memory. What has not been studied so far is the load induced in the auditory subsystem. Based on the assumptions outlined above, it can be expected that learners working with the audiovisual variant of the learning material should experience more auditory cognitive load than learners working with the visual-only material. We expect these patterns of cognitive load because in learning with the audiovisual variant of the materials, the verbal information is processed in the auditory system, while in the visual-only variant this system remains unemployed. Therefore, if participants learning with the two variants of materials also perform an auditory secondary task, e.g., listen to music or to an environmental sound such as thunder or rain, the visual-only group would be expected to outperform the audiovisual group with respect to this auditory secondary task. Auditory load and seductive background music One multimedia learning effect that is related to the auditory load assumption has been reported by Moreno and Mayer (2000) within the framework of CTML and has later been described by Mayer (2001) as coherency principle 2. Moreno and Mayer conducted two experiments, each with different learning materials, in which they compared two variants of a learning system; one delivering information as narration and animation, the other delivering the same information with the same narration and animation, but adding interesting yet, with respect to the learning goal, irrelevant sounds as well as background music. They found strong evidence for a negative effect of background music on knowledge acquisition. In both experiments, learners working with the material without background music outperformed the learners working with the material containing background music. Moreover, Moreno and Mayer (2000) reported large effect sizes for this difference (d = 1.27 in Experiment 1 and d = 0.96 in Experiment 2). The theoretical explanation Mayer (2001) offers for this effect is closely related to the explanation of the modality effect. He argued that learners cannot ignore the music informa-
120
ROLAND BRÜNKEN ET AL.
tion despite the fact that it is irrelevant for the instructional goal. Therefore, this information needs to be processed simultaneously with the relevant narration in the phonological subsystem of the working memory. Both the relevant and the irrelevant information compete for the available cognitive resources in auditory working memory. For the learners receiving the materials without background music, this additional demand on phonological memory is not present. These learners have therefore more phonological capacity available for processing the relevant information, and – assuming identical visual capacities in both groups for processing the visual pictorial information – have more total cognitive capacity available for learning than the group receiving background music (Mayer 2001; Moreno and Mayer 2000). If we were to apply our dual-task approach in a measurement of cognitive load, what would be the expected load patterns for the auditory system and how would this affect the processing of an auditory secondary task? Based on the capacity explanation offered by Moreno and Mayer (2000) and Mayer (2001), we would expect that learners working with a multimedia learning system containing pictures and narrations will have less auditory capacity available for processing an auditory secondary task when interesting but irrelevant background music is added than when no such auditory information was available. Therefore, the performance on the auditory secondary task should be higher without irrelevant background music in the primary learning task than with background music. Summarizing these assumptions regarding the auditory load induced by narration and background music in multimedia learning environments, a clear prediction can be drawn with respect to the learners’ secondary task performance: In an auditory secondary task, learners’ performance should be highest without any auditory information (relevant or irrelevant) as a result of the auditory load induced by narration and background music. Performance should decrease in a multimedia system with visual-only information and seductive background music, i.e., when irrelevant auditory information in form of background music, but no relevant auditory information is presented. Secondary task performance should be lowest in an audiovisual multimedia system with seductive background music, i.e., when learners are presented with both relevant and irrelevant auditory information in the primary learning task. Even though primary task performance (i.e., learning from the multimedia system) was not in the primary focus of our investigation, we did expect to also find a replication of the modality effect for learners’ knowledge acquisition, where learners presented with an audiovisual primary task outperform learners presented with a visual-only primary task.
AUDITORY LOAD AND MODALITY EFFECTS IN MULTIMEDIA LEARNING
121
In order to test our hypotheses, we conducted two experiments that used the same experimental design, the same secondary task, but different learning contents of the multimedia learning system that served as the primary task.
Experiment 1 Method Participants and design. The first study was conducted with 10 female students, enrolled in the undergraduate (B.A.) program at Erfurt University. The mean age of the participants was 20.9 years (SD = 1.45). A withinsubject design with repeated measures was chosen because of the high individual differences in reaction time measures and because of its successful implementations in prior studies (Brünken et al. 2002). The dependent variable of central interest was the performance on the secondary task, which was operationalized as a reaction time measure. In addition, the performance on the primary task, knowledge acquisition from the learning system, was measured as a control variable. A pre-test was used to assess learners’ domain-specific prior knowledge of the learning material. The independent variable was the amount of auditory information of the primary task. This variable had three levels, (1) no auditory information, (2) background music only, and (3) background music and narration. Within each level of the independent variable, several repeated measures of the secondary task performance were taken. The study design was the same as that used in our prior dual-task investigation on visual cognitive load (Brünken et al. 2002). Primary task. The multimedia learning system employed in this experiment has been used in several prior experiments (e.g., Brünken and Leutner 2000; Brünken et al. 2002). It contains verbal and pictorial instruction on the functionality of the blood circulation system, presented on 22 screens. Each screen was available in two variants, one containing a picture and a related on-screen text (relevant information in visual-only format), and the other containing the same picture and, instead of the on-screen text, a narration that presented the identical verbal information in a female voice (relevant information in audiovisual format). In addition, both variants contained background music in the form of an instrumental movie soundtrack without vocals. The presentation time of both variants of the system was the same for each screen, as screens were automatically advanced by the computer system. Following the within-subject design, the material was presented to each learner in a sequence where visual-only and audiovisual pages were alternated screenwise. To avoid interactions between presentation mode and the content of
122
ROLAND BRÜNKEN ET AL.
each page, different screen sequences were randomly assigned to different learners. To assess prior domain-specific knowledge and knowledge acquired in the learning situation, two criterion-referenced parallel tests were constructed (Klauer 1987), containing items about the acquired procedural knowledge (similar to Mayer’s (2001) problem solving test). Each test consisted of 22 items; each item was related to the content of one specific screen page. The page-related item construction was chosen to enable the calculation of subtest scores for each of the two presentation modalities of the learning material. The tests were constructed as multiple-choice tests with four answer alternatives per item. They were administered as paper-and-pencil pre-test before and as post-test immediately after the learning session. Secondary task. The secondary task employed in this study was based on a program called WinRT, which we developed for and successfully used in previous dual-task studies (Brünken et al. 2002). The task chosen for the purpose of this research was the detection of a simple auditory stimulus. A single tone was presented to the learner at random intervals of 5 to 10 seconds. The learners were instructed to press the space bar on their computer keyboard as soon as they detected the tone. The computer program automatically recorded the lag time between the presentation of the tone and the learners’ reaction. Along with the reaction time, the program stored the system time for each measurement to enable synchronization of reaction time measures and the type of treatment presented on the corresponding screen in the learning system. Depending on the experimental condition, the secondary task was either the only auditory stimulus, or it was presented in addition to the background music and the narration. Procedure. Individual test sessions were conducted for each participant in the computer lab of the Erfurt University Center for Research on Learning and Instruction (ZLB). The lab was equipped with a 1.2 GHz Pentium IV computer with a 19" monitor set to display a screen resolution of 1152 × 864 pixels. Auditory information was presented through a Philips headset. The learning system was programmed using Asymetrix Toolbook (Asymetrix 1997). The experimenter was present during the entire investigation. The procedure started for each participant with the completion of the untimed paper-and-pencil pre-test. The participants then completed the learning session which began with the presentation of one of the three experimental conditions. The computer program randomized the sequence of the conditions. In condition 1 (V−N−B: no auditory information), the secondary task was presented with a introductory page of the learning system (written
AUDITORY LOAD AND MODALITY EFFECTS IN MULTIMEDIA LEARNING
123
text and pictures) without background music for exactly 3 minutes. In condition 2 (V−N+B: with background music) and condition 3 (V+N+B: narration + background music) the secondary task was presented simultaneous to the respective pages of the learning system. Individual presentation times for each page varied, depending on the content presented, between 45 seconds and 3 minutes. Presentation times of the visual and audiovisual variant of the pages were the same and were determined based on the duration of the narration. The learners could not influence the pace, order, or volume of the presentation under any of the three conditions. After learners viewed all 22 screens in the learning session, the post-test was administered. Time for completing the post-test was not restricted. The entire procedure took approximately 60 minutes. Results Scoring and analysis. Prior domain specific knowledge was assessed using a 22-item pre-test in which each reaction to one of the answer alternatives was scored with one point; test scores were computed as the sum of all points received. Performance on the post-test was scored in the same way. Pre-test and post-test values were compared using paired sample t-tests to assess knowledge acquisition (post–pre). To determine learning outcomes for the two different variants of the materials, the post-test scores were divided into two subtest scores, one for the number of correct answers on items related to visually presented learning system pages (knowledge acquisition visual: K-VIS), the other for the number of correct answers on items related to audiovisually presented learning system pages (knowledge acquisition audiovisual: K-AV). The two post-test subtest scores were compared using paired sample t-tests. All analyses concerning primary task scores were conducted with alpha at the 0.05 level. The analysis of the secondary task performance was computed by executing the common three-step procedure of analyzing within-subject repeated measures data. First, individual means on the repeated measures of reaction times were calculated for each learner under each condition. These three individual mean values where then used for further analysis. Second, to compare the three experimental conditions, a repeated measures analysis of variance (RM-ANOVA) was conducted, taking the three experimental conditions as three points of measurement. The third step was the pairwise comparison of the three conditions by computing post-hoc paired sample t-tests with alpha adjustment (total alpha = 0.05). The statistical analysis was equivalent to that used in prior studies (Brünken et al. 2002).
124
ROLAND BRÜNKEN ET AL.
Primary task performance. With regard to prior domain specific knowledge, learners achieved a mean score of M = 5.2 (SD = 8.18) in the pre-test. The total post-test score was M = 24.2 (SD = 11.25). The difference (post–pre) was statistically highly significant (t(9) = 4.70; p < 0.001; d = 1.93). This difference indicates that the learners had low prior domain-specific knowledge, and that they indeed worked on the primary task and acquired knowledge from the multimedia learning system. Dividing the post-test score into the two modality-related subtest scores, we found that learners achieved for the visual K-VIS-score a mean value of M = 8.60 (SD = 8.28) and in the audiovisual K-AV-score a mean value of M = 15.6 (SD = 4.88). As in the previous analysis, this difference was statistically highly significant (t(9) = 2.91; p < 0.01; d = 1.03), which indicates a solid modality effect in the knowledge acquisition values.1 In summary, learning outcome scores from the primary task are in line with CLT and CTML. They show that, even in a dual-task condition, learners acquire knowledge from the learning materials, and, moreover, that they acquire more knowledge when the information is presented in audiovisual mode rather than visual mode only, as predicted by the modality effect. Secondary task performance. Concerning secondary task performance, Table 1 shows the descriptive data for individual learners (M, SD, N) for each experimental condition. As can easily be seen, the number of reaction time measures varies among learners and experimental conditions from N = 21 to N = 56. This variation occurs because the number of measures taken within the fixed total presentation time of each condition depends on the individual reaction times and the randomized time interval between the measures. The mean group values in secondary task performance for each of the three test conditions are illustrated on the left side of Figure 1. As can be seen, the reaction times on the auditory secondary detection tasks were similar under conditions V−N−B (without audio information: M = 231.69 msec; SD = 59.23) and V−N+B (background music: M = 235.61 msec; SD = 62.65), but increased dramatically in V+N+B (background music + narration: M = 285.37 msec; SD = 55.39). This pattern was confirmed by the RM-ANOVA, which showed a significant main effect of the factor experimental condition (F(2,8) = 24.15; p < 0.001). Analyzing this effect in more detail, the posthoc tests revealed that the reaction time differences between the conditions V−N−B and V+N+B (t(9) = 5.26; p < 0.001; d = 0.94) as well as the conditions V-N+B and V+N+B (t(9) = 5.91; p < 0.001; d = 0.84) were statistically 1 Note that because of the randomized presentation sequence, the differences in the sub-
scores could not have been caused by differences in item difficulties, as the sub-scores are based on different test items for different learners.
AUDITORY LOAD AND MODALITY EFFECTS IN MULTIMEDIA LEARNING
125
Table 1. Individual measures in Experiment 1 for the secondary task (reaction time in milliseconds; N, M, SD) for each of the three test conditions Subject number
Condition 1 (no auditory information) N M SD
Condition 2 (background music alone) N M SD
Condition 3 (background music + narration) N M SD
1 2 3 4 5 6 7 8 9 10
24 23 24 23 22 21 24 24 23 23
52 54 53 51 50 49 55 49 50 51
52 52 50 54 51 423 562 55 51 52
217.6 179.5 257.5 141.4 328.0 304.1 192.3 185.3 240.2 271.0
110.7 116.2 92.0 46.2 105.1 98.0 74.1 86.4 143.8 132.0
269.4 181.5 248.9 181.5 283.7 369.7 154.6 199.9 230.3 236.7
137.5 160.0 87.2 108.2 75.3 104.2 51.2 114.7 116.0 121.0
328.6 219.8 310.1 222.4 319.8 81.3 45.9 217.5 285.8 314.2
127.0 55.1 91.3 122.0 105.3 76.1 109.6 61.1 155.6 97.7
Figure 1. Mean values of the secondary task performance (reaction time) in Experiment 1 (left) and 2 (right) for each of the three experimental conditions (V−N−B: without phonological information; V−N+B: with background music; V+N+B: with background music and narration).
highly significant, while those for V−N−B and V−N+B were not (t(9) = 0.32; n.s.). In summary, the results show that secondary task performance decreased when, in addition to the auditory secondary task, background music and narration had to be processed simultaneously, but not when only background music had to be processed. These results are in line with the predictions of CTML and CLT concerning the modality effect.
126
ROLAND BRÜNKEN ET AL.
Experiment 2 To replicate the results of Experiment 1, we conducted a second investigation. Experiment 2 used the same experimental design, the same secondary task and comparable participants as in experiment 1, but different learning materials. Method Participants and design. Experiment 2 was conducted with 10 female undergraduate (B.A.) students at Erfurt University with a mean age of M = 25.6 years (SD = 4.57). A within-subject design with three points of measurement (experimental conditions V−N−B, V−N+B, and V+N+B), and repeated measures on each condition was implemented. Dependent variables were the pre-test score (assessing domain specific prior knowledge), post-test score (assessing knowledge acquisition from the learning material) and reaction times (assessing secondary task performance). Primary task. The primary task used in Experiment 2 was a multimedia tourist guide containing verbal and pictorial information about the historic city of Florence, Italy. This system has been used successfully in a series of prior experiments to study knowledge acquisition as well as in the context of dual-task measurement (Brünken, Steinbacher, Schnotz and Leutner 2001; Brünken et al. 2002). The system was implemented in Asymetrix Toolbook (Asymetrix 1997) and contains 14 screen pages, each including verbal and pictorial information concerning one specific point of interest (e.g., a church, a historic building, or a sculpture). As in the system used in Experiment 1, the verbal information on each screen page was presented either visually as on-screen-text or acoustically as narration. The presentation sequence of the pages was, as in Experiment 1, randomized among the participants. Two criterion-referenced parallel tests were constructed (Klauer 1987), each containing 14 multiple-choice items with 4 answer alternatives. Each item was related to the learning content of one specific screen page to enable the calculation of total performance scores as well as separate modality-specific subtest performance scores for information presented in visual-only and in audiovisual format, respectively. Secondary task. The secondary task was the same auditory detection task as in Experiment 1, realized by the WinRT program. Procedure. Experiment 2 was conducted with individual test sessions for each participant. Each session started with the paper-pencil pre-test, followed by
AUDITORY LOAD AND MODALITY EFFECTS IN MULTIMEDIA LEARNING
127
the learning session and was concluded by administering the paper pencil post-test. Within the learning session, the learners worked under three treatment conditions: V−N−B, where the learners responded to the secondary task and viewed a visual-only introductory screen page of the learning material without auditory information for exactly 3 minutes, V−N+B with visual presentation of the relevant verbal and pictorial information (pictures and onscreen text) and background music, and V+N+B with audiovisual information presentation (picture and narration) and background music. The sequence of presenting the three conditions was randomized among the participants. The learning session took about 15 minutes; the entire experimental session required about 60 minutes. No time restriction was implemented for the paper-and-pencil tests. Results Scoring and analysis. Scoring and analysis were equivalent to those described in Experiment 1. With regard to primary task performance, scores for the pre-test and post-test were calculated by adding scores for each item, and modality-specific subtest scores (K-VIS) and (K-AV) were computed by adding the answers on items related to visually (K-VIS) and audiovisually (K-AV) presented information, respectively. Scoring and analysis of the secondary task performance were identical to Experiment 1. Primary task performance. The mean score in the pre-test was M = 1.4 (SD = 3.66), indicating that the learners had in fact no prior knowledge about the learning material. Moreover, informal exit interviews showed that none of the participants had visited Italy before. The mean total post-test score was M = 8.6 (SD = 5.50). The difference (post–pre) was statistically highly significant (t(9) = 4.63; p < 0.001, d = 1.54), which indicates that the learners indeed acquired knowledge from the learning system. Calculating post-test subtest scores, the mean K-AV-score was M = 5.4 (SD = 2.99) while the mean KVIS-score was M = 3.2 (SD = 3.68). Paired samples t-tests showed that the difference between the K-AV and K-VIS scores was statistically marginally significant (t(9) = 1.82; p = 0.051; d = 0.65). Similar to Experiment 1, this indicates a modality effect in the primary task performance. Secondary task performance. The individual secondary task performance values (M, SD, N) for each learner under each of the three experimental conditions are shown in Table 2. As can be seen, the number of repeated measures varies among learners and experimental conditions from 15 to 47, which is the result of individual reaction time differences and randomization of intervals between measures.
128
ROLAND BRÜNKEN ET AL.
Table 2. Individual measures in Experiment 2 for the secondary task (reaction time in milliseconds; N, M, SD) for each of the three test conditions Subject number
Condition 1 (no auditory information) N M SD
Condition 2 (background music alone) N M SD
Condition 3 (background music + narration) N M SD
1 2 3 4 5 6 7 8 9 10
21 23 20 24 23 15 24 24 22 24
39 40 44 40 44 36 45 39 47 37
43 45 37 45 40 40 43 47 40 45
200.3 271.5 452.7 317.1 195.0 437.1 175.5 178.1 178.2 156.1
63.3 113.1 145.0 115.6 129.4 187.3 60.1 40.2 61.3 73.0
330.0 246.3 470.8 283.3 201.2 487.6 162.9 156.8 198.4 217.9
109.8 101.2 142.9 119.9 79.8 183.3 46.6 66.0 66.3 93.5
254.4 275.4 538.0 287.9 233.3 429.2 184.3 184.2 237.8 267.8
87.7 80.6 173.7 70.3 140.2 147.4 77.5 69.9 69.6 97.3
The individual mean values were averaged for each condition and the group means for each condition are presented on the right side of Figure 1. The pattern of results obtained is similar to that in Experiment 1. The mean reaction time is lowest (indicating best secondary task performance) in condition V−N−B without any additional auditory information (M = 256.16; SD = 110.99), followed by the reaction times in V−N+B with background music (M = 267.95 msec; SD = 118.07). The slowest mean reaction time (the poorest secondary task performance) was obtained in condition V+N+B with background music and narration (M = 296.67; SD = 111.05). This pattern was confirmed by a RM-ANOVA, which showed a marginally significant main effect of the within factor (F(2,8) = 2.95; p = 0.55). In order to analyze this effect post hoc in more detail, paired sample t-tests with alpha adjustment were conducted. They showed statistically significant differences between the scores for conditions V−N−B and V+N+B (t(9) = 2.39; p = 0.021; d = 0.36), and between scores for conditions V−N+B and V+N+B (t(9) = 2.45; p = 0.019; d = 0.25), but, as in Experiment 1, no statistically significant difference between the conditions V−N−B and V−N+B (t(9) = 1.06; n.s.). In other words, Experiment 2 replicated findings from Experiment 1, showing that the performance in an auditory secondary task decreases when the primary task requires the simultaneous processing of verbal information and background music, but does not decrease when, simultaneous to the secondary task, only background music is presented in the primary task, and the verbal informa-
AUDITORY LOAD AND MODALITY EFFECTS IN MULTIMEDIA LEARNING
129
tion is presented visually. As in Experiment 1, this result is in line with the theoretical predictions derived from CLT and CTML concerning the modality effect (Moreno and Mayer 2000).
General discussion The basic question addressed in this paper was whether the limited capacity assumption of CLT and CTML, which has mainly been studied using learning outcome measures, could be validated with a more direct measurement of resource demands in multimedia learning. As in our prior experiments on visual cognitive load, we found strong empirical evidence supporting the limited capacity assumption with respect to the modality effect for the auditory channel. As predicted by CLT and CTML, the auditory presentation of verbal information in a multimedia learning system requires specific cognitive resources, which therefore are not available for processing a simultaneously presented secondary task of the same modality. This effect is consistent with our prior findings (Brünken et al. 2002), where we found that the visual presentation of the same materials as used in this study demands more visual cognitive resources, causing a decrease in the performance of a visual secondary task. Taking both experiments into account, we found solid empirical evidence for the modality effect as an effect of different capacity demands. So far this effect has been inferred mainly based on the analysis of learning outcomes, and our results show that the same effect can be demonstrated by using a more direct method of measuring the demands on cognitive capacity, providing convergent validity for the underlying cognitive load hypothesis. One finding of our study that might lead to an extension of the work on seductive background music effect conducted by Moreno and Mayer (2000) is that in neither of the two experiments (conducted with different instructional materials) were there differences in secondary task performance between the single task condition (reaction time task only) and the dual task with pictorial information and background music condition. In other words, reaction times did not differ beyond chance when background music was presented compared to when it was not. Only the addition of a relevant narration caused a decrease in secondary task performance. Our findings suggest that the auditory cognitive requirements of background music alone did not differ from the auditory cognitive requirements of the materials without any auditory stimuli. Does this finding contradict the results of Moreno and Mayer, who had argued that the processing of irrelevant background music simultaneously with an animation and a narration demands auditory cognitive resources to
130
ROLAND BRÜNKEN ET AL.
a degree that learning outcomes are negatively affected? Because our study did not compare the same treatment conditions as the Moreno and Mayer study, it cannot provide a direct answer to this question, but we believe that the effects found in our study and in the Moreno and Mayer studies highlight some important basic theoretical implications for the CLT. We argue that the reason for the lack of a load effect of seductive background music, presented by itself, in the present study is that the observed load effects are not additive in a linear form. Moreno and Mayer’s (2000) results showed that recall and transfer were lower when both the narration and the background music were included than when only the narration was presented. In the present study, we measured the total load of background music when presented by itself and found that, in an otherwise unused auditory channel, this load was not different from the load experienced when no auditory information at all was presented. Moreno and Mayer, on the other hand, found a load effect when background music was presented in addition to narration, i.e., in a condition where there was already load on the auditory channel. The impact of background music on this channel in a moderate load condition might differ from the impact of the same music in a low load condition. This idea leads to the assumption that the background music alone does not impose load on auditory working memory, perhaps because – as it contains no relevant information – it is not related to the process of knowledge construction. Therefore, the background music alone has no cognitive load implications.2 Only in a situation where the auditory channel is already used for information processing – as in the audiovisual presentation condition – does background music have a load effect and, as a result, a negative impact on learning. This thought may be extended to a more general assumption of an interaction effect between extraneous load imposed by the instructional design and intrinsic load caused by the learning demands: If the intrinsic load of the learning materials is low, then the extraneous load due to the instructional design has no relevant effect since the total load is below the limits of the working memory. Extraneous load only matters in situations where the learner already works at the limits of his or her cognitive capacity. This is not a new insight by itself; what had not previously been shown, however, is how little of an impact background music has on load when presented by itself. Another possible explanation might be an attentional adaptation effect. Moreno and Mayer’s (2000) treatments had durations of 180 seconds and 45 seconds, respectively. In contrast, the treatments used in the present study lasted about 15 minutes each. It is conceivable that in such a longer treatment, participants were able to determine that the background music did not contain any instructional content and to ignore it in the processing of auditory 2 We would like to thank John Sweller for his helpful comments on this issue.
AUDITORY LOAD AND MODALITY EFFECTS IN MULTIMEDIA LEARNING
131
information. In a shorter treatment, this metacognitive process might not have had the same noticeable effect due to the initial adjustment period it would require. This would indicate that the cognitive demands of a specific task are not constant during the entire learning phase, but vary over the process of knowledge acquisition. The role of metacognitive control processes has not yet been understood in enough detail to answer this question conclusively (Valcke 2002). This study also has practical implications for the design of instructional materials. One of the important arguments of CLT is the need to reduce the extraneous cognitive load imposed by the instructional design of the learning materials by eliminating all redundant, seductive, and irrelevant elements. If applied without consideration of the total load of the learning materials, however, this principle could lead to a minimalist concept of instructional design that would result in materials with low extraneous cognitive load, but potentially also with a low level of interest. The present study shows that a simple “less is more” principle does not fit the empirical data adequately. A better approach might be to optimize extraneous load by taking the complex interaction between the demands of the learning material, the learning process, and the presentation mode into account. More research is required to better understand this complex interaction and to show how it can inform the design of interesting learning materials that do not impose too much load on the learner.
References Anderson, J.R. (1983) The Architecture of Cognition. Cambridge, MA: Harvard University Press. Asymetrix (1997) Toolbook Instructor II [Computer Program, PC]. Bellevue, WA: Asymetrix Learning Systems. Baddeley, A.D. (1986) Working Memory. Oxford: Oxford University Press. Baddeley, A.D. & Logie, R.H. (1999) Working memory: The multiple-component model. In: A. Miyake & P. Shah (eds), Models of Working Memory. Mechanisms of Active Maintenance and Executive Control, pp. 28–61. Cambridge: Cambridge University Press. Brünken, R. & Leutner, D. (2001) Aufmerksamkeitsverteilung oder Aufmerksamkeitsfokussierung? Empirische Ergebnisse zur “Split-Attention-Hypothese” beim Lernen mit Multimedia. [Split of attention or focusing of attention? Empirical results on the splitattention-hypothesis in multimedia learning], Unterrichtswissenschaft 29: 357–366. Brünken, R., Plass, J.L. & Leutner, D. (2003) Direct measurement of cognitive load in multimedia learning, Educational Psychologist 38(1): 53–62. Brünken, R., Steinbacher, S., Plass, J.L. & Leutner, D. (2002) Assessment of cognitive load in multimedia learning using dual-task methodology, Experimental Psychology 49: 1–12. Brünken, R., Steinbacher, S., Schnotz, W. & Leutner, D. (2001) Mentale Modelle und Effekte der Präsentations- und Abrufkodalität beim Lernen mit Multimedia. [Mental models and
132
ROLAND BRÜNKEN ET AL.
the effect of presentation and retrieval codality in multimedia learning], Zeitschrift für Pädagogische Psychologie 15: 15–27. Chandler, P. & Sweller, J. (1991) Cognitive Load Theory and the format of instruction, Cognition and Instruction 8: 293–332. Klauer, K.J. (1987) Kriteriumsorientierte Tests [Criterion Referenced Tests]. Göttingen: Hogrefe. Mayer, R.E. (2001) Multimedia Learning. New York: Cambridge University Press. Mayer, R.E. & Moreno, R. (1998) A split attention effect in multimedia learning: Evidence for dual processing systems in working memory, Journal of Educational Psychology 90: 312–320. Moreno, R. & Mayer, R.E. (2000) A coherence effect in multimedia learning: The case for minimizing irrelevant sounds in the design of multimedia instructional messages, Journal of Educational Psychology 92: 117–125. Mousavi, S.Y., Low, R. & Sweller, J. (1995) Reducing cognitive load by mixing auditory and visual presentation modes, Journal of Educational Psychology 87: 319–334. Miyake, A. & Shah, P. (eds) (1999) Models of Working Memory. Mechanisms of Active Maintenance and Executive Control. Cambridge: Cambridge University Press. Paas, F., Renkl, A. & Sweller, J. (2003) Cognitive Load Theory and instructional design: Recent developments, Educational Psychologist 38(1): 1–4. Paas, F., Tuovinen, J., Tabbers, H. & van Gerven, P. (2003) Mental workload measurement as a means to advance Cognitive Load Theory, Educational Psychologist 38(1): 63–71. Paivio, A. (1986) Mental Representations: A Dual Coding Approach. New York: Oxford University Press. Plass, J.L., Chun, D.M., Mayer, R.E. & Leutner, D. (2003) Cognitive load in reading a foreign language text with multimedia aids and the influence of verbal and spatial abilities, Computers in Human Behavior 19: 221–243. Plass, J.L., Chun, D.M., Mayer, R.E. & Leutner, D. (1998) Supporting visual and verbal learning preferences in a second language multimedia learning environment, Journal of Educational Psychology 90: 25–36. Schnotz, W. (2001) Wissenserwerb mit Multimedia [Knowledge acquisition with multimedia], Unterrichtswissenschaft 29: 292–318. Sweller, J. (1999) Instructional Design in Technical Areas. Camberwell, Australia: ACER Press. Sweller, J., Van Merriënboer, J. & Paas, F. (1998) Cognitive architecture and instructional design, Educational Psychology Review 10: 251–296. Tabbers, H.K. (2002) The Modality of Text in Multimedia Instructions. Unpublished Doctoral Dissertation, Open University of the Netherlands, Heerlen, NL. Tindall-Ford, S., Chandler, P. & Sweller, J. (1997) When two sensory modes are better than one, Journal of Experimental Psychology: Applied 3: 257–287. Valcke, M. (2002) Cognitive load: Updating the theory? Learning and Instruction 12: 147– 154. Wittrock, M.C. (1990) Generative processes of comprehension, Educational Psychologist 24: 345–376.