Human Factors: The Journal of the Human Factors and Ergonomics Society http://hfs.sagepub.com/
Effectiveness of Part-Task Training and Increasing-Difficulty Training Strategies: A Meta-Analysis Approach Christopher D. Wickens, Shaun Hutchins, Thomas Carolan and John Cumming Human Factors: The Journal of the Human Factors and Ergonomics Society published online 20 July 2012 DOI: 10.1177/0018720812451994 The online version of this article can be found at: http://hfs.sagepub.com/content/early/2012/07/20/0018720812451994
Published by: http://www.sagepublications.com
On behalf of:
Human Factors and Ergonomics Society
Additional services and information for Human Factors: The Journal of the Human Factors and Ergonomics Society can be found at: Email Alerts: http://hfs.sagepub.com/cgi/alerts Subscriptions: http://hfs.sagepub.com/subscriptions Reprints: http://www.sagepub.com/journalsReprints.nav Permissions: http://www.sagepub.com/journalsPermissions.nav
>> OnlineFirst Version of Record - Jul 20, 2012 What is This?
Downloaded from hfs.sagepub.com at COLORADO STATE UNIV LIBRARIES on August 31, 2012
Effectiveness of Part-Task Training and Increasing-Difficulty Training Strategies: A Meta-Analysis Approach Christopher D. Wickens, Shaun Hutchins, Thomas Carolan, Alion Science and Technology, Boulder, Colorado, and John Cumming, Colorado State University, Fort Collins, Colorado
Objective: The objective was to conduct metaanalyses that investigated the effects of two training strategies, increasing difficulty (ID) and part-task training (PTT), on transfer of skills and the variables that moderate effectiveness of the strategies. Background: Cognitive load theory (CLT) provides a basis for predicting that training strategies reducing the intrinsic load of a task during training avail more resources to be devoted to learning.Two strategies that accomplish this goal, by dividing tasks in parts or by simplifying tasks in early training trials, have offered only mixed success. Method: A pair of complementary effect size measures were used in the meta-analyses conducted on 37 transfer studies employing the two training strategies: (a) a transfer ratio analysis on the ratio of treatment transfer performance to control transfer performance and (b) a Hedges’ g analysis on the standardized difference between treatment and control group means. Results: PTT generally produced negative transfer when the parts were performed concurrently in the whole transfer task but not when the parts were performed in sequence.Variable-priority training of the whole task was a successful technique. ID training was successful when the increases were implemented adaptively but not when increased in fixed steps. Both strategies provided evidence that experienced learners benefited less, or suffered more, from the strategy, consistent with CLT. Conclusion: PTT can be successful if the integrated parts are varied in the priority they are given to the learner. ID training is successful if the increases are adaptive. The fundamental elements of CLT are confirmed. Keywords: training strategies, transfer of training, cognitive load theory, part-task training, adaptive training, meta-analysis Address correspondence to Dr. Christopher Dow Wickens, Alion Science, 4949 Pearl East Circle, Suite 300, Boulder, CO 80301; e-mail:
[email protected]. HUMAN FACTORS Vol. XX, No. X, Month XXXX, pp. X-X DOI:10.1177/0018720812451994 Copyright © 2012, Human Factors and Ergonomics Society.
Introduction Cognitive Load Theory
Learning is effortful. This commonsense statement becomes more complex when placed in the context of cognitive load theory (CLT; Mayer & Moreno, 2003; Paas, Renkl, & Sweller, 2003; Paas & van Gog, 2009; Sweller, 1994; Wickens, Hutchins, Carolan, & Cumming, 2012). Researchers have identified three sources of cognitive load within the learning environment. Intrinsic load is related directly to the target task being trained. For example, the intrinsic load of flying an airplane is greater than that of driving a car because of the greater number of axes of control and axis interaction in the former skill. Key drivers of this intrinsic load are the number of interacting elements in a task (Halford Wilson & Phillips, 1998; Kalyuga, 2011) and its working memory demands. Extraneous load is load imposed in the learning or training environment that is irrelevant to the task being learned. For example, when learning procedures on a software system, if it is necessary to go to a manual and look up the meaning of every abbreviation that appears on a display, extraneous load is imposed. Technical difficulties in the learning environment also impose extraneous load (Sitzmann, Ely, Bell, & Bauer, 2010). The sort of technically imposed extraneous load could also relate to stories or facts contained in a verbal lesson that may have entertainment value but little to do with the process being trained (Mayer, Griffith, Jurkowitz, & Rothman, 2008). Finally, and most critically, germane load has been classically defined by those sources of effort requirement or effort investment that are part of the learning process itself. Germane load includes aspects, such as rehearsal, that focus on the meaning of the material (Craik & Lockhart, 1972) or on making active choices (rather than witnessing another agent make those choices;
Downloaded from hfs.sagepub.com at COLORADO STATE UNIV LIBRARIES on August 31, 2012
2
Month XXXX - Human Factors
Slamecka & Graf, 1978). It also includes reciting material or answering questions about the material to be learned (McDaniel Howard & Einstein, 2009; Weinstein, McDermott, & Roediger, 2010). Indeed, a major goal of active learning techniques, such as exploratory learning and learner control, is to put the learner “in the loop,” making choices about the material to be learned that increase germane load (Keith & Frese, 2008). All of these aspects heavily depend on working memory (Kalyuga, 2011; Sweller, 2010). Although the distinction between intrinsic and germane load is somewhat fuzzy (Kalyuga, 2011), some convergent techniques can be used to discriminate these. Underlying CLT is the idea that these three sources of load compete with each other and that either high levels of intrinsic load and/or high levels of extraneous load can reduce the resources available to germane load, hence hindering the learning process. An important facet of this relationship is the observation that learners who are somewhat experienced in the task trained enjoy reduced demands for intrinsic load (e.g., the task is partially automatized; Schneider, 1985), because more resources are available for germane load. These higher-skilled learners benefit less from any load-reducing strategies during the training process than do naive learners (Paas & van Gog, 2009; van Merriënboer, Kester, & Paas, 2006). The CLT trichotomy is highly relevant to understanding the effects of different training strategies (Healey & Bourne, 2012; Lintern, 1989; Wickens, Carolan, Hutchins, & Cumming, 2011) and for understanding those factors that may moderate the effects of such strategies on learning and performance (i.e., transfer). We have already alluded to the advantages of active learning in creating germane load, but those advantages of learner active choice may be offset by the potential of the learner to get “off track” from the instructional goals and by being a source of extraneous load. As a consequence, we find that active learning strategies do not enjoy unqualified success (Wickens et al., 2011). Training Strategies
We examined two closely related training strategies, increasing difficulty (ID) and part-task
training (PTT), that are both designed to initially decrease the intrinsic load of the task and hence avail more resources for germane load. Yet, like the strategies of active learning, we will see that both ID and PTT can have consequences that may offset their benefits. These consequences are, for ID, learning the wrong skill and, for PTT, failing to learn the right skill. Prior research on both strategies has revealed a record of mixed success in terms of transfer to a target task. Wightman and Lintern (1985) reviewed the literature on PTT of psychomotor tasks, comparing transfer between groups in which the target task was separated into parts during training and those in which the whole task was trained from the beginning. In their review, they found a general record of success for PTT so long as the parts were performed sequentially in the whole task, such as the approach and flare portion of an aircraft landing or two sequential phrases in a musical piece. This approach they labeled segmentation. In contrast, when the two (or more) parts were performed concurrently in the whole task, such as controlling altitude and heading in an aircraft or the left and right hand of a piano piece, Wightman and Lintern (1985) found no evidence for a PTT benefit. This approach they labeled fractionation. What is missing in PTT of fractionation tasks is the opportunity to practice time-sharing skills (Damos & Wickens, 1980; Lintern & Wickens, 1991), for example, those necessary skills of scanning, bimanual coordination, or task switching that support multitask fluency. Lacking practice on timesharing skills during the initial transfer trials, PTT groups must take time to acquire them when they start the transfer trials, time that more than offsets any benefits from practicing the lower-intrinsic-load components of the target task one at a time. A close cousin of PTT is variable-priority training (VPT; Gopher, Weil, & Siegel, 1989), in which the whole task is maintained during training but, in contrast to the whole-task control group, different components are systematically emphasized or deemphasized to allow more attention to be focused on the former while still preserving the necessary element of time sharing.
Downloaded from hfs.sagepub.com at COLORADO STATE UNIV LIBRARIES on August 31, 2012
Part-Task and Increasing-Difficulty Training 3
Wightman and Lintern (1985) also included a third category of PTT that they labeled simplification, which is a technique that we define here as ID. In this technique, parameters of the target whole task are initially set to lower levels, to reduce the intrinsic load during early training trials, and then increased as training progresses, until the difficulty reaches the level of the target task (typically the difficulty at which the control group trains). For example, the long lag in the flight dynamics of heading control in an aircraft might be reduced in early trials (Briggs & Naylor, 1962), or target tasks with high time stress might initially be performed with that time stress greatly reduced (Mane, Adams, & Donchin, 1989). Importantly, as trials proceed, there are two strategies for ID. In a fixed increases schedule, all learners receive the same schedule of fixed step increases. In an adaptive schedule, difficulty increases for each learner will be adaptive on the basis of his or her trialby-trial performance. Wightman and Lintern found no overall evidence for success with ID training (i.e., simplification); however, they did not systematically contrast adaptive versus fixed schedules. An important contribution of the current meta-analyses is the merging of the simplification PTT literature with the fixed versus adaptive training literature to more fully demonstrate the impacts of increasing task difficulty. As with PTT, the purported benefit of ID training in decreasing intrinsic load during training can sometimes be offset by a cost, in this case, learning the inappropriate skill. For example, when the lag on a tracking task is increased (fixed or adaptively), the skill of tracking for a low-lag task is quite different from that for a high-lag task; the former requires rapid response to perceptual changes, but the latter requires visual and cognitive anticipation. Negative transfer between the two classes of skills could offset any advantages of lower intrinsic load. The Meta-Analytic Approach
The purpose of the current research is to examine the relative effectiveness of the two techniques in transfer of training through a meta-analysis of research (Borenstein, Hedges,
Higgins, & Rothstein, 2009; Rosenthal, 1991). This article augments the review by Wightman and Lintern (1985), which is more than 25 years old, addressed only psychomotor techniques, and did not include quantitative techniques of effect size estimation. Only one subsequent meta-analysis of PTT techniques was located (Fontana, Mazzardo, Furtado, & Gallagher, 2009). Their analysis concluded that part and whole techniques were equally effective overall. However, the authors examined only motor learning and focused on retention rather than transfer. Furthermore, 16 studies that we included in our current meta-analysis had not been included by those authors. We were unable to locate any meta-analysis addressing the success of ID techniques. The benefits of a meta-analysis (Rosenthal, 1991) is that it allows one not only to focus on the “collective wisdom” of a set of studies regarding a particular technique (e.g., what are the overall benefits or costs of PTT relative to whole-task training) but also to examine the effect of various moderator variables that may have varied within or between studies. For example, one such moderator variable for PTT is whether the parts were created by segmentation or fractionation. For ID, it might be whether difficulty was increased in a fixed or learneradaptive schedule. For both training strategies, we will be interested in the moderator variable of learner experience to examine one of the specific predictions of CLT described earlier (van Merriënboer et al., 2006). We employ two convergent forms of metaanalysis, each with complementary strengths and offsetting weakness: one based on the transfer ratio (TR) or percentage gain and one based on the Hedges’ g estimate of statistical effect size (Borenstein et al., 2009). In the former, we calculate, for each study, the ratio of performance on the transfer trials for participants in the treatment condition to that in the control condition; so a TR of 1.5 would indicate that the participants receiving the treatment (e.g., PTT) performed 50% better on the transfer trial than did participants in the control (i.e., whole-task training) condition. Individual TRs are then aggregated and subjected to statistical analyses, in much the same way that individual data for
Downloaded from hfs.sagepub.com at COLORADO STATE UNIV LIBRARIES on August 31, 2012
4
Month XXXX - Human Factors
participants are aggregated in conventional t tests or ANOVAs. For Hedges’ g, the effect size for each study is measured in a manner that more closely mimics that used in conventional statistical inference, reflecting not only the difference between means of the two groups but also estimates of variance and sample size. Each technique has strengths and weaknesses. A strength of the TR is that it expresses a benefit (or cost) in a metric that is directly interpretable to the user (e.g., a ratio of 1.5 means “50% more effective”). Furthermore, many transfer studies, particularly those carried out away from the social sciences (e.g., in engineering domains) do not report the necessary statistical data that would allow extraction of a Hedges’ g measure; and therefore, these would be excluded from such a meta-analysis. Use of the TR measure thereby allows our meta-analysis to be more inclusive, increasing the number of studies examined. But of course, a limitation of the TR is that studies of the sort described may contribute high ratios but of unknown statistical significance. The strength of Hedges’ g is both that it is more conventional and that it has a structured way of characterizing and exploiting the statistical power within each individual study to draw inferences regarding an effect size. Thus if a particular comparison for a moderator variable exists only within a single study, there would be no statistical power to examine its effect in the TR approach, but it would be readily available in the Hedges’ g approach. In this respect, the two approaches offer complementary ways to attain statistical power but by different means: TR by including more studies and Hedges’ g by being less vulnerable to the loss of power when fewer studies are available. Hedges’ g was employed instead of the often-used Cohen’s d because the former corrects for the lower reliability of smaller-N studies, whereas the latter does not. Thus the primary goal of the current research was to evaluate the effectiveness of ID and PTT. With the simplest interpretation of CLT, both should produce positive transfer. However, within a broader context, the benefits of reduced intrinsic load could be offset by inappropriate or inadequate skill learning fostered by the techniques.
Furthermore, a more refined view of CLT predicts that the intrinsic-load benefits will be reduced with more experienced learners and with simpler tasks, wherein the reduced intrinsic load is less of an advantage. A second goal is to compare the two meta-analytic techniques to examine their degree of complementarity. Method Searches
Keyword searches were applied to a total of 76 unique databases, including the Defense Technical Information Center (DTIC), the entire database services of EBSCO (including PsycINFO, PsycARTICLES, and ERIC, among 40 others), the Web of Science, Wiley Interscience, and Science Direct. For each of the training strategies examined, a series of custom search strings was formed including the appropriate strategy-specific keywords in conjunction with training or learning. The reference sections of high-value papers (i.e., literature reviews, metaanalyses, significant empirical studies, etc.) were also searched for additional relevant papers. Inclusion Criteria
Following the “judgment” calls of Wanous, Sullivan, and Malinak (1989, p. 261) and general guidance of the larger meta-analysis literature (Rosenthal, 1991), several criteria were used to guide choices for inclusion and exclusion of literature. The final sources included were only those that (a) contained a control condition with identical training and transfer test circumstances to the treatment condition (both quasiexperimental and experimental designs), (b) provided a broadly defined measure of training transfer (defined as any posttraining procedural, knowledge, immediate-transfer, delayed-transfer, near-transfer, or far-transfer test), and (c) examined normalhealth adults representative of the military population. Coding and Data Extraction
We performed a second pass through the remaining articles to (a) assess the article’s level on a set of moderator variables (see Results) and (b) extract the necessary measures
Downloaded from hfs.sagepub.com at COLORADO STATE UNIV LIBRARIES on August 31, 2012
Part-Task and Increasing-Difficulty Training 5
of TRs and/or effect sizes. For each measure, separate data extraction was carried out for different levels of moderator variables available within the experimental design (e.g., high- vs. low-experienced learners) independently by two team members. The team checked moderator coding for consensus and addressed any discrepancies by collectively reviewing the original article to reach agreement on appropriate code. Analysis
We analyzed all Hedges’ g data in Comprehensive Meta Analysis (CMA) Version 2 using a fixed-effect model. We analyzed all TR data in SPSS Version 19 using a fixed-effect weighted least squares regression. The fixedeffect models used in both CMA and SPSS weighted study means by group sample size to help with the precision of estimates. Note that a somewhat unconventional k† is used to indicate the number of data points used in analysis and multiple effects per study were allowed, whereas conventionally in meta-analysis, k is used to indicate the number of studies used in analysis and only one effect size per study is included. Because of the exploratory nature of the analysis, allowance for multiple effects, and choice of fixed-effect modeling rather than random effects, no adjustments were made to account for nesting of effects within studies; however, it should be further noted that even though samples contributing to multiple effects arguably had within-study dependence, they did represent independent samples (e.g., between-subjects designs) within each study. Results
In the following, we report both the TR, for which a ratio equal to 1.0 indicates no difference between treatment and control condition, and the Hedges’ g measure, for which a value of 0 indicates no difference. We also report the extent to which any value of each measure is significantly (*p < .05) greater or less than the baseline [1.0, 0] measure of equivalence between treatment and control, inferred by exclusion of 1.0 or 0 in the 95% confidence intervals, TR and g, respectively. In a few cases, when a significant effect on one measure
is coupled by a near-significant (p < .10) effect on the other, we report the latter as well, to signal the overall consistency of the two measures. Our focus in reporting the results is on the differences between a given level of a moderator variable and the equivalence point (TR = 1.0 or g = 0) rather than on the differences between two or more levels of a moderator variable. This focus is because we believe the greatest value to the consumer of our results lies in determining the conditions in which specific variations of the training strategy in question is effective or not in contrast to the status quo rather than how its overall benefits or costs are moderated by variation in those conditions. In each of the following sections, we first report the overall benefit or cost of the strategy, then the effects of a set of moderator variables that were common to both strategies (ID and PTT), and finally, effects of any moderator variables that are defined only for the specific strategy in question. The strength of effects of a moderator variable is reflected in the extent to which both measures show effects of increasing significance. Results: ID
Overall, there was neither a cost nor benefit for ID training relative to the constant difficulty control (TR = 1.22, k† = 30; g = +0.03, k† = 15). However, several moderator variables were significant. Task complexity. Task complexity showed no significant effect, but there were too few studies in which tasks of different complexity were manipulated across the training strategy to reveal any patterns. Prior experience. Prior experience was coded only for a subset of studies that clearly identified naive (no experience) learners. (That is, there were no studies that clearly contrasted low- versus high-experienced learners.) For the low-experienced learners, the results revealed a benefit for ID (TR = 1.1, k† = 6; g = +0.75*, k† = 6), a benefit that of course was not present when the entire data set was examined (see the overall effect previously reported). Type of transfer. Near-transfer findings were inconclusive, showing a nonsignificant benefit for TR and a nonsignificant cost for g. Although
Downloaded from hfs.sagepub.com at COLORADO STATE UNIV LIBRARIES on August 31, 2012
6
Month XXXX - Human Factors
not explicitly labeled as a form of far transfer, some studies continued to increase the difficulty of the task past training criterion during transfer. When the transfer test was of equal difficulty to training criterion, there was a strong benefit to ID (TR = 1.32, k† = 23, p < .10; g = .21*, k† = 9). However, when task difficulty was more difficult in transfer than in the training criterion, there was a cost for ID (TR = 0.7, k† = 7; g = –0.56*, k† = 6). Delivery environment. ID produced a benefit when there was no instructor present (TR = 1.38, k† = 23, p < .10; g = +0.42*, k† = 10) and, generally, a cost when the instructor was present (TR = 0.89, k† = 7; g = –0.48*, k† = 5). With respect to the delivery system, ID revealed a benefit when employed for computer- or webbased training (TR = 1.47, k† = 14, p < .10; g = +0.54*, k† = 8). Test type. Performance tests showed a cost of ID (TR = 1.12; g = –0.52*, k† = 9), whereas question-and-answer tests (whether based on declarative or procedural knowledge) showed a consistent benefit (TR = 1.24, k† = 6, p < .10; g = +0.54*, k† = 6). Delay. Delayed transfer testing yielded a cost for ID (TR = 0.95, k† = 10; g = –0.36*, k† = 7), whereas immediate transfer yielded a benefit (TR = 1.39, k† = 20, p < .10; g = +0.46*, k† = 8). Strategy-Specific Modulator
Adaptive difficulty. When difficulty was increased adaptively, there was a benefit relative to the constant difficulty control (TR = 1.36, k† = 21, p < .10; g = +0.75*, k† = 7), whereas studies that increased difficulty in fixed (learner-independent) steps showed a cost relative to the fixed-difficulty control group (TR = 0.77, k† = 6; g = –0.90*, k† = 5). Discussion: ID
The most important finding from the metaanalysis that was unique to the ID strategy was the clear benefit to adaptive training compared with training schedules that increase difficulty in learner-independent steps. Although the technology required to implement automated adaptive training must be more complex than that for fixed increases, as it involves real-time (or semi-real-time) performance measurement
and adaptive decisions on what and how much intrinsic load should be increased, the payoff is substantial, producing a 36% benefit compared with a fixed-difficulty control relative to a constant-difficulty control. Such a benefit was reversed when difficulty was increased in fixed steps, a 23% cost. Results: PTT
Overall there was a cost for PTT (TR = 0.87*, k† = 65, g = –.06, k† = 35). This cost was moderated as follows. Task complexity. When complexity was manipulated within an experiment, tasks of low complexity showed a strong cost for PTT (TR = 0.60*, k† = 5; g = –0.49*, k† = 1), whereas tasks of high complexity did not (TR = 0.83, k† = 9; g = –0.10, k† = 5). Prior experience. Prior experience showed only a small effect. When experience was high, there was a large PTT cost in one measure (g = –0.93*, k† = 2) but not in the other (TR = 0.84, k† = 2). However, when experience was low, this cost remained significant in the g measure but was reduced in its magnitude (g = –0.25*, k† = 11). Delivery environment. When an instructor was coupled with a group or class, there was a strong cost for PTT (TR = 0.85*, k† = 18; g = –0.43*, k† = 4), and when an instructor was coupled with an individual, there was a cost (TR = 0.80*, k† = 30; g = –0.08, k† = 21). However, when no instructor was involved, a benefit for PTT emerged (TR = 1.11, k† = 17; g = +0.40*, k† = 10). This benefit is consistent with a benefit for PTT that emerged with computer-based instruction (TR = 1.09, k† = 17; g = +0.32*, k† = 9). Type of transfer. Near transfer showed strong costs (TR = 0.84*, k† = 53; g = –.07, k† = 27), as did far transfer, even though the samples were small (TR = 0.79, k† = 2; g = –1.12*, k† = 2). Strategy-Specific Modulators
Concurrence. Segmentation, whereby the parts were created out of sequential elements within the task, showed neither cost nor benefit by either measure. However, fractionation, whereby the two parts were performed concurrently during the whole task, showed a strong cost (TR = 0.71*, k† = 25; g = –0.35*, k† = 10).
Downloaded from hfs.sagepub.com at COLORADO STATE UNIV LIBRARIES on August 31, 2012
Part-Task and Increasing-Difficulty Training 7
VPT. VPT conditions or studies were not included in the PTT meta-analysis. However, when the effectiveness of VPT was evaluated in its own right (compared with fixed-difficulty control), a substantial benefit was observed (TR = 1.27*, k† = 12; g = +0.74*, k† = 7). This benefit of course stands in stark contrast with the PTT cost described previously but is in agreement with the fractionation cost (the intended timesharing benefit of VPT). Discussion: PTT
In the meta-analysis, two prominent results emerge that are unique to the PTT manipulations. First, the overall failure of PTT is localized in those studies in which parts were created by fractionation. That is, components time shared in the whole task were now separated in training. This separation prevented a time-sharing skill from being developed during training, an absence that penalized the treatment group when it transferred to the whole task. This absence is supported in the strong cost across time-sharing, perceptual, and psychomotor skill types. Second, the importance of time-sharing skill is indirectly supported by the success of VPT (Gopher et al., 1989). Here, the best of both worlds is preserved. The whole task is maintained in some form during training, but the learner is allowed to (and in fact encouraged to) focus on one task at a time, reducing the intrinsic load of the concurrent part. General Discussion
The unifying theme of the two meta-analyses reported was the role of CLT in accounting for the successes or failures of the two strategies. Both PTT and ID were designed to reduce intrinsic task load in the training environment, but neither demonstrated a clear pattern of success, and indeed, one, PTT, might be described as failing. In both cases, one can point to counteracting factors, unrelated to cognitive load, that partially or fully offset any benefits of load reduction. As pointed out earlier, for ID, there were potential problems of training inappropriate skills, but also revealed by the meta-analysis were the challenges of developing a single fixed schedule for difficulty increase for all
learners. When this impediment was removed, by the adoption of adaptive logic, the reducedload benefits predicted by CLT (Sweller, 2010) clearly emerged. For PTT, we assume that for most time-sharing tasks, the importance of the time-sharing skill is sufficiently great as to require its training during whole-task training sessions and is seen as a cost with a fractionation PTT technique. Again, when provision of time sharing skill practice was made in the VPT paradigms, the benefits of reduced intrinsic load (from concentrating on only one component during learning) emerged. The underlying framework of CLT was also supported by two other aspects of the data. First, the theory predicted that the benefits of intrinsic-load reduction strategies would be reduced or eliminated with more experienced learners (Paas & van Gog, 2009; van Merriënboer et al., 2006). In support, we found that benefits of ID strategy were realized for the novice learner but not for all learners together; and although PTT showed an overall cost, this cost was substantially reduced for the naive learner when the g measure was examined. Second, if less experienced learners, presumably bearing a higher intrinsic load, benefit more (or are penalized less) from the loadreducing strategies, then we would also expect the same load-reducing benefits with simpler tasks, and indeed, the task complexity effects were consistent with this prediction. Although ID strategies were found to be equally effective for simple and complex tasks, the PTT cost with low-complexity tasks was eliminated for training of high-complexity tasks, in which intrinsic load would have been higher, and hence loadreducing benefits were better exploited. Instructor Effect
Another common pattern between the two strategies that might reflect a general principle is the substantial “instructor-present cost.” Here, the presence of an instructor in the training environment eliminated (and slightly reversed) the benefits of ID strategies and amplified the costs of PTT. Two mechanisms may be hypothesized, post hoc, to account for this unexpected effect. First, consistent with CLT, it may be that the presence of and interaction with an instructor,
Downloaded from hfs.sagepub.com at COLORADO STATE UNIV LIBRARIES on August 31, 2012
8
Month XXXX - Human Factors
along with whatever technology was used to implement the strategy, may have added extraneous load to the students’ task load. Second, and related, there may have been circumstances in which the instructor offered feedback or guidance that was inconsistent with the schedules inherent in the training environment. Methodological Findings
The two techniques did indeed provide convergent, and complementary, information. We conducted an analysis of the concordance between the results of the two effect size measures by comparing the percentage of cases in which the two measures agreed or disagreed on direction (cost or benefit) and significance (p < .05). For the ID analysis, directional concordance was 78% with a 10% agreement in direction and significance. There were no cases whereby significant contraindications were found between the two measures, and only 8% (4) of the total differed in direction, with one measure indicating a significant effect. For the PTT analysis, directional concordance was 94% with a 41% agreement in direction and significance. Again, there were no cases whereby significant contraindications were found between the two measures, and there was only 1 case in which the two effect size measures differed in direction and one measure indicated a significant effect. Overall, the Hedges’ g measure was most powerful, yielding 35 and 33 significant effects for PTT and ID, respectively, whereas TR yielded 28 and 2 such effects. This latter result signaled the benefit of g for contributing results of comparisons with only a few studies. Implications for Training
Training-system designers should strive to implement adaptive logic for ID schedules. Designers should also be wary of using PTT techniques for complex multitask skills, either minimizing the number of part-task trials or adopting VPT strategies as an alternative. Designers should also recognize that selection of training strategy is not one size fits all and that consideration of individual factors, such as experience, and skill types, such as time sharing, should weigh on decisions.
Limitations of Current Research
All meta-analyses lack total control over potentially confounding moderator variables. When each variable in turn is examined, the levels on one are likely to contain a distribution of studies with a different proportion of levels on the other moderator variable in a way that could confound (inflate or deflate) the measured effect size of the first. Acknowledgments This work is supported by U.S. Army Research Institute Contract W91WAW-09-C-0081 to Alion Science and Technology titled “Understanding the Impact of Training on Performance.”
Key Points •• Part-task training (PTT) is not an effective training strategy for parts that are time shared in the transfer task but may be effective for parts performed sequentially. •• PTT is effective when the parts can be integrated in variable priority training. •• Increasing-difficulty training can be effective when difficulty is increased adaptively but not when increased in fixed steps. •• Both training strategies benefit when learners are naive to the task but not when learners are more experienced. •• The results are consistent with predictions of cognitive load theory.
References References marked with an asterisk indicate studies included in the meta-analyses. *Adams, J. A., & Hufford, L. E. (1962). Contributions of a parttask trainer to the learning and relearning of a time-shared flight maneuver. Human Factors, 4, 159–170. *Bancroft, N. R., & Duva, J. S. (1969). The effects of adaptive stepping criterion on tracking performance: A preliminary investigation (Report No. NAVTRADEVCEN-TN-3). Orlando, FL: Naval Training Device Center. *Barrett, G. V., Greenawalt, J. P., Thornton, C. L., & Williamson, T. R. (1977). Adaptive training and individual differences in perception. Perceptual and Motor Skills, 44, 875–880. *Battiste, V. (1987). Part-task vs. whole-task training on a supervisory control task. Proceedings of the Human Factors Society 31st Annual Meeting (1365–1369). Santa Monica, CA: Human Factors and Ergonomics Society. Borenstein, M., Hedges, L., Higgins, J., & Rothstein, H. (2009). Introduction to meta-analysis. New York, NY: Wiley & Sons. *Briggs, G. E., & Brogden, W. J. (1954). The effect of component practice on performance of a lever-positioning skill. Journal of Experimental Psychology, 48, 375–380.
Downloaded from hfs.sagepub.com at COLORADO STATE UNIV LIBRARIES on August 31, 2012
Part-Task and Increasing-Difficulty Training 9
*Briggs, G. E., & Naylor, J. C. (1962). The relative efficiency of several training methods as a function of transfer task complexity. Journal of Experimental Psychology, 64, 505–512. *Briggs, G. E., & Waters, L. K. (1958). Training and transfer as a function of component interaction. Journal of Experimental Psychology, 56(6), 492–500. *Brydges, R., Carnahan, H., Backstein, D., & Dubrowski, A. (2007). Application of motor learning principles to complex surgical tasks: Searching for the optimal practice schedule. Journal of Motor Behavior, 39, 40–48. *Clawson, D. M., Healy, A. F., Ericsson, K. A., & Bourne, L. E., Jr. (2001). Retention and transfer of Morse code reception skill by novices: Part-whole training. Journal of Experimental Psychology: Applied, 7, 129–142. *Cote, D. O., Williges, B. H., & Williges, R. C. (1981). Augmented feedback in adaptive motor skill training. Human Factors, 23, 505–508. Craik, F. I. M., & Lockhart, R. S. (1972). Levels of processing: A framework for memory research. Journal of Verbal Learning and Verbal Behavior, 11, 671–684. Damos, D., & Wickens, C. D. (1980). The identification and transfer of time-sharing skills. Acta Psychologica, 46, 15–39. *Fabiani, M., Buckely, J., Gratton, G., Coles, M., Donchin, E., & Logie, R. (1989). The training of complex task performance. Acta Psychologica, 71, 259–299. Fontana, F. E., Mazzardo, O., Furtado, O., Jr., & Gallagher, J. D. (2009). Whole and part practice: A meta-analysis. Perceptual and Motor Skills, 109, 517–530. *Gagné, R. M., & Foster, H. (1949). Transfer of training from practice on components in a motor skill. Journal of Experimental Psychology, 39, 47–68. *Goettl, B. P., & Shute, V. J. (1996). Analysis of part-task training using the backward-transfer technique. Journal of Experimental Psychology: Applied, 2, 227–249. Gopher, D., Weil, M., & Siegel, D. (1989). Practice under changing priorities: An approach to the training of complex skills. Acta Psychologica, 71, 147–177. *Gopher, D., Williges, B. H., Williges, R. C., & Damos, D. L. (1975). Varying the type and number of adaptive variables in continuous tracking. Journal of Motor Behavior, 7, 159–170. Halford, G. S., Wilson, W. H., & Phillips, S. (1998). Processing capacity defined by relational complexity: Implications for comparative, developmental, and cognitive psychology. Behavioral and Brain Sciences, 21, 803–831. *Hansen, S., Tremblay, L., & Elliott, D. (2005). Part and whole practice: Chunking and online control in the acquisition of a serial motor task. Research Quarterly for Exercise and Sport, 76, 60–66. Healey, A., & Bourne, L. (2012). Training cognition: Optimizing efficiency, durability, and generalizability. Boca Raton, FL: CRC. Kalyuga, S. (2011). Cognitive load theory: How many types of load does it really need? Educational Psychology Review, 23, 1–19. *Kalyuga, S., & Sweller, J. (2005). Rapid dynamic assessment of expertise to improve the efficiency of adaptive E-learning. Educational Technology Research & Development, 53, 83–93. Keith, N., & Frese, M. (2008). Effectiveness of error management training: A meta-analysis. Journal of Applied Psychology, 93, 59–69. *Koch, H. L. (1923). A neglected phase of the part-whole problem. Journal of Experimental Psychology, 6, 366–376.
*Lim, J., Reiser, R., & Olina, Z. (2009). The effects of part-task and whole-task instructional approaches on acquisition and transfer of a complex cognitive skill. Educational Technology Research & Development, 57, 61–77. Lintern, G. (1989). The learning strategies program: Concluding remarks. Acta Psychologica, 71, 301–309. Lintern, G., & Wickens, C. D. (1991). Issues for acquisition in transfer of timesharing and dual-task skills. In D. Damos (Ed.), Multiple-task performance (pp. 123–138). London, UK: Taylor & Francis. *Mane, A. M., Adams, J. A., & Donchin, E. (1989). Adaptive and part-whole training in the acquisition of a complex perceptualmotor skill. Acta Psychologica, 71, 179–196. Mane, A., Coles, M., Wickens, C. D., & Donchin, E. (1983). The use of additive factors methodology in the analysis of a skill. In Proceedings of the Human Factors Society 27th Annual Meeting (pp. 407–411). Santa Monica, CA: Human Factors Society. *Mattoon, J. S. (1994). Designing instructional simulations: Effects of instructional control and type of training task on developing display-interpretation skills. International Journal of Aviation Psychology, 4, 189–209. Mayer, R. E., Griffith, I., Jurkowitz, N., & Rothman, D. (2008) Increased interestingness of extraneous details in a multimedia science presentation leads to decreased learning. Journal of Experimental Psychology: Applied, 14, 329–339. Mayer, R. E., & Moreno, R. (2003). Nine ways to reduce cognitive load in multimedia learning. Educational Psychologist, 38, 45–52. McDaniel, M. A., Howard, D. C., & Einstein, G. O. (2009). The read-recite-review study strategy: Effective and portable. Psychological Science, 20, 516–522. *McGuigan, F. J., & Maccaslin, E. F. (1955). Whole and part methods in learning a perceptual motor skill. American Journal of Psychology, 68, 658–661. *Metzler-Baddeley, C., & Baddeley, R. (2009). Does adaptive training work? Applied Cognitive Psychology, 23, 254–266. *Mirabella, A., & Lamb, J. C. (1966). Computer based adaptive training applied to symbolic displays. Perceptual and Motor Skills, 23, 647–661. *Murray, J. F. (1981). Effects of whole versus part method of training on transfer of learning. Perceptual and Motor Skills, 53, 883–889. *Nadolski, R. J., Kirschner, P. A., Eroen, J. J., & van Merriënboer, J. G. (2005). Optimizing the number of steps in learning tasks for complex skills. British Journal of Educational Psychology, 75, 223–237. *Naylor, J. C., & Briggs, G. E. (1963). Effects of task complexity and task organization on the relative efficiency of part and whole training methods. Journal of Experimental Psychology, 65, 217–224. *Norman, D. A. (1972). Adaptive training of manual control: Comparison of three adaptive variables and two logic schemes (Report No. NAVTRADEVCEN-69-C-0156-1). Orlando, FL: Naval Training Device Center. Paas, F., Renkl, A., & Sweller, J. (2003). Cognitive load theory and instructional design: Recent developments. Educational Psychologist, 38, 1–4. Paas, F., & van Gog, T. (2009). Principles for designing effective and efficient training for complex cognitive skills. In F. Durso (Ed.), Reviews of human factors and ergonomics (Vol. 5, pp. 166–194). Santa Monica, CA: Human Factors and Ergonomics Society.
Downloaded from hfs.sagepub.com at COLORADO STATE UNIV LIBRARIES on August 31, 2012
10
Month XXXX - Human Factors
*Park, J. H., Wilde, H., & Shea, C. (2004). Part-whole practice of movement sequences. Journal of Motor Behavior, 36, 51–61. *Peck, A. C., & Detweiler, M. C. (2000). Training concurrent multistep procedural tasks. Human Factors, 42, 379–389. Rosenthal, R. (1991). Meta-analytic procedures for social research (Rev. ed.). Beverly Hills, CA: Sage. *Ross, S. M., & Rakow, E. A. (1981). Learner control versus program control as adaptive strategies for selection of instructional support on math rules. Journal of Educational Psychology, 73, 745–753. *Salden, R. J., Paas, F., Broers, N. J., & van Merriënboer, J. J. (2004). Mental effort and performance as determinants for the dynamic selection of learning tasks in air traffic control training. Instructional Science, 32, 153–172. Schneider, W. (1985). Training high-performance skills: Fallacies and guidelines. Human Factors, 27, 285–300. Sitzmann, T., Ely, K., Bell, B. S., & Bauer, K. (2010). The effects of technical difficulties on learning and attrition during online training. Journal of Experimental Psychology: Applied, 16, 281–292. Slamecka, N. J., & Graf, P. (1978). The generation effect: Delineation of a phenomenon. Journal of Experimental Psychology: Human Learning and Memory, 4, 592–604. *Stroud, J. B., & Ridgeway, C. W. (1932). The relative efficiency of the whole, part and progressive part methods when trials are massed: A minor experiment. Journal of Educational Psychology, 23, 632–634. Sweller, J. (1994). Cognitive load theory, learning difficulty and instructional design. Learning and Instruction, 4, 295–312. Sweller, J. (2010). Element interactivity and intrinsic, extraneous and germane cognitive load. Educational Psychology Review, 22, 123–138. van Merriënboer, J. J. G., Kester, L., & Paas, F. (2006). Teaching complex rather than simple tasks: Balancing intrinsic and germane load to enhance transfer of learning. Applied Cognitive Psychology, 20, 343–352. Wanous, J. P., Sullivan, S. E., & Malinak, J. (1989). The role of judgment calls in meta-analysis. Journal of Applied Psychology, 74, 259–264. Weinstein, Y., McDermott, K. B., & Roediger, H. L. (2010). A comparison of study strategies for passages: Re-reading, answering questions, and generating questions. Journal of Experimental Psychology: Applied, 16, 308–316. Wickens, C. D., Carolan, T., Hutchins, S., & Cumming, J. (2011). Investigating the impact of training on transfer: A meta-analytic approach. In Proceedings of the Human Factors and Ergonomics Society 55th Annual Meeting (pp. 2138–2142). Santa Monica, CA: Human Factors and Ergonomics Society. Wickens, C. D., Hutchins, S., Carolan, T., & Cumming, J. (2012). Attention and cognitive resource load in training strategies. In A. F. Healy & L. E. Bourne, Jr. (Eds.), Training cognition: Optimizing efficiency, durability, and generalizability. New York, NY: Psychology Press.
Wightman, D., & Lintern, G. (1985). Part task training for tracking and manual control. Human Factors, 27, 267–284. *Wightman, D. C., & Sistrunk, F. (1987). Part-task training strategies in simulated carrier landing final-approach training. Human Factors, 29, 245–254. *Wood, M. E. (1970). Continuously adaptive versus discrete changes of task difficulty in the training of a complex perceptual motor task (Report No. AFHRL-TR-70-30). Mesa, AZ: Air Force Human Resources Laboratory.
Christopher D. Wickens is a senior scientist at Alion Science and Technology Corporation, Micro Analysis and Design Operation, in Boulder, Colorado, and professor emeritus at the University of Illinois at Urbana-Champaign. He received his PhD in psychology from the University of Michigan in 1974. Shaun Hutchins is a senior human factors engineer at Alion Science and Technology, Micro Analysis and Design Operation. He received his MA in experimental psychology from New Mexico State University in Las Cruces and is currently a PhD student studying research methodology at Colorado State University. Thomas Carolan is a senior scientist and program manager with Alion Science and Technology, where he works on research and development efforts related to human performance measurement and training effectiveness. He has a PhD in psychology from the University of Connecticut. John Cumming, PhD, is a recent graduate of the Research Methodology program within the School of Education at Colorado State University. He received his MA in educational psychology, emphasizing in research and evaluation methodology, from the University of Colorado at Denver and his BA in psychology from the Metropolitan State College of Denver. Date received: December 12, 2011 Date accepted: May 21, 2012
Downloaded from hfs.sagepub.com at COLORADO STATE UNIV LIBRARIES on August 31, 2012