292
Int. J. Human Factors and Ergonomics, Vol. 4, Nos. 3/4, 2016
Designing adaptive systems: selecting an invoking threshold to improve human performance Christina F. Rusnock* Department of Systems Engineering and Management, Air Force Institute of Technology, 2950 Hobson Way, Bldg 640, Room 107B, Wright Patterson Air Force Base, OH 45433, USA Email:
[email protected] *Corresponding author
Christopher D. Geiger Business Analytics and Industrial Engineering, Universal Orlando Resort, 1000 Universal Studios Plaza/B-110, Orlando, FL 32819, USA Email:
[email protected] Abstract: Previous adaptive automation design research has focussed on the decisions of how to automate, how much to automate, and what to automate. Another important factor that has not been widely considered is when to automate. As adaptive systems become more viable, the design decision of when to automate (i.e. the workload/taskload level that should be used to invoke the adaptive automation) will become increasing important. This research uses human performance simulation to analyse the impact of adaptive automation thresholds on operator workload and situation awareness. Through an unmanned ground and aerial vehicle case study using human trials and discrete-event simulation, this research reveals that the effectiveness of the adaptive automation requires a deliberate trade-off between performance, workload, and situation awareness goals. Keywords: adaptive automation; human performance modelling; invoking threshold; mental workload; simulation; situation awareness. Reference to this paper should be made as follows: Rusnock, C.F. and Geiger, C.D. (2016) ‘Designing adaptive systems: selecting an invoking threshold to improve human performance’, Int. J. Human Factors and Ergonomics, Vol. 4, Nos. 3/4, pp.292–315. Biographical notes: Major Christina F. Rusnock, PhD, is an Assistant Professor in the Department of Systems Engineering and Management at the Air Force Institute of Technology, Wright Patterson AFB, OH. Dr. Rusnock earned her PhD in Industrial Engineering, Human Factors/Ergonomics Specialisation, from the University of Central Florida, and an MS in Research and Development Management from the Air Force Institute of Technology. Her research interests include human performance modelling, mental workload, trust in automation, and situational awareness, with a focus on applications in human-machine interaction and adaptive automation. She is an active member of IISE and HFES. Copyright © 2016 Inderscience Enterprises Ltd.
Designing adaptive systems
293
Christopher D. Geiger, PhD, is a Director of Business Analytics and Industrial Engineering at Universal Orlando Resort in Orlando, FL. Prior to joining Universal, Dr. Geiger was a tenured Associate Professor of Industrial Engineering and Management Systems at the University of Central Florida in Orlando, FL. Dr. Geiger earned an MS and a PhD in Industrial Engineering from Purdue University in West Lafayette, IN. His areas of interests include computer modelling and simulation theory and multi-objective optimisation (multi-criteria decision-making) with application in operations scheduling, production planning, inventory management, and transportation scheduling. He is an active member of IISE and INFORMS.
1
Introduction
Automated systems provide numerous benefits, including increased operator performance, system efficiency, and operator safety. However, several human factor issues tend to emerge with automation use, including operator mistrust of the automation, over-reliance on automation, erosion of the human operator’s skills, decreased situation awareness, decreased vigilance, and operator complacency (Bailey and Scerbo, 2007; Lee and See, 2004; Sheridan and Parasuraman, 2006; Sheridan and Verplank, 1978). Adaptive automation has the potential to address these issues, while maintaining the benefits of traditional automation. Adaptive automation dynamically changes the level of automation in real time in order to moderate operator mental workload to meet performance or workload goals. Changes in the level of automation are triggered by the system using real-time information regarding task performance, events, or operator states (Hancock and Chignell, 1988; Rouse, 1988). By automating tasks during periods of high taskload/workload, an adaptive system takes on more tasks from the operator, reducing the operator’s workload to a desired level while allowing increased situation awareness. Furthermore, during times of low taskload/workload, adaptive systems return tasks to the operator, thereby decreasing the likelihood of operator boredom and complacency while preventing the erosion of the human operator’s skills. By keeping the human in the loop during dynamic task allocation, adaptive automation provides a unique opportunity to improve system performance. Adaptive automation might be particularly well-suited for systems where the taskloads vary over time and the decisions involved require human judgement (Moray, Inagaki and Itoh, 2000; Parasuraman, Barnes and Cosenzo, 2007). Numerous studies have demonstrated the benefits of adaptive automation over traditional automation (Bailey et al., 2006; de Visser and Parasuraman, 2011; Hancock, 2007). However, much investigation is still needed with respect to designing adaptive systems, especially since handoffs that are poorly implemented or poorly timed can negate the automation’s performance benefits and erode operator trust. de Visser et al. (2008) describe three key decisions in the design of adaptive systems: how much to automate, what to automate, and when to automate. The design space regarding how much to automate is largely captured by levels of automation (LOA) frameworks such as that described by Sheridan and Verplank (1978). The decision of
294
C.F. Rusnock and C.D. Geiger
what to automate examines the specific tasks to automate. This design space has been explored by both the LOA taxonomy proposed by Endsley and Kaber (1999) and the model for types and LOA framework proposed by Parasuraman, Sheridan and Wickens (2000). In both of these frameworks, LOAs are combined with automation activities (e.g. functions or information processing stages) to describe the particular activity to be automated. With respect to when to automate, de Visser et al. (2008) describe three invoking mechanisms: critical events, measurement based, and model based (and also hybrid mechanisms). However, the specific mechanism to invoke automation could be more accurately categorised as how to automate. The question of when to automate is actually a question of timing, and - given the selection of a specific mechanism - this question still remains. Goodman et al. began to explore the issue of timing by using human performance modelling to predict the impact of invoking timing on team performance and workload when using a critical event-based trigger. With a post-event trigger, automation can invoke 1
upon immediate detection or
2
after a specified delay time from detection - allowing the human an opportunity to intervene.
For pre-event trigger, the options are 1
immediately or
2
within a specified time-remaining until the event occurs - allowing for the human to pre-emptively intervene.
Goodman et al. (2016) examined a post-event trigger, finding that as the delay in invoking the automation increased, team performance decreased, operator workload increased, and human involvement (i.e. compensation) increased. In the case of post-event triggers, timing is largely a question of how much delay the automation should have before concluding the operator is not going to react, and thus the automation should take over (e.g. at what point should a car initiate automatic braking). However, delay time is less relevant for at-time-of-event triggers, such as those based on task demands, event occurrences, or neurological/physiological measurements. Current research is increasingly investigating the use of at-time-of-event triggers, such as those based on task demand or predicted workload. As these methods become more refined and established, the design decision of when to automate will refer to the level of taskload/workload that should be used to invoke the adaptive automation. This study seeks to make headway on the implications of this design decision by exploring the impact of adaptive automation threshold levels based on at-time-of-event triggers. This study seeks to understand the relative trade-offs to workload and situation awareness that occur when specifying invoking threshold levels for an adaptive system. Due to the exploratory nature of this work and the current lack of refinement in real-time workload measurement techniques, this paper explores the adaptive automation design decision through human performance simulation of a human-in-the-loop system to be modified in the future to include real-time adaptive automation.
Designing adaptive systems
2
295
Previous related research
2.1 The relationship between mental workload and operator performance Implementation of adaptive automation consistently centres on injecting automation when operators are perceived to have high levels of workload (Bailey et al., 2006; de Visser and Parasuraman, 2011; Parasuraman, Cosenzo and de Visser, 2009). Workload and performance are presumed to have a non-linear, inverted U-shaped relationship as described by the Hebb-Yerkes-Dodson Law (Teigen, 1994). Through empirical studies, Cassenti et al. characterise this workload-performance curve (shown in Figure 1), where low operator mental workload corresponds to a medium-high level of operator performance (Segment A). At low to medium levels of workload (Segment B), operator performance tends to be high reaching a maximum level, and the operator can maintain this level of performance as workload increases. However, as operator workload continues to increase, it begins to negatively impact performance (Segment C) until finally the operator reaches the lowest level of performance (Segment D) (Cassenti et al., 2011). Thus, establishing a mental workload goal for an adaptive system depends upon where an operator is currently located on this workload-performance curve. For instance, if a human operator is operating along Segment C, then that operator’s performance should benefit from a reduction in the workload level. On the other hand, if a human operator is operating along Segment A of the workload-performance curve, then that operator could benefit from an increase in workload. Figure 1
Performance-workload curve
Source: Adapted from Cassenti et al. (2011)
Thus, for adaptive systems to be truly effective, system designers must recognise the potential for non-linear relationships between workload and performance - and account for this relationship in the system design. Early in the adaptive system design process, the adaptive system designers should identify which portion or portions of this curve account for the relationship between workload and performance for their particular system under the range of potential operating conditions. This approach would be helpful to identify
296
C.F. Rusnock and C.D. Geiger
which phases of operation and workload levels would be candidates for intervention, which phases and workload levels are acceptable, and which phases and workload levels require automation to take over some of the operator’s tasks.
2.2 The relationship between mental workload and situation awareness Mental workload and situation awareness do not follow a linear - or even quadratic relationship. As Endsley hypothesised, workload and situation awareness display a complex relationship that is highly situation-dependent, which can either converge or diverge (Endsley, 1993). The particulars of a task, and the operator’s relationship to that task, could result in an operator existing in any region of the workload-situation awareness framework shown in Figure 2. In the ideal state region, situation awareness tends to be high when mental workload is low. In the overload region, operator situation awareness tends to be low when mental workload is high. Furthermore, this framework recognises that human operators can maintain relatively high situation awareness when at fairly high levels of mental workload (the challenged region). On the other hand, both situation awareness and workload can be low (the vigilance region), such as when performing long, uneventful tasks. Figure 2
Hypothesised workload-situation awareness framework (see online version for colours)
Establishing the appropriate workload level to invoke the adaptive automation requires an understanding of the relationship between workload and performance, as well as between workload and situation awareness, for the particular human-computer system being designed as well as for the various relevant operating conditions. Furthermore, it is important to note that maximum system performance may not be the only priority, and that there may be workload or situation awareness goals that need to be met, even if achieving these goals results in instances of lower overall system performance. Workload and situation awareness goals may permit some system performance degradation in order to prevent the operator from becoming bored and disengaged (i.e. avoid low workload, avoid low situation awareness) or to prevent the operator from becoming overwhelmed (i.e. avoid high workload). This perspective recognises that accounting for the long-term needs of the operator will result in more stable performance, even if it diminishes shortterm performance. For example, complete automation with minimal human involvement
Designing adaptive systems
297
(low workload, low situation awareness) may result in higher short-term performance, but disengaging the operator may limit his/her ability to respond to unexpected situations that require human involvement (Kaber, Onal and Endsley, 2000). This may provide motivation to reduce automation and increase human involvement in mundane system operations, for the purpose of increasing workload and situation awareness, even if human task performance is not as effective or efficient as automation task performance.
2.3 Adaptive automation Current research in adaptive automation primarily seeks to demonstrate the humancomputer system performance benefits of adaptive automation over user-initiated automation, static automation, or manual systems. Studies examining the impact of adaptive automation on mental workload consistently find that adaptive automation results in lower workload than using static automation, random automation, or manual modes (de Visser and Parasuraman, 2011; Dorneich et al., 2006; Parasuraman, Cosenzo and de Visser, 2009; Wilson and Russell, 2004). While most studies also report improvements in system performance using adaptive automation vs. static automation, random automation, and manual modes (Cosenzo et al., 2010; Dorneich et al., 2006; Haarmann, Boucsein and Schaefer, 2009; Kaber et al., 2005; Parasuraman, Cosenzo and de Visser, 2009; Wilson and Russell, 2007), a few studies find that there is no statistical difference in performance between adaptive systems and non-adaptive systems (Arciszewski, De Greef and Van Delft, 2009; Szalma and Taylor, 2011). While studies have sought to address the effectiveness of adaptive automation systems, much work still remain with respect to design of adaptive systems. Those studies that examine adaptive automation system design decisions focus primarily on what to automate (Clamann, Wright and Kaber, 2002; Steinhauser, Pavlas and Hancock, 2009). Some consideration has also been paid to how much to automate (Arciszewski, De Greef and Van Delft, 2009), interface design (Kaber et al., 2001), invoking methods (de Visser et al., 2008), and invoking triggers (Davidsson and Alm, 2014). However, after these system design decisions are made, the automation invoking timing still needs to be established. For workload-based triggers, this timing specifically refers to the workload threshold at which the automation is triggered. Given that adaptive automation ought to be implemented in systems with variable workload, it is expected that the invoking threshold used to activate the automation will have a significant impact on the frequency and duration that the automation is active. This, in turn, will impact the level of human operator workload and overall human-machine team performance. Thus, it is important that the threshold value for invoking the adaptive automation be carefully selected, as this value simultaneously influences performance, workload, and situation awareness.
3
Study methodology
3.1 Purpose This study expanded on previous research regarding adaptive automation design decisions by examining when to automate when using measurement-based invoking
298
C.F. Rusnock and C.D. Geiger
mechanisms. To do so, this study used human performance modelling to explore invoking threshold value setting as a function of an operator’s mental workload in order to balance human-machine team performance and operator situation awareness. This study also sought to understand the relative trade-offs that occur when selecting the threshold level for invoking adaptive automation. Additionally, this study examined how selecting an invoking threshold level may require the consideration of the trade-off between performance, workload, and situation awareness goals.
3.2 Methodology overview The study begins with the identification of a system that will be redesigned to include adaptive automation. A human-in-the-loop experiment is performed on the baseline system (without adaptive automation) to obtain an understanding of current operator performance and to provide data for use in the human performance model. Next, a baseline simulation model is built and validated using the human-in-the-loop data. Finally, the human performance model is updated to include adaptive automation and is used to perform analysis on the impacts of triggering the adaptive automation based on different workload levels.
3.3 Identify baseline system This study utilised a military-inspired unmanned intelligence, surveillance, and reconnaissance mission as the baseline system. This system involves an operator who receives and subsequently interprets intelligence outputs from multiple remotely controlled unmanned assets. Then, the operator identifies and reports any potential threats and changes in the environment. The first major task, change detection, involves the operator receiving intelligence from multiple unmanned aerial vehicles performing aerial surveillance of a hostile territory. These unmanned aerial vehicles track known points of interest on a digital map of the territory. As updates are received, the operator is responsible for identifying and reporting the observed changes in the points of interest. In the second major task, threat detection, the operator monitors a live video feed from an unmanned ground vehicle that is patrolling a hostile urban environment. The operator is responsible for identifying enemies (called threats) that appear on the unmanned ground vehicle video feed. This system is appropriate for adaptive automation implementation because operator workload is expected to be variable and some decisions require human judgement.
3.4 Conduct human-in-the-loop experiment These surveillance tasks were performed in a controlled laboratory setting in which each of the 150 human participants assumed the role of the operator. The participants included 85 male and 65 female university students, ranging in age from 18 to 45 with a mean age of 19.6. The tasks were presented in a virtual environment consisting of an operator control unit (shown in Figure 3) that was comprised of one computer monitor and a computer mouse. The following sections describe the change detection task and the threat detection task that the human operator (i.e. the study participant) completed. Prior to participating in the study, each participant completed a training session that involved a
Designing adaptive systems
299
detailed briefing of the study expectations and practice on each of the tasks individually and simultaneously as dual tasks. Figure 3
Operator control unit of the human-in-the-loop study (see online version for colours)
3.4.1 The change detection task scenario During the change detection task, the participant monitored the situation map for changes in a set of icons that represented points of interest and identified any changes in the icons that occurred. The situation map displayed an average of 24 different coloured and shaped icons at a time (Figure 4). At specific times, icons appeared, disappeared, and moved. If a change was perceived, the participant identified and then indicated the perceived change by using the computer mouse to select the button in the operator control unit that corresponded with the perceived change (i.e. appeared, disappeared and movement). Figure 4
The change detection task situation map display on the operator control unit (see online version for colours)
300
C.F. Rusnock and C.D. Geiger
3.4.2 The threat detection task scenario In the threat detection task, the human operator monitored a live video feed from an unmanned ground vehicle that was patrolling a hostile urban environment, and the operator was responsible for identifying threats that appeared on the video feed. During the threat detection task, the participant viewed the street view located in the topmost portion of the operator control unit computer monitor. The street view displays urban streets lined with actors that represent threats and non-threats (Figure 5). There were four types of actors: friendly soldiers, friendly civilians, enemy soldiers, and armed civilians (i.e. insurgents). Both enemy soldiers and armed civilians were considered threats. Figure 5
Street view (see online version for colours)
When a participant perceived a threat in the street view, that participant identified the threat by using the computer mouse to select the THREAT DETECT button in the topright portion of the operator control unit computer monitor and then clicked on the perceived threat. Since the unmanned ground vehicle driving task was automated, the participant had to identify threats before the vehicle passed them and the threats disappeared from the street view. If there were multiple threats in the street view, the participant could identify them in any order.
3.4.3 The dual task scenarios For the dual task scenarios, the study participants performed both the change detection and threat detection tasks simultaneously. During the training session, the participants were instructed that both tasks should be considered equally important. There was no change in the instructions between the single-task and dual task scenarios. Also, there was no preferred order in which the participants complete the tasks.
3.4.4 Human-in-the-loop experimental design The change detection and threat detection tasks were used to create four experimental scenarios. In Scenario 1, a participant performed only the change detection task, with the icon change rate varied between low, medium, and high event rates. In Scenario 3, a participant performed only the threat detection task, and the actor image appearance rate was varied between low, medium, and high event rates. Scenarios 2 and 4 are dual task scenarios in which both the change detection and the threat detection tasks were performed by the human participant simultaneously. In Scenario 2, the icon change rate for the change detection task was varied between low,
Designing adaptive systems
301
medium, and high event rates, while the actor image appearance rate for the threat detection task was held constant at a medium event rate. In Scenario 4, the actor appearance rate for the threat detection task was varied between low, medium, and high event rates, while the icon change rate for the change detection task was held constant at a medium event rate. Table 1 summarises the four scenarios for the 12 experimental design variants and specific event rates used in the human-in-the-loop study. In this study, 150 participants performed all four scenarios at each of the three taskload rates in a randomised order. Table 1 Taskload event rate
Summary of the experimental design variants Scenario 1
Scenario 2
Scenario 3
Scenario 4
Low
CD: 6 Changes/Min TD: None
CD: 6 Changes/Min TD: 28 Actors/Min
CD: None TD: 14 Actors/Min
CD: 12 Changes/Min TD: 14 Actors/Min
Medium
CD: 12 Changes/Min TD: None
CD: 12 Changes/Min TD: 28 Actors/Min
CD: None TD: 28 Actors/Min
CD: 12 Changes/Min TD: 28 Actors/Min
High
CD: 24 Changes/Min TD: None
CD: 24 Changes/Min TD: 28 Actors/Min
CD: None TD: 56 Actors/Min
CD: 12 Changes/Min TD: 56 Actors/Min
3.5 Develop and validate baseline model To investigate the adaptive automation invoking threshold design decision, this research extends the human-in-the-loop study by employing discrete-event simulation. The discrete-event simulation models were built in the improved performance research integration tool (IMPRINT), a human performance modelling software tool specifically designed to model human performance and mental workload through its implementation of the visual, auditory, cognitive, psychomotor (VACP) method (Mitchell, 2000). The simulation provides a continuous, real-time prediction of workload, enabling the exploration of when to automate without being subject to current limitations in neurological/physiological data. Thus, simulation enabled this experiment to isolate the effects of the invoking threshold levels, without complications due to current technological limitations (i.e. imprecise workload measurements). Thus, a primary assumption in this study is that the technical challenges surrounding measurement-based invoking mechanisms will be resolved and that it will become feasible in future to implement adaptive systems based on neuroergonomic measures of operator workload. First, a set of baseline models without adaptive automation was constructed, in order to validate accurate representation of the operator workload experienced while performing the change detection and threat detection tasks. The baseline models consisted of four separate models, corresponding to the four scenarios described in Table 1. The task networks model the human operator’s attention as they monitor for changes/threats and respond to those changes/threats using the process defined by the human-in-the-loop study. The dual task models combined the model networks from the
302
C.F. Rusnock and C.D. Geiger
single task networks with additional variables and release conditions to enforce physical limitations from having to use one mouse to accomplish both tasks. Based on exit questionnaires participants divided their attention unequally, with almost 80% performing the task they deemed to be the higher priority and switching to the lower priority task during ‘windows of opportunity’. The model logic captured this strategy using IMPRINT’s sequential workload management strategy, in which the operator completed the ongoing task before starting the new task. Figures 6 and 7 present the task networks for the single task change detection and threat detection models, respectively. These networks capture system and operator tasks and logic. Model input data, such as task time probability distributions, probabilistic decision logic, and error rates, were derived from participant data. Figure 6
Change detection model task network (see online version for colours)
Figure 7
Threat detection model task network (see online version for colours)
The models use an objective analytical measure of mental workload computed using the VACP method (Mitchell, 2000). This analytical tool enables workload prediction for systems that are hypothetical or in early development. VACP has been used extensively
Designing adaptive systems
303
by the US Army (Mitchell and Chen, 2006; Mitchell and Samms, 2010) to make workload-based trade-offs and design decisions early in the development cycle. The VACP method builds upon multiple resource theory by capturing mental workload demands across seven separate resource changes: 1
visual
2
auditory
3
cognitive
4
fine psychomotor
5
gross psychomotor
6
speech
7
tactile.
Each sub-task performed is rated by level of demand on a scale from 0 to 7 for each resource channel, where zero represents no demand and seven represents the highest level of demand. Workload demand values are determined based on descriptive anchors corresponding to numerical values contained in IMPRINT and adapted from the VACP tables contained in earlier works (Bierbaum, Szabo and Aldrich, 1989; McCracken and Aldrich, 1984; Mitchell, 2000). VACP workload values were calculated for each channel and summed within each resource channel and across resource channels. Mental workload values were then augmented with an interference value that represented the potential conflict when more than one resource channel was used simultaneously. The interference values are also contained in IMPRINT and are task- and channel-specific, with the specific values and calculations based on the work of Wickens (2002). Because workload is the amount of attentional resources required for a specific person to perform a specific task in a specific context, workload will vary from operator to operator. Thus, assigning workload values requires a deliberate assessment upon the part of the modeller to account for the specific situation. For example, simple arithmetic may be challenging for a young child but quite easy for an educated adult. The VACP tables provide the flexibility to capture these differences between operators performing the same task. The child would be assigned a value of 7.0 on the cognitive channel (verbal anchor: ‘calculation’), whereas the adult would be assigned a value of 1.0 (verbal anchor: ‘simple association’). Thus, models should be constructed for a specific class of operator with an assumed skillset level. In addition to the default VACP values, IMPRINT also contains options to implement human performance adjustments based on fatigue, environment (cold, heat), and personal protective equipment. For the models in this study, the operators are assumed to be proficient in performing the task (based on training), with no fatigue, environment, or equipment stressors. The operator’s total workload value was calculated continuously as tasks are performed. For comparison and interpretation, these values were then transformed into time-weighted averages for each of the 12 variants described in Table 1. These workload values were then validated through a correlation analysis, which compared the model workload values to the subjective workload values collected from the study participants. Correlation analysis was selected as an appropriate statistical method, since all measures of workload analysed differ in their units of measure and scales. The data for the correlation analysis consisted of 12 matched pairs; for each of the 12 experimental design
304
C.F. Rusnock and C.D. Geiger
variants shown in Table 1, the time-weighted mean predicted VACP workload values are matched with the mean subjective workload values from the 150 participants. The subjective workload measures used in the correlation analysis included the mean instantaneous self-assessment (ISA) score (Tattersall and Foord, 1996), mean NASA-TLX (task-load index) mental dimension score, mean NASA-TLX temporal dimension score, and mean NASA-TLX effort dimension score (Hart and Staveland, 1988). Individual NASA-TLX dimensions are used in lieu of the overall NASA-TLX score in order to maintain the focus on mental workload vs. other components of workload. The correlation analysis revealed that the subjective measures and the baseline model’s predicted workload are highly correlated (R > 0.90) with each other. Based on the high correlation with the well-established subjective measures, the workload values predicted by the model were deemed valid. For more details on this validation, see (Rusnock and Geiger, 2013). In order to identify adaptive automation workload targets with which to evaluate the results of the different levels of invoking thresholds, participant performance from the human-in-the-loop study, predicted workload from the baseline model, and corresponding average ISA scores were compared for each model variant. While the ISA scores are not part of the model, they provided a frame of reference for how the VACP scores compare with the workload levels reported by the human subjects. The ISA is a single rating on a scale of 1 to 5, as described in Table 2, with lower values corresponding to low workload and higher values corresponding to higher workload. For those researchers more accustomed to the NASA-TLX, the correlation between the ISA and NASA-TLX mental dimension for this study is 0.983. The ISA scores are presented here because they have specific, non-overlapping verbal anchors, and thus the numerical values have the same interpretation for all users. Table 2
Instantaneous self-assessment ratings
Rating
Workload
1
Under-utilised
Nothing to do. Rather boring
2
Relaxed
More than enough time for all tasks. Active on the task less than 50% of the time
3
Comfortably busy pace
All tasks well in hand. Busy but stimulating pace. Could keep going continuously at this level
4
High
Non-essential task suffering. Could not work at this level very long
5
Excessive
Behind on tasks; losing track of the full picture
Source:
Description
Adapted from Kirwan et al. (1997)
Table 3 summarises the performance, ISA rating values, and model predicted time-weighted average VACP workload values for each of the 12 variants given in Table 1. (Recall that this is for the baseline system, which does not include the adaptive automation.) Table 3 is sorted by the predicted time-weighted average VACP workload scores generated by the models (the fifth column). It can be seen that the highest change detection task performance corresponds to Scenario 1 (i.e. the change detection task only) at the low and medium event rates. Similarly, the highest threat detection task scores occur during Scenario 3 (i.e. the threat detection task only) at the low and medium event rates.
Designing adaptive systems Table 3
305
Baseline human-computer system performance and workload values
Scenario_ Segment S3_Low S3_Medium
Average % changes Average % threats correctly identified correctly identified (standard deviation) (standard deviation)
Average ISA rating
Average predicted VACP workload
N/A
95% (9)
1.63
1.56
N/A
93% (19)
2.08
3.46
S1_Low
63% (16)
N/A
2.33
6.74
S1_Medium
61% (14)
N/A
2.59
7.24
S1_High
40% (9)
N/A
3.26
7.70
S3_High
N/A
94% (9)
2.69
8.15
S4_Low
46% (13)
92% (11)
3.08
10.59
S2_Low
41% (14)
89% (22)
3.27
13.47
S2_High
26% (8)
82% (13)
3.84
13.88
S4_Medium
42% (13)
88% (12)
3.43
13.95
S2_Medium
41% (14)
89% (17)
3.40
14.27
S4_High
34% (13)
86% (9)
3.94
20.55
The values summarised in Table 3 also reveal that increased workload corresponds with decreased performance, suggesting that this system is in Segment C of the workloadperformance curve described by Cassenti et al. Therefore, the goal of minimising workload is an appropriate goal for the adaptive automation implementation in this system. A more specific goal could also be defined based on mission requirements. For example, the mission may necessitate a threat detection performance goal of at least 90%. From Table 3, it can be seen that the model predicted time-weighted average VACP workload values below 13.47 correspond with average threat detection performance scores of at least 92% and predicted time-weighted average VACP workload values of 13.47 and above correspond with average threat detection performance of 89% and lower. Thus, in addition to (or, in place of) the minimisation goal, this system design includes a goal of achieving time-weighted average VACP workload values less than 13.47. Note that this is a time-weighted average goal and does not necessitate that the maximum instantaneous VACP workload value should be 13.47. The workload goal is captured in time-weighted average VACP as this is the measure that is on the same timescale as the ISA subjective workload scores and the performance scores. If the desired instantaneous VACP workload value was known, then this would be the value of the invoking threshold. However, this value is not known; thus, this study explores setting the invoking threshold at various instantaneous VACP to identify which will result in acceptable time-weighted average VACP workload values, while also maintaining the operator’s situation awareness. The VACP workload target of a time-weighted average less than 13.47 is specific to this particular scenario. However, the methodology used in this study can be extended to the design of any adaptive system. In general, the methodology calls for system designers to determine whether increasing mental workload is expected to result in an increase, a decrease, or no effect on system performance. System designers should also establish whether the goal is to minimise, maximise, or achieve a target range of workload. Setting this goal also requires considering whether there is an objective performance goal (e.g.
306
C.F. Rusnock and C.D. Geiger
85% accuracy), as well as whether there are any separate workload considerations, such as the need to maintain mental workload at a minimum level for skill development.
3.6 Implement adaptive automation Based on the task analysis and observations of participants performing the tasks, and exit surveys completed by the study participants, the change detection task was selected as an appropriate task for incorporating adaptive automation. During the change detection task, when a change event occurred, if the operator did not see the change, or was unsure of which change event occurs, there was no means for the operator to replay the video feed to verify what had occurred. Furthermore, the operator had to identify the change type before the next change occurred even though he or she did not know when the next change would occur. Thus, the operator was faced with an unknown and limited amount of time to perform the task, creating task time pressure. In the threat detection task, most actors were on the screen for over 30 s, and the appearance of an additional threat did not prevent the operator from clicking on any previously identified threats remaining on the screen. As a result, the operator could respond to the visible threats in any order and could look at the threat as often or for as long as the threat was on the screen. This allowed the operator flexibility in managing his or her time to respond to threats, as well as in assessing whether or not an actor is a threat, alleviating task time pressure. Based on the temporal stress of the change detection task compared to the threat detection task, the preferred task to automate in this research investigation was the change detection task. Furthermore, this task was also more reasonable to automate since determining whether or not a change had occurred was a straightforward perception task, whereas threat identification was more complex, requiring consideration of several characterisation variables and thus was best left to human judgement. Automating the change detection task in this study aligns with the practice of using automation for less critical and repetitive tasks, while reserving judgement-based pattern recognition tasks for the human (Arciszewski, De Greef and Van Delft, 2009; Fitts et al., 1951). As this was an initial study, the current implementation for the adaptive automation features 100% reliability and the operator is not given the option to disengage or override the automation. Future work will explore the impact of reliability and automation reliance on human-machine team performance. To incorporate measurement-based adaptive automation into the validated baseline model, the adaptive system would take over the process of identifying and reporting changes when an operator’s estimated workload was greater than or equal to a specified instantaneous VACP workload threshold value. The simulated adaptive system checked the operator’s VACP workload level at 2-s intervals. The 2-s interval was selected based upon the distribution of time for an operator to perform the identify-and-select sequence of the change detection task. After the automation was invoked, the system alerted the user that the automation had taken control of the change detection task. The system continued to check the operator’s VACP workload levels every 2 s to determine if the VACP workload level had fallen below the invoking threshold. When the workload fell below the threshold value, the automation was revoked and became inactive; notifying the user that it was returning the responsibility of the change detection task to the human operator.
Designing adaptive systems
307
Adaptive automation was implemented in the dual task scenarios (i.e. Scenarios 2 and 4 in Table 1), since these scenarios had noticeable performance decrements, and high subjective workload indicated that the system would benefit from automated aiding. The goal for the adaptive system was to reduce average workload levels of the dual task scenarios to a manageable level. In addition to operator workload, operator situation awareness was considered as an additional dependent variable of interest. Since IMPRINT does not explicitly model situation awareness, two measures served as proxies for situation awareness. Both measures were based on the change detection task since this was the task that was enhanced through adaptive automation. The first situation awareness performance measure was the percentage of changes detected, i.e. the number of changes identified by the operator divided by the total number of actual changes, excluding changes identified by the adaptive system. The second situation awareness measure was the percentage of time the operator spent monitoring the situation map, i.e. the number of time units spent performing the monitoring map task divided by the total amount of time for that variant. Monitoring the map excluded any time spent in the adaptive mode as well as time spent identifying changes. Both of these measures are proxies for Level 1 (perception) situation awareness (Endsley, 1995), since they focus on the perception of environmental stimuli. A limitation of this situation awareness metric is that it is restricted to perception of the change detection task, and thus does not encompass higher levels of situation awareness, nor does it encompass integrated situation awareness or potential to increase situation awareness from increased availability of cognitive resources.
3.7 Evaluate invoking threshold levels The simulation experiment began with varying the invoking threshold levels at instantaneous VACP values between 10, 20, 30 and 40. This design served as a screening experiment because these initial thresholds covered a large proportion of the expected design space of workload values. In the dual task scenarios, the minimum instantaneous VACP value was 6.0 and the maximum value was 78.28, with less than 20% of the instantaneous VACP values occurring between 40.0 and 78.28. Threshold levels over 40.0 were thus expected to trigger the adaptive automation too infrequently to produce noticeable changes in workload, situation awareness, or performance. Threshold values below 10 were expected to result in the adaptive automation always being on, and thus no longer representing adaptive automation. Based on the results from this screening experiment, noticeable differentials in workload occurred between thresholds 20 and 30 (on average difference of 2.06 VACP points), unlike the workload differences between 10 and 20 (0.10 VACP points) or 30 and 40 (0.01 VACP points). Thus, additional invoking threshold levels in increments of 1 were explored between 20 and 30. The additional invoking threshold values between 23 and 25 provide the most information regarding workload-situation awareness trade-offs and thus are included in the reported results below. Thus, the experiment used a 3 × 2 × 7 design, with three taskload levels, two dual task scenarios, and seven invoking threshold levels.
308
4
C.F. Rusnock and C.D. Geiger
Study results
4.1 Impact on workload of invoking threshold level Each of the 42 variants were simulated and run for 10 independent replications, with each replication using a unique random number seed. Based on the low variability within each variant compared to the variability between the variants, it was determined that 10 replications were sufficient (see baseline models standard deviations in Table 4). Table 4 provides the outcomes from the experiment for each of the six dual task variants. The workload values from the baseline model are included for reference. Table 4 shows that all of the invoking thresholds reduce workload when compared to the baseline model, with decreasing the threshold resulting in decreased workload. This is expected since lower thresholds are more likely to invoke the automation. Table 4
Predicted VACP workload values by threshold
Baseline
Baseline standard deviation
10
20
23
24
25
30
40
S2_Low
13.47
0.27
10.88
10.94
11.57
11.58
12.97
13.02
13.04
S2_Medium
14.27
0.25
11.59
11.66
11.70
11.74
13.72
13.74
13.69
S2_High
13.88
0.31
11.48
11.66
12.16
12.00
13.68
13.78
13.76
S4_Low
10.59
0.17
9.81
9.85
9.86
9.86
10.47
10.47
10.47
S4_Medium
13.95
0.33
11.05
11.29
11.63
11.59
13.52
13.58
13.58
S4_High
20.55
0.40
15.96
15.97
16.53
16.63
18.75
19.16
19.16
Variant
Threshold
While it is likely to assume that incremental decreases in the invoking threshold would result in proportional decreases in predicted workload, this is not the case for this system. Rather, for a given variant, the change in predicted workload due to a change in invoking threshold is negligible for invoking threshold levels between 10 and 24 and for invoking threshold levels between 25 and 40. However, there is a considerable decrease in workload between threshold values 24 and 25. An analysis of variance, combined with Tukey’s pairwise comparisons, confirms that for all variants, there are no significant differences between thresholds 10 and 20, 23 and 24, and 25 through 40. In most cases, there are no significant differences between thresholds 20 and 23. In all cases, there is a significant difference between thresholds 24 and 25. Thus, for this particular system, the invoking threshold value experiences a trade-off between values 24 and 25. While both values result in reduced predicted workload over the baseline, a threshold of 24 achieves considerably lower workload. Thus, if the goal is to minimising workload, a threshold of 24 would be the preferred value. Furthermore, when using an invoking threshold of 24, five of the six variants achieve workload values below 13.47 (see Table 4), whereas a threshold of 25 achieves this level in only two of the six variants.
4.2 Workload-situation awareness trade-off Adaptive automation design decisions can also benefit from incorporating situation awareness into the evaluation. For this case study, simulation was also used to predict
Designing adaptive systems
309
situation awareness for the invoking thresholds under consideration. This revealed the predicted situation awareness impacts that occur at different invoking thresholds, thus allowing for trade-offs between performance, workload, and situation awareness goals. Figures 8 and 9 show the relationship between predicted workload and predicted situation awareness for Scenarios 2 and 4, respectively. Each point represents an invoking threshold; thus, each chart contains seven points corresponding to the invoking threshold values 10, 20, 23, 24, 25, 30 and 40. The locations of these points correspond to average workload and average situation awareness values for the respective invoking threshold. Workload is displayed along the x-axis, and situation awareness is displayed along the y-axis. The first column of graphs is for the situation awareness measure “average percentage of changes identified” and the second column is the situation awareness measure “average percentage of time spent monitoring the map”. The first row of charts displays the low taskload variant, the second row displays the medium taskload, and the third row displays the high taskload. The threshold values 10–24 are grouped together, as are threshold values 25–40. Large gaps occur between threshold values 24 and 25. An analysis of variance, combined with Tukey’s pairwise comparisons, confirms that for 10 of the 12 instances, there is a statistical difference between thresholds 24 and 25. The exceptions are the average percentage of time monitoring the map for the high taskload condition of Scenario 2 and the percentage of changes identified for the low taskload condition of Scenario 4. In these two cases, none of the thresholds produced statistically different results. Figure 8
Scenario 2 - variable change detection, predicted workload vs. predicted situation awareness (see online version for colours)
310 Figure 9
C.F. Rusnock and C.D. Geiger Scenario 4 - variable threat detection, predicted workload vs. predicted situation awareness (see online version for colours)
Figures 8 and 9 show that the general relationship between predicted workload and predicted situation awareness is relatively positive and linear, suggesting there is a direct trade-off between workload and situation awareness for this system. This positive linear relationship is likely because lower workload due to increased automation results in the operator disengaging from the tasks that are being automated and focussing on tasks not being automated, reducing situation awareness of the automated (change detection) task, as is consistent with prior research (Kaber and Endsley, 2004; Kaber, Onal and Endsley, 2000). Thus, for this particular system, an invoking threshold of 24 resulted in lower workload, but also resulted in a lower situation awareness. However, regarding the monitoring of the map situation awareness measure, decreases in workload resulted in relatively small decreases in situation awareness, perhaps suggesting that decreasing workload could be achieved without a significant impact to situation awareness. Note that for other systems, decreasing workload could result in more time and mental resources available to spend in big-picture, monitoring activities, and could thus result in increased situation awareness. As with the workload-performance analysis, the main difference occurs between threshold values 24 and 25; therefore, the trade-space is reduced to selecting one of these two values as the invoking threshold. Since there appears to be a direct trade-off between workload and situation awareness, for this system, the selection of an invoking threshold should consider the relative priorities of workload and situation awareness. If the higher priority is to minimise workload, a threshold of 24 is appropriate. However, if the higher
Designing adaptive systems
311
priority is to maximise situation awareness, a threshold value of 25 should be chosen. In both cases, workload has been reduced compared to the baseline system. By adding the situation awareness dimension to the analysis, a system designer can make a more informed decision about the potential need to balance mental workload and situation awareness. Establishing an invoking threshold for an adaptive system requires a careful examination of the relative impacts each potential invoking threshold value has on performance, workload, and situation awareness. The relationships among these measures may be linear, thus requiring a direct trade-off between them. However, it is also possible for the relationships to be non-linear, in which case, it may be possible to achieve a combination where performance and situation awareness are high and workload is low. If direct trade-offs are required, then the system designer must establish the relative priorities of the three measures as well as identify any objective performance, workload, or situation awareness goals that are necessitated by the task environment. While the adaptive automation workload and situation awareness evaluation performed in this study use computer simulation, the same analysis of when to automate and the trade-off involved in making this decision applies to physically instantiated versions of a system as well.
5
Conclusion and future work
Previous research has demonstrated the impact that types and LOA can have on performance, workload, and situation awareness (Endsley and Kaber, 1999; Furukawa, Inagaki, and Niwa, 2000; Kaber and Endsley, 2004; Kaber, Onal and Endsley, 2000; Meyer, Feinshreiber and Parmet, 2003; Onnasch et al., 2014; Taylor et al., 2013). Rather than examining how much to automate through different LOA implementation along the continuum described by Sheridan and Verplank or what to automate as described by Endsley and Kaber, this study focusses on when to automate by examining the impacts from selecting different values for an adaptive automation invoking threshold. This study demonstrates that, for a given automation implementation, selecting an appropriate threshold can impact the success of the adaptive system in meeting performance, workload, and/or situation awareness goals. By introducing deliberate performance or workload goals, system designers can evaluate the degree to which the invoking threshold will enable the system to meet the desired outcomes. When an invoking threshold is selected for an adaptive automation system, a trade-off between performance, workload, and situation awareness inherently occurs. By performing deliberate evaluations of the invoking thresholds under consideration, one can identify the relative impact of different thresholds on the success of the system, allowing for an informed trade-off between competing priorities. Given the nascent state of current neuroergonomic systems, using simulation to model adaptive automation trade-offs provides an effective means to explore expected impacts of invoking threshold levels on workload prior to adaptive system design. As with any simulation, a number of assumptions must be incorporated into the models. The primary assumptions for the models include: workload is independent of the order in which the scenarios (Scenario 1: change detection, Scenario 3: threat detection, Scenarios 2 and 4: dual task) are performed; workload is independent of the order in which the segments (low, medium, and high) are performed; there are no fatigue or learning
312
C.F. Rusnock and C.D. Geiger
impacts on workload scores during the performance of the task; and models do not account for individual differences, personal factors (e.g. emotional stress), or environmental factors. The ability to model individual differences, fatigue, and learning would be a valuable enhancement to workload modelling. Another limitation of workload modelling in IMPRINT is the difficulty in quantifying and measuring operator situation awareness. IMPRINT does not provide any features to directly model situation awareness. Instead this outcome must be designed into the model. Since the entity(s) in IMPRINT represent the operator’s attention, only perception tasks (Level 1) can potentially be quantified and measured using the IMPRINT tool. It would be challenging to model the other two levels (comprehension and projection) described in Endsley’s model of situation awareness. However, the ability to model all three levels of situation awareness and their interaction would greatly benefit workload modelling. Future directions of exploration that are the next natural steps of this research include the implementation of adaptive automation into the human-in-the-loop system to corroborate the finding of these simulated experiments with regard to the invoking threshold trade-offs and correlation of workload from adaptive models to subjective and physiological measures. Furthermore, investigations into translating VACP workload values into physiological measurements would enable adaptive system designers to directly implement findings from IMPRINT modelling into their adaptive systems. Finally, additional insights regarding performance, workload, and situation awareness trade-offs from when to automate design decisions can be expected from performing this same analysis with other adaptive systems.
References Arciszewski, H.F.R., De Greef, T.E. and Van Delft, J.H. (2009) `Adaptive automation in a naval combat management system’, IEEE Transactions on Systems, Man, and Cybernetics Part A: Systems and Humans, Vol. 39, No. 6, pp.1188–1199, http://doi.org/10.1109/ TSMCA.2009.2026428. Bailey, N.R. and Scerbo, M.W. (2007) ‘Automation-induced complacency for monitoring highly reliable systems: the role of task complexity, system experience, and operator trust’, Theoretical Issues in Ergonomics Science, Vol. 8, No. 4, pp.321–348, http://doi.org/ 10.1080/14639220500535301. Bailey, N.R., Scerbo, M.W., Freeman, F.G., Mikulka, P.J. and Scott, L.A. (2006) ‘Comparison of a brain-based adaptive system and a manual adaptable system for invoking automation’, Human Factors, Vol. 48, No. 4, pp.693–709, http://doi.org/10.1518/001872006779166280. Bierbaum, C.R., Szabo, S.M. and Aldrich, T.B. (1989) Task Analysis of the UH-60 Mission and Decision Rules for Developing a UH-60 Workload Prediction Model: Volume 1, Summary Report, US Army Research Institute. Cassenti, D.N., Kelley, T.D., Colle, H.A. and McGregor, E.A. (2011) ‘Modeling performance measures and self-ratings of workload in a visual scanning task’, Proceedings of the Human Factors and Ergonomics Society, Las Vegas, Nevada, pp.870–874. Clamann, M. P., Wright, M. and Kaber, D.B. (2002) ‘Comparison of performance effects of adaptive automation applied to various stages of human-machine system information processing’, Proceedings of the Human Factors and Ergonomics Society Annual Meeting, Vol. 46, pp.342–346, http://doi.org/10.1177/154193120304700361.
Designing adaptive systems
313
Cosenzo, K., Chen, J., Reinerman-Jones, L., Barnes, M. and Nicholson, D. (2010) ‘Adaptive automation effects on operator performance during a reconnaissance mission with an unmanned ground vehicle’, Proceedings of the Human Factors and Ergonomics Society Annual Meeting, Vol. 54, No. 25, pp.2135–2139, http://doi.org/10.1177/ 154193121005402503. Davidsson, S. and Alm, H. (2014) ‘Context adaptable driver information - or, what do whom need and want when?’, Applied Ergonomics, Vol. 45, No. 4, pp.994–1002, http://doi.org/10.1016/ j.apergo.2013.12.004. de Visser, E.J., LeGoullon, M., Freedy, A., Freedy, E., Weltman, G. and Parasuraman, R. (2008) ‘Designing an adaptive automation system for human supervision of unmanned vehicles: a bridge from theory to practice’, Proceedings of the Human Factors and Ergonomics Society Annual Meeting, Vol. 52, No. 4, pp.221–225, http://doi.org/10.1177/154193120805200405. de Visser, E.J. and Parasuraman, R. (2011) ‘Adaptive aiding of human-robot teaming: effects of imperfect automation on performance, trust, and workload’, Journal of Cognitive Engineering and Decision Making, Vol. 5, No. 2, pp.209–231. Dorneich, M.C., Ververs, P.M., Whitlow, S.D., Mathan, S., Carciofini, J. and Reusser, T. (2006) ‘Neuro-physiologically-driven adaptive automation to improve decision making under stress’, Proceedings of the Human Factors and Ergonomics Society Annual Meeting, Vol. 50, No. 3, pp.410–414, http://doi.org/10.1177/154193120605000342. Endsley, M.R. (1993) ‘Situation awareness and workload: flip sides of the same coin’, Proceedings of the 7th International Symposium on Aviation Psychology, Columbus, OH, pp.906–911. Endsley, M.R. (1995) ‘Toward a theory of situation awareness in dynamic systems’, Human Factors: The Journal of the Human Factors and Ergonomics Society, Vol. 37, No. 1, pp.32–64, http://doi.org/10.1518/001872095779049543. Endsley, M.R. and Kaber, D.B. (1999) ‘Level of automation effects on performance, situation awareness and workload in a dynamic control task’, Ergonomics, Vol. 42, No. 3, pp.462–492. Fitts, P., Chapannis, A., Grether, W., Henneman, R., Kappauf, W., Newman, E. and Williams, Jr., A. (1951) Human Engineering for an Effective Air-Navigation and Traffic-Control System, National Research Council, Division of Anthropology and Psychology, Committee on Aviation Psychology, Washington, DC. Furukawa, H., Inagaki, T. and Niwa, Y. (2000) ‘Operator’s situation awareness under different levels of automation; evaluations through probabilistic human cognitive simulations’, Smc 2000 Conference Proceedings. 2000 IEEE International Conference on Systems, Man and Cybernetics. “Cybernetics Evolving to Systems, Humans, Organizations, and Their Complex Interactions” (Cat. no. 0, 2), pp.1319–1324, http://doi.org/10.1109/ICSMC.2000.886036. Goodman, T., Miller, M.E., Rusnock, C.F. and Bindewald, J. (2016) ‘Timing within human-agent interaction and its effects on team performance and human behavior’, 2016 IEEE International Multi-Disciplinary Conference on Cognitive Methods in Situation Awareness and Decision Support (CogSIMA), IEEE, San Diego, CA. Haarmann, A., Boucsein, W. and Schaefer, F. (2009) ‘Combining electrodermal responses and cardiovascular measures for probing adaptive automation during simulated flight’, Applied Ergonomics, Vol. 40, No. 6, pp.1026–1040, http://doi.org/10.1016/j.apergo.2009.04.011. Hancock, P.A. (2007) ‘On the process of automation transition in multitask human-machine systems’, IEEE Transactions on Systems Man and Cybernetics Part A Systems and Humans, Vol. 37, No. 4, pp.586–598, Retrieved from http://ieeexplore.ieee.org/lpdocs/epic03/ wrapper.htm?arnumber=4244553. Hancock, P.A. and Chignell, M.H. (1988) `Mental workload dynamics in adaptive interface design’, IEEE Transactions on Systems, Man and Cybernetics, Vol. 18(4), pp.647–658, http://doi.org/10.1109/21.17382. Hart, S.G. and Staveland, L.E. (1988) ‘Development of NASA-TLX (task load index): results of empirical and theoretical research’, Human Mental Workload, pp.139–183, http://doi.org/10.1016/S0166-4115(08)62386-9.
314
C.F. Rusnock and C.D. Geiger
Kaber, D.B. and Endsley, M.R. (2004) ‘The effects of level of automation and adaptive automation on human performance, situation awareness and workload in a dynamic control task’, Theoretical Issues in Ergonomics Science, Vol. 5, No. 2, pp.113–153, http://doi.org/10.1080/ 1463922021000054335. Kaber, D.B., Onal, E. and Endsley, M.R. (2000) ‘Design of automation for telerobots and the effect on performance, operator situation awareness, and subjective workload’, Human Factors and Ergonomics in Manufacturing, Vol. 10, No. 4, pp.409–430. Kaber, D.B., Riley, J.M., Tan, K-W. and Endsley, M.R. (2001) ‘On the design of adaptive automation for complex systems’, International Journal of Cognitive Ergonomics, Vol. 5, No. 1, pp.37–57, http://doi.org/10.1207/S15327566IJCE0501_3. Kaber, D.B., Wright, M.C., Prinzel, L.J. and Clamann, M.P. (2005) ‘Adaptive automation of human-machine system information-processing functions’, Human Factors, Vol. 47, No. 4, pp.730–741, http://doi.org/10.1518/001872005775570989. Lee, J.D. and See, K.A. (2004) ‘Trust in automation: designing for appropriate reliance’, Human Factors: The Journal of the Human Factors and Ergonomics Society, Vol. 46, No. 1, pp.50–80. McCracken, J.H. and Aldrich, T.B. (1984) Analyses of Selected LHX Mission Functions: Implications for Operator Workload and System Automation Goals, Technical Note AD-A232-330. Meyer, J., Feinshreiber, L. and Parmet, Y. (2003) ‘Levels of automation in a simulated failure detection task’, SMC’03 Conference Proceedings. 2003 IEEE International Conference on Systems, Man and Cybernetics. Conference Theme - System Security and Assurance (Cat. No.03CH37483), 3, http://doi.org/10.1109/ICSMC.2003.1244194. Mitchell, D.K. (2000) Mental Workload and ARL Workload Modeling Tools, Army Research Laboratory, Report No ARL-TN-161. Mitchell, D.K. and Chen, J.Y.C. (2006) ‘Impacting system design with human performance modeling and experiment: another success story’, Proceedings of the Human Factors and Ergonomics Society 50th Annual Meeting, San Francisco, CA, pp.2477–2481. Mitchell, D.K. and Samms, C. (2010) ‘An analytical approach for predicting soldier workload and performance using human performance modeling’, Human-Robot Interactions in Future Military Operations, Ashgate Publishing, Ltd., United Kingdom, pp.125–141. Moray, N., Inagaki, T. and Itoh, M. (2000) ‘Adaptive automation, trust, and self-confidence in fault management of time-critical tasks’, Journal of Experimental Psychology: Applied, Vol. 6, No. 1, pp.44–58, http://doi.org/10.1037/1076-898X.6.1.44. Onnasch, L., Wickens, C.D., Li, H. and Manzey, D. (2014) ‘Human performance consequences of stages and levels of automation: an integrated meta-analysis’, Human Factors: The Journal of the Human Factors and Ergonomics Society, Vol. 56, No. 3, pp.476–488. http://doi.org/ 10.1177/0018720813501549. Parasuraman, R., Barnes, M. and Cosenzo, K. (2007) ‘Adaptive automation for human-robot teaming in future command and control systems’, The International C2 Journal, Vol. 1, No. 2, pp.43–68. Parasuraman, R., Cosenzo, K. and de Visser, E.J. (2009) ‘Adaptive automation for human supervision of multiple uninhabited vehicles: effects on change detection, situation awareness, and mental workload’, Military Psychology, Vol. 21, pp.270–297, http://doi.org/10.1080/ 08995600902768800. Parasuraman, R., Sheridan, T.B. and Wickens, C.D. (2000) ‘A model for types and levels of human interaction with automation’, IEEE Transactions on Systems, Man, and Cybernetics Part A, Systems and Humans: A Publication of the IEEE Systems, Man, and Cybernetics Society, Vol. 30, No. 3, pp.286–297, http://doi.org/10.1109/3468.844354. Rouse, W.B. (1988) ‘Adaptive aiding for human/computer control’, Human Factors, Vol. 30, pp.431–438.
Designing adaptive systems
315
Rusnock, C.F. and Geiger, C.D. (2013) ‘Using discrete-event simulation for cognitive workload modeling and system evaluation’, Proceedings of the 2013 Industrial and Systems Engineering Research Conference, pp.2485–2494, Retrieved from http://search.proquest.com/ openview/b77033807ade34134e81d078a4513631/1?pq-origsite=gscholar, . Sheridan, T.B. and Parasuraman, R. (2006) ‘Human-automation interaction’, Reviews of Human Factors, Vol. 1, No. 1, pp.89–129. Sheridan, T.B. and Verplank, W.L. (1978) ‘Human and computer control of undersea teleoperators’, ManMachine Systems Lab Department of Mechanical Engineering MIT Grant N0001477C0256. Steinhauser, N.B., Pavlas, D. and Hancock, P.A. (2009) ‘Design principles for adaptive automation and aiding’, Ergonomics in Design, (Spring), pp.6–10. Szalma, J.L. and Taylor, G.S. (2011) ‘Individual differences in response to automation: the five factor model of personality’, Journal of Experimental Psychology: Applied, Vol. 17, No. 2, pp.71–96, http://doi.org/10.1037/a0024170. Tattersall, A.J. and Foord, P.S. (1996) ‘An experimental evaluation of instantaneous selfassessment as a measure of workload’, Ergonomics, Vol. 39, No. 5, pp.740–748. Taylor, G.S., Reinerman-Jones, L.E., Szalma, J.L., Mouloua, M. and Hancock, P.A. (2013) ‘What to automate: addressing the multidimensionality of cognitive resources through system design’, Journal of Cognitive Engineering and Decision Making, Vol. 7, No. 4, pp.311–329, http://doi.org/10.1177/1555343413495396. Teigen, K.H. (1994) ‘Yerkes-Dodson: a law for all seasons’, Theory & Psychology, Vol. 4, No. 4, pp.525–547. Wickens, C.D. (2002) ‘Multiple resources and performance prediction’, Theoretical Issues in Ergonomics Science, Vol. 3, No. 2, pp.159–177, http://doi.org/10.1080/14639220210123806. Wilson, G.F. and Russell, C.A. (2004) ‘Psychophysiologically determined adaptive aiding in a simulated UCAV task’, Human Performance, Situation Awareness, and Automation: Current Research and Trends, Embry-Riddle Aeronautical University, Dayton Beach, FL, pp.200–204. Wilson, G.F. and Russell, C.A. (2007) ‘Performance enhancement in an uninhabited air vehicle task using psychophysiologically determined adaptive aiding’, Human Factors, Vol. 49, No. 6, pp.1005–1018, http://doi.org/10.1518/001872007X249875.