Effects of Levels of Automation on Air Traffic Controller Situation ...

Effects of Levels of Automation on Air Traffic Controller Situation Awareness and Performance

by Arathi Sethumadhavan, M.A A Dissertation In EXPERIMENTAL PSYCHOLOGY Submitted to the Graduate Faculty of Texas Tech University in Partial Fulfillment of the Requirements for the Degree of DOCTOR OF PHILOSOPHY Approved Patricia R. DeLucia, Ph.D.

Francis T. Durso, Ph.D.

Roman Taraban, Ph.D.

James L. Smith, Ph.D.

Fred Hartmeister Dean of the Graduate School

May 2009

Copyright 2009, Arathi Sethumadhavan

Texas Tech University, Arathi Sethumadhavan, May 2009

ACKNOWLEDGMENTS First, I would like to thank my advisor, Dr. Francis T. Durso for having the faith in me. I sincerely thank him for his time and encouragement. I would also like to thank my committee members, Dr. Patricia R. DeLucia, Dr. Roman Taraban, and Dr. James L. Smith for their comments on my dissertation. I also extend my thanks to Mr. Vanchinathan Chandrasekaran for his help in programming the air traffic control simulation. Last, but not the least, I would like to thank my family. I owe everything to my parents, Girija and Sethumadhavan. I would also like to express my gratitude to my husband, Anand Tharanathan, for being my pillar of strength and my in-laws for their best wishes and prayers.

ii


TABLE OF CONTENTS ACKNOWLEDGMENTS .................................................................................................. ii ABSTRACT ........................................................................................................................ v LIST OF TABLES ............................................................................................................ vii LIST OF FIGURES ......................................................................................................... viii CHAPTER 1. INTRODUCTION ....................................................................................................... 1 Situation Awareness: The Factor Responsible for the OOP Problem ......................... 1 Levels of Automation .................................................................................................. 4 Purpose of the Present Study ....................................................................................... 8 Overview of the Present Study .................................................................................. 10 2. METHOD .................................................................................................................. 17 Participants ................................................................................................................. 17 Simulation Software................................................................................................... 17 The ATC Task............................................................................................................ 19 Secondary Task .......................................................................................................... 21 Experimental Design .................................................................................................. 23 Procedure ................................................................................................................... 37 3. RESULTS AND DISCUSSION ................................................................................ 44 ATC Performance ...................................................................................................... 44 SA .............................................................................................................................. 49 Meta-SA ..................................................................................................................... 57 Secondary Task Performance .................................................................................... 61 Effect of LOA on Subjective Ratings of Workload ................................................... 65 Follow Up .................................................................................................................. 65 4. CONCLUSIONS........................................................................................................ 73 Theoretical Implications ............................................................................................ 75 Practical Implications................................................................................................. 77 Limitations and Future Directions ............................................................................. 79 5. REFERENCES .......................................................................................................... 83 APPENDIX A. EXTENDED LITERATURE REVIEW ................................................................... 94 B. EXTENDED RESULTS AND DISCUSSION ....................................................... 171

iii


C. INFORMED CONSENT FORMS .......................................................................... 225 D. BIOGRAPHICAL QUESTIONNAIRE .................................................................. 227 E. VIDEO GAME EXPERIENCE QUESTIONNAIRE ............................................. 228 F. PARTICIPANT INSTRUCTIONS ......................................................................... 230

iv


ABSTRACT To meet the increasing demands of air traffic, the Joint Planning and Development Office has proposed initiatives to modernize the U.S. National Airspace System (NAS). That proposal involves introduction of automated systems that would help air traffic controllers handle the increasing volume of traffic in their airspace (JPDO, 2007). However, a potential consequence of fully automated systems is the out-of-the-loop performance (OOP) problem (e.g., Billings, 1991; Endsley & Kiris, 1995). This refers to the reduced ability of operators working with fully automated systems to perform tasks manually following automation failure compared to operators who perform the tasks manually. The central factor that is considered responsible for the OOP problem is a loss of operator situation awareness (SA). One approach to reduce the OOP problem is to use varying levels of automation (LOA; e.g., Parasuraman, Sheridan, & Wickens, 2000). An experiment was conducted to examine the differences in the SA and collision detection performance of individuals when they worked with different levels of a collision detection system to control air traffic. Seventy-two participants controlled air traffic using one of the LOAs: information acquisition, information analysis, decision and action selection or action implementation automation (Parasuraman et al., 2000). SA was assessed by examining the extent to which individuals monitored various aircraft attributes. The time taken to detect upcoming aircraft collisions was also recorded. When the automation was unreliable, the time taken to detect aircraft collision in the higher LOAs was significantly longer compared with the information acquisition automation. This poor performance following automation failure for higher LOAs was mediated by SA for altitude and destination, with lower SA yielding poor performance. Thus, the costs

v


associated with automation failure are greater when automation is applied to higher order stages of information processing. SA is an important cognitive issue that needs to be considered in air traffic control automation design. Results from this experiment have practical implications for automation, SA training programs, and air traffic control display design for the future NAS.

vi


LIST OF TABLES Table 1. The Independent and dependent variables in this study are shown .................... 36 Table 2. Experiment details .............................................................................................. 37 Table 3. Summary of results is presented ......................................................................... 44 Table 4. Relationship between SA and meta-SA .............................................................. 59 Table 5. Results from the binomial probability calculations ............................................ 64 Table 6. Effects of automation failure and LOA on advance notification time, secondary task performance, and total SA is presented................... 69

vii


LIST OF FIGURES Figure 1. A snapshot of the ATST airspace ...................................................................... 18 Figure 2. A snapshot of the ATST airspace in the information acquisition condition ................................................................................................... 24 Figure 3. A snapshot of the ATST airspace in the information analysis condition ................................................................................................... 26 Figure 4. A snapshot of the ATST airspace in the decision and action selection condition .................................................................................... 28 Figure 5. A snapshot of the ATST airspace in the action implementation condition ................................................................................................... 29 Figure 6. Effect of LOA on advance notification time following automation failure. .................................................................................... 46 Figure 7. Mediation model of the association between LOA and advance notification time following automation failure as mediated by Total SA ............................................................................................... 52 Figure 8. Mediation model of the association between LOA and advance notification time following automation failure as mediated by SA for altitude and destination ............................................................ 57 Figure 9. Effect of LOA on hit-to-signal ratio .................................................................. 62

viii


CHAPTER I INTRODUCTION To meet the increasing demands of air traffic in the next two decades, the Joint Planning and Development Office has proposed initiatives to modernize the U.S. National Airspace System (JPDO, 2007). That proposal involves introduction of automated systems that would help air traffic controllers to cope with increasing traffic demands. Though fully automated systems offer several benefits, a potential consequence of such systems is the out-of-the-loop performance (OOP) problem (e.g., Billings, 1991; Endsley & Kiris, 1995; Kessel & Wickens, 1982; Moray, 1986; Sarter & Woods, 1995; Wiener & Curry, 1980). This refers to the reduced ability of operators working with fully automated systems to perform tasks manually following automation failure compared to operators who routinely perform the tasks manually. Therefore, it is imperative that automated systems for the future airspace system be designed that keeps air traffic controllers in control and in the loop. Situation Awareness: The Factor Responsible for the OOP Problem In order to design systems that keep controllers involved in the decision-making loop, it is important to understand the factor that is responsible for the OOP problem. Some researchers attribute loss of operator situation awareness (SA) to be the main factor responsible for the OOP problem (e.g., Endsley & Kiris, 1995; Kaber, Onal, & Endsley, 2000). The factors that contribute to lower SA when operators interact with fully automated systems include overreliance on, and passive monitoring of, automation (e.g., Endsley & Kiris, 1995; Endsley, Bolte, & Jones, 2003).

1


Overreliance on Automation Overreliance results in failure in monitoring the automation (Parasuraman & Riley, 1997). That is, overreliance occurs when operators place too much trust in the system, allocate their cognitive resources to perform other tasks, and consequently fail to monitor the automated system for failures (e.g., Parasuraman, Molloy, & Singh, 1993). This results in a lack of SA of the state of the automated system (e.g., Endsley, 1996). High automation reliability results in overreliance in automation. When the reliability of the automation is high (not perfect), operators continue to rely on it to such an extent that even occasional failures do not reduce their trust in the automation (e.g., Parasuraman & Riley, 1997). The workload experienced by operators can also result in overreliance (e.g., Parasuraman et al., 1993). For example, using an information warfare scenario, Biros, Daly, and Gunsch (2004) showed that military decision makers tended to rely on an unreliable automation when their workload was high, despite low perception of trust in the automation. The next section will discuss the loss of SA that occurs due to passive monitoring of automated systems. Passive Monitoring of Automation There is a plethora of research that demonstrates the advantages of active processing over passive involvement in simple perceptual tasks (e.g., Held & Hein, 1963; Larish & Anderson, 1995). For example, Gibson (1962) found that individuals who actively explored an irregular object by moving their hand over it (i.e., active observers) were more accurate in identifying the shape of the object compared to those whose hand remained stationary with the object moving over their hand (i.e., passive observers). The advantages of active processing have been demonstrated in cognitive tasks as well (e.g.,

2


Craik & Tulving, 1975; Foos, Mora, & Tkacz, 1994). For example, Koriat, PearlmanAvnion, and Ben-Zur (1998) showed that individuals who enacted phrases (e.g., pour coffee) exhibited better recall for these phrases during testing compared to individuals who merely memorized these phrases. The advantages of active processing in promoting operator SA and performance have been demonstrated in safety-critical domains as well (e.g., Endsley & Kiris, 1995; Wickens & Kessel, 1979). For example, using a tracking task, Young (1969) showed that passive monitors were slower in detecting system malfunctions compared to active participants. The superiority of manual performance was also demonstrated by Gugerty (1997) using a simulated driving task. He found that participants’ had better recall of nearby (potentially hazardous) car locations when drivers actively performed the driving task compared to watching driving scenes in a passenger (passive) mode. The benefits of active operator involvement have also been illustrated in the domain of air traffic control (ATC). For example, Willems and Truitt (1999) found that controllers’ response times to SA queries were longer when they were passively monitoring traffic. More recently, Metzger and Parasuraman (2001) showed that controllers took almost 2 minutes longer to detect conflicts under passive monitoring conditions compared to active control, under high traffic, free-flight conditions. Controllers’ recall of the altitude of aircraft was also lower in the passive monitoring condition. Thus, high traffic and passive monitoring lowered SA and left controllers outof-the-loop. The next section will examine the approach that can be adopted to increase the involvement of human operators when working with automated systems.

3


Levels of Automation One approach to improve SA and reduce the OOP problem is to use varying levels of automation (LOA). The central concept of LOA is that automation is not an all or none phenomenon, but instead can be implemented at various levels (e.g., Billings, 1991; Parasuraman, 2000; Parasuraman, Sheridan, & Wickens, 2000; Sheridan & Verplank, 1978; Wiener & Curry, 1980). Recently, Parasuraman et al. (2000) developed the Parasuraman- Sheridan- Wickens (PSW) model of automation. Based on the PSW model, automation can be applied to different stages of information processing: information acquisition, information analysis, decision and action selection, and action implementation. Further, different degrees of automation support can be applied to each of these stages. Automation of information acquisition supports human sensory processes and refers to technology that cues, highlights (e.g., Dzindolet, Pierce, Beck, & Dawe, 2002; Fisher & Tan, 1989; Yeh, Wickens, & Seagull, 1999) or filters information. For example, an automated aid that directs an operator’s attention to important targets (such as camouflaged enemy objects in a terrain) is an example of information acquisition automation (e.g. Yeh & Wickens, 2001). Automation of information analysis provides support for integrating multiple information and making inferences and predictions (Parasuraman et al., 2000). For example, the Traffic Alert and Collision Avoidance System (TCAS) that integrates several pieces of information such as altitude, speed, and heading to warn the pilot of an upcoming collision is an example of information analysis automation (e.g., Lorenz & Parasuraman, 2007).

4


Automation of decision and action selection involves providing support to the operator in selecting the appropriate course of action. This automated system can be implemented at various levels. For example, an automated aid that recommends the best enemy versus friendly engagement option to a shooter in a military command and control environment is an example of a high decision and action selection automation whereas an aid that recommends the top three enemy-friendly engagement options can be considered as a medium decision and action selection automation (e.g., Rovira, McGarry, & Parasuraman, 2007). Automation of action implementation assists operators in executing actions. This stage yields a system that typically only involves two levels: manual or automated (Parasuraman et al., 2000). Therefore, automation of this stage involves no assistance from the human operator. An autopilot is an example of action implementation automation. The next section will examine the empirical studies that demonstrate the performance and SA benefits of using low LOAs. Specifically, the role of low LOAs in keeping operators involved in the decision making loop and promoting faster recovery times following automation failure is examined. The low LOAs described in the following sections can be considered equivalent to the information acquisition automation in the PSW model, while the high LOAs can be considered equivalent to the decision automation and action implementation automation in the PSW model.

5


Advantages of low LOAs Performance Benefits The utility of low LOAs in promoting faster recovery times following automation failure was demonstrated by Crocoll and Coury (1990). They examined the performance consequences of unreliable status automation (low LOA) and unreliable decision automation (high LOA) using a military decision-making task. When the aids failed, participants who received only status information performed better than those receiving recommendations. Similar results were obtained by Sarter and Schroeder (2001). They examined the effects of status (low LOA) and command displays (high LOA) on pilot performance during in-flight icing encounters. When the displays were unreliable, performance costs associated with unreliable automation were lower for status displays compared to command displays. More recently, using a military command and control task, Rovira et al. (2007) showed that when the reliability of the aid was high, people relied on it to a great extent even though it was not perfect, and the consequences associated with this over-reliance was greater when the automation was applied to higher order cognitive functions (e.g., decision-making) instead of lower order cognitive functions (e.g., information acquisition). SA Benefits Low LOAs have also been shown to promote operator SA. For example, using a simulated ATC environment, Kaber, Perry, Segall, McClernon, and Prinzel (2006) showed that individuals were more accurate in answering perception-based queries about aircraft states when adaptive automation was applied to information acquisition compared to information analysis, decision making, and action implementation. However, Kaber et

6


al. (2006) did not examine the performance of individuals following automation failure. Similarly, Kaber et al. (2000) also demonstrated the benefits of low LOAs in a simulated nuclear materials handling task using a telerobot. They found that when the automation failed, participants were faster to respond to automation failures under low LOAs compared to high LOAs (e.g., full automation), presumably due to higher SA under low LOAs. The research that has been reviewed so far shows that higher LOAs can lower operator SA and presumably hamper performance when automation fails. However, there is also research that demonstrates the advantages of higher LOAs. This view is examined in the next section. Benefits of Higher LOAs in Improving SA and Performance Contradicting the research that advocates using low LOAs for promoting SA and improving operator performance, there are studies that advocate using higher LOAs to improve operator performance. For example, using a simulated spaceflight micro-world, Lorenz, DiNocera, Röttger, and Parasuraman (2002) showed that participants working with high LOAs were faster in detecting system faults compared to those assigned to lower LOAs. This was attributed to their ability to engage in better information sampling, thereby helping them gain better awareness of the system status. That is, participants working with lower LOAs require more cognitive resources compared to those working with higher LOAs that resulted in allocation of attention away from critical sources that provided failure-related information. High LOAs also promote operator SA. For example, through an analysis of over 300 civilian accident reports, Jentsch, Barnett, Bowers, and Salas (1999) concluded that a loss

7


of SA is more likely to occur when a pilot is flying than when the pilot is not flying. Lower SA during active control was attributed to the need to maintain SA while engaging in active flight control. Similarly, using a dynamic control task, Endsley and Kaber (1999) showed that SA was higher under high LOAs compared to manual control, presumably because higher LOAs freed up cognitive resources. In summary, there are two lines of research that describe how SA is affected when operators work with automated systems. One line of research argues that fully automated systems are undesirable if humans are to be kept in the loop and advocates using low LOAs. The other line of research advocates that higher LOAs are more advantageous because higher LOAs free up operators’ cognitive resources that help them maintain higher SA. Purpose of the Present Study With the nation’s air traffic expected to increase two-fold over the next two decades, automated systems are proposed to be an integral part of the future airspace system to help air traffic controllers handle the increasing volumes of air traffic and identify conflicts (JPDO, 2007). Fully automated systems that leave the controllers out of the decision-making loop can prove hazardous when the systems fail. Therefore, it is important to determine the right degree of automation that will promote SA and facilitate better performance following automation failure. The purposes of this study were multifold. First, this study aimed to examine the differences in the SA of individuals when working with varying levels of a conflict detection aid to complete an ATC task. It is important to understand how SA differs under different LOAs in the ATC context in order to design systems that promote

8


controller SA. Second, this study aimed to understand the process of SA by examining the extent to which individuals monitored aircraft call sign, location, altitude, heading, and destination. Treating all attributes as equal and aggregating the recall accuracy scores of all the attributes to obtain an aggregate SA score tells us little about how operators come to comprehend ATC situations (e.g., Durso & Sethumadhavan, 2008). Understanding the extent to which individuals monitor the different aircraft attributes under different LOAs will help to better understand the process of SA. Third, this study sought to examine the performance of participants working with different levels of the conflict detection aid following the failure of the automation. Specifically, the difference in the time taken by participants working with the different LOAs to detect an upcoming collision between an aircraft pair was examined. Fourth, this study examined whether the relationship between the LOA and the time taken to detect the upcoming collision following automation failure was mediated by SA under perfectly reliable automation conditions. Though prior studies (e.g., Endsley & Kiris, 1995; Kaber et al., 2000) have attributed the loss of SA to be the central factor responsible for the OOP problem, these studies have not demonstrated whether the loss of SA is the main factor that is responsible for performance degradation following automation failure. Fifth, this study applied the concept of LOA to the ATC domain. Though prior research has examined the extent to which human performance differ under various LOAs in domains such as military (e.g., Crocoll & Curry, 1990), robotics (Kaber et al., 2000), and driving (e.g., Endsley & Kiris, 1995), research should be directed towards understanding the effect of LOAs in the ATC context in order to design automated

9


systems for the future airspace system. Finally, this study applied the PSW model of automation (Parasuraman et al., 2000) to the ATC domain. There have been limited studies (e.g., Kaber et al., 2006; Rovira et al., 2007) evaluating the PSW model of automation. Validating this model of automation requires empirical work. Overview of the Present Study The PSW model (Parasuraman et al., 2000) of automation was used in this experiment. Following initial manual training on the ATC task, participants controlled air traffic using information acquisition, information analysis, decision and action selection or action implementation automation. In the information acquisition condition, an automated aid displayed aircraft that were flying at the same altitude in the same color in the ATC display. Specifically, a moderate LOA (e.g., Parasuraman et al., 2000) that highlights important information (i.e., altitude) was applied to the information acquisition stage of information processing. In the information analysis condition, an automated aid displayed the potential collisions that occurred in the scenario. Specifically, a high LOA (e.g., Parasuraman et al., 2000) that integrates information from multiples sources (i.e., altitude, heading, location, and destination) and projects future events (i.e., collisions) was applied to the information analysis stage of information processing. In the decision and action selection condition, an automated aid recommended the course of action (i.e. altitude separation) to resolve a potential collision. Specifically, level 4 in the Sheridan and Verplank (1978) LOA taxonomy (i.e., ‘Computer suggests one alternative’) was applied to the decision and action selection stage of information processing. In the action implementation condition, an automated aid detected potential collisions and autonomously resolved these collisions by making an altitude separation. Specifically,

10


level 7 in the Sheridan and Verplank (1978) LOA taxonomy (i.e., ‘Computer executes selected option and informs the human’) was applied to the action implementation stage of information processing. Note that the level of involvement of the human operator seems to decrease as we move from the information acquisition to the action implementation automation condition. In the remainder of the paper, information acquisition, information analysis, decision and action selection, and action implementation automation will be referred to as LOAs for simplicity. Further, information analysis, decision and action selection, and action implementation automation will be referred to as high LOAs. In addition to the primary task of controlling air traffic, participants were also given an additional task of responding to wind shear warnings that appeared on a separate computer display. There were two main reasons for including the weather monitoring task. First, this will induce higher workload by requiring participants to perform multiple tasks concurrently and thereby be representative of a multi-task environment like ATC. Second, the secondary task would encourage overreliance on the automated aids provided in the ATC task. Participants completed several LOA training scenarios before completing the LOA experimental scenario. The automation did not fail in any of the LOA training scenarios so as to build up participants’ trust on the automated aids. This is consistent with the methodology adopted in earlier studies (e.g., Kaber et al., 2000; Rovira et al., 2007). Therefore, for participants in the information acquisition condition, aircraft flying at altitude 3, 2, and 1 were respectively presented on the ATC display in green, pink, and blue. The automation did not fail in acquiring the correct altitude of aircraft in any of the

11


training scenarios. In the information analysis condition, the aid never failed to present the upcoming collisions to the participants in any of the training scenarios. In the decision and action selection condition, the aid never failed to provide recommendations to avoid upcoming collisions and finally in the action implementation condition, the aid never failed to automatically prevent upcoming collisions. Following the LOA training scenarios, participants completed an LOA experimental scenario. During this scenario, SA of the participants working under various LOAs was assessed by using a modification of the SAGAT methodology (Endsley, 1995a). SA was assessed during the course of time the automated aids were fully reliable. SA assessments were made prior to the occurrence of the automation failure so as to examine the extent to which SA (before the occurrence of the automation failure) mediated the association between LOA and collision detection performance following automation failure. The automated aids failed at the end of the experimental scenario. Specifically, in the information acquisition condition, the aid failed to present the data block of one of the aircraft that would be involved in an upcoming collision in the correct color. In the information analysis condition, the aid failed to detect an upcoming collision and thereby failed to highlight the aircraft pairs that would be involved in the collision. In the decision and action selection condition, the aid failed to detect an upcoming collision and thereby failed to provide recommendation to resolve the collision. In the action implementation condition, the aid failed to detect an upcoming collision and thus failed to automatically resolve the upcoming collision. The time taken by participants to detect the upcoming collision, following the failure of the aid was assessed. I have two competing hypotheses

12


for explaining the differences in SA and performance that would arise when participants work with the four LOAs. Active-Processing Hypothesis According to this hypothesis, SA will decrease with increasing LOA, due to reduced involvement of the operators in the task (e.g., Endsley & Kiris, 1995; Kaber et al., 2000; Wickens & Kessel, 1979). The advantages of active involvement have been demonstrated using simple perceptual tasks (e.g., Larish & Andersen, 1995), cognitive tasks (e.g., Foos et al., 1994) as well as using tasks in safety-critical domains (e.g., Endsley & Kiris, 1995). Further, when the reliability of the automated aid is high, people rely on it to a great extent and the consequences associated with this over-reliance is greater when the automation is applied to higher order cognitive functions (e.g., Rovira et al., 2007). This would imply that SA of participants will decrease with increasing LOA in the ATC task. Corresponding with this decrease in SA with increasing LOA, the time taken to detect the aircraft collision following the failure of the automated aid will also be longer under higher LOAs. That is, the participants in the information acquisition condition will take the least time to detect the upcoming collision following the automation failure, compared to other LOAs.

13


Free Cognitive Resources Hypothesis According to this hypothesis, SA will increase with increasing LOA, due to greater availability of cognitive resources under high LOAs that helps operators maintain awareness of the system state (e.g., Jentsch et al., 1999; Lorenz et al., 2002). That is, higher LOAs free up operators’ cognitive resources, thereby helping them gain better SA about the system state. This would imply that SA of participants will increase with increasing LOA in the ATC task. Corresponding with this decrease in SA with lower LOAs, the time taken to detect the aircraft collision following the failure of the automated aid will also be longer under lower LOAs. That is, the participants in the information acquisition condition will take the longest time to detect the upcoming collision following the automation failure, compared to other LOAs. A Closer Look at SA In this experiment, SA of the participants assigned to all the LOAs was assessed using a modification of the SAGAT methodology (Endsley, 1995a; Mogford, 1997). The simulation was frozen twice during the experimental scenario. The participants were then asked to recall the attributes of the aircraft present in their airspace. Specifically, they were asked to recall the call sign, location, altitude, heading, and destination of all the aircraft in their airspace. Aircraft speed was omitted because all aircraft flew at speed fast at virtually all times. Proportion accuracy in recalling call sign, location, altitude, heading, and destination were calculated separately. Differences among the LOAs in the recall of any of the aircraft attributes can be interpreted with the active-processing hypothesis or the free cognitive resources hypothesis. That is, if the recall of call sign, location, altitude, heading or destination

14


attributes increase with decreasing LOAs, then this finding can be explained with the active-processing hypothesis. On the other hand, if the recall of the call sign, location, altitude, heading or destination attributes increase with increasing LOAs, then this finding can be explained with the free cognitive resources hypothesis. Differences among the LOAs in the accuracy of recall of only the spatial attributes, namely, location, altitude, and heading will be interpreted with the unequal attribute relevance hypothesis. Unequal Attribute Relevance Hypothesis According to this hypothesis, some aircraft attributes are considered more relevant and better monitored by controllers. Specifically, there is evidence that aircraft location, altitude, and heading are the most widely monitored pieces of information on a radar display. For example, Bisseret (1970) showed that controllers recalled altitude and location information most accurately than other aircraft attributes. Similar findings were reported by Means et al. (1988). They found that controllers recalled position, heading, altitude, and call sign with 84%, 80%, 79%, and 24% accuracy respectively. Even when the controllers were not given explicit instructions to remember altitude and heading, they did so with high accuracy. Mogford (1997) also showed that some aircraft attributes are more important and better remembered than others. He showed that location, heading, and altitude were recalled better by ATC trainees than other aircraft attributes (e.g., speed, call sign). He also found that of all the attributes, heading and altitude predicted the performance of ATC trainees. According to Mogford (1997), aircraft location, heading, and altitude are monitored to a greater extent by controllers than other aircraft attributes because these attributes help controllers anticipate conflicts between aircraft pairs. That is, in order to

15


determine whether two aircraft will be in conflict, controllers have to read the altitude of the two aircraft that are in the same general area and then determine whether they are in converging paths. Similar to Mogford (1997), Gronlund, Ohrt, Dougherty, Perry, and Manning (1998) examined the amount of radar screen information that controllers use to maintain their SA. They found that controllers recalled the position of aircraft on the map with remarkable accuracy. After aircraft position, recall accuracy of altitude information was higher than accuracy of other attributes (i.e., ground speed, destination, departure point, aircraft type). Gronlund et al. (1998) also asked experienced enroute air traffic controllers to rate how important it is to remember the attributes of an aircraft. Aircraft altitude and position were reported as the most important pieces of information. Most controllers reported ‘it depends’ to aircraft call sign, destination, route, type of aircraft, and speed. Majority of the controllers rated remembering an aircraft’s computer ID as ‘not important’. To summarize, some aircraft attributes such as aircraft location, altitude, and heading are considered more relevant by controllers and therefore monitored better than others because these attributes help controllers predict aircraft collisions, when they manually control traffic. Location, altitude, and heading form part of controller’s spatial awareness. Therefore, if differences among the LOAs arise only in the recall of these spatial attributes, then this finding can be interpreted with the unequal attribute relevance hypothesis.

16


CHAPTER II METHOD Participants Seventy- two individuals (27 men and 45 women) ranging in age from 18 to 30 years (M = 21.11, SD = 3.62) participated in this experiment. All participants reported having normal or corrected to normal vision and hearing. They were also screened for red-green color deficiencies. Participants either received course credit or a monetary stipend upon completion of the experiment. Participants were randomly assigned to each experimental condition. Simulation Software Air traffic scenarios were simulated using a modification of the Air Traffic Scenarios Test (ATST), a low fidelity simulation of the radar display. ATST is used by the Federal Aviation Administration (FAA) for the selection of air traffic controllers (Broach, 2002). The airspace comprised four sector gates to neighboring sectors (A, B, C, and D) and two airports (x, y). An aircraft could fly at three possible speeds (Fast, Medium, or Slow) and at any of the three altitudes (1, 2, or 3). A medium speed aircraft moves at two-third the speed of a fast aircraft whereas a slow speed aircraft travels at one-third the speed of a fast aircraft. All the aircraft were programmed to appear in the airspace at speed fast during the scenarios. Participants were instructed to fly an aircraft at speed fast at all times except when landing the aircraft at an airport. When landing at airports, aircraft were required to be at speed slow and altitude 1. When exiting through sector gates, aircraft were required to be at speed fast and altitude 3.

17


The ATST simulator was modified to accommodate the research needs. Figure 1 presents a snapshot of the modification of the ATST airspace. Aircraft were identified by call signs (e.g., 5, 13). Routes were added to the airspace. E, F, G, H, and I were waypoints, through which an aircraft had to pass to get to its final destination. Aircraft were programmed to fly the shortest possible routes in the airspace. For example, an aircraft originating from sector gate A with destination C flew the shorter path AEIGC instead of the longer path AEFGC. Finally, in consultation with an FAA Training Academy Instructor at Oklahoma City, the speeds (Fast, Medium, or Slow) of the aircraft in the simulator were modified to match the speeds of aircraft in a Terminal Radar Approach Control environment.

Figure 1. A snapshot of the ATST airspace. A, B, C, and D are sector gates through which planes enter or leave the controller’s airspace. x and y are airports. E, F, G, H, and I are waypoints. F2C_5 represents an aircraft flying at speed fast and altitude 2. The destination of this aircraft is sector gate C and its call sign is 5.

18


The ATC Task Each scenario started with a certain number of aircraft on the computer screen. These aircraft appeared near the sector gates or on the routes in the airspace. Participants could change the altitude and speed of aircraft present in the airspace. To issue a change, the participants had to click on an aircraft. For example, to change the altitude of a plane to level 2, the participants clicked on the plane and clicked 2. Similarly, to change the speed of a plane from fast to slow, the participants clicked on the plane and then clicked slow. The participants were instructed to avoid collisions between aircraft, land and exit aircraft at the right speed and altitude, and avoid delays in the airspace (See Appendix F). Participant Instructions Avoiding Collisions The participants were instructed to avoid collisions during a scenario. A collision occurred if two aircraft flying at the same altitude were at the same location, at the same time. Participants were informed that aircraft that violated a separation distance of 5 miles (See the “5 miles” distance in Figure 1) were in danger of a potential collision. The participants were told to resolve a collision by an altitude separation. Participants were instructed not to use speed separation to resolve a collision because in the simulator this increases the time taken by an aircraft to reach its destination. Participants were also not allowed to resolve collisions by making a heading change. From a practical perspective, if heading change were permitted then aircraft could fly anywhere in the airspace, which would prevent the occurrence of planned collisions in the airspace. From an operational perspective, resolving conflicts by making a heading change is the last choice adopted by

19


controllers (e.g., Amaldi & Leroux, 1995; Rantanen & Nunes, 2005; Willems, Allen, & Stein, 1999). Participants were told to press a button labeled ‘conflict’ on the ATC display when they detected a potential collision. They were also asked to say aloud the call signs of the aircraft involved in the upcoming collision. The advance notification times (Metzger & Parasuraman, 2005) were recorded by the ATST simulator. Advance notification time referred to the time a potential collision would occur in the scenario minus the time the participant detected the potential collision by pressing the conflict button. The advance notification time served as the main ATC task performance variable. Therefore, participants were instructed to press the conflict button before making an altitude change to resolve an upcoming collision. Under circumstances when the participants forgot to press the conflict button, the time the participants clicked on one of the aircraft involved in an upcoming collision to change its altitude was subtracted from the time the potential collision was scripted to occur in the scenario to obtain the advance notification time. Avoiding Rule Violations The participants were also instructed to adhere to the rules of the simulation. This meant that aircraft were to be landed at airports at speed slow and level 1. Aircraft should exit through sector gates at speed fast and level 3. The number of rule violations was also recorded by the ATST simulator. Avoiding Handoff Delay Each scenario started with a certain number of aircraft on the computer screen and as the scenario progressed, more aircraft appeared on the screen. The participants were

20


instructed to activate these planes as soon as possible in order to reduce the handoff delay time. After a delay of 7 seconds, if the planes were not yet activated by the participants, the planes were automatically activated by the simulation program. Therefore, if participants did not activate planes within 7 seconds, a handoff delay of 7 seconds was assigned to the participants for each individual plane. The handoff delay score for each participant was recorded by the ATST simulator. Secondary Task The participants were told that in addition to controlling air traffic, they were also required to monitor a weather display for wind shear warnings (See Appendix F). Weather places operational demands on controllers and is a major contributor of aviation accidents (e.g., FAA, 2004; NTSB, 2001). Therefore, monitoring the weather display served as a secondary task with some face validity. Participants were told that the secondary task was lower in priority to the primary task of controlling traffic. Consistent with prior research (e.g., Kaber & Endsley, 2004; Metzger & Parasuraman, 2005; Rovira et al., 2007), the purpose of including the secondary task in the scenarios was twofold. First, introduction of the secondary task will induce higher task load by requiring participants to perform multiple tasks concurrently and thereby representative of a multi-task environment like ATC. Second, the secondary task should encourage overreliance on the automated aids provided in the ATC task. For example, Parasuraman et al. (1993) showed that individuals exhibited overreliance on the automated gauge monitoring aid only in a multi task environment. That is, individuals were slower in detecting automation failures when they performed automation

21


monitoring in conjunction with other tasks compared to when automation monitoring was the only task that they had to perform. The secondary task used in this experiment was a modification of the secondary task used by DeLucia and Betts (2008) in a simulated surgical environment. In the current experiment, the secondary task was presented on a laptop monitor that was placed adjacent to the ATC display. A series of numbers between 100 and 199 appeared at 3second intervals. Participants were told that a number that was less than 130 or greater than 170 represented a wind shear warning. Consequently, they were told to press the spacebar key on the laptop when they saw a number less than 130 or greater than 170. If the number that was presented on the screen was between 130 and 170 (including 130 and 170), the participants were not required to press the spacebar key. Each number remained on the screen for 3 seconds, after which the next number was presented. Not more than 3 consecutive numbers elicited the same response from the participants. That is, not more than 3 adjacent numbers were signals (requiring a key press) or noise (requiring no key press). To encourage the participants to pay attention to the secondary task, auditory feedback was provided. That is, if the participants failed to press the spacebar key to a wind shear warning, an auditory beep was presented to notify them that they failed to respond to the wind shear warning. Participants were told to be as fast and accurate as possible in this task. If the participants responded to wind shear warnings by pressing the spacebar key, their response was scored as a hit. If the participants failed to press the spacebar key for a wind shear warning, then their response was scored as a miss. Pressing the spacebar key for

22


non-wind shear warnings were scored as false alarms. The response times to the hits were also recorded. When a SAGAT freeze occurred during the experimental scenario, the weather display was also blanked. Further, if a SAGAT freeze occurred (e.g., 315 seconds into the scenario), the three numbers presented in the weather display before this freeze time (that is, numbers presented at times 306, 309, and 312) were not wind shear alerts (that is, these numbers did not require a spacebar key press). This was done to ensure that participants in all the LOAs were exposed to similar secondary task conditions immediately before the SA assessment. Experimental Design Independent Variables The independent variable in this study was the LOA: information acquisition, information analysis, decision and action selection, and action implementation. LOA was the between-participant grouping variable. Participants were randomly assigned to each LOA condition. The different LOAs are described as follows: Information Acquisition Participants assigned to the information acquisition condition were given an automated aid that acquired the altitude of all the aircraft in the airspace and presented all the aircraft that are flying at the same altitude in the same color. Specifically, all the aircraft flying at altitude 3, 2, and 1 were presented in the airspace in green, pink, and blue respectively (See Figure 2). The participants were told that if they saw two aircraft that had the same color and that would violate the 5 mile separation standard, it would indicate a potential collision or separation error.

23


Figure 2. A snapshot of the ATST airspace in the information acquisition condition. All the aircraft flying at altitude 3 is presented in green, all the aircraft flying at altitude 2 is presented in pink, and all the aircraft at altitude 1 is presented in blue.

Rationale for color coding altitude. Information acquisition automation refers to technology that supports human sensory processes by cueing or highlighting information (e.g., Dzindolet al., 2002; Yeh et al., 1999). The LOA used in this study cued the altitude of aircraft by using color and can be considered representative of information acquisition automation. In order to determine whether two aircraft will be in conflict, controllers must determine the location, heading, and altitude of aircraft pairs. Altitude information needs to be acquired by reading the datablock, which consists of alphanumeric characters. Reading the data block can be time-consuming (e.g., Johnston, Horlitz, & Edminston, 1993). Research in visual search tasks provide evidence that perceptual features such as color are preattentively processed (e.g., Treisman, 1986). There is evidence that color coding improves the ability of controllers to detect conflicts in ATC displays by

24


supporting human sensory processes. Using discriminable color to represent the altitude of aircraft has been shown to help controllers assess altitude, without allocating focused attention to data blocks (e.g., Johnston et al., 1993; Remington, Johnston, Ruthruff, Gold, & Romero, 2000). For example, Johnston, et al. (1993) showed that color coding improved the ability of participants to detect aircraft that were within 1000 feet of a target aircraft. The participants were also informed that the automation was highly reliable, though not 100% reliable. Specifically, they were told that the automation might acquire the wrong altitude information of an aircraft and might consequently display its altitude in a wrong color. For example, when the automation fails to acquire the correct altitude information, the automation may present the data block of an aircraft in blue (represents altitude 1) even though the altitude of the aircraft is 3. The participants were told that even if the automation fails, it was still their responsibility to monitor the aircraft in their airspace and prevent collisions between aircraft. Information Analysis Participants assigned to the information analysis condition were given an automated aid that acquired the altitude, location, heading, and destination of all the aircraft in the airspace and determined whether an aircraft would be within 5 miles of another aircraft in the future. If the 5-mile separation distance were violated, aircraft pairs were projected to be in conflict. These aircraft pairs were then displayed in a conflict alert table in the ATC display (See Figure 3). In addition, in order to help participants locate the aircraft pairs projected to be in conflict in the airspace, these were highlighted in blue in the airspace. The information analysis automation used in this study is similar to the User Request

25


Evaluation Tool (URET; e.g., Brudnicki & McFarland, 1997). URET maintains the current flight plans of all the aircraft and uses these to determine potential collisions and notifies the controller 20 minutes prior to the conflict. URET also displays the location of a conflict graphically. The participants were also informed that the automation was highly reliable, though not 100% reliable. Specifically, they were told that the automation might fail to detect an upcoming collision and fail to highlight the aircraft pairs involved in the collision.

Figure 3. A snapshot of the ATST airspace in the information analysis condition. The table on the left lists the potential collisions and separation errors that would occur in the scenario. In this snapshot, the aircraft with call signs 3 and 5 that are flying at altitude 3 and present in paths CG and AE are expected to collide. This information is presented in the table. These aircraft are also highlighted in blue in the airspace.

26


Decision and Action Selection As with the information analysis condition, participants assigned to the decision and action selection condition were given an automated aid that acquired the altitude, location, heading, and destination of all the aircraft in the airspace and determined whether an aircraft would be within 5 miles of another aircraft in the future. If the 5-mile separation distance were violated, aircraft pairs were projected to be in conflict. A recommendation of an altitude change to resolve the projected conflict between the aircraft pairs was then displayed in a ‘recommendations’ table in the ATC display (See Figure 4). In addition, the aircraft whose altitude needed to be changed to resolve the potential conflict was highlighted in blue in the airspace. Specifically, level 4 in Sheridan and Verplank (1978) LOA taxonomy (i.e., ‘Computer suggests one alternative’) was implemented in this condition. The decision and action selection automation used in this study was similar to the Prediction/Resolution Advisory Tool, which provides a conflict resolution advisory to controllers (PRAT; e.g., Nolan, 1999). The participants were also informed that the automation was highly reliable, though not 100% reliable. Specifically, they were told that the automation might fail to detect an upcoming conflict and thereby fail to provide the recommendation to resolve the collision.

27


Figure 4. A snapshot of the ATST airspace in the decision and action selection condition. The table on the left provides the recommendation to avoid potential collisions or separation errors that would occur in the scenario. In this snapshot, the aid has provided the recommendation to change the altitude of the aircraft with call sign 3 (that is flying at altitude 3 and present in the path GC) to altitude 1 to avoid a potential collision (with the aircraft with call sign 5). This aircraft is also highlighted in blue in the airspace.

Action Implementation As with the information analysis and decision and action selection conditions, participants assigned to the action implementation condition were given an automated aid that acquired the altitude, location, heading, and destination of all the aircraft in the airspace and determined whether an aircraft would be within 5 miles of another aircraft in the future. If the 5-mile separation distance were violated, aircraft pairs were projected to be in conflict. In this case, the automated aid resolved the upcoming conflict by making an altitude separation (See Figure 5). Specifically, level 7 in Sheridan and Verplank (1978) LOA taxonomy (i.e., ‘Computer executes selected option and informs the human’) was implemented in this condition. Therefore, the automated aid autonomously detected and resolved conflicts and then provided feedback to the participants (See Figure 5). 28


Specifically, the action performed on one of the aircraft to resolve the potential conflict was provided in a table in the ATC display as well as highlighted in blue in the airspace. As with the other LOAs, the participants were told that the automation was highly reliable, though not 100% reliable. Specifically, they were told that the automation might fail to detect an upcoming conflict and thereby fail to resolve the conflict.

Figure 5. A snapshot of the ATST airspace in the action implementation condition. The automated aid resolves potential collisions or separation errors that would occur in the scenario and gives feedback to the participant. The action performed by the automation to resolve the potential collision is indicated in the table. In this snapshot, the automated aid has changed the altitude of the aircraft with call sign 3 (that is present in path GC) to altitude 1 (from altitude 3) to avoid a potential collision (with the aircraft with call sign 5). The aircraft on which the action is taken is also highlighted in blue in the airspace.

The underlying functioning of the information analysis, decision and action selection, and action implementation aids were the same. The factor that distinguished these LOAs was the extent to which collision information was made available to the participants. In the information analysis condition, both the pairs of aircraft that were projected to be in conflict were highlighted on the radar display. Note that in the information analysis

29


condition, the decision to be made to resolve the conflict (i.e., determining the altitude to which an aircraft needs to be changed) and the action to be executed to resolve the conflict (i.e., changing the altitude of the aircraft ) had to be performed by the participants. In the decision and action selection condition, only one of the aircraft involved in a potential conflict was highlighted on the radar screen, with a recommendation to resolve the conflict. In this automation condition, the action execution (i.e., making the altitude change to avoid the collision) had to be performed by the participants. In the action implementation condition, the automation resolved potential conflicts that would occur in the airspace and highlighted the action performed by it to avoid the conflicts on the radar display. Note that, in this LOA, the decision made to avoid an upcoming collision as well as executing the action to avoid the collision was done by the automation. Dependent Variables The dependent variables of interest in this study were SA, meta-SA, ATC task performance, secondary task performance, and subjective rating of mental workload. All the dependent variables are discussed in detail in this section. SA Similarities with SAGAT. The main dependent variable of interest in this study was participant SA. SA was assessed by using a modification of the SAGAT methodology (Endsley, 1987; Endsley, 1995a). The ATC simulation was frozen during the scenarios and the ATC display screen was blanked. The participants were then asked to turn away from the ATC display screen. They were then presented with a map of the airspace and asked to indicate the location of all the aircraft that were present in their airspace with an “X” on the map. Once they completed recalling the locations of all the aircraft, they were

30


given another map. The locations of all the aircraft in the airspace were copied from the ATC display to this map by the experimenter while the participants were completing the location recall. They were then asked to indicate the call sign, altitude, heading, and destination of all the aircraft on the second map (See Appendix F). This methodology of blanking the ATC display and asking controllers to recall the attributes of all the aircraft that are currently present in their airspace has been widely used to measure controller SA (e.g., Endsley, Sollenberger, Nakata, & Stein, 2000; Gronlund et al., 1998; Mogford, 1997). The timing of the freezes followed the guidelines recommended by Endsley (1995a, 2000). Therefore, SAGAT freezes did not occur in the first 3-5 minutes of the ATC scenario, thereby providing participants sufficient time to obtain an understanding of the events occurring in the scenario. In addition, during the 15-minute experimental scenario, SAGAT freezes occurred twice, which is acceptable because up to 3 SAGAT freezes have been shown to have no adverse effects on task performance during 15-minute scenarios (Endsley, 2000). Further, the SAGAT stops did not occur within 1 minute of each other to help ensure that participants could regain their SA following the first SAGAT stop. Participants were given a maximum time limit of 5 minutes to complete the recall. SAGAT stops exceeding 5 minutes are not advocated because long stops may result in memory decay (Endsley, 1995a). Differences from SAGAT. In the traditional SAGAT methodology (Endsley, 1995a), the overall accuracy in recalling all the aircraft attributes is used as a measure of operator SA. This does not tell us the extent to which individuals monitor the different attributes of aircraft. In this experiment, in addition to the total SA score, the extent to which

31


participants monitored various dimensions of an ATC situation such as call sign, destination, location, altitude, and heading were examined separately. Another important difference from the traditional SAGAT methodology is with regard to the determination of freeze times. In the traditional SAGAT technique, the time of occurrence of the freezes is randomly determined (Endsley, 2000). In this experiment, the time of occurrence of the SAGAT freezes was determined by the experimenter and recorded in a file so that the simulation programs for all the LOAs can read from the file and present the SAGAT freezes at the same time to the participants in all the LOAs, during a scenario. Randomly administered SAGAT freezes pose the problem of assessing the SA of participants working under varying LOAs at different times during the experimental scenario. Assessing SA. Both the SAGAT freezes occurred before the occurrence of the automation failure. The SA assessments took place before the occurrence of the automation failure so as to examine the extent to which pre-automation failure SA affected the collision detection performance of individuals working with the different LOAs following the automation failure. The responses of the participants during the first and second SAGAT freezes were averaged to obtain an overall proportion accuracy score for call sign, destination, location, altitude, and heading. Proportion accuracy score for call sign was computed as the ratio of the number of aircraft call signs correctly recalled to the total number of aircraft in the airspace. Proportion accuracy score for destination, heading, and altitude was computed in the same manner as the proportion accuracy score for call sign.

32


In order to compute the proportion accuracy score for aircraft location, the distance error (in cm) between each aircraft’s reported and actual location was computed. If this distance error exceeded the “5-mile” distance (5 miles = 0.8 cm) in the ATST airspace (See the “5 miles” standard in Figure 1), or the aircraft was positioned in a wrong segment in the airspace (e.g., consider an aircraft positioned at segment GF in the airspace when it was actually present at segment GH), then the response was recorded as incorrect. This procedure was repeated for all the aircraft that were present in the ATC screen at the time of the freeze to obtain an overall proportion accuracy score for aircraft location. In addition to the scores on the various aircraft attributes, total SA score was also examined. The proportion accuracy scores for call sign, destination, location, altitude, and heading were averaged to obtain the total SA score. Meta-SA Meta-SA refers to individuals’ subjective assessment of their own SA (Durso, Rawson, & Girotto, 2007). Meta-SA was assessed by asking participants to indicate their confidence in their ability to correctly recall each of the aircraft attributes during the SAGAT freezes using a 0 (not at all confident) to 10 (extremely confident) rating scale, at the end of the experimental scenario. This methodology is analogous to retrospective confidence judgments that are used to assess a learner’s metacognition (e.g., Dunlosky, Serra, & Baker, 2007). ATC Task Performance Variables In addition to participant SA, performance measures in the ATC task were also collected. The main ATC performance variable of interest was advance notification times

33


(Metzger & Parasuraman, 2005). Participants were asked to press a button labeled ‘conflict’ on the radar display when they detected an upcoming collision in the scenario and tell aloud the call signs of the aircraft involved in the collision. Advance notification time was computed as the time the planned collision would occur (or would have occurred) in the scenario minus the time the participant reported the collision by pressing the conflict button. Advance notification time was recorded when the automated aids were 100% reliable as well as following the failure of the automated aids. In addition, the number of rule violations and handoff delay time was also collected. Rule violations occurred when the participants did not adhere to the rules in the simulator and landed or exited aircraft at incorrect speed and altitude. Handoff delay time (times are in number of radar sweeps) was computed as the difference between the time an aircraft requiring acceptance of the handoff appeared on the radar screen to the time the controller accepted the handoff. All the ATC performance variables were recorded by the ATST simulation program. Secondary Task Performance Variables In addition to the ATC task, participants were also given a secondary task to perform. Participants were instructed to respond to wind shear warnings by pressing the spacebar key, which was recorded as a hit. Failure to press the spacebar key to wind shear warnings was scored as a miss. Thus, the performance measures for the weather monitoring task included a) the hit-to-signal ratio which was computed as the number of hits/ (number of hits + number of misses), b) reaction time for the detection of hits, and c) the number of false alarms.

34


Subjective Rating of Mental Workload Subjective ratings of mental workload were collected using an electronic version of NASA-TLX (Hart & Staveland, 1988). Participants first rated the perceived level of demand for each of the six workload components (on a scale of 5 to 100): mental demand, physical demand, temporal demand, performance, effort, and frustration. They then made 15 pair wise rankings of the six components in relation to the task to establish weights for each component (e.g., ‘Click on the factor that contributes more to your workload: mental demand or physical demand’). In order to obtain a composite workload score, individual component ratings were multiplied by the component weights and summed together. Workload was assessed at the end of each SAGAT freeze (See Appendix F). The independent and dependent variables in this experiment are provided in Table 1.

35


Table 1. The Independent and dependent variables in this study are shown Independent variable Dependent variables • LOA • ATC task performance o Information acquisition o Advance notification time o Information analysis (following automation o Decision and action failure) selection o Rule violations o Action implementation o Handoff delay • SA o Call sign o Location o Altitude o Heading o Destination o Total SA • Meta-SA o Meta-memory for call sign o Meta-memory for location o Meta-memory for altitude o Meta-memory for heading o Meta-memory for destination • Secondary task performance o Hit-to-signal ratio o Reaction time to hits o Number of false alarms • Subjective workload

36


Procedure This study involved approximately 3 hours of participation. Table 2 presents the details of the experiment. Table 2. Experiment details What did the participant do? • • • • • • • • • • • • • • •

Consent form Ishihara Color Blindness Test (Screening for red-green color deficiencies) Biographical Questionnaire and Video Game Questionnaire Instructional Video • Airspace layout Learn the map of the airspace Instructional Video • Rules to control traffic Two manual training scenarios Secondary task training Two manual training scenarios with secondary task SAGAT familiarization scenario Manual performance assessment on two scenarios LOA instructions LOA training scenario Three LOA training scenarios with secondary task Performance assessment on LOA scenario with secondary task • SAGAT administered twice during the scenario (before the automation failure) • NASA-TLX collected at the end of each SAGAT stop • Meta-SA ratings collected at the end of the scenario Procedural Details

Airspace Map and ATST Rules Participants first completed a consent form (See Appendix C). Thereafter, they were screened for red-green color deficiency using the Ishihara color blindness test (1951). The Ishihara test consisted of 11 pseudoisochromatic plates. Ten plates were made of

37


colored circles with two-digit numbers inscribed inside the circles. Participants were asked to tell aloud the numbers inscribed in the circles. One of the plates consisted of a bluish-green winding line that the participants had to trace. They were allowed 3 s to respond to each plate. Incorrect response to 4 or more plates was considered indicative of defective red-green color vision. Only participants with normal color vision were allowed to participate in the experiment. Two participants were omitted from the experiment due to defective red-green color vision. After completing the color blindness test, participants completed a biographical questionnaire (See Appendix D), and a video game experience questionnaire (See Appendix E). They then watched an instructional video that demonstrated the layout of the airspace (See Appendix F). After watching the instructional video, they were asked to summarize what they learned. Thereafter, they memorized the map of the airspace. They were then given a blank map and asked to indicate the airports, sector gates, waypoints, the relative speeds of fast, medium and slow planes, and the route that a plane travels on the airspace given the source and destination of the plane. If the participants did not reconstruct all these parameters accurately in the first trial, they were asked to re-learn the airspace map till 100% accuracy was achieved. This procedure helped participants learn the map of the airspace. Thereafter they watched an instructional video that demonstrated the rules to control traffic in the ATST simulator (See Appendix F). After learning the rules to control air traffic, participants were asked to explain the rules to the experimenter, with the latter providing feedback. This procedure helped to learn the rules of controlling traffic.

38


Manual Training and Assessment After watching the instructional video, the participants completed an ATC training scenario. The experimenter helped the participant during this scenario. The participants then completed another training scenario, with no assistance from the experimenter. These scenarios lasted for 5 minutes and required participants to control a total of 9-10 aircraft, with 4 aircraft present at a time on the ATC display. A total of two collisions were embedded in each of these scenarios. The training on the ATC task was followed by a 5-minute training session on the secondary task (See Appendix F). Following the training on the secondary task, participants completed two training scenarios, in which they performed the ATC task in conjunction with the secondary task. These dual-task training scenarios lasted for 5 minutes each and involved 8-9 aircraft, with 4 aircraft present at a time on the ATC screen. A total of two collisions were embedded in each of these scenarios. After completing the dual task training scenarios, participants completed a SAGAT familiarization scenario. Before the start of this scenario, participants were told that the ATC scenario and the weather display will freeze at a random time during the upcoming scenario. They were informed that when a freeze occurred they would be given a map of the airspace and asked to recall the attributes of all the aircraft present in their airspace before the freeze occurred (See Appendix F). They were also informed that if they encountered a freeze at any time during any of the scenarios during the course of the experiment, they would be given a map of the airspace and asked to recall the attributes of all the aircraft present in their airspace at the time of the freeze. In addition, they were

39


also told that freezes would occur at random times during scenarios and that multiple freezes could occur during a scenario. Following the SAGAT familiarization scenario, participants were assessed on their performance on two scenarios. In these scenarios, participants performed the ATC task in conjunction with the secondary task. These scenarios lasted for 10 minutes, with 4 aircraft present at one time on the ATC display. Each of these scenarios involved four potential collisions. Note that before assigning the participants to the appropriate LOA, they were given manual training, where they controlled traffic with no assistance from automation aids. It is not clear in the automation literature the extent to which manual training was provided to the participants before assigning them to various LOAs (e.g., Endsley & Kiris, 1995). The purpose of providing manual training to the participants in this study before assigning them to different LOAs was to help them develop appropriate internal models that would help them detect automation failures (e.g., Endsley, 1995b). For example, Kessel and Wickens (1982) showed that participants who were given manual training in a pursuit tracking task were superior in detecting system failures under passive monitoring conditions than those who were given no manual training. According to Moray (1986), if operators are solely trained for outer loop control, then it may become impossible for them to take over manual control when the automation fails. Therefore, manual training was provided to the participants in this experiment before assigning them to work with the LOAs.

40


LOA Training After completing the manual test scenarios, participants were randomly assigned to one of the LOAs. A partial blind procedure was used in this experiment in order to partly control for experimenter bias (e.g., Gould, 2002). That is, the experimenter was unaware of the automation condition to which the participants were assigned till manual training was complete. The participants first received instructions on the functionalities of the automated aid (See Appendix F). In addition, participants were also informed that though highly reliable, the aid is not 100% reliable. They were told that in the event of an automation failure, it was still their responsibility to prevent collisions between aircraft. These instructions are consistent with instructions given to participants in prior studies (e.g., Metzger & Parasuraman, 2005; Rovira et al., 2007). After receiving instructions on the functionalities of the LOA, participants performed a 5-minute training scenario with the assigned LOA. This scenario had a total of five aircraft present at one time on the ATC display. The experimenter helped the participants during this scenario, pointing out the functionality of the automated aid. A total of five collisions were embedded in this scenario to demonstrate the functionality of the various LOAs. This training scenario was followed by three more LOA training scenarios. During these scenarios, participants completed both the ATC task as well as the secondary task. These scenarios lasted for 10 minutes each, with 5 aircraft present on the ATC display at one time. A total of eight collisions were scripted into each of these scenarios. The automation did not fail in any of the LOA training scenarios so as to build up participants’ trust on the automated aids. This is consistent with the methodology adopted

41


in earlier studies (e.g., Kaber et al., 2000; Rovira et al., 2007). Therefore, for participants in the information acquisition condition, aircraft flying at altitude 3, 2, and 1 were respectively presented on the ATC display in green, pink, and blue. The automation did not fail in acquiring the correct altitude of aircraft in any of the training scenarios. In the information analysis condition, the aid never failed to present the upcoming collisions to the participants in any of the training scenarios. In the decision and action selection condition, the aid never failed to provide recommendations to avoid upcoming collisions and finally in the action implementation condition, the aid never failed to automatically prevent upcoming collisions. LOA Assessment After completing the training scenarios, participants were assessed on their performance in a LOA test scenario. Participants controlled traffic and performed the secondary task in this scenario. This scenario lasted for 15 minutes, involving a total of 5 aircraft on the ATC display at one time. This scenario involved a total of 12 scripted collisions. The different LOAs were perfectly reliable in detecting the first 11 upcoming collisions. The automated aids failed to detect the 12th potential collision. That is, for participants in the information acquisition condition, the aid failed to present the data block of one of the aircraft that would be involved in an upcoming collision in the correct color. For example, suppose that aircraft pairs with call signs 26 and 27, flying at altitude 3, were in danger of a potential collision. Instead of presenting the aircraft with call sign 26 in green color, the aid presented it in blue (which represents altitude 1). In the information analysis condition, the aid failed to detect the 12th scripted collision and thereby failed to highlight

42


the datablocks of the aircraft involved in the collision. In the decision and action selection condition, the aid failed to detect the 12th scripted collision and thus failed to provide the recommendation to resolve the collision. Finally, in the action implementation condition, the aid failed to detect the 12th scripted collision and thereby failed to automatically resolve the collision. The time taken by participants working with different LOAs to detect the 12th scripted collision (i.e., advance notification time) following the failure of the automation was recorded. The scenario ended 8 seconds after the occurrence of the 12th scripted collision. Only one failure was induced in the experimental scenario. That is, the reliability of the automated aids was set at a high level (of 11/12 = 0.92). Automated systems with reliabilities below 0.7 have been shown to be worse than having no automation support (e.g., Wickens & Dixon, 2005). Further, operators also ‘disuse’ an automated system if the system does not function as expected (Parasuraman & Riley, 1999). SA of the participants was assessed twice during the scenario. Both SA assessments were conducted prior to the occurrence of the automation failure so as to determine the SA of individuals working under perfectly reliable automation conditions. The extent to which the pre-automation failure SA score mediated the relationship between LOA and collision detection performance following automation failure was examined. Participants also provided NASA-TLX ratings at the end of each SAGAT stop. After completing the test scenario, participants were asked to make subjective judgments about their ability to correctly recall the aircraft attributes during the SAGAT freezes.

43


CHAPTER III RESULTS AND DISCUSSION The purpose of this study was to examine the differences in the SA and performance of individuals when they worked with different levels of a conflict detection aid to control traffic. One way Analysis of Variance (ANOVA) with LOA as the between subjects variable was performed on ATC task performance variables (advance notification time, rule violations, and handoff delay), SA, meta-SA, secondary task performance variables (hit-to-signal ratio, reaction time to hits, and false alarms), and subjective ratings of workload. The summary of the results is presented in Table 3. Table 3. Summary of results is presented. There were no differences between the three high LOAs (information analysis, decision and action selection, and action implementation) in any of the dependent variable. Information Acquisition Automation

Dependent Variable ATC Performance

Higher advance notification time than high LOAs (i.e., faster detection of collision following automation failure)

Advance notification time (following automation failure)

Not different from high LOAs Rule violations Not different from high LOAs Handoff delay Higher total SA than high LOAs Total SA Not different from high LOAs Meta-SA Lower hit-to-signal ratio than high LOAs Secondary task performance (hit-to-signal ratio) Not different from high LOAs Subjective ratings of workload

ATC Performance Three one way ANOVAs with LOA as the between subjects variable was performed on advance notification time following automation failure, number of rule violations, and handoff delay. A Bonferroni adjustment (e.g., Tabachnick & Fidell, 2001) was done to account for multiple testing (adjusted significance level was 0.017). Throughout the 44


results section, effects that reached traditional, uncorrected levels of significance will be reported as marginal. Effect of LOA on Advance Notification Time A one way ANOVA with LOA as the between subjects variable was performed on advance notification time following automation failure. Based on the active-processing hypothesis, it was expected that the participants assigned to the information acquisition condition will have higher advance notification time following automation failure in comparison to those in the high LOAs. On the other hand, based on the free cognitive resources hypothesis, it was expected that participants in the information acquisition condition will have lower advance notification time following automation failure compared to those in the high LOAs. The effect of LOA on advance notification time was significant, F (3, 68) = 16.507, p < .001, ηp2= 0.421. Tukey’s HSD analysis was conducted on the mean advance notification time to identify the LOAs that were significantly different from each other. The results revealed that individuals working with information acquisition automation were significantly faster in detecting collisions compared to individuals working with all the other LOAs, following the automation failure (See Table 3). No significant differences in the advance notification time were observed between the individuals working with the information analysis automation (95% CI = 18.842, 37.368), decision automation (95% CI = 7.173, 25.699) and action implementation automation (95% CI = 4.907, 23.433). The mean advance notification times for information acquisition automation, information analysis automation, decision and action selection automation, and action

45


implementation automation were 55.261 s (SD = 15.078), 28.105 s (SD =25.871), 16.436 s (SD =20.239), and 14.170 s (SD =15.662), respectively (See Figure 6). Effect of Level of Automation on Advance Notification Time (Following automation failure)

Advance Notification Time (in seconds)

70

60

50

40

30

20

10

0

sis on ion tion tati lect n a ly n uisi e e A q S c n n plem nA a ti o ctio atio n Im dA orm m f o n i r n t a o I Ac Inf ision Dec

Level of Automation

Figure 6. Effect of LOA on advance notification time following automation failure. The ideal advance notification time is 70 s. Error bars indicate + 1 standard error of the mean.

In summary, when the automation failed, individuals working with information acquisition automation detected the upcoming collision earlier in comparison to the other LOAs. Specifically, the individuals in the information acquisition condition detected the upcoming collision approximately 27 s, 38 s, and 41 s earlier than information analysis automation, decision and action selection automation, and the action implementation automation, respectively. Participants in the information acquisition condition were actively involved in making predictions about potential collisions, generating their own

46


decisions on how to avoid the collisions, and implementing their decisions, contributing to better performance following automation failure compared to the other LOAs. Thus, automation of sensory processing had benefits over automation of higher order cognitive functions such as prediction generation and decision making. This result confirmed the active-processing hypothesis, indicating that decreasing the level of operator involvement can prove hazardous if the automation reliability is not 100%. There is a plethora of research demonstrating the advantages of active processing. For example, Craik and Tulving (1975) showed that deeper levels of processing improve the memorability of information. Similarly, research on student learning has shown that students who generated their own questions after studying a text performed better on an upcoming test than those who received the questions from the experimenter (Foos, Mora, &, Tkacz, 1994). The disadvantages of passive involvement have been demonstrated in safetycritical domains as well (e.g., Gugerty, 1997; Young, 1969; Wickens & Kessel, 1979). For example, Endsley & Kiris (1995) showed that individuals working with a fully automated navigation expert system took longer to perform the task manually following the automation failure compared to those who performed the task manually. Effect of LOA on Rule Violations A one way ANOVA with LOA as the between subjects variable was performed on rule violations. Based on the active-processing hypothesis, it was expected that the participants assigned to the information acquisition condition will have lesser number of rule violations in comparison to those in the high LOAs. On the other hand, based on the free cognitive resources hypothesis, it was expected that participants in the information acquisition condition will have higher number of rule violations compared to those in the

47


high LOAs. However, the effect of LOA on the total number of rule violations failed to reach significance, F (3, 68) = 2.628, p >.05, ηp2= 0.104 (See Table 3). This means that none of the automated aids appeared to have benefited individuals in tasks such as landing and exiting aircraft at the right speed and altitude. Effect of LOA on Handoff Delay A one way ANOVA with LOA as the between subjects variable was performed on handoff delay. Based on the active-processing hypothesis, it was expected that the participants assigned to the information acquisition condition will have lower handoff delay in comparison to those in the high LOAs. On the other hand, based on the free cognitive resources hypothesis, it was expected that participants in the information acquisition condition will have higher handoff delay compared to those in the high LOAs. There was a significant effect of LOA on handoff delay, F (3, 68) = 4.372, p < .01, ηp2 = 0.162. The mean handoff delay for information acquisition automation, information analysis automation, decision and action selection automation, and action implementation automation were 23.722 s (SD = 12.960), 26.833 s (SD =14.051), 41.222 s (SD =20.916), and 24.500 s (SD =17.521), respectively. Tukey's HSD analysis was conducted on the mean handoff delay time. Results indicated that individuals working with information acquisition automation were faster in accepting planes into their airspace compared to those working with the decision automation. The post-hoc analysis also revealed that individuals working with the active implementation automation were faster in accepting aircraft into their airspace in comparison to those working with the decision automation. These results were not in full agreement with either the active-processing hypothesis or the free cognitive resources hypothesis. Thus, it would be safe to conclude that none of

48


the automated aids seemed to have helped individuals in accepting planes quickly into their airspace (See Table 3). SA The overall proportion accuracy score for call sign was obtained by averaging the proportion accuracy score for call sign obtained during the two SAGAT freezes. The overall proportion accuracy score for destination, altitude, and heading were obtained in the same manner as the overall proportion accuracy score for call sign. In order to compute the proportion accuracy score for aircraft location, the distance error (in cm) between each aircraft’s reported and actual location was computed. If this distance error exceeded the “5-mile” distance (5 miles = 0.8 cm) in the ATST airspace or the aircraft was positioned in a wrong segment in the airspace, then the response was recorded as incorrect. The proportion accuracy scores for call sign, location, altitude, heading, and destination obtained during the two SAGAT freezes were averaged to compute the total SA score. Effect of LOA on Total SA A one way ANOVA with LOA as the between subjects variable was performed on the total SA score. Based on the active-processing hypothesis, it was expected that the participants assigned to the information acquisition condition will have higher SA than those in the high LOAs. On the other hand, based on the free cognitive resources hypothesis, it was expected that participants in the information acquisition condition will have lower SA compared to those in the high LOAs. The main effect of LOA on the total SA score was significant, F (3, 68) = 6.304, p < .01, ηp2 = 0.218. Tukey’s HSD analysis on the total SA score showed that individuals working with information acquisition

49


automation had higher overall SA compared to those working with all the other LOAs. The mean proportion accuracy score for total SA for the information acquisition automation, information analysis automation, decision and action selection automation, and action implementation automation were 0.514 (SD =.101), 0.397 (SD =.085), 0.416 (SD =.085), and 0.383 (SD =.125), respectively. In summary, individuals assigned to the information acquisition automation condition had higher total SA in comparison to the other automation conditions (See Table 3). They were also faster in detecting the collision following automation failure compared to the individuals in the other LOAs. These results were consistent with the active-processing hypothesis, indicating that decreasing the level of operator involvement by automating higher order stages of information processing such as inference generation, decision making, and implementation of automation generated decisions can lower SA and leave the operator out of the loop. Total SA as a Mediator between LOA and Performance Following Automation Failure The methods developed by Baron and Kenny (1986) were used to determine whether the relationship between the LOA and the advance notification time following automation failure was mediated by pre-automation failure total SA. For SA to serve as a mediator in the association between LOA and the advance notification time, the following conditions should be met: (1) LOA should be associated with SA (2) SA should be associated with the advance notification time, (3) LOA should be associated with the advance notification time, and (4) the association between LOA and advance notification time should decrease, when SA is controlled (See Figure 7).

50


Four linear regression analyses were conducted to test for mediation. First, a regression analysis was conducted with LOA predicting total SA. The results showed that total SA was predicted by LOA (β = -.379, p < .01). Second, a regression analysis was performed with total SA predicting advance notification time. The results revealed that advance notification time was predicted by total SA (β = .488, p < .001). Third, a regression analysis was conducted with the LOA predicting advance notification time. Results indicated that advance notification time was predicted by LOA (β = -.600, p < .001). Fourth, a regression analysis was performed with LOA and total SA predicting advance notification time. The results revealed that advance notification time was predicted by both LOA (β = -.484, p < .001) and total SA (β = .304, p < .01). To determine whether total SA mediates the relationship between LOA and advance notification time following automation failure, Sobel’s (1982) test was conducted. Software available on the Internet was used to perform this test (Preacher & Leonardelli, 2001). The following statistics were obtained: the unstandardized regression coefficient for the association between LOA and total SA and its standard error as well as the unstandardized regression coefficient for the association between total SA and advance notification time, adjusted for LOA and its standard error. Sobel test statistics revealed significant p values for the indirect effect of total SA as a mediator between LOA and advance notification time following automation failure (z = -2.287, p < .05). In summary, pre-automation failure total SA mediated the relationship between LOA and advance notification time following automation failure, with higher SA before automation failure contributing to earlier detection of the collision following automation failure. This finding is important to the main purpose of the study. This study

51


demonstrates that SA predicts performance when operators interact with automated systems, with lower SA resulting in poor performance when the automation fails.

Pre-automation Failure SA (Mediator)

β = .488***

β = -.379 ** β = -.6*** LOA

Without mediation

β = -.484***

Advance notification time

With mediation

Figure 7. Mediation model of the association between LOA and advance notification time following automation failure as mediated by Total SA. Standardized regression coefficients are presented in the figure. ***p < .001, **p < .01. Conventionally, the accuracy of recalling all the aircraft attributes is treated as a measure of overall operator SA (e.g., Endsley, 1995b; Kaber et al., 2000). Aggregating the recall accuracy scores of all the attributes to obtain an overall SA score tells us little about how individuals come to comprehend ATC situations. Therefore, the extent to which participants working with the different LOAs monitored the different aircraft attributes was examined. Determining the attributes that are monitored poorly under high LOAs can help design SA training programs that help individuals monitor these attributes better. Effect of LOA on SA Variables A one-way between subjects multivariate analysis of variance (MANOVA) was performed on five dependent variables: call sign, location, altitude, heading, and destination. The independent variable was the LOA. With the use of the Wilk’s criterion, the dependent variables were significantly affected by the LOA, F (15, 177.077) = 2.433,

52


p < .01, ηp2 = 0.158. Univariate analyses showed that there was an effect of LOA on altitude recall, F (3, 68) = 4.398, p < .01, ηp2 = .162. Tukey’s HSD analysis on the mean altitude recall showed that participants in the information acquisition condition (M = .706, SD = .234) recalled altitude higher than those in the information analysis (M = .517, SD = .172) and action implementation condition (M = .467, SD = .243). Note that the participants in the information acquisition condition were provided with a color coding aid that cued the aircraft altitude using color. Color coding would have helped the participants in this LOA to preattentively process altitude without allocating focused attention to the datablocks (e.g., Johnston et al., 1993; Remington et al., 2000; Treisman, 1986), leading to superior recall of this attribute. In summary, individuals working with the information acquisition automation had higher awareness of the altitude of the aircraft in their airspace in comparison to individuals working with the other LOAs. This finding provided partial support for the unequal attribute relevance hypothesis. Aircraft altitude help controllers anticipate collisions between aircraft pairs and are monitored to a greater extent by controllers than other attributes such as call sign and speed (e.g., Bisseret, 1970; Means et al., 1998; Mogford, 1997). This could be why participants in the information acquisition condition, who were actively involved in detecting collisions, making decisions, and resolving collisions with minimal automation support, remembered the altitude information better than the other LOAs. Surprisingly, the participants in the various LOAs did not differ in the recall of the other spatial attributes, namely aircraft location and heading. This could be because location and heading are represented graphically on the radar display and hence might be easier to remember irrespective of the LOA to which an individual is

53


assigned. Aircraft altitude, on the other hand, consists of numeric information, which needs to be monitored to be remembered. Univariate analyses also showed that there was an effect of LOA on destination recall, F (3, 68) = 3.081, p < .05, ηp2 = .120. Tukey’s HSD analysis on the mean destination recall revealed that participants in the information acquisition condition (M = .572, SD = .193) recalled destination of aircraft higher than those in the action implementation condition (M = .383, SD = .238). The individuals working with the LOAs did not differ in the recall of the call sign attribute. This could be because call signs need to be remembered only on a need-toknow basis (e.g., Gronlund et al., 1998). Thus, the LOAs differed only in the monitoring of two dimensions: altitude and destination. There is evidence that these attributes are monitored the most by controllers. For example, Durso, Batsakes, Crutchfield, Braden, and Manning (2004) found that when controllers were asked to prioritize the strip markings that they made on flight progress strips, they judged altitude and route issued as the most frequent and critical strip markings. The finding that individuals assigned to the different LOAs differed only in the recall of certain aircraft attributes is relevant to the purpose of the study. Considering all the aircraft attributes as equal does not tell us anything about how controllers come to comprehend situations. In order to understand the process of SA in the ATC domain, it is important to understand the extent to which individuals monitor various aircraft attributes separately.

54


Adopting the methodology of examining the extent to which individuals recall various aircraft attributes to understand the comprehension of operators in an ATC environment is analogous to the work done in reading comprehension. There has been considerable research in reading comprehension to understand the dimensions of a textual situation that readers monitor (e.g., Magliano, Zwaan, & Graesser, 1999; Zwaan, Magliano, & Graesser, 1995; Zwaan & Radvansky, 1998; Zwaan, Radvansky, Hilliard, & Curiel, 1998). Narratives are indexed along various dimensions that include protagonist, time, intentionality, causality, and space (Zwaan, Langston, &, Grasser, 1995). As in reading comprehension, the process of SA in ATC can also be better understood by determining the extent to which individuals monitor various aircraft attributes separately. SA of Altitude and Destination as a Mediator between LOA and Performance The LOAs differed only in the monitoring of altitude and destination. Therefore, the proportion accuracy scores on these two attributes were combined to obtain an overall score. The extent to which this pre-automation failure SA score mediated the relationship between LOA and ATC performance following automation failure was examined. The methods developed by Baron and Kenny (1986) were used to determine whether the relationship between the LOA and the advance notification time following an automation failure was mediated by SA for altitude and destination. For SA to serve as a mediator in the association between LOA and the advance notification time, the following conditions should be met: (1) LOA should be associated with SA (2) SA should be associated with the advance notification time, (3) LOA should be associated with the advance notification time, and (4) the association between LOA and advance notification time should decrease, when SA is controlled (See Figure 8).

55


Four linear regression analyses were conducted to test for mediation. First, a regression analysis was conducted with LOA predicting SA. The results showed that SA was predicted by LOA (β = -.395, p < .01). Second, a regression analysis was performed with SA predicting advance notification time. The results revealed that advance notification time was predicted by SA (β =.444, p < .001). Third, a regression analysis was conducted with the LOA predicting advance notification time. Results indicated that advance notification time was predicted by LOA (β = -.600, p < .001). Fourth, a regression analysis was performed with LOA and SA predicting advance notification time. The results revealed that advance notification time was predicted by both LOA (β = -.503, p < .001) and SA (β = .245, p < .05). To determine whether SA for altitude and destination mediates the relationship between LOA and advance notification time following automation failure, Sobel’s (1982) test was conducted. Sobel test statistics revealed significant p values for the indirect effect of SA as a mediator between LOA and advance notification time following automation failure (z = -2.024, p < .05). In summary, SA for altitude and destination mediated the relationship between LOA and performance following automation failure, with lower recall of aircraft altitude and destination resulting in longer advance notification time. Therefore, automating higher order cognitive functions can hamper the comprehension of important aircraft attributes and leave individuals out of the loop.

56


Pre-automation Failure SA (Mediator)

β = .444***

β = -.395 ** β = -.6*** LOA

Without mediation

β = -.503***


With mediation

Figure 8. Mediation model of the association between LOA and advance notification time following automation failure as mediated by SA for altitude and destination. Standardized regression coefficients are presented in the figure. ***p < .001, **p < .01. Meta-SA At the end of the test scenario, participants were asked to rate their confidence in correctly recalling call sign, destination, altitude, heading, and location during the SAGAT freezes. Note that the participants could assign the same confidence ratings to multiple attributes. Altitude was reported as the most accurately recalled aircraft attribute by the participants (55.56%), followed by heading (34.72%), location (20.83%), destination (19.44%), and call sign (2.78%). This means that majority of the participants were more confident in their ability to recall altitude in comparison to other aircraft attributes. Participants were least confident in their ability to recall aircraft call sign. One of the limitations of the SAGAT methodology was that participants were asked to recall the locations of the aircraft in their airspace before all the other aircraft attributes during the SAGAT freezes. This would have biased them to pay more attention to the location attribute. However, majority of the participants did not rate location as the most remembered aircraft attribute. Location was only rated as the third most recalled attribute.

57


Effect of LOA on Meta-SA Four one way ANOVAs with LOA as the between subjects variable were performed on metamemory for call sign, destination, altitude, heading, and location. A Bonferroni correction was implemented to account for multiple testing (adjusted significance level was 0.01). The effect of LOA on meta-memory for altitude reached marginal significance, F (3, 68) = 3.324, p < .05, ηp2 = 0.128. Tukey's HSD analysis showed that individuals working with information acquisition automation had higher confidence in their ability to correctly recall altitude (M = 7.611, SD = 1.195) compared to those working with information analysis automation (M = 5.611, SD = 2.893) and action implementation automation (M =5.667, SD = 2.142). The effect of LOA on the meta-memory for other aircraft attributes failed to reach significance. This meant that there were no differences between the participants working with varying LOAs in their confidence in their own SA (See Table 3). Even though participants working with the information acquisition automation had higher SA than participants assigned to other LOAs, they did not rate their SA as higher in comparison to participants in the other LOAs. It is important that individuals have good SA and correct confidence in their SA so that they can make the right decisions based on their SA (e.g., Endsley et al., 2003).

58


Relationship between SA and Meta-SA For each LOA, correlations between actual SA and meta-SA were examined (See Table 4). Virtually all the significant correlations involved the call sign attribute. Call sign recall was generally poor and participants were sensitive to their awareness of how well they recalled the call sign attribute. With the exception of call sign, meta-SA did not mirror actual SA. For participants in the action implementation condition, the correlations between accuracy of recall of the aircraft attributes and the confidence in the recall of these attributes did not reliably differ from zero. Table 4. Relationship between SA and meta-SA. ***p < .001, **p < .01 LOA

Correlation between call sign recall and meta-memory for call sign

Correlation between altitude recall and metamemory for altitude

Correlation between heading recall and metamemory for heading

Correlation between location recall and metamemory for location

Correlation between destination recall and metamemory for destination

Information Acquisition

.613**

-.308

.406

.258

.406

Information Analysis

.694**

.722**

. 185

.079

.395

Decision and Action Selection

.762***

.353

-.127

-.347

.322

Action Implementation

.379

.091

-.191

-.031

.218

Overall, regardless of the LOA with which the participants were working, their judgments about their SA were incongruent with their actual SA. That is, irrespective of whether individuals worked with an automated system applied to a lower stage of information processing (i.e., information acquisition) or higher stages of information processing (i.e., information analysis, decision making or action implementation), their

59


perceptions about their SA did not correlate with their actual SA. Individuals working with information acquisition automation had higher actual SA than those assigned to the other LOAs. Therefore, it was surprising that their judgments about their SA did not correlate with their actual SA. Making correct judgments about one’s SA is important to adopt better strategies to improve one’s SA. Individuals working with high LOAs had lower SA than those working with the information acquisition automation. Therefore, it is especially important that individuals working with high LOAs be able to make accurate meta-cognitive judgments about their SA so that they can adopt better monitoring strategies and be equipped to respond to automation failures. The incongruence between meta-SA and actual SA is consistent with the work on basic metacognition (e.g., Dunlosky et al., 2007). For example, judgments of how well individuals can recall information from a text that they learned in an upcoming test are found to be uncorrelated with their actual test performance. The evidence so far is in agreement with the active processing hypothesis. That is, individuals working with high LOAs had lower SA for aircraft attributes that consequently led to poor performance following automation failure in the ATC task. If individuals working with the high LOAs were highly reliant on the automated aids provided to them in the ATC task, they should exhibit superior performance in the secondary task. Individuals assigned to the high LOAs should also experience lower workload than those in the lower LOAs because the primary task of controlling traffic is less demanding for them. Therefore, the next section will examine the secondary task

60


performance and subjective workload ratings of the individuals assigned to the four LOAs. Secondary Task Performance Three one way ANOVAs with LOA as the between subjects variable were performed on hit-to-signal ratio, reaction time to the hits, and the number of false alarms. A Bonferroni correction was implemented to account for multiple testing (adjusted significance level was 0.017). Effect of LOA on Hit-to-signal Ratio There was a significant effect of LOA on the hit-to-signal ratio, F (3, 68) = 12.087, p < .001, ηp2 = 0.348. Tukey's HSD analysis revealed that individuals working with information analysis automation, decision automation, and action implementation automation had higher hit-to-signal ratio compared to those working with the information acquisition automation (See Table 3). The mean hit-to-signal ratio for information acquisition automation, information analysis automation, decision and action selection automation, and action implementation automation were 0.818 (SD = .103), 0.906 (SD = .070), 0.933 (SD = .047), and 0.939 (SD = .032), respectively (See Figure 9).

61


Effect of Level of Automation on Secondary Task Performance 1.0

Hit-to-signal ratio

0.8

0.6

0.4

0.2

0.0

on ion ion ysis tati isit lect nal n u e e A s q m Ac ion tion ple act tion Im ma r a d n o n io orm Inf na Act Inf isio c e D

Level of Automation

Figure 9. Effect of LOA on hit-to-signal ratio. Error bars indicate + 1 standard error of the mean. Effect of LOA on Reaction Time to Hits The effect of LOA on the reaction time to hits reached marginal significance, F (3, 68) = 3.313, p < .05, ηp2 =0.128. Tukey's HSD analysis revealed that individuals working with information analysis automation were faster in responding to wind shear alerts compared to those working with the information acquisition automation. The mean reaction time to hits for information acquisition automation, information analysis automation, decision and action selection automation, and action implementation automation were 1.196 s (SD = .128), 1.071s (SD = .149), 1.083 (SD = .112), and 1.123 (SD = .135), respectively.

62


In summary, individuals working with information analysis, decision, and action implementation automation were more accurate in responding to wind shear warnings compared to those working with the information acquisition automation. The superior performance in the secondary task by individuals working with high LOAs can be attributed to the overreliance they exhibited on the collision detection aids provided to them to perform the ATC task. Participants in the action implementation condition were given an automation aid that detected as well as resolved upcoming collisions between aircraft. This was in sharp contrast to the participants in the information acquisition condition, who had to detect and resolve collisions that would occur in the scenarios, with minimal automated assistance. Therefore, in comparison to the participants in the information acquisition condition, participants working with the other LOAs had more cognitive resources to assign to the secondary task or they chose to allocate more cognitive resources to the secondary task. In short, secondary task performance was superior when individuals worked with high LOAs. These results demonstrate the benefits of high LOAs in multi-task environments like ATC where operators have to perform multiple tasks concurrently. Effect of LOA on False Alarms The effect of LOA on the number of false alarms failed to reach significance, F (3, 68) = 1.167, p > .05, ηp2 =0.049. Binomial probabilities were also calculated to determine whether the number of false alarms for each LOA was significantly less than chance probability. A response was recorded as a false alarm when the participants pressed the spacebar key when a wind shear warning was not presented on the weather display. This analysis helped to establish whether participants in all the LOAs were attending to the

63


secondary task, rather than merely pressing the spacebar key. As shown in Table 5, the results indicated that the overall percentage of false alarms for each LOA was significantly less than chance probability (p < .000001). Table 5. Results from the binomial probability calculations LOA

Mean proportion of false alarms

Information acquisition

.025

Information analysis

.039

Less than chance probability

Decision and action selection

.030



.027


Accuracy of Response Less than chance probability

Results from Analysis of Covariance (ANCOVA) An ANCOVA was conducted to assess the effect of LOA on performance following automation failure after controlling for the performance in the secondary task. That is, the hit-to-signal ratio served as the covariate. Results indicated that even after controlling for the secondary task performance, there was a significant effect of LOA on advance notification time following automation failure, F(3, 67) = 8.335, p < .001, ηp2 = .272. The covariate did not significantly contribute to the ANCOVA model, F (1, 67) = 1.455, p > .05, ηp2 = .021. Post-hoc analysis using Tukey’s Least Significant Difference (LSD) was performed on the advance notification time to determine the LOAs that differed significantly from each other. The adjusted means for the information acquisition, information analysis, decision and action selection, and action implementation automation were 51.852 s (SE = 5.421), 28.387 s (SE = 4.633), 17.882 s (SE = 4.780), and 15.850 s (SE = 4.832), respectively. This means that even after controlling for the secondary task performance, participants in the information acquisition condition detected the collision earlier in comparison to the participants in the information analysis,

64


decision, and action implementation conditions, following the failure of the automation. This finding is important to the central purpose of this study. This shows the disadvantages of automating higher levels of information processing. Even if ATC task is the only task that individuals have to perform, providing high LOAs can leave them out of the loop, presumably due to a vigilance decrement occurring when individuals are working at low levels of involvement (e.g., Parasuraman, 1987). Effect of LOA on Subjective Ratings of Workload The workload ratings collected at the end of the two SAGAT freezes were averaged to obtain an overall total workload score. The effect of LOA on the subjective workload ratings failed to reach significance, F (3, 68) = .723, p >.05, ηp2 = 0.031 (See Table 3). It was expected that participants assigned to the higher LOAs would rate their workload as being lower because the primary task of controlling traffic would be less demanding for them. However, this was not the case. This could be because highly automated systems do not always reduce workload due to the shift in the operator’s role from active involvement to passive monitoring (e.g., Walker, Stanton, & Young, 2001). It could also be that subjective ratings of workload are insensitive measures of actual workload (e.g., Meshkati, Hancock, Rahimi, & Dawes, 1995). Follow Up Purpose After completing the test scenario, the participants were asked to complete a follow up scenario. The purpose of the follow up scenario was to determine how the SA and performance of individuals working with the different LOAs would differ after experiencing the first automation failure (e.g., Wickens & Xu, 2002). There are two lines

65


of research that explain how operator performance differs after being exposed to an automation failure. One line of research claims that operators’ monitoring of automated systems improves following the first automation failure, presumably due to calibration of trust in the automation. For example, using a target detection task, Merlo, Wickens, and Yeh (1999) examined the performance consequences following the failure of an automated cueing aid. In their experiment, participants were instructed to search for targets in a hand-held or helmet mounted display, classify the target as friend or enemy, and report the azimuth of targets. In some trials, an automated aid cued the location of targets on the display. The automated cueing aid was not fully reliable. Merlo et al. (2000) found that when the automated aid failed for the first time, target detection rates were only 50%. However, target detection rates in subsequent failure trials improved to 91%. Presumably, prior to the failure of the cueing algorithm, participants exhibited overtrust in the aid and were immediately drawn to the search space that was cued by the aid. Following the failure of the aid, participants calibrated their trust in the aid that reduced their attentional tunneling. Similarly, Rovira et al. (2007) also examined how the performance of operators changes after being exposed to the failure of different LOAs. Participants were asked to perform a military decision-making task using information automation, low decision automation, medium decision automation, and high decision automation. They found that participants exhibited significant performance improvements in subsequent failure trials (compared to their performance after the first automation failure) for both medium and high decision automation. In summary, individuals tend to be more cautious after they experience an automation failure. Instead

66


of exhibiting overreliance on the automation, they adopt new cognitive strategies that are reflected in their performance following the failure. However, there is also evidence that operator performance may not always improve subsequent to automation failure (e.g., Parasuraman et al., 1993; Parasuraman & Riley, 1997). For example, Kantowitz et al. (1997) found that drivers exhibited poor performance because they continued to rely on a navigation system even though it made errors, when they were in an unfamiliar city, where their confidence in manual navigation was low. Similarly, Biros et al. (2004) showed that military decision makers continued to rely on an automated aid when their workload was high, despite the low trust ratings they assigned to the automation. Therefore, operator reliance on an automated aid following the first occurrence of an automation failure is dependent on operator self confidence as well as workload. There has been no work done to understand the effects of the first automation failure on subsequent operator SA and performance when individuals work with automated systems that are applied to all the four stages of information processing. In short, the purpose of the follow up was to determine the effects of first automation failure on SA and performance of individuals when working with information acquisition, information analysis, decision and action selection, and action implementation in the ATC domain. Overview of Follow Up A test scenario was designed to determine how the SA and performance of individuals working with the different LOAs would differ after exposure to the automation failure in the first test scenario. This scenario involved a total of 6 scripted collisions. The different LOAs were perfectly reliable in detecting the first 5 collisions.

67


The automated aids failed to detect the 6th scripted collision. The time taken by participants working with the different LOAs to detect the 6th collision (i.e., advance notification time) following the failure of the automation was recorded. SA of the participants was assessed once during the scenario before the occurrence of the automation failure. Examining the First Automation Failure Effect ATC Performance: Advance Notification Time A 2 (failure) X 4 (LOA) mixed ANOVA with LOA as the between subjects variable was conducted on the advance notification time to examine the differences in the time taken to respond to the first and second automation failures by individuals assigned to the four LOAs. Failure was the within subjects variable with first automation failure and second automation failure as the two treatment levels. Results indicated a significant main effect of LOA on the advance notification time, F (3, 68) = 20.681, p < .001, ηp2 = .477. No other significant main effects or interactions were identified. A Tukey’s HSD analysis on advance notification time indicated that participants in the information acquisition condition (M = 54.034 s, SD = 12.473) were faster in detecting upcoming collisions following the failure of the automation compared to those in the information analysis (M = 30.384 s, SD = 24.267), decision and action selection (M = 22.608 s, SD = 22.209), and action implementation (M = 12.184 s, SD = 16.405) conditions, in both the scenarios (See Table 6). Thus, participants working with the information acquisition automation were faster in detecting collisions following both the first and second automation failures compared to those working with other LOAs. Even after the first automation failure, participants

68


working with the high LOAs continued to exhibit overreliance on the automated aids and were slower in responding to the second automation failure, compared to participants in the information acquisition condition. Table 6. Effects of automation failure and LOA on advance notification time, secondary task performance, and total SA is presented. There were no differences between the three high LOAs (information analysis, decision and action selection, and action implementation) in any of the dependent variable. Dependent variable

Information Acquisition Automation

Advance notification time (following automation failure)

Higher advance notification time than high LOAs

Secondary task performance (hit-to-signal ratio)

Lower hit-to-signal ratio than high LOAs Higher total SA than high LOAs

Total SA

Secondary Task Performance The secondary task performance of the participants in the follow up scenario was compared to the performance of the participants in the first test scenario in order to assess the effects of the first automation failure on subsequent secondary task performance. A 2 (scenario) X 4 (LOA) mixed ANOVA with LOA as the between subjects variable was conducted on the hit-to-signal ratio. Scenario (first test scenario, follow up scenario) was the within subjects variable. Results indicated a significant main effect of LOA on the hit-to-signal ratio, F (3, 68) = 12.202, p < .001, ηp2 = .350. No other significant main effects or interactions were identified. A Tukey’s HSD analysis on the hit-to-signal ratio revealed that participants in the information acquisition condition (M = 0.813, SD = 0.108) were less accurate in responding to wind shear warnings compared to those in the information analysis (M = 0.907, SD = 0.0615), decision and action selection (M = 0.928, SD = 0.0485), and action implementation (M = 0.930, SD = 0.046) conditions (See Table 6). Thus, even after being exposed to the first automation failure, participants working

69


with the high LOAs continued to attend to the secondary task and exhibited superior performance in the secondary task in comparison to those in the information acquisition condition. This helps to corroborate the notion that individuals working with high LOAs continued to rely on the collision detection aids provided to them to complete the ATC task, despite being exposed to the automation failure in the first test scenario. Total SA The total SA of the participants after being exposed to the first automation failure (i.e., SA in the follow up scenario) was compared to the total SA prior to the first automation failure. A 2 (failure) X 4 (LOA) mixed ANOVA with LOA as the between subjects variable on total SA revealed a significant main effect of LOA, F (3, 68) = 8.852, p < .001, ηp2 = .281. Failure (before first automation failure, after first automation failure) was the within subjects variable. A Tukey’s HSD analysis on total SA showed that participants in the information acquisition condition (M = 0.488, SD = 0.119) had higher SA compared to those in the information analysis (M = 0.381, SD = 0.100), decision and action selection (M = 0.371, SD = 0.101), and action implementation (M = 0.348, SD = 0.106) conditions (See Table 6). In summary, individuals in the information acquisition automation condition had higher SA than the other LOAs before the automation failure as well as after the automation failure. There was also a significant main effect of failure, F (1, 68) = 17.533, p < .001, ηp2 = .205. Participants had higher SA before the first automation failure (M = 0.428, SD = 0.111) rather than after the failure (M = 0.367, SD = 0.127). Lower SA in the follow up scenario would be due to a vigilance decrement associated with monitoring the ATC display for long time (e.g., Endsley et al., 2003; Parasuraman, 1987). There is also a

70


possibility that the difference in SA would have arisen due to the differences in the first and second test scenario. Total SA as a Mediator between LOA and Performance The purpose of this analysis was to examine whether SA after exposure to the first automation failure mediated the relationship between LOA and collision detection following the second automation failure. Sobel test statistics failed to reveal significant p value for the indirect effect of total SA as a mediator between LOA and advance notification time following the second automation failure (z = -1.738, p > .05). The SA for altitude, destination, location, heading, and call sign also did not mediate the association between LOA and collision detection performance following the second automation failure. Therefore, SA failed to mediate the relationship between LOA and advance notification time following the second automation failure. That is, after exposure to the first automation failure, one’s understanding of the situation no longer mediated the relationship between the LOA and the performance following the subsequent automation failure. If SA does not mediate the association between LOA and performance after an automation failure, it becomes important to understand the factors that mediate the relationship. Perhaps, individuals’ confidence in the automation might be mediating the relationship between LOA and performance, following the automation failure. There is evidence that operators continue to rely on automated systems even if there are occasional failures if their confidence in the system exceeds their confidence in manual operation (e.g., Parasuraman & Riley, 1997).

71


There is also a possibility that SA may not have mediated the relationship between LOA and performance due to methodological limitations. First, Endsley (2000) recommends that in order to get a reliable assessment of SA, multiple SAGAT freezes must be administered in the scenario. However, only one SAGAT freeze was administered in the follow up scenario in order to reduce the duration of the experiment. This could be why the SA score failed to mediate the relationship between SA and performance following the automation failure. Second, SA scores as obtained using the SAGAT methodology could not be a true indicator of operator SA. SAGAT has been criticized for heavily relying on operator’s memory (e.g., Durso & Dattel, 2004). The extent to which SA and operator confidence in the automation mediates the relationship between LOA and performance following automation failures needs attention in future studies.

72


CHAPTER IV CONCLUSIONS This study sought to examine the differences in the SA and collision detection performance of individuals when they worked with a collision detection system implemented at various levels to complete an ATC task. The collision detection system was applied to different stages of information processing: information acquisition, information analysis, decision and action selection, and action implementation. High LOAs provided clear benefits. The participants working with the high LOAs exhibited superior performance in the secondary task, indicating that high LOAs can be beneficial in multi-task environments like ATC, where individuals have to perform several concurrent tasks. Though high LOAs offered benefits, the performance costs associated with their failure was high. When the automation failed, the individuals working with the information analysis, decision, and action implementation automation took longer to detect an upcoming collision in comparison to those working with the information acquisition automation. The automated aids used in this study had high automation reliability (though not 100%), that may have contributed to overreliance on the automated aids (e.g., Parasuraman & Riley, 1997). The consequences associated with this overreliance were greater when automation was applied to information analysis, decision and action selection, and action implementation. This finding confirmed the finding of Rovira et al. (2007). Using a military decision making task, they showed that the costs associated with overreliance on automation was greater with decision automation compared to information automation. SA mediated the relationship between LOA and performance following automation failure, with lower SA resulting in longer time to detect the upcoming collision. This

73


meant that high LOAs hamper performance by reducing operator SA. Thus, automating higher order stages of information processing such as inference generation and decision making can lower SA and leave the operator out of the loop. Even after experiencing an automation failure, participants working with the high LOAs continued to exhibit overreliance on the automated aids provided to them. That is, even after being exposed to the first automation failure, the individuals in the information analysis, decision, and action implementation conditions continued to have lower SA and were slower in detecting the collision following the subsequent automation failure compared to the individuals in the information acquisition condition. Therefore, automating higher order stages of information processing can be extremely detrimental to ATC performance when 100% automation reliability cannot be assured because the participants working with high LOAs continued to ‘misuse’ their automated aids, despite experiencing an automation failure. This study provides recommendations for automation design for the future National Airspace System by applying the PSW model of automation. Based on this model, in order to determine the appropriate degree of automation that will keep the operator involved, the consequences for human performance when interacting with automated systems applied to different stages of information processing have to be examined (Parasuraman et al., 2000). In this study, three main human performance areas were examined: SA, performance, and workload. This study revealed that when 100% reliable automation was not provided, automating sensory processing was beneficial in helping individuals detect an upcoming collision earlier following automation failure in comparison to automation of information analysis, decision making, and action

74


implementation, by improving their SA. Workload ratings, however, was not sensitive in differentiating between the four stages of information processing. In summary, applying moderate LOA (i.e., cueing aid) to the information acquisition stage can be pursued in the future airspace system as long as the automation reliability is high. Although applying high LOAs to information analysis, decision and action selection, and action implementation stages can provide benefits such as enhancing performance in secondary tasks that controllers have to perform, the risk associated with their failure is high. Theoretical Implications The results from the proposed experiment have important theoretical implications. First, this study applied the concept of LOA to the ATC domain. Though prior studies have examined the effect of LOAs in domains such as military (e.g., Crocoll & Curry, 1990; Rovira et al., 2007), robotics (Kaber et al., 2000), and driving (e.g., Endsley & Kiris, 1995), it is also important to understand the effect of LOAs in the ATC domain. Second, this study applied the PSW model of automation (Parasuraman et al., 2000) to the ATC context. That is, in this study, automation was applied to different stages of information processing: information acquisition, information analysis, decision and action selection, and action implementation. There have been limited studies (e.g., Kaber et al., 2006; Rovira et al., 2007) evaluating the PSW model of automation. Validating this model of automation requires empirical work. Third, this study added to the literature on automation and SA (e.g., Endsley & Kiris, 1995; Kaber et al., 2006; Kaber et al., 2000) by examining how the SA of individuals differ when working with varying LOAs in an ATC task. The results were consistent with the active processing hypothesis, indicating that increasing the level of involvement

75


improves operator SA and consequently leads to better performance when the automated aid fails. Fourth, this study helped to understand better the process of SA in the ATC domain by examining the extent to which individuals monitor various aircraft attributes (i.e., call sign, location, altitude, heading, and destination) when working with different LOAs. Aggregating the recall accuracy scores of all the aircraft attributes to obtain a total SA score (e.g., Endsley, 1995a) tells us very little about how operators come to comprehend ATC situations. Fifth, this study examined whether the relationship between the LOA and ATC performance (i.e., the time taken to detect an upcoming collision) following automation failure was mediated by SA. Though prior studies on LOA and SA (e.g., Endsley & Kiris, 1995; Kaber et al., 2001; Kaber & Endsley, 1999) have attributed the loss of SA to be the central factor behind performance decrements following automation failure, none of these studies have established the relationship between SA and performance. This study examined whether SA mediates the relationship between LOA and performance following automation failure. Finally, this study also added to the literature on ‘first automation failure effect’ (e.g., Wickens & Xu, 2002). There are two lines of research that explain how operator performance differs after being exposed to an automation failure. While one line of research argues that operators’ detection of subsequent automation failures improves following the first automation failure, presumably due to calibration of trust in the automation (e.g., Merlo et al., 2000; Rovira et al., 2007), the other line of research claims that as long as operator workload is high and the confidence of the operator on the automation is high, they continue to rely on it irrespective of occasional automation

76


failures (e.g., Parasuraman & Riley, 1997). There has been no empirical work done to examine how the SA of individuals working with various LOAs differs following an automation failure. This study examined the effects of first automation failure on subsequent SA and performance when individuals worked with information acquisition, information analysis, decision and action selection, and action implementation automation in the ATC domain. Practical Implications The findings from the proposed experiment have important practical implications for automation design. The nation’s air traffic is predicted to double by 2025. To meet the increasing traffic demands, the JPDO has proposed initiatives to modernize the U.S. National Airspace System. That plan involves introduction of several automated aids that would assist air traffic controllers cope with increasing traffic demands (JPDO, 2007). The SA of controllers is an important cognitive issue that needs to be considered when designing automated aids for the Next Generation Air Transportation System. Reduced involvement of the controller may prove hazardous, especially in situations when the automation fails. This experiment helped to better understand how SA varies when automation is applied to different stages of information processing. The results from this study revealed that individuals had higher SA when automation was applied to information acquisition (i.e., automation of sensory processing) compared to when automation was applied to higher order information processing stages such as inference generation, decision making, and action implementation. In addition to SA, controller performance is also an important factor that needs to be considered when designing automated systems. This study examined the differences in

77


performance when individuals worked with different LOAs. There were clear benefits to high LOAs when the automated aids were reliable. Individuals working with information analysis, decision, and action implementation automation exhibited superior performance in the secondary task. Therefore, if 100% automation reliability can be guaranteed, high LOAs will be beneficial in enhancing operator performance in multi-task environments like ATC. However, when the automation was unreliable, individuals working with the information acquisition automation demonstrated better performance in the ATC task compared to the other LOAs. Thus, the cost associated with automation failure was higher when automation was applied to higher order stages of information processing. Thus, if 100% automation reliability cannot be assured, it becomes important to consider the pitfalls associated with providing automation support to higher order information processing stages such as information analysis, decision and action selection, and action implementation, during automation design. Rather than applying high LOAs to the central task of detecting and resolving aircraft collisions, secondary tasks that controllers have to perform may be a better candidate for high LOAs. The findings from this experiment have important implications for controller training and ATC display design as well. First, determining the aircraft attributes of the ATC situation that are monitored poorly under different LOAs will help to design SA training programs targeted at improving monitoring of these attributes. The results from the present study showed that SA for aircraft altitude and destination was lower when individuals worked with high LOAs in comparison to information acquisition automation. SA for altitude and destination also mediated the relationship between LOA and performance following automation failure, with lower SA contributing to poor

78


collision detection performance. Therefore, if high LOAs will be adopted in the future airspace system, SA training programs should be developed that directs a trainee’s attention to the aircraft altitude and destination. In addition to the development of SA training programs, ATC interfaces should also be designed to highlight these important aircraft attributes. Limitations and Future Directions There are several limitations in this study. First, caution should be exercised in applying the results of this study to the ATC domain. The background of the participants and the fidelity of the ATC simulation environment limit my ability to generalize the findings of this study to the ATC domain. The participants in this study were not professional air traffic controllers and therefore may not have given as much importance to effectively monitoring their airspace for potential collisions and responding appropriately to automation failures. The effects of applying automation to higher stages of information processing need to be investigated with professional controllers. Second, the duration of this experiment was only 3 hours and the participants received only 1.5 hours of manual training. Future work should be directed towards examining the costs associated with automation failure with professional controllers who have had extensive manual experience in the ATC domain. Third, it could be argued that participants working with the high LOAs failed to understand that they were still required to monitor the ATC display when they were given an automated aid. However, it should be noted that before beginning training on the LOA scenarios, the participants were informed that the automated aids they were working with were not 100% reliable and in the event of an automation failure, it was still their responsibility to avoid collisions in the airspace. The

79


LOAs were fully reliable during the training scenarios so as to build up participants’ trust on the automated aids. Therefore, after interacting with a fully reliable automated system for 35 minutes, they would have forgotten the instruction that the automation may fail and would have started exhibiting overreliance on the automation. Future research should thus be conducted to examine the effects of information analysis, decision, and action implementation automation on performance following automation failure when participants are told the true reliability of an automated aid (e.g., 80%, 85%). In this experiment, the participants were only told that the aid reliability was high but not 100%. In addition, future work should also examine whether explaining the situations under which the automation algorithm can fail can help to reduce the misuse of information analysis, decision, and action implementation automation. Fourth, in this experiment, the individuals working with information acquisition automation were faster in detecting collisions following automation failure compared to those in the information analysis, decision, and action implementation conditions. A total of five aircraft were present in the ATC display in the test scenarios. Traffic density was not manipulated in these scenarios. Future work should therefore examine whether information acquisition automation would still be beneficial in helping individuals detect collisions earlier following automation failure under high traffic density conditions, compared to the other LOAs. Fifth, the automation malfunctions were not salient in the ATC display. Future work should examine whether the disadvantages of automating higher stages of information processing can be overcome by designing better displays in which the intermediate results of the automation are made available to the operator (e.g., Lee & See,

80


2004) and automation failures are made more salient (e.g., Metzger & Parasuraman, 2005). Finally, some limitations with respect to the SA assessment should also be noted. First, SA of the participants in this study was measured using the SAGAT methodology. SA scores as obtained using the SAGAT may not be a true indicator of operator SA. This methodology has been criticized for being intrusive and relying heavily on the operator’s memory (e.g., Sarter & Woods, 1991). Another criticism of this technique is that if operators know where in the environment to look for a piece of information, then they don’t have to use their limited working memory resources to remember the information (Durso & Dattel, 2004). SA would have failed to mediate the relationship between LOA and performance following automation failure in the follow up scenario because SA measured using SAGAT was not sensitive. Thus, future work should be directed towards studying SA under different LOAs using the SPAM methodology (Durso & Dattel, 2004) that uses response time to answering SA queries as a measure of SA, which is more sensitive than accuracy. Second, during the SAGAT recall, participants were asked to recall aircraft location first. After recalling the locations of the aircraft that were on their screen during the SAGAT freeze, participants were given a map with the correct aircraft locations and asked to recall the other aircraft attributes. This method of asking them to recall the aircraft location first might have biased them to pay more attention to the location attribute. This may be why the four LOAs did not differ in the monitoring of the location attribute. Though this methodology is consistent with prior studies, this is a limitation. Third, during the SAGAT recall, the extent to which individuals monitored the speed attribute could not be determined because in the ATST simulator, all aircraft flew

81


at speed fast at virtually all times.

82


CHAPTER V REFERENCES Amaldi, P., & Leroux, M. (1995). Selecting relevant information in a complex environment: The case of air traffic control. In L. Norros (Ed.), 5th European Conference on Cognitive Science Approaches in Process Control (pp. 89–98). Espoo, Finland: VTT Automation. Baron, R.M., & Kenny, D. A. (1986). The moderator-mediator variable distinction in social psychological research: Conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51, 1173-1182. Biros, D. P., Daly, M., & Gunsch, G. (2004). The Influence of Task Load and Automation Trust on Deception Detection. Group Decision and Negotiation, 13, 173-189. Billings, C. E. (1991). Human-centered aircraft automation: A concept and guidelines (NASA Tech. Memorandum 103885). Moffet Field, CA: NASA Ames Research Center. Bisseret, A. (1970). Mémoire opérationelle et structure du travail [Operational memory and structure. of work]. Bulletin de Psychologie, 24, 280-294. Broach, D. (2002). Functional requirements for the Windows®-based ATCS PreTraining Screen (WinPTS). Oklahoma City, OK: Civil Aerospace Medical Institute Training and Organizational Research Laboratory. Brudnicki, D. J., & McFarland, A. L. (1997). User request evaluation tool (URET) conflict probe performance and benefits assessment. Retrieved April 30, 2008, from, www.mitrecaasd.org/library/documents/mp97w112.pdf

83


Craik, F.I.M. & Tulving, E. (1975). Depth of processing and the retention of words in episodic memory. Journal of Experimental Psychology: General, 104, 268-294. Crocoll, W. M., & Coury, B. G. (1990). Status or recommendation: Selecting the type of information for decision aiding. In Proceedings of the Human Factors and Ergonomics Society 34th Annual Meeting (pp. 1524-1528). Orlando, FL: Human Factors and Ergonomics Society. DeLucia, P. R., & Betts, E.T. (2008). Separated versus integrated displays in minimallyinvasive surgery. Proceedings of the Human Factors and Ergonomics Society. New York, NY: Human Factors and Ergonomics Society. Dunlosky, J., Serra, M. J., & Baker, J. M. (2007). Metamemory. In F. T. Durso, R. S. Nickerson, S.T. Dumais, S. Lewandowsky, & T. J. Perfect (Eds.), Handbook of applied cognition (2nd ed., pp. 137-162). Chicester, UK: Wiley. Durso, F. T., Batsakes, P. J., Crutchfield, J. M., Braden, J. B., & Manning, C. A. (2004). The use of flight progress strips while working live traffic: Frequencies, importance, and perceived benefits. Human Factors, 46, 32-49. Durso, F. T., & Dattel, A. R. (2004). SPAM: The real-time assessment of SA. In S. Banbury & S. Tremblay (Eds.), A Cognitive approach to situation awareness: Theory and application (pp. 137-154), Aldershot, UK: Ashgate. Durso, F. T., Rawson, K. A., & Girotto, S (2007).

Comprehension and situation

awareness. In F. T. Durso, R. S. Nickerson, S.T. Dumais, S. Lewandowsky, & T. J. Perfect (Eds.), Handbook of applied cognition (2nd ed., pp. 163-193). Chicester, UK: Wiley.

84


Durso, F. T., & Sethumadhavan, A. (2008). Situation awareness: Understanding dynamic environments. Human Factors, 50, 442-448. Dzindolet, M. T., Pierce, L. G., Beck, H. P., & Dawe, L. A. (2002). The perceived utility of human and automated aids in a visual detection task. Human Factors, 44, 7994. Endsley, M. R. (1987). SAGAT: A methodology for the measurement of situation awareness (NOR DOC 87-83). Hawthorne, CA: Northrop Corporation. Endsley, M. R. (1995a). Measurement of situation awareness in dynamic systems, Human Factors, 37, 65-84. Endsley, M. R. (1995b). Towards a theory of situation awareness in dynamic environments, Human Factors, 37, 32-64. Endsley, M. R. (1996). Automation and situation awareness. In R. Parasuraman & M. Mouloua (Eds.), Automation and human performance: Theory and applications (pp. 163-181). Mahwah, NJ: LEA. Endsley, M. (2000). Theoretical underpinnings of situation awareness. In. M. R. Endsley, & D. J. Garland (Eds.), Situation awareness analysis and measurement (pp 3-32). Mahwah, NJ: LEA. Endsley, M. R., Bolte, B., & Jones, D. G. (2003). Designing for situation awareness: An approach to user-centered design. New York: Taylor & Francis. Endsley, M. R., & Kaber, D. B. (1999). Level of automation effects on performance, situation awareness, and workload in a dynamic control task. Ergonomics, 42, 462-492.

85


Endsley, M. R. & Kiris, E. O. (1995). The out-of-the-loop performance problem and level of control in automation. Human Factors, 37, 381-394. Endsley, M., Sollenberger, R. L., Nakata, A., & Stein, E. S. (2000). Situation Awareness in air traffic control: Enhanced displays for advanced operations (DOT/FAA/CTTN00/01). Atlantic City, NJ: U.S. Department of Transportation. FAA (2004). Order 7110.65 Air Traffic Control. Retrieved June 1, 2008, from, 128.173.204.63/courses/cee5614/cee5614_pub/FAA_ATC_handbook.pdf Fisher, D. L., & Tan, K. C. (1989). Visual displays: The highlighting paradox. Human Factors, 31, 17–30. Foos, P.W., Mora, J.J., & Tkacz, S. (1994). Student study techniques and the generation effect. Journal of Educational Psychology, 86, 567-576. Gibson, J. J. (1962). Observations on active touch. Psychological Review, 69, 477-491. Gould, J.E. (2002). Concise handbook of experimental methods for the behavioral and biological sciences. Boca Raton: CRC Press. Gronlund, S. D., Ohrt, D. D., Dougherty, M. R. P., Perry, J. L., & Manning, C. A. (1998). Role of memory in air traffic control. Journal of Experimental Psychology: Applied, 3, 263-280. Gugerty, L. J. (1997). Situation awareness during driving: Explicit and implicit knowledge in dynamic spatial memory. Journal of Experimental Psychology: Applied, 1, 42-66. Hart, S.G., & Staveland, L. E. (1988). Development of NASA-TLX (Task Load Index): results of empirical and theoretical research. In P. A. Hancock & N. Meshkati

86


(Eds.), Human mental workload (pp. 139-183). Amsterdam: North-Holland, Elsevier Science. Held, R., & Hein, A. (1963). Movement-produced stimulation in the development of visually guided behavior. Journal of Comparative and Physiological Psychology, 56, 872-876. Ishihara, S. (1951). Tests for color blindness. London: Lewis. Jentsch, F., Barnett, J., Bowers, C. A., & Salas, E. (1999). Who is flying this plane anyway? What mishaps tell us about crew member role assignment and crew situation awareness. Human Factors, 41, 1-14. Johnston, J. C., Horlitz, K. L., & Edmiston, R. W. (1993). Improving situation awareness displays for air traffic controllers. In R. S. Jensen & D. Neumesiter (Eds.), Proceedings of the 7th International Symposium on Aviation Psychology (pp. 328334). Columbus, OH: Ohio State University. JPDO (2007). Concept of Operation for the Next Generation Air Transportation System. Retrieved October 29, 2008, from, http://www.jpdo.gov/library/nextgen_v2.0.pdf Kaber, D., & Endsley, M. (2004). The effects of level of automation and adaptive automation on human performance, situation awareness and workload in a dynamic control task. Theoretical issues in Ergonomic Science, 5, 113–153. Kaber, D. B., Onal, E., & Endsley, M. R. (2000). Design of automation for telerobots and the effect on performance, operator situation awareness, and subjective workload. Human Factors and Ergonomics in Manufacturing, 10, 409-430. Kaber, D.B., Perry, C. M., Segall, N., McClernon, C. K., & Prinzel, L.J. (2006). Situation awareness implications of adaptive automation for information processing in an

87


air traffic control-related task. International Journal of Industrial Ergonomics, 36, 447-462. Kantowitz, B. H., Hanowski, R. J., & Kantowitz, S. C. (1997). Driver acceptance of unreliable traffic information in familiar and unfamiliar settings. Human Factors, 39, 164-176. Kessel, C. J., & Wickens, C. D. (1982). The transfer of failure-detection skills between monitoring and controlling dynamic systems. Human Factors, 24, 49-60. Koriat, A., Pearlman-Avnion, S., & Ben-Zur, H. (1998). The subjective organization of input and output events in memory. Psychological Research, 61, 295-307. Larish, J. F., & Andersen, G. J. (1995). Active control in interrupted dynamic spatial orientation: The detection of orientation change. Perception and Psychophysics, 57, 533-545. Lorenz, B., Di Nocera, F., Röttger, S., & Parasuraman, R. (2002). Automated flight management in a simulated space flight microworld. Aviation, Space, and Environmental Medicine, 73, 886-897. Lorenz, B., & Parasuraman, R. (2007). Automated and interactive real-time systems. In F. T. Durso, R. S. Nickerson, S.T. Dumais, S. Lewandowsky, & T. J. Perfect (Eds.), Handbook of applied cognition (2nd Ed., pp. 415-441). Chicester: John Wiley and Sons. Magliano, J. P., Zwaan, R. A., & Graesser, A. C. (1999). The role of situational continuity in narrative understanding. In S. R. Goldman & H. van Oostendorp (Eds.), The construction of mental representations during reading (pp. 219-245). Mahwah, NJ: LEA.

88


Means, B., Mumaw, R., Roth, C., Schalger, M., McWilliams,E., Gagne, E., Rice, V., Rosenthal, D., & Heon, S. (1988). ATC training analysis study: Design of the next-generation ATC training system (OPM Work Order No. 342-036). Richmond, VA: HumRRO International. Merlo, J. L., Wickens, C. D., & Yeh, M. (1999). Effect of reliability on cue effectiveness and display signaling. (Tech. Report ARL-99-4/FED-LAB-99-3). Savoy: University of Illinois, Aviation Research Lab. Meshkati, N., Hancock, P.A., Rahimi, M. and Dawes, S.M. (1995). Techniques in mental workload assessment. In J. Wilson & E. N. Corlett (Eds.), Evaluation of human work. London: Taylor and Francis. Metzger, U. & Parasuraman, R. (2001). The role of the air traffic controller in future air traffic management: An empirical study of active control vs. passive monitoring. Human Factors, 43, 519-528. Metzger, U., & Parasuraman, R. (2005). Automation in future air traffic management: Effects of decision aid reliability on controller performance and mental workload. Human Factors, 47, 35-49. Mogford, R. H. (1997). Mental models and situation awareness in air traffic control. International Journal of Aviation Psychology, 7, 331-341. Moray, N. (1986). Monitoring behavior and supervisory control. In K. Boff (Ed.), Handbook of perception and human performance (pp. 40/1-40/51). New York: Wiley.

89


National Transportation Safety Board (NTSB). (2001). Annual review of aircraft accident data, U.S. Air Carrier Operations: Calendar year 2001 (Report Number. ARC06-01). Washington, DC: Author. Nolan, M.S. (1999). Fundamentals of air traffic control (3rd ed.). Pacific Grove, CA: Brooks/Cole. Parasuraman, R. (1987). Human-computer monitoring. Human Factors, 29, 695-706. Parasuraman, R. (2000). Designing automation for human use: Empirical studies and quantitative models. Ergonomics, 43, 931-951. Parasuraman, R., Molloy, R., & Singh, I. L. (1993). Performance consequences of automation-induced

‘complacency’.

International

Journal

of

Aviation

Psychology, 3, 1-23. Parasuraman, R., & Riley, V. (1997). Humans and automation: Use, misuse, disuse, and abuse. Human Factors, 39, 230-253. Parasuraman, R., Sheridan, T. B., & Wickens, C. (2000). A model for types and levels of human interaction with automation. IEEE Transactions on Systems, Man, and Cybernetics, 30, 286-297. Preacher,K., J & Leonardelli, G. J. (2001). Calculation for the Sobel test: An interactive calculation tool for mediation tests [Computer software]. Retrieved from http://www.people.ku.edu/~preacher/sobel/sobel.htm Rantanen, E.M., & Nunes, A. (2005). Hierarchical conflict detection in air traffic control. International Journal of Aviation Psychology, 15, 339-362.

90


Remington, R. W., Johnston, J. C., Ruthruff, E., Gold, M., & Romera, M. (2000). Visual search in complex displays: Factors affecting conflict detection by air traffic controllers. Human Factors, 42, 349-366. Rovira, E., McGarry, K., & Parasuraman, R. (2007). Effects of imperfect automation on decision making in a simulated command and control task. Human Factors, 49, 76-87. Sarter, N. B., & Schroeder, B. (2001). Supporting decision making and action selection under time pressure and uncertainty: The case of in-flight icing. Human Factors, 43, 573-583. Sarter, N. B. and Woods, D. D. (1991). Situation Awareness: A critical but ill-defined phenomenon. International Journal of Aviation Psychology, 1, 45-57. Sarter, N. B., & Woods, D.D. (1995). How in the world did we ever get into that mode? Mode error and awareness in supervisory control. Human Factors, 37, 5-19. Sheridan, T. B., & Verplank, W. L. (1978). Human and computer control of undersea teleoperators. Cambridge, MA: MIT Man-Machine Laboratory. Sobel, M. E. (1982). Asymptotic confidence intervals for indirect effects in structural equations models. In S. Leinhart (Ed.), Sociological methodology (pp. 290-312). San Francisco: Jossey-Bass Tabachnick, B.G., and Fidell, L.S. (2001). Using Multivariate Statistics (4th edition). New York: Allyn and Bacon. Treisman, A. (1986). Features and objects in visual processing. Scientific American, 255, 114–125.

91


Walker, G., Stanton, N.A., & Young, M.S. (2001), Where is technology driving cars?, A technology trajectory of computing in vehicles. International Journal of Human Computer Interaction. 13, 203-299. Wickens, C. D.., & Dixon, S. (2005). Task priorities and imperfect automation, Technical Report (AHFD-05-17/MAAD-05-5). Savoy, IL: University of Illinois, Aviation Human Factors Division. Wickens, C. D., & Kessel, C. (1979). The effects of participatory mode and task workload on the detection of dynamic system failures. IEEE Transactions on System, and Cybernetics, 1, 23-34. Wiener, E. L., & Curry, R. E. (1980). Flight deck automation: Promises and problems. Ergonomics, 23, 995-1011. Willems, B., & Truitt, T. R. (1999). Implications of Reduced Involvement in En Route Air Traffic Control (DOT/FAA/CT-TN-99/22). Atlantic City, NJ: Federal Aviation Administration Technical Center. Willems, B., Allen, R. C., & Stein, E. S. (1999). Air traffic control specialist visual scanning: 2. Task load, visual noise, and intrusions into controller airspace (DOT/FAA/CT-TN99/23). Atlantic City, NJ: Federal Aviation Administration. Wickens, C. D., & Xu, X. (2002). Automation trust, reliability and attention (Tech. Rep.AHFD-02-14/MAAD-02-2). Savoy: University of Illinois, Aviation Research Lab. Yeh, M., & Wickens, C. D. (2001). Display signaling in augmented reality: Effects of cue reliability and image realism on attention allocation and trust calibration. Human Factors, 43, 355-365.

92


Yeh, M., & Wickens, C. D., & Seagull, F.J. (1999). Target cueing in visual search: The effects of conformity and display location on the allocation of visual attention. Human Factors, 41, 524-542. Young, L. R. (1969). On adaptive manual control. IEEE Transactions on System, and Cybernetics, 10, 292-331. Zwaan, R. A., Langston, M. C., & Graesser, A. C. (1995). The construction of situation models in narrative comprehension : The event-indexing model. Psychological Science, 6, 292-297. Zwaan, R. A., Magliano, J. P., & Graesser, A. C. (1995). Dimensions of situation model construction in narrative comprehension. Journal of Experimental Psychology: Learning, Memory, and Cognition, 21, 386-397. Zwaan, R. A. & Radvansky, G. A. (1998). Situation models in language comprehension and memory. Psychological Bulletin, 123, 162-185. Zwaan, R. A., Radvansky, G. A., Hilliard, A. E., Curiel, J. M. (1998). Constructing multidimensional situation models during reading. Scientific Studies of Reading, 2, 199-220.

93


APPENDIX A EXTENDED LITERATURE REVIEW Situation Awareness Definitions of SA Situation awareness (SA) is a term that denotes one’s understanding of the environment in which one is working. Several definitions of SA have been proposed in the literature. Because SA originated in the aviation community, several definitions are specific to the aviation domain. For example, Tolk and Keether (1982) defined SA as the ability to predict the current and future disposition of Red and Blue aircraft. Thus, Tolk and Keether’s definition of SA emphasizes on 2 components: understanding of the present and the ability to make predictions about the future. Harwood, Barnett, and Wickens (1988) defined SA in terms of four components: spatial awareness (i.e., pilot knowledge of spatial relationships between objects), identity awareness (i.e., pilot’s knowledge of system state variables), responsibility awareness (i.e., pilot’s knowledge of whether she or automation is in charge), and temporal awareness (i.e., knowledge of occurrence of events as mission evolves). There are also definitions that equate SA to operator knowledge. For example, Whitaker and Klein (1988) defined SA as ‘the pilot’s knowledge about his surroundings in light of his mission’s goals’ (p.321). Similarly, Fracker (1988) define SA as ‘the knowledge that results when attention is allocated to a zone of interest at a level of abstraction’ (p.102). Relatedly, according to Regal, Rogers, and Boucek (1988), SA ‘means that the pilot has an integrated understanding of factors that will contribute to the safe flying of the aircraft under normal or non-normal conditions. The broader this knowledge is, the greater the degree of situational awareness (p.65)’.

94


A more general definition of SA was proposed by Endsley (1988, 1995b), that can be applied to any domain. Endsley (1988) defined SA as ‘the perception of the elements in the environment within a volume of time and space, comprehension of their meaning, and the projection of their status into the future’ (p.3). Therefore, the first step in achieving SA is to perceive the environment through the senses (e.g., visual, auditory, tactile). The second step is to understand what the perceived data means in relation to operator goals. The third step is to use the understanding of the current situation to make predictions about the future. For example, an air traffic controller needs to perceive all the aircraft in her airspace, comprehend what the perceived information means to her goals, and predict future states of the environment so as to take timely measures to prevent collisions. In Endsley’s (1995b) model of SA, time is a vital component. That is, an important part of SA is to know how much time is available to take an action or for an event to occur. In addition to time, space is also an integral part of SA. That is, operators can filter the elements in the environment based on space and pay more attention to those elements of the environment that are nearer to them. For example, a driver may pay close attention to nearby cars that pose more danger to them than cars that are farther away. Similar to Endsley (1995b), Dominguez (1994) put forth a more global definition of SA. According to Dominguez (1994), SA is the ‘continuous extraction of environmental information, the integration of this information with previous knowledge to form a coherent mental picture, and the use of that picture in directing further perception and anticipating future events’. Based on this definition, SA can be considered as having an overall ‘mental picture’.

95


Though most of the earlier work on SA has been conducted in the aviation domain (mainly on pilots), research on SA has also been extended to air traffic control (e.g., Durso et al., 1998), medicine (e.g., Gaba, Howard, & Small, 1995), driving (e.g., Gugerty & Tirre, 2000), military (e.g., Matthews, Strater, & Endsley, 2004), and robotics (e.g., Yanco & Drury, 2004). The next section will examine the theoretical foundations of SA. Theoretical Foundations of SA Theoretical work on SA began almost a decade ago (e.g., Adams, Tenney, & Pew, 1995; Endsley, 1988; Durso & Gronlund, 1999; Sarter & Woods, 1991; Smith & Hancock, 1995) and continues even today (e.g., Bedny, Karwowski, & Jeng, 2004; Durso, Rawson & Girotto, 2007; Durso & Sethumadhavan, 2008; Tenney & Pew, 2006). Broadly, there are two theoretical views of SA (Tenney & Pew, 2006). One view focuses on the information processing theory (Niesser, 1967) whereas the other view is based on the ecological approach to perception (Gibson, 1979). Endsley’s (1988; 1995b; Endsley, Bolte, & Jones, 2003) model of SA is based on the information processing theory. Based on this model, working memory (WM) plays an important role in achieving SA. Integrating perceived information with prior knowledge as well as making projections about the future requires WM resources. Because of limits in WM capacity, WM forms a bottleneck for SA. In addition to WM, long-term memory structures such as mental models also help to achieve SA. Mental models help operators understand what is happening, without taxing WM. Goals and expectations also play a major role in achieving SA. For example, operators’ goals and expectations can influence where they direct attention and how they interpret information. Endsley’s (1995b) model

96


considers SA as a product of cognitive processes. Therefore, methodologies to measure SA focus on recall techniques aimed at deriving the contents of WM and long term WM. Unlike information processing psychologists, ecological psychologists emphasizes on the close coupling between perception and action. According to the ecological view, SA arises due to the interaction between operators and the environment (e.g., Adams et al., 1995; Flach, 1995; Smith & Hancock, 1995). Thus, instead of focusing on ‘what is inside the head’, they focus on ‘what the head is inside of’ (Tenney & Pew, 2006). Therefore, ecological psychologists focus on studying affordances of objects and events and how adaption occurs in naturalistic environments (Tenney & Pew, 2006). In summary, SA as a term originated in the aviation community and spread to the research community. Currently, SA is a pervasive term used in several domains. Two main theoretical frameworks of SA have been proposed. While one framework is based on the information processing view and focuses on obtaining the contents of memory to assess SA, the other framework is based on the ecological approach to perception and focuses on understanding operators’ SA in close relation to the environment in which they are working in. The next section will examine the factors that influence SA. Factors that Affect SA Cognitive Abilities Individual differences in cognitive abilities play an important role in predicting SA. For example, Carretta, Perry, and Ree (1996) showed that when flight experience was controlled, cognitive abilities (i.e. verbal working memory, spatial working memory, spatial reasoning, and divided attention) predicted SA of F-15 pilots whereas psychomotor skills and personality variables did not. Similarly, Endsley and Bolstad

97


(1994) showed that spatial abilities, perceptual skills, attention sharing, and pattern matching predicted SA of pilots in an air-to-air fighter sweep mission. More recently, Sohn and Doane (2004) showed that spatial WM predicted SA of novice pilots whereas spatial long term WM predicted SA of expert pilots. The role of WM in SA has also been demonstrated in other domains besides aviation, such as driving. For example, using a simulated driving task, Gugerty and Tirre (2000) showed that working memory, perceptual-motor ability, dynamic visual processing ability, and temporal processing ability correlated with SA measures in driving. Expertise Experts tend to have better SA than novices. For example, Durso et al. (1995) showed that expert chess players were faster in answering SA queries than novices. There is also evidence that experts are better than novices in generating predictions, which form an integral part of SA. For example, Doane and Sohn (2004) presented expert and novice pilots with a control statement (e.g., ‘left pressure on aileron’), a flight situation depicted using cockpit displays, and a change in flight situation. They had to judge whether the change in flight situation was consistent or inconsistent with their expectations. Doane and Sohn (2004) found that experts were more accurate than novices in making projections about the future flight state, due to superior access to appropriate mental models. In summary, individual differences in cognitive abilities and expertise play an important role in predicting SA. The next section will examine the advantages of studying SA in addition to performance in various domains.

98


Advantages of Studying SA It is important to study SA because loss of SA is attributed to performance failures in several safety-critical domains. For example, Endsley (1990) showed that fighter pilots who were aware of the existence of enemy aircraft were more likely to destroy the enemy targets in a simulated combat mission. Similarly, through an analysis of en route operational errors that occurred in 1993, Durso, Truitt, Hackworth, Crutchfield, and Manning (1998) found that controllers who were unaware that a loss of separation was happening were involved in more severe operational errors. More recently, using a simulated air traffic control environment, Durso, Bleckley, and Dattel (2006) showed that participants who were slower and less accurate in answering SA queries about the past had longer handoff delay times. Likewise, Woodhouse and Woodhouse, (1995) attributed controlled flight into terrain accidents to a loss of pilot SA, rather than a loss of pilot skills. Finally, in the domain of driving, of all the different components in driving skill, the only component that is related to involvement in traffic accidents is hazard perception or SA (Horswill & McKenna, 2004). In summary, SA predicts performance in various domains, with lower SA resulting in poor operator performance. Therefore, to improve operator performance in various domains, it is important to understand the reasons behind lower SA and develop training programs or design interventions that improve SA. The next section will discuss the methodologies used to measure SA. Methodologies to Measure SA There exists a plethora of techniques that can be used to evaluate an operator’s SA (See Jeannot, Kelly, & Thompson, 2003 for a review). This section will describe the

99


commonly used SA measurement techniques. These have been classified into three major categories: objective measures of SA, subjective measures of SA, and implicit performance measures. Objective Measures One of the most widely used methods to assess operator SA is the Situation Awareness Global Assessment technique (SAGAT; Endsley, 1987a; 1995a). The first step in evaluating SA involves identifying the information needs of operators through a goal-directed task analysis (GDTA; Endsley et al., 2003). The GDTA methodology aims at identifying the goals, the decisions that needs to be made in order to achieve the goals, and the information requirements for making each of the decisions, through interviews with domain experts. Once the information needs are identified, SAGAT queries are constructed. For example, if one of the information needs of an air traffic controller, identified through the task analysis, is ‘aircraft altitude’, a relevant SAGAT query would be ‘what is the altitude of SW320 (aircraft ID)?’. These queries are then presented to the operator at random times during a scenario, by blanking the air traffic control display. The accuracy in answering the SAGAT queries is used as a measure of SA. Higher the accuracy in answering the SAGAT queries, higher the operator SA. Endsley (2000) described a set of guidelines to be adopted when administering SAGAT. It is recommended that SAGAT freezes do not occur in the first 3-5 minutes of a scenario. This ensures that participants have sufficient time to obtain an understanding of the events occurring in the scenario. In addition, the maximum number of SAGAT freezes permissible during a 15-minute experimental scenario is three. Further, it is

100


recommended that SAGAT stops do not occur within 1 minute of each other so as to ensure that participants can regain their SA following the previous SAGAT stop. SAGAT has been criticized for several reasons. First, freezing the simulation to collect SAGAT data has been criticized for being intrusive and impacting operator performance (e.g., Sarter & Woods, 1991). Another criticism of this technique is its overreliance on conscious memory (e.g., Durso, Rawson, & Girotto, 2007). SAGAT collects SA data by blanking all the displays, depriving the participants of the task context, and asking them to answer SA queries by solely relying on their memory. Just because the participants were unable to answer a SAGAT query does not mean that they had lower SA while performing the task because the answer to the query could just have been forgotten at the time the query was presented (Durso et al., 2007). Situation Present Assessment Method (SPAM; Durso & Dattel, 2004) is another technique that can be employed to objectively measure SA. In this technique, in addition to the accuracy in answering SA queries, the response time to answering these queries is also recorded. This technique has two main advantages over SAGAT (Durso & Dattel, 2004). Firstly, reaction time is a more sensitive measure than accuracy. For example, two operators may be equally accurate in answering SA queries, though one operator may be faster than the other. This technique will thus help to distinguish between the two operators (based on reaction time to the queries). Secondly, this technique is based on the notion that SA is best when the situation is present and therefore does not deprive operators of the task context. Because the situation remains present when the SA queries are presented, it does not require operators to use their limited WM resources to recall all the information in the environment. Based on this technique, an operator who knows

101


where in the environment to look for a piece of information will be faster than an operator who does not know where to look for the information. This technique also tries to separate workload from SA. Operators are first presented with a warning that a SA query is in the queue. The time from the appearance of the warning to the readiness of the operator to answer the query is taken as an index of operator workload. The time to answer the query is taken as an index of operator SA. Apart from objective measures of SA, subjective measures of SA are also widely used. Commonly used subjective SA measures are discussed in the next section. Subjective Measures Subjective measures of SA typically involve operators making judgments about their own SA. One of the most popular subjective SA measurement techniques is the Situational Awareness Rating Technique (SART; Taylor, 1990). SART measures SA by asking operators to make ratings on three dimensions: Demand on attentional resources, supply of attentional resources, and understanding. An overall SA score is obtained by using the formula: SA = Understanding – (Demand – Supply). The ratings are made using a seven-point Likert scale and are often collected at the end of each simulation exercise. The main advantages of this technique are that it is non-intrusive and is easy to administer. The main criticism of this technique is that operators are not good at rating their own SA and often confuse SA with performance (Jones, 2000). However, the advantage of this technique is that it helps to understand operators’ confidence in their SA or their meta-SA (Durso et al., 2007). Understanding operators’ perception of their own SA is important because only if they realize that they have poor SA, can they summon help.

102


Another popular subjective SA measurement technique is the Situation Awareness Behaviorally Anchored Rating Scale (SA/BARS; Neal et., 1998). In this technique, an expert observer makes ratings of a participant’s performance on a seven-point rating scale. Higher the score on the SA/BARS, higher the participant SA. The main disadvantage of this technique is that observers have to rely on the overt actions and verbalizations of participants to rate their SA (Jones, 2000). Implicit measures of SA will be described in the next section. Implicit Performance Measures Implicit performance measures involve embedding SA-revealing events into scenarios that would require participants to exhibit specific behaviors. For example, Andre, Wickens, Boorman, and Boschelli (1991) assessed the SA of participants in a simulated flight task using implicit performance measures. They froze the simulation display at random times, throwing participants into an unpredictable bank and pitch angle. The time taken by pilots to recover from these disorienting events was used as an implicit measure of SA. The advantage of using implicit SA measures is that it removes the artificiality that is produced by presenting queries to participants while they are in the midst of doing their task or interrupting their task. In summary, there exist several techniques to measure operator SA in dynamic environments. These techniques range from self or observer ratings of SA to objective and implicit measures of SA. The next section will examine the distinction made in the literature between the product and process of SA.

103


Product vs. Process There has been a lot of debate in the SA research community as to whether SA is a product or a process. For example, Endsley (1995b) argues that SA is a product of cognitive processes. Research conducted to understand the product of SA focuses on memory recall techniques such as SAGAT that relies on operator’s conscious memory. In this technique, SA queries are presented to participants at random occasions and the accuracy in answering these queries is used as a measure of SA. The problem with this technique is that it does not tell us anything about how SA was acquired (Durso et al., 2007; Durso & Sethumadhavan, 2008). In other words, it does not tell us anything about the process of SA. Other researchers argue that SA is a process. For example, according to Sarter and Woods (1991), SA is a ‘variety of cognitive processing activities’. That is, SA refers to the perceptual and cognitive activities involved in constructing and updating the state of awareness (Adams et al., 1995). It is very important to understand the process of SA. A better understanding of how operators come to comprehend situations will help human factors practioneers develop training programs and design interventions that would improve their SA and consequently improve their performance. The process of SA in dynamic environments can be better understood by borrowing from the more methodically researched domain of reading comprehension. Durso et al. (2007) makes an analogy between reading comprehension and situation comprehension (or SA). Kintsch and van Dijk (1978; Kintsch, 1992) have distinguished three levels of reading comprehension: surface level, textbase, and situation models. Surface level consists of the linguistic information in a sentence whereas the textbase consists of the

104


meaning extracted from the text input. Situation models involve the integration of the text content with the reader’s prior knowledge to form a more coherent understanding of the textual situation (Kintsch, 1993). Situation model is different from textbase in that it requires a deeper level of understanding (Kintsch, 1993). Specifically, situation models “represent what the text is about, not the text itself” (Glenberg, Meyer, & Lindem, 1987; p.70). By using the framework of Kintsch and van Dijk (1978), Durso et al. (2007) defined three levels of situation comprehension: surface level, which consists of the perceptual details from a scene, the eventbase, which consists of information extracted from the perceptual input, and situation models, in which the eventbase is integrated with prior knowledge to form a coherent understanding of the situation. There has been considerable research in reading comprehension to understand the dimensions of situation models (e.g., Magliano, Zwaan, & Graesser, 1999; Zwaan, Magliano, & Graesser, 1995; Zwaan & Radvansky, 1998; Zwaan, Radvansky, Hilliard, & Curiel, 1998). Similar work is possible in safety-critical domains such as air traffic control, robotics, and medicine to understand the dimensions of situation models. This will help to obtain a better understanding of the process of SA in these domains. The next section will examine the empirical research conducted to understand the dimensions of situation models in reading comprehension. Situation Models in Reading Comprehension Readers monitor multiple dimensions of the situation model during narrative comprehension. Zwaan, Langston, and Graesser (1995) proposed the event indexing model, according to which narratives are indexed along various dimensions that include protagonist, time, intentionality, causality, and space. This means that readers keep track

105


of ‘who’ were involved in an event, as well as ‘when’, ‘where’, ‘why’, and ‘how’ the event occurred. According to this model, if a sentence refers to a spatial setting that is different from the one encountered by readers in the previous sentence, the space index will be updated by the readers. Similarly, if a sentence indicates a temporal, protagonist, goal or causal shift, the temporal, protagonist, goal, and causal indices are respectively updated by the readers. Evidence for the event-indexing model was provided by Zwaan et al. (1995) by collecting sentence reading times. They found that sentence reading times increased with temporal and causal discontinuities though this was not the case for spatial discontinuities. They concluded that readers monitor multiple dimensions of the situation model during narrative comprehension. Zwaan, Radvansky, Hilliard, and Curiel (1998) extended the findings of Zwaan et al. (1995) by including all the five dimensions. They found that sentence reading times increased with discontinuities in the protagonist, time, goal, and causality dimensions, indicating that these dimensions are monitored during narrative comprehension. The following sections will examine the empirical findings that demonstrate the extent to which readers monitor protagonist, intentionality, time, space, and causality. Protagonist Zwaan and Radvansky (1998) have termed protagonist as the ‘meat’ of situation models. For example, Scott-Rich and Taylor (2000) showed that the coherence ratings (i.e., ‘how well integrated the narrative is overall’) that readers assigned to narratives were the lowest when the narratives involved protagonist shifts compared to spatial or temporal shifts. They also found that cohesion ratings (i.e., ‘how well the sentence fits with the previous sentence’) were the lowest when sentences involved protagonist shifts.

106


Scott Rich and Taylor (2000) concluded that the protagonist is the central feature in narratives. In addition to the protagonists themselves, readers also monitor the traits of the protagonists. For example, Myers, O'Brien, Albrecht, and Mason (1994) showed that readers were aware of the inconsistency between a protagonist action (e.g., ordering a cheeseburger) described in the target sentence and the protagonist trait (e.g., being a vegetarian) previously described, resulting in higher reading times for the target sentence in the inconsistent version of the narrative compared to the control version of the narrative. There is also evidence that readers monitor the emotional state of the protagonist (e.g., Komeda & Kusumi, 2006). In Komeda and Kusumi’s (2006) study, participants read narratives that involved an emotional shift. For example, the protagonist was described as being worried and later described as being relieved. When emotional shifts occurred, reading times for the target sentence increased when the participants were told to empathize with the protagonist or asked to read the narratives naturally. In summary, readers keep track of the protagonists, their traits as well as their emotional states. The next section will discuss the empirical findings that demonstrate the extent to which readers monitor the intentions of the protagonist. Intentionality In addition to the protagonists, readers also keep track of the goals and motivation of the protagonist (e.g., Fletcher & Bloom, 1988; Linderholm, van den Broek, Robertson, & Sundermier, 2004; Trabasso & Wiley, 2005; van den Broek, Young, Tzeng, & Linderholm, 1999). Zwaan and Radvansky (1998) have termed intentionality as the ‘backbone’ that helps readers link events together, thereby improving their understanding of the narrative. This means that information about goals is highly available in a reader’s

107


memory. Evidence for the high availability of goal information was shown by Myers and Duffy (1990). They found that when participants were asked to recall details about a narrative that they had read earlier, they recalled the goals of the protagonist better than other information. Not all goal information is maintained at a high level of availability by readers. For example, Lutz and Radvansky (1997) showed that failed goal information is more available to the readers than completed goal information. In their study, participants were presented a ‘completed goal’ narrative or a ‘failed goal’ narrative. In the completed goal version, the protagonist was described as having a goal, which she later completed. For example, Betty (the protagonist) was described as wanting to give her mother a present. She was then described as having gone to the store, finding a purse, and gifting it to her mother. In the failed goal version, the protagonist failed to achieve the goal. For example, Betty was described as wanting to give her mother a present but could not do so because all the gifts in the store were expensive. During the course of reading the narratives, the participants were probed with questions such as ‘Did Betty want to buy her mother a present?’ Lutz and Radvansky (1997) found that the reaction times to the questions were faster in the failed goal version compared to the completed goal version, indicating that failed goal information were more available to the readers than completed goal information. In summary, readers monitor the goals of the protagonist to understand the cause and effect relations between events presented in the narratives. Further, incomplete goal information is maintained at a higher level of availability than completed goal

108


information. The next section will discuss the studies conducted to understand the extent to which the temporal dimension is monitored during text comprehension. Time Readers monitor the temporal dimension during narrative comprehension. For example, Zwaan (1996) showed that the reading times of sentences involving a time shift (e.g. ‘An hour later’) was longer compared to sentences that did not involve a time shift (e.g., ‘A moment later’). Updating the temporal index required cognitive resources, signaled by an increase in reading times. Recent research using eye movement recording has also shown that readers keep track of temporal information while reading narratives (e.g., Rinck, et al., 2003). In Rinck et al. (2003), participants read narratives in which the temporal information presented in a later sentence was either consistent or inconsistent with the temporal information presented in the earlier sentence. Rinck et al. (2003) found that participants regressed to the earlier sentence that contained the temporal order information and fixated on it more often when there were temporal inconsistencies. Rinck et al. (2003) concluded that the temporal inconsistency slowed down the updating of the reader’s situation model. Readers attempted to resolve this inconsistency by looking back at the earlier information and rereading this information. In addition to explicitly stated temporal information, readers also incorporate implicitly stated temporal information into their situation model (e.g., Rinck, Hahnel, and Becker, 2001). In Rinck et al. (2001), participants read narratives with explicit order information (e.g., ‘Markus’ train arrived 20 minutes before Claudia’s train) or implicit order information (e.g., ‘Markus’ train arrived at 4.10 pm and Claudia’s train arrived at 4.30 pm’). They then read a target sentence that was temporally consistent or inconsistent

109


with the explicit or implicit order information. The reading time of the temporally inconsistent target sentence was longer than the temporally consistent target sentence. However, there were no differences in the reading times of the implicit and explicit order versions. This meant that readers keep track of temporal discontinuities even if the information is implicitly stated. In a related experiment, Rinck et al. (2001) asked participants to read a temporally consistent or inconsistent version of a narrative and report anything strange or wrong in the narrative. They found that the reading times for the temporally inconsistent target sentence were longer for participants even though they were unable to report the inconsistency compared to participants who read the temporally consistent target sentence. That is, temporal inconsistencies increased reading times even if the readers were not able to report the inconsistency in the narrative. Rinck et al. (2001) suggested that nonconscious automatic processes may play a role in the construction and updation of situation models. Finally, there is evidence that readers use temporal information as event boundaries during narrative comprehension. For example, Speer and Zacks (2005) instructed participants to identify places in a narrative where one activity ended and another began. That is, the participants indicated event boundaries by placing a line between two words. They found that participants placed event boundaries before sentences involving a temporal change compared to other sentences (e.g., object shift). Participants also marked event boundaries preceding sentences that involved a larger time shift (e.g., ‘An hour later’) compared to a smaller time shift (e.g., ‘A moment later’). In summary, readers are sensitive to changes in the temporal dimension during narrative comprehension, irrespective of whether the temporal information is explicitly or

110


implicitly stated. The next section will discuss the studies conducted to examine the extent to which readers monitor causality during narrative comprehension. Causality Causal inferences. Readers make causal inferences during narrative comprehension. Causal inferences are made through causal connectives such as ‘because’, ‘therefore’, ‘consequently’ or ‘so’ (e.g., Caron, Micko, & Thuring, 1988). Causality is also inferred by readers when causal connectives are not present in sentences. In such cases, they make causal inferences using their prior knowledge. For example, Singer, Halldorson, Lee, and Andrusiak (1992) showed that participants were faster in responding to ‘Does water extinguish fire?’ when presented with ‘Mary poured the bucket of water on the fire. The fire went out.’ rather than ‘Mary placed the bucket of water by the fire. The fire went out.’ This meant that readers used their prior knowledge (i.e., water extinguishes fire) to make causal inferences. In addition to explicitly stated causal information, readers also incorporate implicitly stated causal information into their situation model. For example, Singer and Halldorson (1996) presented participants with a motive-outcome statement that required an inference (e.g., ‘Terry was unhappy with his dental health. He phoned the dentist.’) or a statement in which the information required for the inference was explicitly stated (e.g., ‘Terry was unhappy with his dental health. So he phoned the dentist for an appointment.’). This statement was succeeded by two intervening sentences, following which a question ‘Do dentists require appointments?’ was presented. The answer time to the question was recorded. Singer and Halldorson (1996) found that there were no differences in the

111


answer time for the implicit and explicit statements, indicating that causal inferences are made by readers even if the inference is not explicitly stated in the text. Predictive inferences. In addition to making causal inferences between events, readers also make predictive inferences about the causal consequences of events during narrative comprehension. For example, Klin, Murray, Levine and Guzman (1999) found that participants’ response to the word ‘steal’ was faster when the protagonist in the narrative was described as having lost his job (high predictability condition) and wanted to buy a ring for his wife compared to when the protagonist was described as having received a big raise (low predictability condition). This indicated that the participants in the high predictability condition had activated the concept ‘steal’, even though it was not explicitly stated in the narrative that the protagonist wanted to steal the ring. Keefe and McDaniel (1993) also showed that readers make predictive inferences during narrative comprehension. In their study, participants read passages in which a sentence suggested a potential outcome (e.g., ‘After standing through the three-hour debate, the tired speaker walked over to his chair’) or a sentence that explicitly stated the outcome (e.g., ‘After standing through the three-hour debate, the tired speaker walked over to his chair to sit down’) or a control condition. Following this sentence, a probe word (e.g., ‘sat’) was presented and the participants were told to read the word out aloud. The participants responded faster to the probe word in the predictive condition (as well as in the explicit condition) compared to the control condition. There were no differences between the predictive and explicit conditions, suggesting that the participants made predictions irrespective of whether the information was explicitly stated in the narrative or not.

112


Individual differences play an important role in the generation of predictive inferences during narrative comprehension. For example, Linderholm (2002) showed that individuals with high working memory (WM) capacity were faster to respond to probe words that summarized the predictive inference after reading narratives compared to low WM capacity individuals. Linderholm (2002) concluded that generating predictive inferences is a demanding process that requires WM resources. Relatedly, Fincher-Kiefer and D’Agostino (2004) showed that predictive inferences require visuospatial resources. In their study, participants were given a visuospatial memory load task (i.e., shown a 4 X 4 matrix with 5 dots in random cells of the matrix) or a verbal memory load task (i.e., shown a string of 6 letters). Following this, they read a narrative. The last sentence of the narrative elicited a predictive inference (e.g. ‘The soup spilled’). Then a target word (e.g., ‘spill’) or a nonword appeared on the screen and the participants had to indicate whether it was a word or not. The reaction times to the target word showed that the visuospatial memory load disrupted the generation of predictive inferences whereas the verbal memory load did not. In addition to cognitive resources, reading abilities also play an important role in the generation of predictive inferences. For example, Murray and Burke (2003) showed that highly skilled readers were faster in naming probe words that summarized the predictive inference in the narrative than moderately skilled readers, indicating that predictive inferences are automatically activated by highly skilled readers. In summary, readers make causal inferences using causal connectives present in the narratives or using their prior knowledge. Causal inferences are made even if the causal information is implicit. Readers also make predictive inferences about the causal

113


consequences of events. This is a demanding task placing a heavy load on the reader’s cognitive resources. The next section will examine the extent to which readers monitor the spatial dimension during narrative comprehension. Space Spatial information in situation models consists of the spatial location of the protagonists, their movement path, and the location of objects in the scene (e.g., Rinck, 2005). There are two lines of research that describe the extent to which readers monitor the spatial dimension during narrative comprehension. One line of research claims that readers do not keep track of spatial information during narrative comprehension. For example, Zwaan, Radvansky, Hilliard, and Curiel (1998) showed that spatial discontinuities increased reading times only when the participants memorized the map of the building where the narrative events took place prior to reading the narratives. Further, Zwaan, Magliano, and Graesser (1995) showed that reading times of sentences involving spatial discontinuities increased only when participants reread the narrative. More recently, Therriault, Rinck, and Zwaan (2006) showed that reading times of sentences involving spatial discontinuities increased only when participants were told to explicitly focus on the spatial dimension. In summary, readers monitor space only when they have prior knowledge about the spatial layout or when they reread the narrative or when they are explicitly instructed to focus on the spatial dimension. However, there is another line of research that states that readers keep track of spatial information during narrative comprehension. This view is discussed in the following section. There is a plethora of research that shows that readers monitor the spatial dimension during narrative comprehension. For example, Ehrlich and Johnson-Laird (1982) showed

114


that people took longer to comprehend a sentence when the spatial information encountered in the sentence did not have a clear connection with the spatial information presented in earlier sentences. In their study, participants were presented with descriptions of the spatial arrangement of objects. In the ‘continuous description’ condition, subsequent sentences referred to objects in the previous sentence (e.g., ‘The knife is in front of the pot. The pot is behind the dish.’) whereas in the ‘discontinuous condition’, this reference was absent (e.g., ‘The knife is in front of the pot. The glass is behind the dish.’). Sentence reading times were higher in the discontinuous condition compared to the continuous condition, indicating that readers keep track of the spatial continuities between sentences. There is also neuropsychological evidence that readers monitor the spatial dimension during narrative comprehension. For example, Carpenter, Just, Keller, Eddy, and Thulborn (1999) found that reading statements describing a spatial configuration (e.g., ‘The star is above the plus’) activated the regions of the brain activated by spatial tasks such as mental rotation. An important piece of spatial information that readers monitor include the spatial surroundings of a moving protagonist (e.g., de Vega, 1995). In his study, participants were presented narratives in which the location of a target object was consistent (e.g., ‘Camen went into the museum. She approached the mummies quietly.’) or inconsistent (e.g., ‘Camen went out of the museum. She approached the mummies quietly.’) with the location of a moving protagonist. de Vega (1995) found that the reading times were longer for sentences in which the object location was inconsistent with that of the protagonist, indicating that readers keep track of the protagonist’s location during narrative comprehension. O' Brien and Albrecht (1992) also showed that readers are

115


sensitive to the location of the protagonist. They presented narratives in which the first sentence contained information about the location of the protagonist (e.g., ‘Kim stood inside the health club’). Subsequently, a critical sentence that described the movement of the protagonist was presented, which was either consistent (e.g., ‘Kim decided to go outside the health club’) or inconsistent (e.g., ‘Kim decided to go inside the health club’) with the protagonist’s starting location. O’Brien and Albrecht (1992) found that the reading times of the critical sentence were longer when the movement of the protagonist was inconsistent with the start location than when it was consistent. This was true when the critical sentence immediately followed the first sentence as well as when it was separated from the first sentence by three filler sentences. Thus, readers keep track of the spatial location of the protagonist and when a discrepancy is detected, reading times increase, indicating that readers are trying to resolve the discrepancy. More recently, Levine and Klin (2001) also showed that readers update the protagonist’s spatial information in their situation models. They found that a shift in the protagonist’s location reduced the accessibility in memory of the protagonist’s prior location. Because readers closely monitor the spatial location of a protagonist, greater the spatial distance between the protagonist and an object, lesser will be the availability of the object in the readers’ memory (e.g., Glenberg, Meyer, & Lindem, 1987). In Glenberg et al. (1987), participants read stories where an object was spatially associated (e.g., ‘John put on his sweatshirt before going jogging’) or disassociated (e.g., ‘John took off his sweatshirt before going jogging’) with the protagonist. Later in the story, a target sentence was presented that referenced this object (i.e., the sweatshirt). Glenberg et al. (1987) showed that the sentence reading times were longer when the object was spatially

116


disassociated from the protagonist. Similar findings were reported by Rinck, Williams, Bower, and Becker (1996). They showed that increased spatial distance between the reader’s focus of attention and an object reduced the accessibility of the object in the reader’s situation model. Participants memorized the map of a fictitious building, including the location of objects in the rooms of the building. They then read a series of motion statements asking them to imagine their own movements through the building from one room to the other. Test probes consisting of two objects (e.g., ‘plant-sink’) were presented at various times while reading the narratives. When a test probe was presented, participants indicated whether the objects were currently located in the room that they had imagined walking into. Reaction time to the probe words were faster when the probes consisted of objects that were in the location that the participant had just imagined walking into compared to all other locations (i.e., the room the participant had started from, the room the participant had imagined walking through or another room somewhere in the building). Rinck, Hahnel, Bower, and Glowalla (1997) also showed that the accessibility of objects in the reader’s memory decreased with increasing spatial distance between the object and the protagonist. This spatial distance, however, was not a Euclidean distance but was a categorical distance. That is, the spatial distance referred to the number of rooms between the object and the protagonist. Individual differences can play an important role in how spatial information is updated in readers’ situation models. Dutke and Rinck (2006) investigated how updating a spatial situation is dependent on verbal abilities. In their study, participants memorized the map of a fictitious building, including the objects in the various rooms of the building. They then read narratives in which the protagonist’s movement was predictable

117


(i.e., the protagonist never changed the direction of movement) or unpredictable (i.e., the protagonist changed the direction of movement in unpredictable ways). Following the sentence that described the protagonist’s motion, the participants read a sentence that referred to an object in the source room (i.e., the room the protagonist started from) or the path room (i.e., the room the protagonist has just passed). Dutke and Rinck (2006) found that predictable motion led to longer reading times (i.e., larger spatial distance effect) for source room sentences compared to path room sentences for participants with low verbal ability. However, for participants with high verbal ability, unpredictable motion led to longer reading times for source room sentences compared to path room sentences than predictable motion. Dutke and Rinck (2006) concluded that, in the ‘unpredictable’ condition, the participants with high verbal abilities had to update the spatial information in their situation models that required cognitive effort, resulting in longer reading times in comparison to the ‘predictive condition’. For the participants with low verbal ability, the demands of integrating the spatial information into their situation models were high even in the predictable condition though they were able to construct a coherent situation model in the predictable condition. In summary, one line of research argues that the spatial dimension is monitored the least during narrative comprehension. The other line of research claims that readers keep track of spatial information during narrative comprehension, with discontinuities or inconsistencies in the spatial dimension, contributing to increase in reading times. The research reviewed so far described how readers monitor a specific dimension of a situation model. The next section will discuss the research conducted to examine the

118


extent to which readers monitor multiple dimensions concurrently and how these dimensions interact to influence narrative comprehension. Multidimensionality of Situation Models Readers monitor multiple dimensions during narrative comprehension (See Therriault & Rinck, 2007 for a review). For example, Zwaan, Magliano, and Graesser (1995) found that sentence reading times increased with temporal and causal discontinuities, indicating that readers update both temporal and causal information in their situation models. Using a verb-clustering task, Zwaan, Langston, and Graesser (1995) showed that readers simultaneously monitor all five dimensions of the situation model. Relatedly, Zwaan, Radvansky, Hilliard, and Curiel (1998) found that sentence reading times increased with protagonist, goal, temporal, and causal discontinuities, indicating that these dimensions are monitored concurrently during narrative comprehension. They also found that as the number of discontinuities increased, reading times increased. That is, discontinuities in two dimensions led to longer sentence reading times than discontinuity in one dimension, discontinuities in three dimensions led to longer sentence reading times than discontinuities in two dimensions and so on. Similar results were also found by Rinck and Weber (2003). However, instead of using sentence reading times they asked participants to rate the coherence of the narrative. They found that sentences involving protagonist, temporal, and spatial shifts alike resulted in lower coherence ratings compared to sentences involving no dimension shift. The coherence ratings also decreased with an increase in the number of dimension shifts. Even when the task demands are high, readers monitor multiple dimensions during narrative comprehension. For example, Therriault, Rinck, and Zwaan (2006) showed that

119


the protagonist and temporal dimensions are monitored by readers, irrespective of task demands. In their study, one group of participants was asked to focus on the spatial dimension while reading narratives, a second group was asked to focus on the temporal dimension, and a third group was asked to focus on the protagonist dimension. The participants were told that they will be asked comprehension questions about the specific dimension (i.e. space, time or protagonist) at the end of the narrative. Sentence reading times were collected. The results indicated that irrespective of the group to which they were assigned, reading times of critical sentences increased with discontinuities in the protagonist and temporal dimensions. Therriault et al. (2006) concluded that the protagonist and temporal dimensions are resistant to task demands and are monitored more globally. The interaction between situation model dimensions also need to be considered to understand how situation models are constructed during narrative comprehension. For example, Rapp and Taylor (2004) showed that temporal and spatial dimensions interact in the construction of situation models. In their study, participants read stories in which a character was described as moving from a start location to a final location. The distance between the two locations were described explicitly as being short (e.g., ‘Emily walked for four blocks) or long (e.g., ‘Emily walked for four miles). After reading the story, the participants were shown a probe word (that was the start location or the final location of Emily) and asked to determine whether this word was encountered during reading. Response times to the start locations were slower when the character was described as having walked a long distance compared to a short distance. In a related experiment, Rapp and Taylor (2004) found that the response times to the start location probes were

120


slower when the character was described as being engaged in a long activity (e.g., ‘Elizabeth read some articles in linguistic communication’) while moving from the start location to the final location instead of a short activity (e.g., ‘Elizabeth perused the journal’s table of contents’). Rapp and Taylor (2004) concluded that the interactive nature of dimensions influence how situation models are constructed. The interactivity of the temporal and spatial dimensions was also demonstrated by Rinck and Bower (2000). They showed that the accessibility of objects and rooms that were spatially proximal to the protagonist reduced with huge time shifts in the narrative. In their study, participants memorized the map of a fictitious building including the objects in the various rooms of the building. They then read narratives in which the protagonist moved from a source room to a location room through an unmentioned path room to accomplish a goal. Following this motion sentence, the protagonist was described as performing an activity which lasted for hours or for a few minutes. The participants were then presented with a probe word that referred to the path room and an object in the path room (e.g., bedlounge). Their task was to respond whether the object presented was located in the stated room. Participants took longer to respond to the probe word when the protagonist was described as having performed an activity lasting several hours rather than a few minutes. Another study that demonstrates the interactivity between situation model dimensions was conducted by Carreiras, Carriedo, Alonso, and Fernandez (1997). They examined how the temporal and protagonist dimensions interact to affect the accessibility of information about the protagonist. They found that participants’ recognition of probe words describing the current occupation of the protagonist (e.g., ‘Now she works as an

121


economist.’) was faster than to words describing the past occupation of the protagonist (e.g., ‘She worked as an economist.’). In summary, readers monitor multiple dimensions of a situation concurrently. Dimensions such as time and protagonist are monitored globally, irrespective of task demands. Most of the experiments on situation models have examined how readers monitor each dimension in isolation. It is also important to understand how the five dimensions interact to influence the construction and updation of situation models. The research reviewed so far examined how readers monitor the various dimensions of the situation model during narrative comprehension. The next section will examine the extent to which viewers monitor the various dimensions of situation models during film comprehension. Situation Models in Film Comprehension Intentionality Just as in reading comprehension, individuals keep track of the goals of characters during narrative comprehension. Magliano, Taylor, and Kim (2005) examined the extent to which viewers monitored the goals of characters during film comprehension. In their study, participants viewed a narrative film and made ‘situation change’ judgments whenever they perceived a change in the situation. The participants were not given explicit instructions on what constitutes a change in situation. Magliano et al. (2005) found that viewers made more situation change judgments when there were shifts in the goals associated with the characters, indicating that they monitored the goals of multiple characters. However, viewers monitored the goals of protagonists and antagonists who were central to the storyline more than the goals of secondary characters.

122


Time and Space Temporal and spatial dimensions are also monitored by viewers during film comprehension (e.g., Magliano, Miller, & Zwaan, 2001). In Magliano et al. (2001) participants viewed a narrative film and made ‘situation change’ judgments when they encountered a change in the situation. They found that shifts in time resulted in higher situation change judgments than shifts in the location of characters and shifts in spatial regions. Magliano et al. (2001) concluded that monitoring the temporal dimension is more dominant in film comprehension and is monitored independent of the shifts in the location of characters or shifts in spatial settings. Predictive Inferences Just as in reading comprehension, viewers generate predictive inferences about the causal consequences of events during film comprehension. For example, Magliano, Dijkstra, and Zwaan (1996) examined how the ability of viewers to make predictions about future events when watching a narrative film varied with the number of information sources (e.g., visual, auditory, discourse) available to them. In their study, participants were asked to make predictions wherever they could while watching a James Bond movie. Whenever they generated a prediction, they were asked to pause the video and write down their prediction. Magliano et al. (1996) found that the likelihood of making predictions increased with the number of information sources. The next section will examine the extent to which viewers monitor multiple dimensions of situation models concurrently during film comprehension.

123


Multidimensionality Viewers monitor multiple dimensions of events while watching narrative films (e.g., Magliano et al., 2001). In Magliano et al. (2001), participants viewed a narrative film and made ‘situation change’ judgments when they encountered a change in the situation. They found that the situation change judgments increased with increase in the discontinuities in the dimensions. That is, situation change ratings were higher for shots containing shifts in three dimensions (i.e., temporal shift, spatial movement of a character, and shift in the spatial region) compared to shots containing shifts in only two dimensions, which were higher than shots containing a shift in only one dimension. This indicates that viewers monitor multiple dimensions during film comprehension. In summary, viewers monitor multiple dimensions of a situation during film comprehension. There are striking similarities in the manner in which individuals monitor dimensions of a situation during film comprehension and text comprehension. In both domains, readers keep track of the protagonists, their goals, spatial shifts, and temporal shifts. In addition, individuals generate predictions about causal consequences of events in both domains. The next section will examine the extent to which individuals monitor the various dimensions of a situation in a virtual reality environment. Situation Models in Virtual Reality Environment There has been very limited research conducted to examine the degree to which individuals monitor various dimensions of a situation in a virtual environment. Recently, there has been some work demonstrating that individuals monitor space closely in virtual environments. For example, Radvansky and Copeland (2006) showed that spatial shifts in a virtual space made information about associated objects less available. In their study,

124


participants either walked through a doorway to another room (spatial shift condition) or walked across a large room (no spatial shift condition) in a virtual space. They were probed about objects that they carried (associated objects). Participants took longer to answer probes about associated objects following a spatial shift. Radvansky and Copeland (2006) concluded that spatial shifts required participants to update their situation model of the environment, which required a lot of cognitive effort, thereby reducing the availability of associated objects in their memory. Magliano, Radvansky, and Copeland (2007) also described how updating spatial information can affect the processing of other relevant information in a virtual air-ground combat mission. Magliano et al. (2007) found that spatial shifts increased the likelihood of players to be hit by enemies. This showed that spatial shifts required players to update the spatial index in their situation model, placing a demand on their cognitive resources, thereby increasing the chance of being hit by enemies. Summary In summary, individuals monitor multiple dimensions of a situation while reading, watching films as well as during interactions in virtual reality environments. Specifically, understanding in these environments requires monitoring the protagonists, their goals, and their spatial locations. In addition, monitoring time as well as generating casual and predictive inferences contributes to comprehension in these environments. Analogous to the work in reading comprehension and film comprehension, work needs to be done in dynamic environments to understand the extent to which operators monitor the dimensions of protagonist, time, space, causality, and intentionality. This will contribute to a better understanding of the process of SA in dynamic environments. The next section

125


will examine the literature on automation and SA, which will be the central focus of this dissertation. Automation and SA Automation is defined as ‘the execution by a machine agent of a function that was previously carried out by a human’ (Parasuraman & Riley, 1997; p.231). Similarly, Moray, Inagaki, and Itoh (2000) defined automation as ‘any sensing, detection, information-processing, decision-making, or control action that could be performed by humans but is actually performed by a machine’ (p.44). Increasingly, automated systems are being used to perform tasks in safety critical domains such as aviation, air traffic control, medicine, and robotics. The primary reason for the introduction of automated systems is to reduce human error that is considered responsible for majority of accidents in safety critical domains (e.g., Coyne, 1994; Shappell & Wiegmann, 2000). Automated systems offer several benefits. For example, reliable automation has shown to improve conflict detection performance of air traffic controllers (e.g., Metzger & Parasuraman, 2005), results in faster and accurate decision making in military tasks (e.g. Rovira, McGarry, & Parasuraman, 2007), and improves pilot performance during icing conditions (e.g., Sarter & Schroeder, 2001). Although automated systems offer several benefits, such systems are not always desirable. The next section will examine the out-ofthe loop performance problem associated with the use of automated systems. Problems with Automation: Out-of-the-Loop Performance Problem A potential consequence of fully automated systems is the out-of-the-loop performance (OOP) problem (e.g., Billings, 1991; Endsley & Kiris, 1995; Kessel & Wickens, 1982; Moray, 1986; Sarter & Woods, 1995; Wiener & Curry, 1980). This refers

126


to the reduced ability of operators working with fully automated systems to perform tasks manually following the failure of the automation in comparison to operators who perform the tasks manually. For example, using a simulated navigation task, Endsley and Kiris (1995) showed that participants working with a fully automated navigation expert system took longer to make decisions following the failure of the automation compared to participants in the manual condition. Thus, although fully automated systems are feasible with the advancement in computer technology, these are not recommended if the operator is to be kept in control and in the loop. The next section will examine the approach that can be adopted to increase the involvement of human operators when working with automated systems. Levels of Automation Taxonomies of Levels of Automation One approach to reduce the OOP problem is to use varying levels of automation (LOA). The central concept of LOA is that automation is not an all or none phenomenon, but instead can be implemented at various levels (e.g., Billings, 1991; Lorenz & Parasuraman, 2007; Parasuraman, 2000; Parasuraman, Sheridan, & Wickens, 2000; Sheridan & Parasuraman, 2006; Wiener & Curry, 1980). The first taxonomy of LOA was proposed by Sheridan and Verplank (1978). Their taxonomy of LOA consists of 10 levels, ranging from manual control (level 1) to full automation (level 10): 1. Human does the whole job 2. Computer offers options 3. Computer narrows options down to a few 4. Computer suggests an option

127


5. Computer executes selected option if human approves 6. Computer executes selected option but the human can veto 7. Computer executes selected option and informs human 8. Computer executes selected option and informs human only if asked 9. Computer executes selected option and informs human only if it decides to 10. Computer does the whole job autonomously Note that as the level increases (moving from level 1 to 10), the automated system has more control. Similarly, Endsley (1987b) developed a LOA taxonomy that is applicable when operators work with expert systems. This taxonomy distinguished five LOA: 1. Manual, in which the operator performs the task manually, with no assistance from the expert system 2. Decision Support, in which an expert system provides recommendations 3. Consensual Artificial Intelligence, in which the expert system implements upon the consensus of the operator 4. Monitored Artificial Intelligence, in which the expert system implements provided there is no veto from the operator 5. Full automation, in which the system decides and the operator observes Endsley & Kaber (1999) developed a 10-level taxonomy of LOA that can be applied to cognitive and psychomotor tasks and is therefore relevant in various domains such as air traffic control, telerobotics, and manufacturing. More recently, Parasuraman, Sheridan and Wickens (2000) developed the Parasuraman-Sheridan-Wickens (PSW) model of automation. The PSW model of automation will be used in this dissertation.

128


Just like other LOA taxonomies, this model also assumes that automation is not an all or none phenomenon, but instead can be implemented at various levels. Based on the PSW model, automation can be applied to different stages of information processing: information acquisition (Stage 1), information analysis (Stage 2), decision and action selection (Stage 3), and action implementation (Stage 4). Further, different degrees of automation support can be applied to each of these stages. Automation of information acquisition supports human sensory processes and refers to technology that cues, highlights (e.g., Dzindolet, Pierce, Beck, & Dawe, 2002; Fisher & Tan, 1989; Yeh, Wickens, & Seagull, 1999) or filters information. For example, an automated aid that directs an operator’s attention to important targets (such as camouflaged enemy objects in a terrain) is an example of information acquisition automation (e.g. Yeh & Wickens, 2001). Automation of information analysis provides support for integrating multiple information and making inferences and predictions (Parasuraman et al., 2000). For example, the Traffic Alert and Collision Avoidance System (TCAS) that integrates several pieces of information such as altitude, speed, and heading to warn the pilot when it detects a potential collision with other aircraft or the terrain is an example of information analysis automation (See Lorenz & Parasuraman, 2007 for a list of examples). Automation of decision and action selection involves providing support to the human operator in selecting the appropriate course of action. An automated aid that recommends the best enemy-friendly engagement option to a shooter in a military command and control environment can be considered as an example of decision and action selection

129


automation (e.g., Rovira et al., 2007). A GPS navigation system in a car that tells drivers how to get to their destination is another example. Automation of action implementation assists in the execution of actions. This stage typically only involves two levels: manual or automated (Lorenz & Parasuraman, 2007). Therefore, automation of this stage involves no assistance from the human operator. An autopilot is an example of action implementation automation. In summary, several taxonomies of LOAs have been proposed. The next section will examine the empirical studies that demonstrate the advantages of using intermediate LOAs in keeping operators involved in the decision making loop and promoting faster recovery times following automation failure. Empirical Evidence on the Utility of Intermediate LOAs The utility of intermediate LOAs in promoting faster recovery times following automation failure was demonstrated by Endsley and Kiris (1995) using a simulated navigation task. Participants were presented with three routes and were told to choose a route that would help them reach their destination as fast as possible, while minimizing fuel consumption. Participants were assigned to the manual condition, intermediate LOAs or the full automation condition. Endsley and Kiris (1995) found that participants working with intermediate LOAs were faster in performing the task manually following the failure of the automation compared to participants in the fully automated condition. Similarly, Kaber, Onal, and Endsley (2000) also demonstrated the benefits of intermediate LOAs in a simulated nuclear materials handling task using a telerobot. Participants were trained to perform the task using each of the five LOAs: manual control (with action support), batch processing, decision support, supervisory control, and full

130


automation. Kaber et al. (2000) found that when the automation failed, participants were slower to respond to automation failures under high LOAs, compared to intermediate LOAs. In summary, intermediate LOAs keep operators in the decision-making loop, promoting faster recovery times following automation failure. The next section will discuss the performance consequences associated with failure of high LOAs. Performance Consequences of Higher LOAs The performance costs associated with automation failure are higher when operators work with higher LOAs compared to lower LOAs. For example, using a military decision-making task, Crocoll and Coury (1990) examined the performance consequences of unreliable status automation (lower LOA) and decision automation (higher LOA). Participants performed a military decision-making task manually or received status information (e.g., whether an aircraft is friendly or hostile) from an automated aid or received recommendations (e.g., fire or no fire) from an aid or received both status and recommendation information. When the aids were perfectly reliable, the participants who were given the aids performed better than those who performed the task manually. There were no significant differences in the performance of the participants using the different aids. Thus, higher decision automation provided no incremental benefits over the status automation. However, when the aids failed, participants who received only status information performed better than those receiving recommendations. This meant that lower LOAs facilitated better performance than higher LOAs when the automation failed. Similarly, Sarter and Schroeder (2001) also showed that the performance costs associated with the failure of higher LOAs are higher. They examined

131


the effects of status and command displays on pilot performance during in-flight icing encounters. While status displays provided information about the location of ice (Level 1 automation), command displays provided recommendations such as power and flap settings ((Level 3 automation). When the displays were perfectly reliable, performance was higher with the status and command displays compared to manual operation. However, when the displays were unreliable (70% reliability), performance was lower with the status and command displays compared to the manual condition. Further, the performance costs associated with unreliable automation was higher for command displays compared to status displays. More recently, using a military command and control task, Rovira et al. (2007) also showed that the failure consequences of a high (but not perfect) reliability automated aid is higher when the aid is applied to higher levels of information processing (i.e., decision making) than lower levels of information processing (i.e., information acquisition). In their study, participants performed a sensor to shooter task in a simulated command and control environment. Their task was to choose friendly units to engage the most dangerous enemies. Automation aids of various levels were provided to the participants. In the information automation condition, participants were given a list of friendly-enemy engagement options, including all the raw data (e.g., the distances between friendlies). In the low decision automation condition, participants were provided with a prioritized list of friendly-enemy engagement options, including the raw data. In the medium decision automation condition, participants were given the top three friendly-enemy engagement options, including the raw data. Finally, in the high decision automation condition, only the top friendly-enemy engagement option was provided, with no raw data. Participants

132


first performed the task manually. They then worked with each of the automated aids in accomplishing the task. The reliability of the aids were low (60%) in some trials and high (80%) in other trials. The speed and accuracy of enemy-friendly engagement decisions were recorded. Rovira et al. (2007) found that participants were faster in making enemyfriendly engagement decisions with the reliable automation (i.e. 80% reliability automation) compared to the manual condition. This meant that reliable automation improves performance. The accuracy of making enemy-friendly engagement decisions were however lower with the unreliable automation (i.e., 60% reliability automation) compared to the manual condition. Finally, when the aid reliability was 80%, the accuracy in making enemy-friendly engagement decisions was lower for the decision automation conditions compared to the information automation condition. Rovira et al. (2007) concluded that when the reliability of the aid is high, people rely on it to a great extent, even though it is not perfect, and the consequences associated with this overreliance is higher when the automation is applied to higher order cognitive functions. In summary, the performance costs associated with automation failure are more severe for higher LOAs compared to lower LOAs. That is, higher LOAs can hamper performance when automation fails. Thus far, I have discussed the advantages of using automated systems. Further, the main problem associated with the use of fully automated systems, namely the OOP problem has also been discussed. The OOP problem refers to the lower ability of operators working with fully automated systems to detect automation failures and perform the task manually following automation failure. One approach to reduce the OOP problem is to use intermediate LOAs. The central idea of LOA is that automation

133


should not be considered as an all or none phenomenon; instead automation can be implemented at varying levels. I have also discussed various studies that demonstrate the advantages of using intermediate LOAs in promoting faster recovery times following the failure of an automated aid, in comparison to high LOAs. The next section will examine the central factor that is considered responsible for the OOP problem. Situation Awareness: The Factor Responsible for the OOP Problem Some researchers attribute the loss of operator SA to be the main factor responsible for the OOP problem (e.g., Endsley & Kiris, 1995; Kaber, Onal, & Endsley, 2006). The various factors that contribute to lower SA when operators interact with fully automated systems (Endsley, Bolte, & Jones, 2003; Endsley & Kiris, 1995) are poor system design, overreliance on automation, and passive monitoring of automation. These factors are discussed in this section. Poor System Design Inappropriate feedback about the system state, reduction in the salience of critical information with information overload, filtering of relevant information, and the presence of several computer windows that hide important data on another window can reduce operator SA and leave the operator out-of-the-loop (Endsley et al., 2003). A survey of aviation experts showed that inappropriate feedback about system state is among the top five issues responsible for automation incidents and accidents (Funk et al., 1999). More recently, through an examination of military UAV accident/incident data, Williams (2004) found that a significant number of accidents occur due to poor display design. Therefore, it is imperative that interfaces are designed that support operator SA. Maintaining mode awareness is also a major problem with highly automated systems

134


(e.g., Sarter & Woods, 1995). Mode awareness problems arise when there is an overabundance of automation modes and a lack of salient indication of a mode change. The loss of SA due to poor system design can be remedied by adopting better display design principles. In order to design an interface that supports operator SA, knowledge of the dynamic information needs of the operator is essential. The dynamic information needs or SA requirements can be identified through a goal-directed task analysis (GDTA; Endsley et al., 2000). The GDTA methodology aims at identifying the goals of an operator, the decisions that the operator makes to achieve the goals, and the SA requirements for making these decisions. The advantage of this methodology is that it focuses on the information that the operators would ideally like to know to accomplish their goals, instead of what is available to them on an existing interface. This methodology involves analysis of the domain literature, training manuals, and knowledge elicitation of expert domain-operators. The resulting information is represented in a hierarchy of goals, decisions, and SA requirements. These results are then validated using a large pool of experts. The output of the GDTA forms the foundation for designing interfaces that support operator SA. Once the SA requirements of the operators are identified, the design principles recommended by Endsley et al. (2000) can be adopted to design SA-oriented displays. For example, a design principle that can be used to support SA involves organizing information around operator goals. That is, all the information needed to achieve a certain goal should be grouped together, so as to reduce information access cost. Some of the other design principles include providing support for making comprehensions as well as projections about future events, reducing attentional narrowing by providing a high level

135


overview of the situation, filtering information not related to SA needs, placing the operator in the loop by employing intermediate LOAs, making system states and modes salient, avoiding proliferation of automation modes, minimizing task complexity, using information filtering wisely, providing appropriate feedback about system state, and enforcing consistency in control-display design (Endsley et al., 2000). Multimodal feedback has also been shown to improve SA in visually dense environments. For example, Ruff et al. (2000) showed that haptic alerts provided via joystick improved the SA of UAV operators during turbulent conditions. In summary, the loss of SA stemming from poor system design can be easily remedied by adopting good SA display design guidelines. The next section will examine the loss of SA arising from operator overreliance on automation. Overreliance on Automation Overeliance or misuse of automation occurs when operators rely on the automation inappropriately, resulting in failure of monitoring the automation (Parasuraman & Riley, 1997). That is, overreliance occurs when the trust that the operator places on the automated system exceeds the capabilities of the system (Lee & See, 2004). Operators of a highly reliable automated system place too much trust in the system, allocate their cognitive resources to perform other tasks and consequently fail to monitor the automated system for failures (e.g., Parasuraman, Molloy, & Singh, 1993). This results in a lack of SA of the state of the automated system and system state parameters (e.g., Endsley, 1996). The main factors contributing to overreliance on automation are the consistency of the automated aid, high confidence in the automation, high automation reliability, and high operator workload (Parasuraman & Riley, 1997).

136


Consistency of automation. An important factor in the development of overreliance is the consistency of the automation. Parasuraman et al. (1993) showed that operator detection of automation failures was lower when the reliability of the automation was constant rather than when it was variable. Participants performed a flight simulation task that included tracking, fuel management, and gauge monitoring. While the tracking and fuel management tasks were performed manually, the gauge monitoring task was automated. Under ‘normal conditions’, the automation detected and corrected gauge malfunctions. Under ‘automation failure’ conditions, the participants were responsible for detecting the gauge malfunction. The reliability of the automation remained constant (high or low) or variable (reliability alternated from high to low or from low to high). Parasuraman et al. (1993) found that the detection of automation failures was higher in the variable reliability condition compared to the constant reliability condition. As long as the reliability of the automation was constant, participants continued to rely on the automation and failed to detect automation failures, even when the automation reliability was low. Self-confidence vs. confidence in automation. Overreliance also occurs when confidence in the automation exceeds confidence in manual operation (Lee & Moray, 1994; Wiegmann, Rich, & Zhang, 2001). For example, drivers who drove in an unfamiliar city (whose self confidence was low) tended to rely more on a vehicle navigation system that made more errors than those who drove in a familiar city (whose self confidence was high; Kantowitz, Hanowski, & Kantowitz, 1997). High automation reliability. High automation reliability also results in misuse of automation. When the reliability of the automation is high (not perfect), operators

137


continue to rely on it to such an extent that even occasional failures do not reduce their trust in the automation (e.g., Parasuraman & Riley, 1997). Workload. The workload experienced by operators can also result in overreliance. For example, using an information warfare scenario, Biros, Daly, and Gunsch (2004) showed that military decision makers tended to rely more on an unreliable automation when their workload was high, despite low perception of trust in the automation. Similar findings were reported by Parasuraman et al. (1993). They found that although participants’ detection of automation failures were poor in a multitask environment, they exhibited near perfect performance (in terms of detection rates) when automation monitoring was the only task they had to perform. Therefore, operator workload is a major factor contributing to overreliance on an automated aid. In summary, highly reliable automation, highly consistent automation, high confidence of the operator in the automation, and high workload can all make an operator misuse the automation, resulting in lower SA. The next section will discuss the loss of SA that occurs due to passive monitoring of automated systems. Passive Monitoring of Automation There is a plethora of research that demonstrates the advantages of active processing over passive monitoring. The advantages of active involvement have been demonstrated in simple perceptual and cognitive tasks as well as in tasks that operators perform in safety-critical domains. The following sections will discuss the advantages of active processing over passive involvement. Advantages of active perception. There exists evidence demonstrating the advantages of active perception over passive perception. For example, Gibson (1962) found that

138


individuals who actively explored an irregular object by moving their hand over it (i.e., active observers) were more accurate in identifying the shape of the object compared to those whose hand remained stationary with the object moving over their hand (i.e., passive observers). Gibson (1962) concluded that perception becomes more accurate with active observation. The advantages of active movement or self-produced movement in visually guided behavior were also demonstrated by Held and Hein (1963). They reared kittens in the dark since birth. The kittens were then paired in a carousel for an hour each day, with an active kitten moving and turning the carousel and the passive kitten following around with no control over its motion. They were then returned to darkness for the rest of the day. Held and Hein (1963) found that even though both the kittens received the same amount of visual stimulation, the passive kitten showed deficits in the ‘visual cliff’ and other visual tasks that required depth perception. This experiment demonstrated the advantages of self motion in visual perception. This classic experiment provided support to Gibson’s (1979) view that active perception is important for the development of perceptual processes. Using a simulated flight task, Larish and Andersen (1995) also demonstrated the benefits of active perception. They found that active observers who viewed a compensatory tracking display were more accurate in detecting a change in the orientation (e.g., change in pitch or roll) following a blackout, compared to passive observers. The next section will examine the advantages of active processing over passive involvement in improving memorability and comprehension. Advantages of active processing in cognitive tasks. The advantages of active processing have been demonstrated in cognitive tasks. For example, deeper level of processing of an event is associated with better memory for the event (e.g., Craik &

139


Lockhart, 1972). Craik and Tulving (1975) investigated how the depth of processing affects individuals’ memory for information. Lower levels of processing were assessed by asking participants to determine whether a presented word was in upper case or lower case. Intermediate levels of processing were assessed by asking participants whether a presented word rhymes with another word. Deeper levels of processing were assessed by asking the participants to determine whether a presented word fits in a sentence. Following this encoding phase, participants were given a recall test. Participants exhibited higher recall when they had engaged in deeper processing during encoding compared to intermediate and lower levels of processing. Deeper levels of processing involved elaboration or meaningful analysis of information that consequently improved the memorability of the information. There is also evidence that memory is better for self-performed tasks because performing an action strengthens the memory trace associated with the task (e.g., Gronlund, Carlson, & Tower, 2007). For example, Koriat, Pearlman-Avnion, and BenZur (1998) showed that individuals who enacted phrases (e.g., pour coffee) exhibited better recall for these phrases during testing compared to individuals who merely memorized these phrases. The advantages of self-produced actions have been illustrated in research on student learning as well. For example, Foos, Mora, and Tkacz (1994) showed that students who generated their own questions after studying a text performed better on an upcoming test than those who received the questions from the experimenter. The benefits of active processing have been shown in text comprehension as well. For example, McNamara, Kintsch, Songer, and Kintsch (1996) showed that high domain knowledge readers were better able to answer inference and problem solving questions

140


that required a situational understanding of the text, when they were given a low coherence text. A low coherence text is one in which all the information is not provided to the readers; instead readers have to make inferences. McNamara et al. (1996) concluded that a low coherence text stimulated active processing in high domain knowledge readers, resulting in the creation of more elaborate situation models. The next section will examine the advantages of active processing over passive monitoring in safety critical domains, where operators interact with automated systems. Advantages of active processing in safety-critical domains. There are several studies that indicate that operator ability to detect automation failures is inferior when they act as passive monitors of automated systems, presumably due to lower SA under passive monitoring conditions. For example, using a tracking task, Young (1969) showed that passive monitors were slower in detecting system malfunctions compared to active participants. The benefits of manual control was also demonstrated by Wickens and Kessel (1979) using a pursuit tracking task. Poor performance under passive monitoring conditions compared to manual control has been attributed to a poor operator SA. For example, using a simulated navigation task, Endsley & Kiris (1995) showed that participants working with a fully automated navigation expert system took longer to perform the task manually following the failure of the automation compared to participants in the manual condition. The participants in the automated condition had lower Level 2 SA compared to those in the manual condition, prior to the automation failure. Thus, Level 2 SA, which is the comprehension of the situation, was influenced by higher levels of automation. Endsley and Kiris (1995) attributed this loss of SA for the automated group to passive processing of information.

141


The superiority of manual performance over passive involvement has also been demonstrated by Gugerty (1997) using a simulated driving task. He found that participants’ had better recall of nearby (potentially hazardous) car locations when they actively performed the driving task compared to watching driving scenes in a passenger (passive) mode. The advantages of active operator involvement have been demonstrated in the domain of air traffic control as well. For example, Willems and Truitt (1999) found that controllers’ response times to SA queries were longer when they were passively monitoring traffic. There is also evidence that passive monitoring can lead to poorer recall of aircraft attributes (e.g., Endsley & Rodgers, 1998). In their study, experienced controllers viewed recreations of operational errors that occurred in Atlanta air route traffic control center. SA was assessed by freezing the simulation and asking the controllers to recall attributes of all the aircraft in the airspace, two minutes prior to the occurrence to the operational error and when the operational error occurred. The recall of aircraft location and other attributes were fairly low, suggesting that passive monitoring may hamper SA. Endsley and Rodgers (1998) also found that the recall of aircraft attributes of the experienced controllers in their study was lower than the recall accuracy of air traffic control trainees used in Mogford and Tansley’s (1991) study. Endsley and Rodgers concluded that passive monitoring can lower SA, leaving controllers out of the decision making loop. The reduction in controller SA associated with passive monitoring during free flight conditions has also been demonstrated. Under free flight conditions, pilots are primarily responsible for maintaining aircraft separation. However, controllers are still responsible

142


for monitoring their airspace and intervene when aircraft separation fell below the recommended threshold (5 nautical miles laterally and 1000 feet vertically). Endsley, Mogford, Allendoerfer, Synder, and Stein (1997) showed that controllers’ perception of elements of the traffic situation, namely, aircraft location and call sign, as well as their comprehension and projection of the traffic situation was inferior under free flight conditions compared to active control. More recently, Metzger and Parasuraman (2001) showed that passive monitoring of air traffic resulted in longer time to detect potential conflicts compared to active control. In their experiment, experienced air traffic controllers participated in a simulation of a free-flight environment under moderate (i.e., average of 11 aircraft at one time in the airspace) and high traffic density (i.e., average of 17 aircraft at one time in the airspace), in both active and passive monitoring conditions. In the passive monitoring condition, controllers monitored traffic for potential conflicts and upon detecting a conflict they were asked to click a button. Under the active control condition, controllers could issue instructions to resolve a conflict. Participants completed four scenarios (moderate traffic, passive control; moderate traffic, active control; high traffic, passive control; high traffic, active control), each lasting for 30 minutes, comprising six embedded conflicts. At the end of one of the high traffic density scenario, participants were given a surprise recall test, where they were asked to indicate the parameters of aircraft present in their sector on a sector map. Metzger and Parasuraman (2001) found that there were no differences in the conflict detection performance between the passive and active control conditions under moderate traffic density. However, controllers took almost 2 minutes longer to detect conflicts under passive monitoring conditions compared to active control, under high traffic conditions. Controllers’ recall of

143


the altitude of aircraft was also lower in the passive monitoring condition. Thus high traffic and passive monitoring hampered controller SA and left them out-of-the-loop. In a related experiment, Metzger and Parasuraman (2005) showed that even providing a conflict detection aid does not improve controller performance during passive monitoring conditions, if the reliability of the aid is not perfect. The conflict detection aid indicated a potential collision in the scenario five minutes in advance. Metzger and Parasuraman (2005) found that when the reliability of the aid was low, conflict detection was better when controllers manually performed the task than when assisted by the aid during passive monitoring. In summary, the main factor considered responsible for the OOP problem is operator SA. Lower SA when operators interact with fully automated systems arises due to poor interface design, overreliance on automation as well as due to the passive involvement of operators. The loss of SA arising from poor interface design can be remedied by adopting good design principles. The loss of SA arising due to overreliance and passive operator involvement can be overcome by using intermediate LOAs. The next section will examine the role of intermediate LOAs in promoting operator SA and facilitating faster recovery times following automation failure. Benefits of Intermediate LOAs in Improving SA The first empirical evidence on the utility of intermediate LOAs in improving operator SA was provided by Endsley and Kiris (1995). In their experiment, participants were presented with a simulated navigation task. A total of six scenarios were used for their experiment. In each scenario, participants were presented with three routes along with the fuel consumption and travel times for each of the route. Their task was to choose

144


a route that would help them reach their destination as fast as possible, while minimizing the fuel consumption. Participants were assigned to one of the five conditions: manual, decision support, consensual AI, monitored AI or full automation. In the manual condition, participants had to manually choose among the three routes. In the decision support condition, an expert system provided recommendations, with probabilities assigned to each route. In the consensual AI condition, the expert system highlighted the route with the highest probability. In the monitored AI condition, the system implemented its decision unless vetoed by the participants within 30 s. Finally, in the full automation condition, the system would implement its decision and advance to the next scenario. The expert system was perfectly reliable in the first four scenarios. In the fifth and sixth scenarios, a system failure occurred and the participants had to perform the task manually. The SA of the participants, prior to the system failure, was evaluated using Level 1 and Level 2 SAGAT queries. Endsley and Kiris (1995) found that participants working with the fully automated navigation expert system took longer to perform the task manually following the failure of the automation compared to participants in the manual condition. The participants in the automated condition had lower Level 2 SA (comprehension of the situation) compared to those in the manual condition, prior to the automation failure. The participants assigned to the intermediate levels of automation had intermediate SA and exhibited intermediate levels of performance. Endsley and Kiris (1995) concluded that intermediate LOA kept the operators in the decision-making loop, resulting in better SA, compared to operators in the fully automated condition. Therefore, intermediate LOAs helped operators manually perform the task following automation failure. Similarly, Kaber et al. (2000) demonstrated the benefits of intermediate LOAs in

145


a simulated nuclear materials handling task using a telerobot. Participants were trained to perform the task using each of the five LOAs: manual control (with action support), batch processing, decision support, supervisory control, and full automation. The experiment lasted for 5 days. On day 1, participants were trained and tested on the manual control of the robot. The task completion times and number of errors (i.e., the number of collisions between robot and task objects) were recorded. On days 2-5, participants were trained on each of the LOA. Following the training, they were assessed on their performance under normal conditions and under automation failures. Two automation failures occurred at random times during the experiment trials. Participants were informed in advance that the automation might fail during the experiment. The SA of the participants was assessed using SAGAT queries. Kaber et al. (2000) found that participants exhibited higher performance (i.e., faster task completion times) under high LOAs (i.e., supervisory control, full automation) compared to intermediate LOAs, under normal conditions. However, when the automation failed, participants were slower to respond to automation failures under high LOAs, compared to intermediate LOAs. This was attributed to the higher SA of the participants under intermediate LOAs. Therefore, intermediate LOAs kept the participants in the loop, thereby resulting in faster recovery times. More recently, Kaber and Endsley (2004) also showed that intermediate LOAs facilitated higher SA even in a dual task environment. The benefits of intermediate LOAs have been documented in the air traffic control domain as well. For example, Kaber, Perry, Segall, McClernon, and Prinzel (2006) showed that participants had higher SA when adaptive automation (AA) was applied to lower LOA. In their study, participants completed trials under manual mode as well as

146


AA applied to information acquisition, information analysis, decision and action selection, and action implementation modes. In addition to the air traffic control task, participants were also required to perform a gauge monitoring task, which served as the secondary task. When the performance in the secondary task fell below a certain threshold, there was a switch from manual control to the mode of automation for that trial. Likewise, when the secondary task performance was above a certain threshold, there was a shift from automated control to manual control. SA was measured by querying the participants about aircraft (i.e., aircraft with highest priority for clearances) that were considered relevant in the scenario. Kaber et al. (2006) found that individuals had higher level 1 SA (i.e., accuracy in perception-based queries) when AA was applied to information acquisition compared to information analysis, decision making, and action implementation. However, the performance of the participants following automation failure was not examined in this study. The research that has been reviewed so far shows that higher LOAs can lower operator SA and presumably hamper performance when automation fails. However, there is also research that demonstrates the advantages of higher LOAs. This view is examined in the next section. Benefits of Higher LOAs in Improving SA Contradicting the research that advocates using intermediate LOAs for promoting SA and improving operator performance, there are studies that advocate using higher LOAs to improve operator performance. For example, Eprath and Curry (1977) showed that individuals were faster and more accurate in detecting system failures during a simulated landing task when they were monitoring the autopilot compared to when they were in

147


control. Poorer performance during manual control was attributed to higher cognitive resources required during manual control compared to the fully automated condition that resulted in the allocation of attention away from critical sources that provided failurerelated information. More recently, using a simulated spaceflight micro-world, Lorenz, DiNocera, Röttger, and Parasuraman (2002) showed that participants working with high LOAs were faster in detecting system faults. This was attributed to their ability to engage in better information sampling, thereby helping them gain better awareness of the system status. High LOAs also promote operator SA. For example, through an analysis of over 300 civilian accident reports, Jentsch, Barnett, Bowers, and Salas (1999) concluded that a loss of SA is more likely to occur when a pilot is flying than when the pilot is not flying. Lower SA during active control was attributed to the need to maintain SA while engaging in active flight control. Similarly, using a dynamic control task, Endsley and Kaber (1999) showed that SA was higher under high LOAs compared to manual control, presumably because higher LOAs freed up cognitive resources. In summary, there are two lines of research that describe how SA is affected when operators work with automated systems. One line of research argues that fully automated systems are undesirable if humans are to be kept in the loop and advocates using intermediate LOAs. The other line of research advocates that higher LOAs are more advantageous because higher LOAs free up operators’ cognitive resources that help them maintain higher SA.

148


First Automation Failure Effect: Performance Following the First Automation Failure The research presented so far has considered the performance consequences associated with the first automation failure. It is also important to examine ‘first automation failure effect’ or how the performance of an operator changes following the first occurrence of an automation failure (e.g., Wickens & Xu, 2002). The work on ‘first automation failure effects’ will be examined in this next section. Using a target detection task, Merlo, Wickens, and Yeh (1999) examined the performance consequences following the failure of an automated cueing aid. In their experiment, participants were instructed to search for targets in a hand-held or helmet mounted display, classify the target as friend or enemy, and report the azimuth of targets. In some trials, an automated aid cued the location of targets on the display. The automated cueing aid was not fully reliable. The participants were also given a secondary task to perform in conjunction with the primary task of identifying targets. Merlo et al. (2000) found that when the automated aid failed, target detection rates were only 50%. However, target detection rates in subsequent failure trials improved to 91% immediately following the first automation failure. Merlo et al. (1999) attributed this improvement subsequent to the first automation failure to a change in search strategies of the participants. Prior to the failure of the cueing algorithm, participants exhibited overtrust in the aid and were immediately drawn to the search space that was cued by the aid, leading to attentional tunneling. Following the failure of the aid, participants calibrated their trust in the aid, that reduced their attentional tunneling and they started paying more attention to the surrounding environment.

149


More recently, Rovira et al. (2007) examined how the performance of participants in a military decision-making task changed after being exposed to the failure of information automation, low decision automation, medium decision automation, and high decision automation. They found that participants exhibited significant performance improvements in subsequent failure trials (compared to their performance after the first automation failure) for both medium and high decision automation. In summary, individuals tend to be more cautious after they experience an automation failure. Instead of exhibiting overreliance on the automation, they adopt new cognitive strategies, which are reflected in their performance following the failure. There is also a contradicting line of research that suggests that operator performance many not improve subsequent to automation failure (e.g., Parasuraman et al., 1993; Parasuraman & Riley, 1997). For example, Kantowitz et al. (1997) found that drivers continued to rely on a navigation system even though it made errors, when they were in an unfamiliar city, where their confidence in manual navigation was low. Similarly, Biros et al. (2004) showed that military decision makers continued to rely on an automated aid when their workload was high, despite the low trust ratings they assigned to the automation. Therefore, operator reliance on an automated aid following the first occurrence of an automation failure is dependent on operator self confidence as well as workload. In summary, there is research that suggests that operators’ monitoring of automated systems improves following the first automation failure, presumably due to calibration of their trust on the automation and adoption of new cognitive strategies. However, there is also research suggesting that operators continue to rely on the automation even when the

150


aid makes errors if their confidence in the aid exceeds self confidence or if their workload is high.

151


REFERENCES Adams, M. J., Tenney, Y. J., & Pew, R. W. (1995). Situation awareness and the cognitive management of complex systems. Human Factors, 37, 85-104. Andre, A. D., Wickens, C. D., Boorman, L., & Boschelli, M. M. (1991). Display formatting techniques for improving situation awareness in the aircraft cockpit. International Journal of Aviation Psychology, 1, 205-218. Bedny, G. Z., Karwowski, W., & Jeng, O. (2004). The situational reflection of reality in activity theory and the concept of situation awareness in cognitive psychology. Theoretical Issues in Ergonomics Science, 5, 275-296. Biros, D. P., Daly, M., & Gunsch, G. (2004). The Influence of Task Load and Automation Trust on Deception Detection. Group Decision and Negotiation, 13, 173-189. Billings, C. E. (1991). Human-centered aircraft automation: A concept and guidelines (NASA Tech. Memorandum 103885). Moffet Field, CA: NASA Ames Research Center. Bisseret, A. (1970). Mémoire opérationelle et structure du travail [Operational memory and structure. of work]. Bulletin de Psychologie, 24, 280-294. Caron, J., Micko, H. C., & Thuring, M. (1988). Conjunctions and recall of composite sentences. Journal of Memory and Language, 29, 309-323. Carpenter, P. A., Just, M. A., Keller, T. A., Eddy, W. F. & Thulborn, K . R. (1999). Time course of fMRI activation in language and spatial networks during sentence comprehension. NeuroImage, 10, 216-224.

152


Carreiras, M., Carriedo, N., Alonso, M. A., & Fernandez (1997). The role of verbal tense and verbal aspect in the foregrounding of information in reading. Memory and Cognition, 25, 438-446. Carretta, T.R., Perry, D.C., & Ree, M.J. (1996). Prediction of situational awareness in F15 pilots. International Journal of Aviation Psychology, 6, 21-41. Craik, F. I. M., & Lockhart, R. S. (1972). Levels of processing: A framework for memory research. Journal of Verbal Learning and Verbal Behavior, 11, 671-684. Craik, F.I.M. & Tulving, E. (1975). Depth of processing and the retention of words in episodic memory. Journal of Experimental Psychology: General, 104, 268-294. Crocoll, W. M., & Coury, B. G. (1990). Status or recommendation: Selecting the type of information for decision aiding. In Proceedings of the Human Factors and Ergonomics Society 34th Annual Meeting (pp. 1524-1528). Orlando, FL: Human Factors and Ergonomics Society. Coyne, P. (1994). Roadcraft: The Police driver’s handbook. London: HMSO. de Vega, M. (1995). Backward updating of mental models during continuous reading of narratives. Journal of Experimental Psychology, 21, 373-385. Doane, S. M., & Sohn, Y. W. (2004). Pilot ability to anticipate the consequences of flight actions as a function of expertise. Human Factors, 46, 92-103. Dominguez, C. (1994). Can SA be defined? In M. Vidulich, C. Dominguez, E. Vogel, & G. McMillan (Eds.), Situation awareness: papers and annotated bibliography (pp. 5-15). Wright-Patterson Airforce Base, OH: Air Force Systems Command.

153


Dutke, S., & Rinck, M. (2006). Predictability of locomotion: Effects on updating of spatial situation models during narrative comprehension. Memory and Cognition, 34, 1193-1205. Durso, F. T., Bleckley, M. K., & Dattel, A. R. (2006). Does SA add to the predictive validity of cognitive tests? Human Factors, 48, 721-733. Durso, F. T., & Dattel, A. R. (2004). SPAM: The real-time assessment of SA. In S. Banbury & S. Tremblay (Eds.), A Cognitive approach to situation awareness: Theory and application (pp. 137-154), Aldershot, UK: Ashgate. Durso, F. T., & Gronlund, S. D. (1999). Situation awareness. In F. T. Durso, R. S. Nickerson, R. W. Schvaneveldt, S. T. Dumais, D. S. Lindays, & M. T. H. Chi (Eds.), Handbook of applied cognition (pp. 283-314). New York: Wiley. Durso, F. T., Hackworth, C., Truitt, T. R., Crutchfield, J., Nikolic, D., & Manning, C. A. (1998). Situation awareness as a predictor of performance in en route air traffic controllers (DOT/FAA/AM-99/3). Washington, DC: Office of Aerospace Medicine, Federal Aviation Administration. Durso, F. T., Rawson, K. A., & Girotto, S (2007).

Comprehension and situation

awareness. In F. T. Durso, R. S. Nickerson, S.T. Dumais, S. Lewandowsky, & T. J. Perfect (Eds.), Handbook of applied cognition (2nd ed., pp. 163-193). Chicester, UK: Wiley. Durso, F. T., & Sethumadhavan, A. (2008). Situation awareness: Understanding dynamic environments. Human Factors, 50, 442-448. Durso, F.T., Truitt, T. R., Hackworth, C. A., Crutchfield, J. M., Nikolic, D., Moertl, P. M., Ohrt, D., & Manning, C. A. (1995). Expertise and chess: A pilot study

154


comparing situation awareness methodologies. In D. J. Garland & M. R. Endsley (Eds.), Experimental analysis and measurement of situation awareness (pp. 295304). Daytona Beach, FL: Embry-Riddle Aeronautical University Press. Durso, F. T., Truitt, T. R., Hackworth, C., Crutchfield, J., & Manning, C. A. (1998). En route operational errors and situation awareness. International Journal of Aviation Psychology, 8, 177-194. Dzindolet, M. T., Pierce, L. G., Beck, H. P., & Dawe, L. A. (2002). The perceived utility of human and automated aids in a visual detection task. Human Factors, 44, 7994. Ehrlich, K., & Johnson-Laird, P. N. (1982). Spatial descriptions and referential continuity. Journal of Verbal Learning and Verbal Behavior, 21, 296-306. Endsley, M. R. (1987a). SAGAT: A methodology for the measurement of situation awareness (NOR DOC 87-83). Hawthorne, CA: Northrop Corporation. Endsley, M. R. (1987b). The application of human factors to the development of expert system for advanced cockpits. In Proceedings of the Human Factors and Ergonomics Society 31st Annual Meeting (pp. 1388-1392). Santa Monica, CA: Human Factors and Ergonomics Society. Endsley, M. R. (1988). Design and evaluation for situation awareness enhancement. In Proceedings of the Human Factors Society 32nd Annual Meeting (pp. 97-101). Santa Monica, CA: Human Factors Society. Endsley, M. R. (1990). Predictive utility of an objective measure of situation awareness. In Proceedings of the Human Factors Society 34th Annual Meeting (pp. 41-45). Santa Monica, CA: Human Factors Society.

155


Endsley, M. R. (1995a). Measurement of situation awareness in dynamic systems, Human Factors, 37, 65-84. Endsley, M. R. (1995b). Towards a theory of situation awareness in dynamic environments, Human Factors, 37, 32-64. Endsley, M. R. (1996). Automation and situation awareness. In R. Parasuraman & M. Mouloua (Eds.), Automation and human performance: Theory and applications (pp. 163-181). Mahwah, NJ: LEA. Endsley, M. R. (2000). Direct measurement of situation awareness: Validity and use of SAGAT. In M. R. Endsley & D. J. Garland (Eds.), Situation awareness analysis and measurement (pp. 147-174). Mahwah, NJ: LEA. Endsley, M. (2000). Theoretical underpinnings of situation awareness. In. M. R. Endsley, & D. J. Garland (Eds.), Situation awareness analysis and measurement (pp 3-32). Mahwah, NJ: LEA. Endsley, M., & Bolstad, C. A. (1994). Individual differences in pilot situation awareness. International Journal of Aviation Psychology, 4, 241-264. Endsley, M. R., Bolte, B., & Jones, D. G. (2003). Designing for situation awareness: An approach to user-centered design. New York: Taylor & Francis. Endsley, M. R., & Kaber, D. B. (1999). Level of automation effects on performance, situation awareness, and workload in a dynamic control task. Ergonomics, 42, 462-492. Endsley, M. R. & Kiris, E. O. (1995). The out-of-the-loop performance problem and level of control in automation. Human Factors, 37, 381-394.

156


Endsley, M. R., Mogford, R. H., Allendoerfer, K. R., Synder, M. D., & Stein, E. S. (1997). Effect of free flight conditions on performance, workload, and situation awareness (DOT/FAA/CT-TN97/12). Atlantic City International Airport: Federal Aviation Administration William J. Hughes Technical Center. Endsley, M. R. & Rodgers, M. D. (1998). Distribution of attention, situation awareness and workload in a passive air traffic control task: Implications for operational errors and automation. Air Traffic Control Quarterly, 6, 1-86. Eprath, A. R., & Curry, R. E. (1977). Detection by pilots of system failures during instrument landings. IEEE Transactions on System, and Cybernetics, 12, 841-848. Fincher-Kiefer, R., & D’ Agostino, P. R. (2004). The role of visuospatial resources in generating predictive and bridging inferences. Discourse Processes, 37, 205-224. Fisher, D. L., & Tan, K. C. (1989). Visual displays: The highlighting paradox. Human Factors, 31, 17–30. Flach, J. M. (1995). Situation awareness: Proceed with caution. Human Factors, 37, 149157.

Fletcher, C. R., & Bloom, C. P. (1988). Causal reasoning in the comprehension of simple narrative texts. Journal of Memory and Language, 27, 235-244. Foos, P.W., Mora, J.J., & Tkacz, S. (1994). Student study techniques and the generation effect. Journal of Educational Psychology, 86, 567-576. Fracker, M. L. (1988). A theory of situation assessment: Implications for measuring situation awareness. In Proceedings of the Human Factors and Ergonomics Society 32nd Annual Meeting (pp. 102-106). Santa Monica, CA: Human Factors Society.

157


Funk, K., Lyall, B., Wilson, J., Vint, R., Miemcyzyk, M., Suroteguh, C., et al. (1999). Flight deck automation issues. International Journal of Aviation Psychology, 9, 125-138. Gaba, D. M., Howard, S. K., & Small, S. D. (1995). Situation awareness in anesthesiology. Human Factors, 37, 20-31. Gibson, J. J. (1962). Observations on active touch. Psychological Review, 69, 477-491. Gibson, J. J. (1979). The ecological approach to visual perception. Boston, MA: Houghton Mifflin. Glenberg, A. M., Meyer, M., & Lindem, K. (1987). Mental models contribute to foregrounding during text comprehension. Journal of Memory and Language, 26, 69-83. Gronlund, S.D., Carlson, C. A., & Tower, D. (2007). Episodic memory. In F. T. Durso, R. S. Nickerson, S.T. Dumais, S. Lewandowsky, & T. J. Perfect (Eds.), Handbook of applied cognition (2nd Ed., pp. 111-136). Chicester: Wiley. Gugerty, L. J. (1997). Situation awareness during driving: Explicit and implicit knowledge in dynamic spatial memory. Journal of Experimental Psychology: Applied, 1, 42-66. Gugerty, L.J., & Tirre, W. C. (2000). Individual differences in situation awareness. In. M. R. Endsley, & D. J. Garland (Eds.), Situation awareness analysis and measurement (pp. 249-276). Mahwah, NJ: Erlbaum. Harwood, K., Barnett, B., & Wickens, C. D. (1988). Situational awareness: A conceptual and methodological framework. Proceedings of the 11th Symposium of Psychology in the Department of Defense, April.

158


Held, R., & Hein, A. (1963). Movement-produced stimulation in the development of visually guided behavior. Journal of Comparative and Physiological Psychology, 56, 872-876. Horswill, M.S., & McKenna, F.P. (2004). Drivers’ hazard perception ability: Situation awareness on the road. In S. Banbury and S. Tremblay (Eds.), A cognitive approach to situation awareness: Theory and application (155-175). Aldershot, UK: Ashgate. Jeannot, E., Kelly, C., & Thompson, D. (2003). The development of situation awareness measures in ATM systems (HRS/HSP-005-REP-01). Brussels: EUROCONTROL. Jentsch, F., Barnett, J., Bowers, C. A., & Salas, E. (1999). Who is flying this plane anyway? What mishaps tell us about crew member role assignment and crew situation awareness. Human Factors, 41, 1-14. Jones, D.G. (2000). Subjective measures of situation awareness. In. M. R. Endsley, & D. J. Garland (Eds.), Situation Awareness analysis and measurement (pp 113-128). Mahwah, NJ: LEA. Kaber, D., & Endsley, M. (2004). The effects of level of automation and adaptive automation on human performance, situation awareness and workload in a dynamic control task. Theoretical issues in Ergonomic Science, 5, 113–153. Kaber, D. B., Onal, E., & Endsley, M. R. (2000). Design of automation for telerobots and the effect on performance, operator situation awareness, and subjective workload. Human Factors and Ergonomics in Manufacturing, 10, 409-430. Kaber, D.B., Perry, C. M., Segall, N., McClernon, C. K., & Prinzel, L.J. (2006). Situation awareness implications of adaptive automation for information processing in an

159


air traffic control-related task. International Journal of Industrial Ergonomics, 36, 447-462. Kantowitz, B. H., Hanowski, R. J., & Kantowitz, S. C. (1997). Driver acceptance of unreliable traffic information in familiar and unfamiliar settings. Human Factors, 39, 164-176. Keefe, D. E., & McDaniel, M. A. (1993). The time course and durability of predictive inferences. Journal of Memory and Language, 32, 446-463. Kessel, C. J., & Wickens, C. D. (1982). The transfer of failure-detection skills between monitoring and controlling dynamic systems. Human Factors, 24, 49-60. Kintsch, W. (1992). A cogntiive architecture for comprehension. In H. L. Pick, P. Van den Broedk, & D. C. Knill (Eds.), The study of cognition: Conceptual and methodological issues (pp. 143-164). Washington, DC: American Pscyhological Association. Kintsch, W. (1993). Text comprehension, memory, and learning. American Psychologist, 49, 294-303. Kintsch, W. & van Dijk, T. A. (1978). Toward a model of text comprehension and production. Psychological Review, 85, 363-394. Klin, C. M., Murray, J. D., Levine, W. H., & Guzman, A. E. (1999). Forward inferences: From activation to long-term memory. Discourse Processes, 27, 241-260. Komeda, H., & Kusumi, T. (2006). The effect of a protagonist’s emotional shift on situation model construction. Memory and Cognition, 34, 1548-1556. Koriat, A., Pearlman-Avnion, S., & Ben-Zur, H. (1998). The subjective organization of input and output events in memory. Psychological Research, 61, 295-307.

160


Larish, J. F., & Andersen, G. J. (1995). Active control in interrupted dynamic spatial orientation: The detection of orientation change. Perception and Psychophysics, 57, 533-545. Lee, J. D., & Moray, N. (1994). Trust, self-confidence, and operators’ adaptation to automation. International Journal of Human-Computer Studies, 40, 153-184. Lee, J., & See, J. (2004). Trust in automation: Designing for appropriate reliance. Human Factors, 46, 50-80. Levine, W. H., & Klin, C. M. (2001). Tracking of spatial information in narratives. Memory and Cognition, 29, 327-335. Linderholm, T. (2002). Predictive inference generation as a function of working memory capacity and causal text constraints. Discourse Processes, 34, 259-280. Linderholm, T., Gernsbacher, M. A., van den Broek, P., Robsertson, R. R. W., & Sundermier, B. (2004). Suppression of story character goals during reading. Discourse Processes, 37, 67-78. Lorenz, B., Di Nocera, F., Röttger, S., & Parasuraman, R. (2002). Automated flight management in a simulated space flight microworld. Aviation, Space, and Environmental Medicine, 73, 886-897. Lorenz, B., & Parasuraman, R. (2007). Automated and interactive real-time systems. In F. T. Durso, R. S. Nickerson, S.T. Dumais, S. Lewandowsky, & T. J. Perfect (Eds.), Handbook of applied cognition (2nd Ed., pp. 415-441). Chicester: John Wiley and Sons. Lutz, M. F., & Radvansky, G. A. (1997). The fate of completed goal information in narrative comprehension. Journal of Memory and Language, 36, 293-310.

161


Magliano, J.P., Dijkstra, K., & Zwaan, R.A. (1996). Predictive inferences in movies. Discourse Processes, 22, 199-224. Magliano, J. P., Miller, J., & Zwaan. R. A. (2001). Indexing space and time in film understanding. Applied Cognitive Psychology, 15, 533-545. Magliano, J. P., Radvansky, G. A., & Copeland, D. E. (2007). Beyond language comprehension: Situation models as a form of autobiographical memory. In F. Schmalhofer & C. A. Perfetti (Eds.), Higher level language processes in the brain: Inference and comprehension processes (pp. 379-391). Mahwah, NJ: LEA. Magliano J.P., Taylor, H.A., & Kim, H.J.J. (2005). When goals collide: Monitoring the goals of multiple characters. Memory and Cognition, 33, 1357-1367. Magliano, J. P., Zwaan, R. A., & Graesser, A. C. (1999). The role of situational continuity in narrative understanding. In S. R. Goldman & H. van Oostendorp (Eds.), The construction of mental representations during reading (pp. 219-245). Mahwah, NJ: LEA. Matthews, M.D., Strater, L. D., & Endsley, M. R. (2004). Situation awareness requirements for infantry platoon leaders. Military Psychology, 16, 149-161. McNamara, D. S., Kintsch, E., Songer, N. B., & Kintsch, W. (1996). Are good texts always better? Interactions of text coherence, background knowledge, and levels of understanding in learning from text. Cognition and Instruction, 14, 1-43. Merlo, J. L., Wickens, C. D., & Yeh, M. (1999). Effect of reliability on cue effectiveness and display signaling. (Tech. Report ARL-99-4/FED-LAB-99-3). Savoy: University of Illinois, Aviation Research Lab.

162


Metzger, U. & Parasuraman, R. (2001). The role of the air traffic controller in future air traffic management: An empirical study of active control vs. passive monitoring. Human Factors, 43, 519-528. Metzger, U., & Parasuraman, R. (2005). Automation in future air traffic management: Effects of decision aid reliability on controller performance and mental workload. Human Factors, 47, 35-49. Mogford, R. H., & Tansley, B. W. (1991). The Importance of the air traffic controller’s mental model. In Proceedings of the 24th Annual Conference of the Human Factors Association of Canada (pp.135-140). Ontario, Canada: Human Factors Association of Canada. Moray, N. (1986). Monitoring behavior and supervisory control. In K. Boff (Ed.), Handbook of perception and human performance (pp. 40/1-40/51). New York: Wiley. Moray, N., Inagaki, T., & Itoh, M. (2000). Adaptive automation, trust, and selfconfidence in fault management of time-critical tasks. Journal of Experimental Psychology: Applied, 6, 44-58. Murray, J. D., & Burke, K. A. (2003). Activation and encoding of predictive inferences: The role of reading skill. Discourse Processes, 35, 81-102. Myers, J. L., & Duffy, S. A. (1990). Casual inferences and text memory. In A. C. Graesser & G. H. Bower (Eds.), Inferences and text comprehension (pp. 159173). New York: Academic Press.

163


Myers, J. L., O'Brien, E.J., Albrecht, J.E., & Mason, R.A. (1994). Maintaining global coherence during reading. Journal of Experimental Psychology: Learning, Memory, and Cognition, 20, 876-886. Neal, A., Griffin, M., Paterson, J., & Bordia, P. (1998). Human factors issues: Performance management transition to a CNS/ATM environment (Final Report: Air Services Australia). Brisbane: University of Queensland. Niesser, U. (1967). Cognitive psychology. New York: Appleton-Century-Crofts.

O' Brien, E. J., & Albrecht, J.E. (1992). Comprehension strategies in the development of a mental model, Journal of Experimental Psychology: Learning, Memory, and Cognition. 18, 777-84. Parasuraman, R. (2000). Designing automation for human use: Empirical studies and quantitative models. Ergonomics, 43), 931-951. Parasuraman, R., Molloy, R., & Singh, I. L. (1993). Performance consequences of automation-induced

‘complacency’.

International

Journal

of

Aviation

Psychology, 3, 1-23. Parasuraman, R., & Riley, V. (1997). Humans and automation: Use, misuse, disuse, and abuse. Human Factors, 39, 230-253. Parasuraman, R., Sheridan, T. B., & Wickens, C. (2000). A model for types and levels of human interaction with automation. IEEE Transactions on Systems, Man, and Cybernetics, 30, 286-297. Radvansky, G. A., & Copeland, D. E. (2006).Walking through doorways causes forgetting: Situation models and experienced space. Memory and Cognition, 34, 1150-1156.

164


Rapp, D.N. & Taylor, H. A. (2004). Interactive dimensions in the construction of mental representations for text. Journal of Experimental Psychology: Learning, Memory, and Cognition, 30, 988-1001. Regal, D. M., Rogers, W. H., & Boucek, G. P. (1988). Situational awareness in the commercial flight deck: Definition, measurement, and enhancement. In Proceedings of the Seventh Aerospace Behavioral Technology Conference and Exposition (pp. 65-69). Warrendale, PA: Society of Automotive Engineers. Rinck, M. (2005). Spatial situation models. In P. Shah & A. Miyake (Eds.), The Cambridge handbook of visuospatial thinking (pp. 334-382). New York, NY: Cambridge University Press. Rinck, M; & Bower, G. (2000). Temporal and spatial distance in. situation models. Memory and Cognition, 28, 1310–1320. Rinck, M., Gamez, E., Diaz, J. M. , & de Vega, M. (2003). Processing of temporal information: Evidence from eye movements. Memory and Cognition, 31, 77-86. Rinck, M., Hahnel, A., & Becker, G. (2001). Using temporal information to construct, update, and retrieve situation models of narratives. Journal of Experimental Psychology: Learning, Memory, and Cognition, 27, 67-80. Rinck, M., Hahnel, A., Bower, G. H., & Glowalla, U. (1997). The metrics of spatial situation models. Journal of Experimental Psychology: Learning, Memory, and Cognition, 23, 622-637. Rinck, M., & Weber, U. (2003). Who when where: An experimental test of the eventindexing model. Memory and Cognition, 31, 1284-1292.

165


Rinck, M., Williams, P., Bower, G. H., & Becker, E. S. (1996). Spatial situation models and narrative understanding: Some generalizations and extensions. Discourse Processes, 21, 23-55. Rovira, E., McGarry, K., & Parasuraman, R. (2007). Effects of imperfect automation on decision making in a simulated command and control task. Human Factors, 49, 76-87. Ruff, H.A., Draper, M.H., Lu, L.G., Poole, M.R., & Repperger, D.W. (2000). Haptic feedback as a supplemental method of alerting UAV operators to the onset of turbulence. In Proceedings of the IEA 2000/ HFES 2000 Congress (pp. 41 - 44). Sarter, N. B., & Schroeder, B. (2001). Supporting decision making and action selection under time pressure and uncertainty: The case of in-flight icing. Human Factors, 43, 573-583. Sarter, N. B. and Woods, D. D. (1991). Situation Awareness: A critical but ill-defined phenomenon. International Journal of Aviation Psychology, 1, 45-57. Sarter, N. B., & Woods, D.D. (1995). How in the world did we ever get into that mode? Mode error and awareness in supervisory control. Human Factors, 37, 5-19. Scott Rich, S., & Taylor, H. A. (2000). Not all narrative shifts function equally. Memory and Cognition, 28, 1257-1266. Shappell, S.A., & Wiegmann, D. A. (2000). The human factors analysis and classification system – HFACS (DOT/FAA/AM-00/7), Washington, DC: Office of Aerospace Medicine, Federal Aviation Administration. Sheridan, T. B., & Parasuraman, R. (2006). Human-automation interaction. Reviews of Human Factors and Ergonomics, 1, 89-129.

166


Sheridan, T. B., & Verplank, W. L. (1978). Human and computer control of undersea teleoperators. Cambridge, MA: MIT Man-Machine Laboratory. Singer, M., & Halldorson, M. (1996). Constructing and validating motive bridging inferences. Cognitive Psychology, 30, 1-38. Singer, M., & Halldorson, M., Lear, J. C., & Andrusiak, P. (1992). Validation of causal bridging inferences. Cognitive Psychology, 30, 1-38. Smith, K., and Hancock, P. A. (1995). Situation awareness is adaptive, externally directed consciousness. Human Factors, 37, 137-148. Sohn, Y. W., & Doane, S. M. (2004). Memory processes of flight situation awareness: Interactive roles of working memory capacity, long-term working memory, and expertise. Human Factors, 46, 461-475. Speer, N. K., & Zacks, J. M. (2005). Temporal changes as event boundaries: Processing and memory consequences of narrative time shifts. Journal of Memory and Language, 53, 125-140. Taylor, R. M. (1990). Situational awareness rating technique (SART): The development of a tool for aircrew systems design. In Situational Awareness in Aerospace Operations (AGARD-CP-478; pp. 3/1-3/17). Neuilly Sur Seine, France: NATOAGARD. Tenney, Y. J., & Pew, R. W. (2006). Situation awareness catches on: What? So What? Now What? Reviews of the Human Factors and Ergonomics Society, 1, 1-35. Tolk, J. D., & Keether, G. A. (1982). Advanced medium-range air-to-air missile (AMRAAM) operational evaluation (OUE) final report. Air Force Test and Evaluation Center, Kirtland Air Force Base, NM.

167


Trabasso, T., & Wiley, J. (2005). Goal plans of action and inferences during comprehension of narratives. Discourse Processes, 39, 129-164. Therriault, D. J., & Rinck, M. (2007). Multidimensional situation models. In F. Schmalhofer & C.A. Perfetti (Eds.), Higher level language processes in the brain: Inference and comprehension processes (pp. 311-327). Mahwah, NJ: LEA. Therriault D.J., Rinck M., & Zwaan, R.A. (2006). Assessing the influence of dimensional focus during situation model construction. Memory and Cognition, 34, 78-89. van den Broek, P., Young, M., Tzeng, Y., & Linderholm, T. (1999). The landscape model of reading: Inferences and the on-line construction of a memory representation. In R. F. Lorch Jr. & E. J. O’Brien (Eds.), Sources of coherences in text comprehension (pp.353-373). Hillsdale, NJ: LEA. Whitaker, L. A., & Klein, G. A. (1988). Situation awareness in the virtual world: Situation assessment report. In Proceedings of the 11th Symposium of Psychology in the Department of Defense, April. Wickens, C. D., & Kessel, C. (1979). The effects of participatory mode and task workload on the detection of dynamic system failures. IEEE Transactions on System, and Cybernetics, 1, 23-34. Wickens, C. D., & Xu, X. (2002). Automation trust, reliability and attention (Tech. Rep.AHFD-02-14/MAAD-02-2). Savoy: University of Illinois, Aviation Research Lab. Wiegmann, D. A., Rich, A., & Zhang, H. (2001). Automated diagnostic aids: The effects of aid reliability on users' trust and reliance. Theoretical Issues in Ergonomics Science, 2, 352-367.

168


Wiener, E. L., & Curry, R. E. (1980). Flight deck automation: Promises and problems. Ergonomics, 23, 995-1011. Willems, B., & Truitt, T. R. (1999). Implications of Reduced Involvement in En Route Air Traffic Control (DOT/FAA/CT-TN-99/22). Atlantic City, NJ: Federal Aviation Administration Technical Center. Williams, K. W. (2004). A summary of unmanned aircraft accident/incident data: Human factors implications (DOT/FAA/AM-04/24). Washington, DC: Office of Aerospace Medicine, Federal Aviation Administration. Woodhouse, R., & Woodhouse, R. A. (1995). Navigation errors in relation to controlled flight into terrain (CFIT) accidents. 8th International Symposium on Aviation Psychology. Columbus, OH, April. Yanco, H.A., & Drury, J. (2004). Where am I? Acquiring situation awareness using a remote robot platform. In Proceedings of the IEEE International Conference on Systems, Man and Cybernetics (pp. 2835-2840). Arlington, VA. Yeh, M., & Wickens, C. D. (2001). Display signaling in augmented reality: Effects of cue reliability and image realism on attention allocation and trust calibration. Human Factors, 43, 355-365. Yeh, M., & Wickens, C. D., & Seagull, F.J. (1999). Target cueing in visual search: The effects of conformity and display location on the allocation of visual attention. Human Factors, 41, 524-542. Young, L. R. (1969). On adaptive manual control. IEEE Transactions on System, and Cybernetics, 10, 292-331.

169


Zwaan, R. A. (1996). Processing narrative time shifts. Journal of Experimental Psychology: Learning, Memory, and Cognition, 22, 1196-1207. Zwaan, R. A., Langston, M. C., & Graesser, A. C. (1995). The construction of situation models in narrative comprehension : The event-indexing model. Psychological Science, 6, 292-297. Zwaan, R. A., Magliano, J. P., & Graesser, A. C. (1995). Dimensions of situation model construction in narrative comprehension. Journal of Experimental Psychology: Learning, Memory, and Cognition, 21, 386-397. Zwaan, R. A. & Radvansky, G. A. (1998). Situation models in language comprehension and memory. Psychological Bulletin, 123, 162-185. Zwaan, R. A., Radvansky, G. A., Hilliard, A. E., Curiel, J. M. (1998). Constructing multidimensional situation models during reading. Scientific Studies of Reading, 2, 199220.

170


APPENDIX B EXTENDED RESULTS AND DISCUSSION The purpose of the current study was to examine the differences in the SA and performance of individuals when they worked with different levels of a conflict detection aid to control traffic. The demographic characteristics of the participants are presented in Table 1. Table 1. Demographic characteristics of the participants

Mean Age (Standard deviation is presented in parenthesis) Number of male participants Number of female participants Number of graduate students Number of undergraduate students Number of working professionals Video gaming experience Number of participants who do not play video games currently Number of participants who play video games once a year Number of participants who play video games once in 6 months Number of participants who play video games once in a month Number of participants who play video games weekly Number of participants who play video games daily



20 (2.90)


21 (3.83)

Decision and Action Selection 22 (3.69)

21(3.78)

6 12 4 13 1

10 8 3 14 1

6 12 5 13 0

5 13 4 13 1

6

3

4

6

0

1

2

1

1

1

5

2

7

9

4

3

3

4

3

6

1

0

0

0

Manual Assessment Before assigning the participants to the various LOAs, they were evaluated on their performance in two manual scenarios. The purpose of this manual assessment was to determine whether the participants were equivalent in their ability to manually control traffic before assigning them to the LOAs. One way Analysis of Variance (ANOVA) with LOA as the between subjects variable was performed on the ATC task performance variables (advance notification time, rule violations, and handoff delay), and secondary

171


task performance variables (hit-to-signal ratio, reaction time to hits) for the two manual test scenarios. Effect of LOA on ATC Performance Six one way ANOVAs with LOA as the between subjects variable were performed on advance notification time, number of rule violations, and handoff delay for the first and second manual test scenarios. A Bonferroni adjustment (e.g., Tabachnick & Fidell, 2001) was done to account for multiple testing (adjusted significance level was 0.008). Advance Notification Time Both the manual test scenarios involved a total of four planned collisions. Advance notification times were averaged across the four collisions for each of the scenarios. There was no significant effect of LOA on advance notification time for the first manual test scenario, F (3, 68) = 2.358, p > .05, ηp2= 0.094. Similarly, there was no significant effect of LOA on advance notification time for the second manual test scenario, F (3, 68) = 4.084, p > .01, ηp2= 0.153. Rule Violations The effect of LOA on the number of rule violations failed to reach significance for the first manual test scenario, F (3, 68) = .510, p > .05, ηp2= 0.022. Likewise, even in the second manual test scenario, there was no significant effect of LOA on the number of rule violations, F (3, 68) = .536, p > .05, ηp2= 0.023. Handoff Delay The effect of LOA on the handoff delay failed to reach significance for the first manual test scenario, F (3, 68) = 2.191, p > .05, ηp2= 0.088. Similarly, even in the second

172


manual test scenario, there was no significant effect of LOA on the handoff delay, F (3, 68) = 2.493, p > .05, ηp2= 0.099. In summary, the participants assigned to the various LOAs did not differ in their performance in the ATC task, when they manually controlled traffic. Effect of LOA on Secondary Task Performance Hit-to-signal Ratio The effect of LOA on the hit-to-signal ratio failed to reach significance for the first manual test scenario, F (3, 68) = 1.609, p > .05, ηp2= 0.066. Similarly, there was no significant effect of LOA on the hit-to-signal ratio for the second test scenario, F (3, 68) = 2.085, p > .05, ηp2= 0.084. Reaction Time to Hits There was no significant effect of LOA on the reaction time to hits for the first manual test scenario, F (3, 68) = .390, p > .05, ηp2= 0.017. Likewise, there was no significant effect of LOA on the reaction time to hits for the second manual test scenario, F (3, 68) = .183, p > .05, ηp2= 0.008. In summary, there were no differences between the participants assigned to the various LOAs in their secondary task performance, while they manually controlled traffic. Thus, participants assigned to the various LOAs did not differ in their manual skills, both in the ATC task as well as the secondary task. Thus, any differences in the ATC and secondary task performance that might arise when the participants work with the different LOAs can be attributed solely due to their overreliance on the automated aids provided to them rather than a difference in manual skills.

173


LOA Assessment The dependent variables of interest in this study were ATC task performance variables (advance notification time following automation failure, rule violations, and handoff delay), SA, meta-SA, secondary task performance (hit-to-signal ratio, reaction time to hits, and false alarms), and subjective rating of workload. The descriptive statistics are shown in Table 2.

174


Table 2. Descriptive Statistics for the ATC performance variables, SA variables, metaSA, secondary task performance, and subjective workload for all automation conditions for the test scenario. Means and standard deviations (in parenthesis) are shown

ATC performance variables Advance notification time following automation failure (in s): Number of Rule violations Handoff delay (in s) SA (proportion accuracy) Call sign recall Destination recall Location recall Heading recall Altitude recall Total SA Meta-SA Meta-memory for call sign Meta-memory for destination Meta-memory for location Meta-memory for heading Meta-memory for altitude Secondary task performance Hit-to-signal ratio RT to hits (in s) Proportion of false alarms Subjective workload rating





55.261 (15.078)

28.105 (25.870)

16.436 (20.239)

14.170 (15.662)

.500 (.786)

1.111 (1.450)

.167 (.383)

1.111 (1.779)

23.722 (12.960)

26.833 (14.051)

41.222 (20.916)

24.500 (17.521)

.072 (.141) .572 (.193) .472 (.136) .750 (.179) .706 (.234) .514 (.101)

.078 (.148) .444 (.161) .344 (.146) .600 (.161) .517 (.172) .397 (.085)

.056 (.125) .422 (.190) .350 (.165) .661 (.120) .589 (.181) .416 (.085)

.022 (.042) .383 (.238) .367 (.150) .678 (.226) .467 (.243) .383 (.125)

1.111 (1.811)

1.611 (2.090)

1.889 (2.111)

2.222 (2.579)

5.722 (2.444)

5.389 (2.200)

5.278 (2.396)

5.389 (1.720)

5.056 (1.552)

5.889 (1.967)

5.333 (1.847)

6.056 (2.127)

5.778 (2.102)

5.722 (2.761)

5.889 (2.632)

6.056 (2.014)

7.611 (1.195)

5.611 (2.893)

6.333 (2.086)

5.667 (2.142)

.818 (.103) 1.197 (.128) .025 (.017)

.906 (.070) 1.071 (.149) .039 (.037)

.933 (.047) 1.083 (.112) .030 (.020)

.939 (.032) 1.123 (.135) .027(.017)

68.185 (9.488)

64.546 (13.174)

63.861 (14.649)

61.565 (7.096)

Effect of LOA on ATC Performance Three one way ANOVAs with LOA as the between subjects variable was performed on advance notification time following automation failure, number of rule violations, and handoff delay. A Bonferroni adjustment (e.g., Tabachnick & Fidell, 2001) was done to account for multiple testing (adjusted significance level was 0.017). Throughout the

175


results section, effects that reached traditional, uncorrected levels of significance will be reported as marginal. Advance Notification Time A one way ANOVA with LOA as the between subjects variable was performed on advance notification time following automation failure. Based on the active-processing hypothesis, it was expected that the participants assigned to the information acquisition condition will have higher advance notification time following automation failure in comparison to those in the high LOAs. On the other hand, based on the free cognitive resources hypothesis, it was expected that participants in the information acquisition condition will have lower advance notification time following automation failure compared to those in the high LOAs. The effect of LOA on advance notification time was significant, F (3, 68) = 16.507, p < .001, ηp2= 0.421. Tukey’s HSD analysis was conducted on the mean advance notification time to identify the LOAs that were significantly different from each other. The results revealed that individuals working with information acquisition automation were significantly faster in detecting the collision compared to individuals working with all the other LOAs, following the automation failure. No significant differences in the advance notification time were observed between the individuals working with the information analysis automation (95% CI = 18.842, 37.368), decision automation (95% CI = 7.173, 25.699) and action implementation automation (95% CI = 4.907, 23.433). The mean advance notification times for information acquisition automation, information analysis automation, decision and action selection automation, and action

176


implementation automation were 55.261 s (SD = 15.078), 28.105 s (SD =25.871), 16.436 s (SD =20.239), and 14.170 s (SD =15.662), respectively (See Figure 1). Effect of Level of Automation on Advance Notification Time (Following automation failure)


70

60

50

40

30

20

10

0

sis on ion tion tati lect n a ly n uisi e e A q S c n n plem nA a ti o ctio atio n Im dA orm m f o n i r n t a o I Ac Inf ision Dec

Level of Automation

Figure 1. Effect of LOA on advance notification time following automation failure. The ideal advance notification time was 70 s. Error bars indicate + 1 standard error of the mean.

In summary, when the automation failed, individuals working with information acquisition automation detected the upcoming collision earlier in comparison to the other LOAs. Specifically, the individuals in the information acquisition condition detected the upcoming collision approximately 27 s, 38 s, and 41 s earlier than information analysis automation, decision and action selection automation, and the action implementation automation, respectively. Participants in the information acquisition condition were

177


actively involved in making predictions about potential collisions, generating their own decisions on how to avoid the collisions, and implementing their decisions, contributing to better performance following automation failure compared to the other LOAs. Thus, automation of sensory processing had benefits over automation of prediction generation, decision making, and execution of automation generated predictions and decisions. This result confirmed the active-processing hypothesis, indicating that decreasing the level of operator involvement can prove hazardous if the automation reliability is not 100%. There is a plethora of research demonstrating the advantages of active processing. For example, Craik and Tulving (1975) showed that deeper levels of processing improve the memorability of information. Similarly, research on student learning has shown that students who generated their own questions after studying a text performed better on an upcoming test than those who received the questions from the experimenter (Foos, Mora, &, Tkacz, 1994). The disadvantages of passive involvement have been demonstrated in safety-critical domains as well (e.g., Gugerty, 1997; Young, 1969; Wickens & Kessel, 1979). For example, Endsley & Kiris (1995) showed that individuals working with a fully automated navigation expert system took longer to perform the task manually following the automation failure compared to those who performed the task manually. Rule Violations A one way ANOVA with LOA as the between subjects variable was performed on rule violations. Based on the active-processing hypothesis, it was expected that the participants assigned to the information acquisition condition will have lesser number of rule violations in comparison to those in the high LOAs. On the other hand, based on the free cognitive resources hypothesis, it was expected that participants in the information

178


acquisition condition will have higher number of rule violations compared to those in the high LOAs. However, the effect of LOA on the total number of rule violations failed to reach significance, F (3, 68) = 2.628, p >.05, ηp2= 0.104. This means that none of the automated aids appeared to have benefited individuals in tasks such as landing and exiting aircraft at the right speed and altitude. Handoff Delay A one way ANOVA with LOA as the between subjects variable was performed on handoff delay. ANOVA with LOA as the between subjects variable was performed on rule violations. Based on the active-processing hypothesis, it was expected that the participants assigned to the information acquisition condition will have lower handoff delay in comparison to those in the high LOAs. On the other hand, based on the free cognitive resources hypothesis, it was expected that participants in the information acquisition condition will have higher handoff delay compared to those in the high LOAs. There was a significant effect of LOA on handoff delay, F (3, 68) = 4.372, p < .01, ηp2 = 0.162. The mean handoff delay for information acquisition automation, information analysis automation, decision and action selection automation, and action implementation automation were 23.722 s (SD = 12.960), 26.833 s (SD =14.051), 41.222 s (SD =20.916), and 24.500 s (SD =17.521), respectively (See Figure 2). Tukey's HSD analysis was conducted on the mean handoff delay time. Results indicated that individuals working with information acquisition automation were faster in accepting planes into their airspace compared to those working with the decision automation. The post-hoc analysis also revealed that individuals working with the active implementation automation were faster in accepting aircraft into their airspace in comparison to those working with the

179


decision automation. These results were not in full agreement with either the activeprocessing hypothesis or the free cognitive resources hypothesis. Thus, it would safe to conclude that providing automated aids did not appear to have helped individuals in accepting planes quickly into their airspace.

Effect of Level of Automation on Handoff Delay 50

Handoff Delay (in seconds)

40

30

20

10

0

n sis tion tion itio aly nta uis elec e An q S c m n ion ple nA atio Im Act rm atio n d o f m o n i r t o In na Ac Inf isio Dec

Level of Automation

Figure 2. Effect of LOA on handoff delay (in seconds). Error bars indicate + 1 standard error of the mean. SA The overall proportion accuracy score for call sign was obtained by averaging the proportion accuracy score for call sign obtained during the two SAGAT freezes. The overall proportion accuracy score for destination, altitude, and heading were obtained in the same manner as the overall proportion accuracy score for call sign. In order to

180


compute the proportion accuracy score for aircraft location, the distance error (in cm) between each aircraft’s reported and actual location was computed. If this distance error exceeded the “5-mile” distance (5 miles = 0.8 cm) in the ATST airspace or the aircraft was positioned in a wrong segment in the airspace, then the response was recorded as incorrect. The proportion accuracy scores for call sign, location, altitude, heading, and destination obtained during the two SAGAT freezes were averaged to compute the total SA score. Effect of LOA on Total SA A one way ANOVA with LOA as the between subjects variable was performed on the total SA score. Based on the active-processing hypothesis, it was expected that the participants assigned to the information acquisition condition will have higher SA than those in the high LOAs. On the other hand, based on the free cognitive resources hypothesis, it was expected that participants in the information acquisition condition will have lower SA compared to those in the high LOAs. The main effect of LOA on the total SA score was significant, F (3, 68) = 6.304, p < .01, ηp2 = 0.218. Tukey’s HSD analysis on the total SA score showed that individuals working with information acquisition automation had higher overall SA compared to those working with all the other LOAs. The mean proportion accuracy score for total SA for the information acquisition automation, information analysis automation, decision and action selection automation, and action implementation automation were 0.514 (SD =.101), 0.397 (SD =.085), 0.416 (SD =.085), and 0.383 (SD =.125), respectively (See Figure 3).

181


Effect of Level of Automation on total SA

Proportion accuracy for total SA

1.0

0.8

0.6

0.4

0.2

0.0

on ion ion ysis ta ti isit lect na l n u e a e s q n m n ac ple atio ctio m a tion m i a r d o ion an orm In f Act ion In f s i c De

Level of Automation

Figure 3. Effect of LOA on Total SA. Error bars indicate + 1 standard error of the mean.

In summary, individuals assigned to the information acquisition automation condition had higher total SA in comparison to the other automation conditions. They were also faster in detecting the collision following automation failure compared to the individuals in the other LOAs. These results were consistent with the active-processing hypothesis, indicating that decreasing the level of operator involvement by automating higher order stages of information processing such as generating inferences, decision making, and executing automation generated decisions can lower SA and leave the operator out of the loop.

182


Total SA as a Mediator between LOA and Performance Following Automation Failure The methods developed by Baron and Kenny (1986) were used to determine whether the relationship between the LOA and the advance notification time following automation failure was mediated by total SA. For SA to serve as a mediator in the association between LOA and the advance notification time, the following conditions should be met: (1) LOA should be associated with SA (2) SA should be associated with the advance notification time, (3) LOA should be associated with the advance notification time, and (4) the association between LOA and advance notification time should decrease, when SA is controlled (See Figure 4). Four linear regression analyses were conducted to test for mediation. First, a regression analysis was conducted with LOA predicting total SA. The results showed that total SA was predicted by LOA (β = -.379, p < .01). Second, a regression analysis was performed with total SA predicting advance notification time. The results revealed that advance notification time was predicted by total SA (β = .488, p < .001). Third, a regression analysis was conducted with the LOA predicting advance notification time. Results indicated that advance notification time was predicted by LOA (β = -.600, p < .001). Fourth, a regression analysis was performed with LOA and total SA predicting advance notification time. The results revealed that advance notification time was predicted by both LOA (β = -.484, p < .001) and total SA (β = .304, p < .01). To determine whether total SA mediates the relationship between LOA and advance notification time following automation failure, Sobel’s (1982) test was conducted. The following statistics were obtained: the unstandardized regression coefficient for the association between LOA and total SA and its standard error as well as the

183


unstandardized regression coefficient for the association between total SA and advance notification time, adjusted for LOA and its standard error. Sobel test statistics revealed significant p values for the indirect effect of total SA as a mediator between LOA and advance notification time following automation failure (z = -2.287, p < .05). In summary, pre-automation failure total SA mediated the relationship between LOA and advance notification time following automation failure, with higher SA contributing to earlier detection of the collision following automation failure. This finding is important to the main purpose of the study. This study demonstrates that SA predicts performance when operators interact with automated systems, with lower SA resulting in poor performance when the automation fails. Pre-automation Failure SA (Mediator)

β = .488***

β = -.379 ** β = -.6*** LOA

Without mediation

β = -.484***


With mediation

Figure 4. Mediation model of the association between LOA and advance notification time following automation failure as mediated by Total SA. Standardized regression coefficients are presented in the figure. ***p < .001, **p < .01. Aggregating the recall accuracy scores of all the attributes to obtain an overall SA score does not tell us anything about how individuals come to comprehend ATC situations. Therefore, the extent to which participants working with the different LOAs monitored the different aircraft attributes was examined.

184


Effect of LOA on SA Variables A one-way between subjects multivariate analysis of variance (MANOVA) was performed on five dependent variables: call sign, destination, location, altitude, and heading. The independent variable was the LOA. With the use of the Wilk’s criterion, the dependent variables were significantly affected by the LOA, F (15, 177.077) = 2.433, p < .01, ηp2 = 0.158. Univariate analyses showed that there was an effect of LOA on altitude recall, F (3, 68) = 4.398, p < .01, ηp2 = .162. Tukey’s HSD analysis on the mean altitude recall showed that participants in the information acquisition condition (M = .706, SD = .234) recalled altitude higher than those in the information analysis (M = .517, SD = .172) and action implementation condition (M = .467, SD = .243; See Figure 5). Note that the participants in the information acquisition condition were provided with a color coding aid that cued the aircraft altitude using color. Color coding would have helped the participants in this LOA to preattentively process altitude (e.g., Johnston et al., 1993, Treisman, 1986), leading to superior recall of this attribute.

185


Effect of Level of Automation on Altitude Recall

Proportion accuracy for altitude recall

1.0

0.8

0.6

0.4

0.2

0.0


Level of Automation

Figure 5. Effect of LOA on altitude recall. Error bars indicate + 1 standard error of the mean.

In summary, individuals working with the information acquisition automation had higher awareness of the altitude of the aircraft in their airspace in comparison to individuals working with the other LOAs. This finding provided partial support for the unequal attribute relevance hypothesis. Aircraft altitude help controllers anticipate collisions between aircraft pairs and are monitored to a greater extent by controllers (e.g., Means et al., 1998; Mogford, 1997). Surprisingly, the participants in the various LOAs did not differ in the recall of the other spatial attributes, namely aircraft location and heading. This could be because location and heading are represented graphically on the

186


screen and hence might be easier to remember irrespective of the LOA to which an individual is assigned. Aircraft altitude, on the other hand, consists of numeric information, which needs to be monitored to be remembered. This could explain why participants in the information acquisition condition, who were actively involved in detecting collisions, making decisions, and resolving collisions with minimal automation support, remembered the aircraft attribute better than the other LOAs. Univariate analyses also showed that there was an effect of LOA on destination recall, F (3, 68) = 3.081, p < .05, ηp2 = .120. Tukey’s HSD analysis on the mean destination recall revealed that participants in the information acquisition condition (M = .572, SD = .193) recalled destination of aircraft higher than those in the action implementation condition (M = .383, SD = .238; See Figure 6).

187


Effect of Level of Automation on Destination Recall

Proportion accuracy for destination recall

1.0

0.8

0.6

0.4

0.2

0.0


Level of Automation

Figure 6. Effect of LOA on destination recall. Error bars indicate + 1 standard error of the mean.

Thus, the LOAs differed only in the monitoring of two dimensions: altitude and destination. There is also evidence that these attributes are monitored the most by controllers. For example, Durso, Batsakes, Crutchfield, Braden, and Manning (2004) found that when controllers were asked to prioritize the strip markings that they made on flight progress strips, they judged altitude and route issued as the most frequent and critical strip markings. There was no difference between the LOAs in the recall of aircraft call sign, location, and heading. The individuals assigned to the different LOAs did not differ in their recall

188


of location and heading possibly because these are presented graphically on the screen and are hence easier to monitor irrespective of the LOA with which one is working. The individuals working with the LOAs did not also differ in the recall of the call sign attribute. This is because call sign needs to be remembered only on a need-to-know basis (e.g., Gronlund et al., 1998). The finding that individuals assigned to the different LOAs differed only in the recall of certain aircraft attributes is relevant to the purpose of the study. Considering all the aircraft attributes as equal does not tell us anything about how controllers come to comprehend situations. In order to understand the process of SA in the ATC domain, it is important to understand the extent to which individuals monitor various aircraft attributes separately. Results of binomial probabilities. The LOAs differed in the recall of the aircraft altitude. A guessing score was computed for altitude in order to determine whether the participants were guessing the aircraft altitude during the SAGAT freezes. An aircraft could fly at altitude 1, 2 or 3. Hence, the guessing score for altitude was 0.33. Binomial probabilities were calculated to determine whether the proportion accuracy of recall of altitude for each LOA was significantly higher than the guessing score. As shown in Table 3, the results indicated that the overall accuracy of recall of altitude for each LOA was significantly higher than the guessing score. (p < .00001).

189


Table 3. Results from the binomial probability calculations LOA



.706


.517

Greater than the guessing score


.589



.467


Accuracy of Response Greater than the guessing score

The LOAs also differed in the recall of the aircraft destination. A guessing score was computed for destination in order to determine whether the participants were guessing the aircraft destination during the SAGAT freezes. An aircraft could fly to 6 possible destinations. The guessing score for destination for the first test scenario was .242. Binomial probabilities were calculated to determine whether the proportion accuracy of recall of destination for each LOA was significantly higher than the guessing score. As shown in Table 4, the results indicated that the overall accuracy of recall of destination for each LOA was significantly higher than the guessing score. (p < .00001).

Table 4. Results from the binomial probability calculations

LOA



.572


.444



.422



.383



SA of Altitude and Destination as a Mediator between LOA and Performance The LOAs differed only in the monitoring of altitude and destination. Therefore, the proportion accuracy scores on these two attributes were combined to obtain an overall

190


score. The extent to which this SA score mediated the relationship between LOA and ATC performance following automation failure was examined. The methods developed by Baron and Kenny (1986) were used to determine whether the relationship between the LOA and the advance notification time following an automation failure was mediated by SA for altitude and destination. For SA to serve as a mediator in the association between LOA and the advance notification time, the following conditions should be met: (1) LOA should be associated with SA (2) SA should be associated with the advance notification time, (3) LOA should be associated with the advance notification time, and (4) the association between LOA and advance notification time should decrease, when SA is controlled (See Figure 7). Four linear regression analyses were conducted to test for mediation. First, a regression analysis was conducted with LOA predicting SA. The results showed that SA was predicted by LOA (β = -.395, p < .01). Second, a regression analysis was performed with SA predicting advance notification time. The results revealed that advance notification time was predicted by SA (β =.444, p < .001). Third, a regression analysis was conducted with the LOA predicting advance notification time. Results indicated that advance notification time was predicted by LOA (β = -.600, p < .001). Fourth, a regression analysis was performed with LOA and SA predicting advance notification time. The results revealed that advance notification time was predicted by both LOA (β = -.503, p < .001) and SA (β = .245, p < .05). To determine whether SA for altitude and destination mediates the relationship between LOA and advance notification time following automation failure SA, Sobel’s (1982) test was conducted. Sobel test statistics revealed significant p values for the

191


indirect effect of SA as a mediator between LOA and advance notification time following automation failure (z = -2.024, p < .05). In summary, SA for altitude and destination mediated the relationship between LOA and advance notification time following automation failure. When the SA for altitude and destination was higher, participants detected collisions earlier. This finding is important to the main purpose of the study. This study demonstrates that SA predicts performance when operators interact with automated systems, with lower SA resulting in lower performance when the automation fails. Pre-automation Failure SA (Mediator)

β = .444***

β = -.395 ** β = -.6*** LOA

Without mediation

β = -.503***


With mediation

Figure 7. Mediation model of the association between LOA and advance notification time following automation failure as mediated by SA for altitude and destination. Standardized regression coefficients are presented in the figure. ***p < .001, **p < .01.

Meta-SA At the end of the test scenario, participants were asked to rate their confidence in correctly recalling call sign, destination, altitude, heading, and location during the SAGAT freezes. Note that the participants could assign the same confidence ratings to multiple attributes. Altitude was reported as the most accurately recalled aircraft attribute by the participants (55.56%), followed by heading (34.72%), location (20.83%), destination (19.44%), and call sign (2.78%). Asking participants to recall the locations of

192


the aircraft in their airspace before all the other aircraft attributes during the SAGAT freezes would have biased them to pay more attention to the location attribute. However, majority of the participants did not rate location as the most remembered aircraft attribute. Effect of LOA on Meta-SA Four one way ANOVAs with LOA as the between subjects variable were performed on metamemory for call sign, destination, altitude, heading, and location. A Bonferroni correction was implemented to account for multiple testing (adjusted significance level was 0.01). The effect of LOA on meta-memory for altitude reached marginal significance, F (3, 68) = 3.324, p < .05, ηp2 = 0.128. Tukey's HSD analysis showed that individuals working with information acquisition automation had higher confidence in their ability to correctly recall altitude (M = 7.611, SD = 1.195) compared to those in the information analysis automation (M = 5.611, SD = 2.893) and action implementation automation (M =5.667, SD = 2.142). The effect of LOA on the meta-memory for other aircraft attributes failed to reach significance. This meant that there were no differences between the participants working with varying LOAs in their confidence in their own SA. Even though participants working with the information acquisition automation had higher SA than participants assigned to other LOAs, they did not rate their SA as higher in comparison to participants in the other LOAs. It is important that individuals have good SA and correct confidence in their SA so that they can make the right decisions based on their SA (e.g., Endsley et al., 2003).

193


Relationship between SA and Meta-SA For each LOA, correlations between actual SA and meta-SA were examined. Participants in the information acquisition condition (r = .613, p < .01) and decision automation condition (r = .762, p < .001) rated that they poorly recalled the call sign attribute when they actually did so. For participants in the information analysis condition, judgments of recall of call sign (r = .694, p < .01) and altitude (r = .722, p < .01) significantly correlated with the accuracy of recall of these attributes. Virtually all the significant correlations involved the call sign attribute. Call sign recall was generally poor and participants were sensitive to their awareness of how well they recalled the call sign attribute. With the exception of call sign, meta-SA did not mirror actual SA. For participants in the action implementation condition, the correlations between accuracy of recall of the aircraft attributes and the confidence in the recall of these attributes did not reliably differ from zero. Overall, regardless of the LOA with which the participants were working, their judgments about their SA were incongruent with their actual SA. That is, irrespective of whether individuals worked with an automated system applied to a lower stage of information processing (i.e., information acquisition) or higher stages of information processing (i.e., information analysis, decision- making or action implementation), their perceptions about their SA did not correlate with their actual SA. Individuals working with information acquisition automation had higher actual SA than those assigned to the other LOAs. Therefore, it was surprising that their judgments about their SA did not correlate with their actual SA.

194


Making correct judgments about one’s SA is important to adopt better strategies to improve one’s SA. Individuals working with high LOAs had lower SA than those working with the information acquisition automation. Therefore, it is especially important that individuals working with high LOAs be able to make accurate meta-cognitive judgments about their SA so that they can adopt better monitoring strategies and be equipped to respond to automation failures. The incongruence between meta-SA and actual SA is consistent with the work on basic metacognition (e.g., Dunlosky et al., 2007). For example, judgments of how well individuals can recall information from a text that they learned in an upcoming test are found to be uncorrelated with their actual test performance. The evidence so far is in agreement with the active processing hypothesis. That is, individuals working with high LOAs had lower SA for aircraft attributes that consequently led to poor performance following automation failure in the ATC task. If individuals working with the high LOAs were highly reliant on the automated aids provided to them in the ATC task, they should exhibit superior performance in the secondary task. Individuals assigned to high LOAs should also experience lower workload than those in the lower LOAs because the primary task of controlling traffic is less demanding for individuals in the high LOAs. Therefore, the next section will examine the secondary task performance and subjective workload ratings of the individuals assigned to the four LOAs. Secondary Task Performance Three one way ANOVAs with LOA as the between subjects variable were performed on hit-to-signal ratio, reaction time to the hits, and the number of false alarms. A

195


Bonferroni correction was implemented to account for multiple testing (adjusted significance level was 0.017). Effect of LOA on Hit-to-signal Ratio There was a significant effect of LOA on the hit-to-signal ratio, F (3, 68) = 12.087, p < .001, ηp2 = 0.348. Tukey's HSD analysis revealed that individuals working with information analysis automation, decision automation, and action implementation automation had higher hit-to-signal ratio compared to those working with the information acquisition automation. The mean hit-to-signal ratio for information acquisition automation, information analysis automation, decision and action selection automation, and action implementation automation were 0.818 (SD = .103), 0.906 (SD = .070), 0.933 (SD = .047), and 0.939 (SD = .032), respectively (See Figure 8).

196


Effect of Level of Automation on Secondary Task Performance 1.0

Hit-to-signal ratio

0.8

0.6

0.4

0.2

0.0

on ion ion ysis tati isit lect nal n u e e A s q m Ac ion tion ple act tion Im ma r a d n o n io orm Inf na Act Inf isio c e D

Level of Automation

Figure 8. Effect of LOA on hit-to-signal ratio. Error bars indicate + 1 standard error of the mean. Effect of LOA on Reaction Time to Hits The effect of LOA on the reaction time to hits reached marginal significance, F (3, 68) = 3.313, p < .05, ηp2 =0.128. Tukey's HSD analysis revealed that individuals working with information analysis automation were faster in responding to wind shear alerts compared to those working with the information acquisition automation. The mean reaction time to hits for information acquisition automation, information analysis automation, decision and action selection automation, and action implementation automation were 1.196 s (SD = .128), 1.071s (SD = .149), 1.083 (SD = .112), and 1.123 (SD = .135), respectively.

197


In summary, individuals working with information analysis, decision, and action implementation automation were more accurate in responding to wind shear warnings compared to those working with the information acquisition automation. The superior performance in the secondary task by individuals working with high LOAs can be attributed to the overreliance they exhibited on the collision detection aids provided to them to perform the ATC task. Participants in the action implementation condition were given an automation aid that detected as well as resolved upcoming collisions between aircraft. This was in sharp contrast to the participants in the information acquisition condition, who had to detect and resolve collisions that would occur in the scenarios, with minimal automated assistance. Therefore, in comparison to the participants in the information acquisition condition, participants working with the other LOAs had more cognitive resources to assign to the secondary task or they chose to allocate more cognitive resources to the secondary task. In short, secondary task performance was superior when individuals worked with high LOAs. These results demonstrate the benefits of high LOAs. When reliable, high LOAs helped individuals to exhibit superior performance in the secondary task. This is beneficial in multi-task environments like ATC where operators have to perform multiple tasks concurrently. Effect of LOA on False Alarms The effect of LOA on the number of false alarms failed to reach significance, F (3, 68) = 1.167, p > .05, ηp2 =0.049. Binomial probabilities were also calculated to determine whether the number of false alarms for each LOA was significantly less than chance probability. A response was recorded as a false alarm when the participants pressed the

198


spacebar key when a wind shear warning was not presented on the weather display. This analysis helped to establish whether participants in all the LOAs were attending to the secondary task, rather than merely pressing the spacebar key. As shown in Table 5, the results indicated that the overall percentage of false alarms for each LOA was significantly less than chance probability (p < .000001). Table 5. Results from the binomial probability calculations LOA



.025


.039



.030



.027



Results from Analysis of Covariance (ANCOVA) An ANCOVA was conducted to assess the effect of LOA on performance following automation failure after controlling for the performance in the secondary task. That is, the hit-to-signal ratio served as the covariate. Results indicated that even after controlling for the secondary task performance, there was a significant effect of LOA on advance notification time following automation failure, F(3, 67) = 8.335, p < .001, ηp2 = .272. The covariate did not significantly contribute to the ANCOVA model, F (1, 67) = 1.455, p > .05, ηp2 = .021. Post-hoc analysis using Tukey’s Least Significant Difference (LSD) was performed on the advance notification time to determine the LOAs that differed significantly from each other. The adjusted means for the information acquisition, information analysis, decision and action selection, and action implementation automation were 51.852 s (SE = 5.421), 28.387 s (SE = 4.633), 17.882 s (SE = 4.780), and 15.850 s (SE = 4.832), respectively. This means that even after controlling for the

199


secondary task performance, participants in the information acquisition condition detected collisions earlier in comparison to the participants in the information analysis, decision, and action implementation conditions, following the failure of the automation. This finding is important to the central purpose of this study. This means that automating higher levels of information processing can leave individuals out of the loop even if automation monitoring is the only task that individuals have to perform. Effect of LOA on Subjective Ratings of Workload The workload ratings collected at the end of the two SAGAT freezes were averaged to obtain an overall total workload score. The effect of LOA on the subjective workload ratings failed to reach significance, F (3, 68) = .723, p >.05, ηp2 = 0.031. It was expected that participants assigned to the higher LOAs would rate their workload as being lower because the primary task of controlling traffic would be less demanding for them. However, this was not the case. This could be because highly automated systems do not always reduce workload due to the shift in the operator’s role from active involvement to passive monitoring (e.g., Walker, Stanton, & Young, 2001). It could also be that subjective ratings of workload are insensitive measures of actual workload (e.g., Meshkati, Hancock, Rahimi, & Dawes, 1995). Follow Up Purpose After completing the test scenario, the participants were asked to complete a follow up scenario. The purpose of the follow up was to determine how the SA and performance of individuals working with the different LOAs would differ after experiencing the first automation failure. There are two lines of research that explain how operator

200


performance differs after being exposed to an automation failure. One line of research claims that operators’ monitoring of automated systems improves following the first automation failure, presumably due to calibration of trust in the automation. For example, using a target detection task, Merlo, Wickens, and Yeh (1999) examined the performance consequences following the failure of an automated cueing aid. In their experiment, participants were instructed to search for targets in a hand-held or helmet mounted display, classify the target as friend or enemy, and report the azimuth of targets. In some trials, an automated aid cued the location of targets on the display. The automated cueing aid was not fully reliable. The participants were also given a secondary task to perform in conjunction with the primary task of identifying targets. Merlo et al. (2000) found that when the automated aid failed for the first time, target detection rates were only 50%. However, target detection rates in subsequent failure trials improved to 91%. Presumably, prior to the failure of the cueing algorithm, participants exhibited overtrust in the aid and were immediately drawn to the search space that was cued by the aid. Following the failure of the aid, participants calibrated their trust in the aid, which reduced their attentional tunneling. Similarly, Rovira et al. (2007) also examined how the performance of operators changes after being exposed to the failure of different LOAs. Participants were asked to perform a military decision-making task using information automation, low decision automation, medium decision automation, and high decision automation. They found that participants exhibited significant performance improvements in subsequent failure trials (compared to their performance after the first automation failure) for both medium and high decision automation. In summary, individuals tend to be more cautious after they experience an automation failure. Instead of exhibiting overreliance on the

201


automation, they adopt new cognitive strategies, which are reflected in their performance following the failure. However, there is also evidence that operator performance may not always improve subsequent to automation failure (e.g., Parasuraman et al., 1993; Parasuraman & Riley, 1997). For example, Kantowitz et al. (1997) found that drivers exhibited poor performance because they continued to rely on a navigation system even though it made errors, when they were in an unfamiliar city, where their confidence in manual navigation was low. Similarly, Biros et al. (2004) showed that military decision makers continued to rely on an automated aid when their workload was high, despite the low trust ratings they assigned to the automation. Therefore, operator reliance on an automated aid following the first occurrence of an automation failure is dependent on operator self confidence as well as workload. In summary, there is research that suggests that operators’ monitoring of automated systems improves following the first automation failure, presumably due to calibration of trust in the automation. However, there is also research suggesting that operators continue to rely on the automation even when the aid makes errors if their confidence in the aid exceeds self confidence (e.g., Kantowitz et al., 1997) or if their workload is high (e.g., Biros et al., 2004; Parasuraman et al., 1993). There has been no work done to understand the effects of the first automation failure on subsequent operator SA and performance when individuals work with automated systems that are applied to all the four stages of information processing. In short, the purpose of the follow up was to determine the effects of first automation failure on SA

202


and performance of individuals when working with information acquisition, information analysis, decision and action selection, and action implementation in the ATC domain. Overview of Follow Up A test scenario was designed to determine how the SA and performance of individuals working with the different LOAs would differ after exposure to the automation failure in the first test scenario. This scenario involved a total of 6 scripted collisions. The different LOAs were perfectly reliable in detecting the first 5 collisions. The automated aids failed to detect the 6th planned collision. The time taken by participants working with the different LOAs to detect the 6th collision (i.e., advance notification time) following the failure of the automation was recorded. Like in the first test scenario, the number of rule violations and handoff delay times were also recorded. SA of the participants was assessed once during the scenario. Workload ratings were collected at the end of the SAGAT freeze. As in the first test scenario, meta-SA ratings were collected at the end of the scenario. The descriptive statistics are shown in Table 6.

203


Table 6. Descriptive Statistics for the ATC performance variables, SA variables, metaSA, secondary task performance, and subjective workload for all automation conditions for the follow up scenario. Means and standard deviations (in parenthesis) are shown

ATC performance variables Advance notification time following automation failure (in s): Number of Rule violations Handoff delay (in s) SA (proportion accuracy) Call sign recall Destination recall Location recall Heading recall Altitude recall Total SA Meta-SA Meta-memory for call sign Meta-memory for destination Meta-memory for location Meta-memory for heading Meta-memory for altitude Secondary task performance Hit-to-signal ratio RT to hits (in s) Proportion of false alarms Subjective workload rating





52.807 (9.868)

32.663 (22.664)

28.780 (24.179)

10.197 (17.148)

.333 (.594) 13.611 (10.594)

.611 (1.145) 13.611 (12.118)

.111 (.323) 19.833 (16.678)

.278 (.669) 14.000 (10.466)

.056 (.134) .467 (.266) .500 (.250) .811 (.145) .478 (.292) .462 (.137)

.082 (.174) .318 (.235) .435 (.177) .718 (.224) .282 (.174) .367 (.118)

.063 (.164) .253 (.187) .347 (.161) .663 (.222) .305 (.244) .326 (.114)

.056 (.115) .211 (.175) .278 (.218) .756 (.146) .267 (.194) .313 (.087)

1.278 (1.841) 4.889 (2.139) 5.278 (1.934) 5.611 (2.200) 6.722 (1.447)

2.118 (2.891) 4.353 (2.149) 5.235 (1.985) 5.000 (2.716) 5.000 (2.915)

1.632 (1.950) 4.632 (2.290) 4.632 (2.005) 4.474 (2.366) 5.211(2.323)

2.222 (2.462) 4.444 (2.064) 5.278 (1.601) 5.556 (1.617) 5.333 (2.449)

.807 (.113) 1.195 (.124) .030 (.024) 70.982 (11.202)

.908 (.053) 1.112 (.151) .040 (.036) 67.537 (18.336)

.923 (.052) 1.140 (.156) .029 (.020) 64.352 (18.919)

.922(.060) 1.108 (.124) .030 (.033) 62.500 (10.221)

Effect of LOA on ATC Performance Three one way ANOVAs with LOA as the between subjects variable was performed on advance notification time following automation failure, number of rule violations, and handoff delay. A Bonferroni adjustment was done to account for multiple testing (adjusted significance level was 0.017). Advance Notification Time The effect of LOA on advance notification time following automation failure was significant, F (3, 68) = 14.756, p < .001, ηp2= 0.394. Tukey’s HSD analysis on the mean advance notification time revealed that individuals working with information acquisition automation were significantly faster in detecting collisions following the failure of the 204


automation compared to individuals working with all the other LOAs. The mean advance notification times for information acquisition automation, information analysis automation, decision and action selection automation, and action implementation automation were 52.807 s (SD = 9.868), 32.663 s (SD = 22.664), 28.780 s (SD =24.179), and 10.197 s (SD = 17.148), respectively (See Figure 9). Effect of Level of Automation on Advance Notification Time (Following Automation Failure)


60

50

40

30

20

10

0

on ion ion ysis tati isit lect nal n u e a e s q ac lem ion tion act tion ma i mp a r d n o io an orm Inf on A ct Inf ci s i e D

Level of Automation

Figure 9. Effect of LOA on advance notification time following automation failure. The ideal advance notification time was 63 s. Error bars indicate + 1 standard error of the mean.

In summary, participants working with the information acquisition automation continued to be faster in detecting collisions following automation failure compared to those working with other LOAs. Thus, the exposure to the first automation failure did not

205


seem to have affected the reliance on the high LOAs. That is, participants working with high LOAs continued to use the automated aids despite experiencing the first automation failure. Rule violations and Handoff Delay The effect of LOA on the total number of rule violations failed to reach significance, F (3, 68) = 1.404, p >.05, ηp2= 0.058. Similarly, the effect of LOA on handoff delay failed to reach significance, F (3, 68) = 1.037, p > 0.05, ηp2=0.044. Therefore, high LOAs did not seem to have helped the participants in adhering to the simulator procedures and quickly accepting planes into their airspace. SA SA was assessed in the follow up scenario in the same manner as in the first test sccenario. However, SAGAT freeze occurred only once during the follow up scenario. Proportion accuracy scores for call sign, destination, altitude, and heading were computed. In order to compute the proportion accuracy score for aircraft location, the distance error (in cm) between each aircraft’s reported and actual location was computed. If this distance error exceeded the “5-mile” distance (5 miles = 0.8 cm) in the ATST airspace or the aircraft was positioned in a wrong segment in the airspace, then the response was recorded as incorrect. The proportion accuracy scores for call sign, location, altitude, heading, and destination obtained were averaged to compute the total SA score.

206


Effect of LOA on Total SA The main effect of LOA on total SA score was significant, F (3, 68) =6.192, p < .01,

ηp2 = 0.215. Tukey’s HSD analysis on the total SA score showed that individuals working with information acquisition automation had higher overall SA compared to those working with decision automation and action implementation automation. The mean overall SA score for the information acquisition automation, information analysis automation, decision and action selection automation, and action implementation automation were 0.462 (SD = .137), 0.367 (SD = .118), 0.326 (SD = .114), and 0.313 (SD = .087), respectively (See Figure 10). As in the first test scenario, individuals assigned to the information acquisition automation condition continued to have higher total SA in comparison to the other automation conditions. These results were consistent with the active-processing hypothesis, indicating that decreasing the level of operator involvement can lower SA and leave the operator out of the loop.

207

Texas Tech University, Arathi Sethumadhavan, May 2009 Effect of Level of Automation on Total SA

Proportion accuracy for total SA

1.0

0.8

0.6

0.4

0.2

0.0

on ion ion ysis tati isit lect nal n u e a e s q ac lem ion tion act tion ma imp a r d n o n io orm Inf na Act Inf isio c e D

Level of Automation

Figure 10. Effect of LOA total SA. Error bars indicate + 1 standard error of the mean.

Effect of LOA on SA Variables A one-way between subjects multivariate analysis of variance (MANOVA) was performed on five dependent variables: call sign, destination, location, altitude, and heading. The independent variable was the LOA. With the use of the Wilk’s criterion, the dependent variables were significantly affected by the LOA, F (15, 177.077) = 2.311, p < .01, ηp2 = 0.152. Univariate analyses showed that there was an effect of LOA on altitude recall, F (3, 68) = 3.200, p < .05, ηp2 = .124. Tukey’s HSD analysis on the mean altitude recall showed that participants in the information acquisition condition (M = .478, SD = .292) recalled altitude better than those in the action implementation condition (M = .267, SD = .194; See Figure 11). 208


Effect of Level of Automation on Altitude Recall


1.0

0.8

0.6

0.4

0.2

0.0

on ion i on ysis tati isit lect na l n u e a e s q n m n ac ple atio ctio m a tion m i a r d o ion an orm Inf Act ion Inf s i c De

Level of Automation

Figure 11. Effect of LOA altitude recall. Error bars indicate + 1 standard error of the mean.

Univariate analyses also indicated an effect of LOA on location recall, F (3, 68) = 4.105, p < .05, ηp2 = .153. Tukey’s HSD analysis on the mean location recall revealed that participants in the information acquisition condition (M = .500, SD = .250) recalled aircraft location better than those in the action implementation condition (M = .278, SD = .218; See Figure 12).

209


Effect of Level of Automation on Location Recall

Proportion accuracy for location recall

1.0

0.8

0.6

0.4

0.2

0.0

on ion ion ysis ta ti isit lect na l n u e a e s q n m n ac ple atio ctio m a tion m i a r d o ion an orm Inf Act ion Inf s i c De

Level of Automation

Figure 12. Effect of LOA location recall. Error bars indicate + 1 standard error of the mean.

Univariate analyses also showed that there was an effect of LOA on destination recall, F (3, 68) = 4.781, p < .01, ηp2 = .174. Tukey’s HSD analysis on the mean destination recall revealed that participants in the information acquisition condition (M = .467, SD = .266) recalled destination of aircraft better than those in the decision and action selection condition (M = .253, SD = .187) and action implementation condition (M = .211, SD = .175; See Figure 13). There was no difference between the LOAs in the recall of aircraft call sign and heading.

210


Effect of Level of Automation on Destination Recall


1.0

0.8

0.6

0.4

0.2

0.0

on ion ion ysis ta ti isit lect na l n u e a e s q n m n ac ple atio ctio m a tion m i a r d o ion an orm Inf Act ion Inf s i c De

Level of Automation

Figure 13. Effect of LOA destination recall. Error bars indicate + 1 standard error of the mean.

Thus, the LOAs differed in the monitoring of three dimensions: altitude, location, and destination. Specifically, the participants assigned to the information acquisition had better SA for important aircraft attributes compared to the other LOAs. This finding is consistent with the active processing hypothesis, mirroring the results from the first test scenario. Results of binomial probabilities. The LOAs differed in the recall of the aircraft altitude. A guessing score was computed for altitude in order to determine whether the

211


participants were guessing the aircraft altitude during the SAGAT freezes. An aircraft could fly at altitude 1, 2 or 3. Hence, the guessing score for altitude was 0.33. Binomial probabilities were calculated to determine whether the accuracy of recall of altitude for each LOA was significantly higher than the guessing score. As shown in Table 7, the results indicated that the overall accuracy of recall of altitude for only the information acquisition group was significantly higher than the guessing score. (p < .01). The accuracy of recall of altitude for the other LOAs did not significantly differ from the guessing score. This means that the participants assigned to higher stages of information processing were not monitoring the ATC display, which was reflected in their altitude recall scores.




.478


.282

Not different from the guessing score


.305



.267



The LOAs also differed in the recall of the aircraft destination. A guessing score was computed for destination in order to determine whether the participants were guessing the aircraft destination during the SAGAT freezes. An aircraft could fly to 6 possible destinations. The guessing score for destination was .217. Binomial probabilities were calculated to determine whether the accuracy of recall of destination for each LOA was significantly higher than the guessing score. As shown in Table 8, the results indicated that the overall accuracy of recall of destination was

212


significantly higher than the guessing score for the information acquisition condition (p < .000001) as well as the information analysis condition (p < .05). The accuracy of recall of destination for the decision automation and action implementation conditions was not significantly different from the guessing score. This also indicates that the participants assigned to higher stages of information processing were not monitoring the ATC display. Table 8. Results from the binomial probability calculations

LOA



.467


.318



.252



.211



Meta-SA Altitude was reported as the most accurately recalled aircraft attribute by the participants (50%), followed by location (26.39%), heading (25%), destination (19.44%), and call sign (5.56%). As in the first test scenario, participants did not rate aircraft location as the most remembered attribute. Effect of LOA on Meta-SA Four one way ANOVAs with LOA as the between subjects variable were performed on meta- call sign, meta-destination, meta-altitude, meta-heading, and meta-location. A Bonferroni correction was implemented to account for multiple testing (adjusted significance level was 0.01).

213


There was no significant effect of LOA on any of the meta-SA variables. Thus, as in the first test scenario, there were no differences between the individuals working with various LOAs in their confidence in their own SA. Relationship between SA and Meta-SA For each LOA, correlations between actual SA and meta-SA were examined. For participants in the information acquisition condition, confidence in the recall of destination correlated with accuracy in recall of destination (r = .552, p < .05). Judgments of recall of call sign also correlated with the accuracy in call sign recall for the information acquisition group (r = .793, p < .001), information analysis group (r = .576, p < .05), decision and automation group (r = .737, p < .001) and action implementation group (r = .494, p < .05). For participants in the decision automation condition, confidence in the recall of altitude also significantly correlated with accuracy in recall of altitude (r = .645, p < .01). As in the first test scenario, majority of the significant correlations involved the call sign attribute. Call sign recall was generally poor and participants were sensitive to their awareness of how well they recalled the call sign attribute. With the exception of call sign, judgments of recall of other attributes were not generally associated with their actual recall of these attributes. Thus, overall, regardless of the LOA with which the participants were working, their judgments about their SA did not reflect their actual SA.

214


Secondary Task Performance Three one way ANOVAs with LOA as the between subjects variable were performed on hit-to-signal ratio as well as on the reaction time to the hits. A Bonferroni correction was implemented to account for multiple testing (adjusted significance level was 0.017). Effect of LOA on Hit-to-signal Ratio There was a significant effect of LOA on the hit-to-signal ratio, F (3, 68) = 10.343, p < .001, ηp2 = 0.313. Tukey's HSD analysis was conducted on the mean hit-to-signal ratio. Results indicated that individuals working with information analysis automation, decision automation, and action implementation automation had higher hit-to-signal ratio compared to those working with the information acquisition automation. The mean hit-tosignal ratio for information acquisition automation, information analysis automation, decision and action selection automation, and action implementation automation were 0.807 (SD = .113), 0.908 (SD =.053), 0.923 (SD = .052), and 0.922 (SD = .060), respectively (See Figure 14).

215

Texas Tech University, Arathi Sethumadhavan, May 2009 Effect of Level of Automation on Hit-to-signal ratio 1.0

Hit-to-signal ratio

0.8

0.6

0.4

0.2

0.0

on ion ion ysis tati isit lect nal n u e a e s q ac lem ion tion act tion ma imp a r d n o n io orm Inf na Act Inf isio c e D

Level of Automation

Figure 14. Effect of LOA on hit-to-signal ratio. Error bars indicate + 1 standard error of the mean.

Effect of LOA on Reaction Time to Hits The effect of LOA on reaction time to hits failed to reach significance, F (3, 68) = 1.490, p > .05, ηp2 = 0.062. In short, as in the first test scenario, secondary task performance in the follow up scenario was superior when individuals worked with high LOAs. This meant that participants working with high LOAs either had more cognitive resources or chose to allocate their cognitive resources to the secondary task compared to those working with the information acquisition automation. This helps to corroborate the notion that individuals working with high LOAs continued to rely on the collision detection aids

216


provided to them to complete the ATC task, despite being exposed to the automation failure in the first test scenario. Effect of LOA on False Alarms The effect of LOA on the proportion of false alarms failed to reach significance, F (3, 68) = .568, p > .05, ηp2 = 0.0242. Binomial probabilities were also calculated to determine whether the number of false alarms in the secondary task for each LOA was significantly less than chance probability. This analysis helped to determine whether participants in the different LOAs were attending to the secondary task, rather than merely pressing the spacebar key. As shown in Table 9, the results indicated that the overall percentage of false alarms for each LOA was significantly less than chance probability (p < .000001).




.030


.040



.029



.030



Results of ANCOVA An ANCOVA was performed to assess the effect of LOA on performance following automation failure after controlling for the performance in the secondary task (i.e., the hit-to-signal ratio). Results revealed that even after controlling for the secondary task performance, there was a significant effect of LOA on advance notification time following automation failure, F(3, 67) = 13.762, p < .001, ηp2 = .381. The covariate did

217


not significantly contribute to the ANCOVA model, F (1, 67) = 1.532, p > .05, ηp2 = .022. Tukey’s LSD analysis was conducted on the advance notification time to determine the LOAs that differed significantly from each other. The adjusted means for the information acquisition, information analysis, decision and action selection, and action implementation automation were 56.064 s (SE = 5.240), 31.962 s (SE = 4.566), 27.491 s (SE = 4.649), and 8.930 s (SE = 4.645), respectively. As in the first test scenario, even after controlling for the secondary task performance, participants in the information acquisition condition detected collisions earlier in comparison to the participants in the information analysis, decision, and action implementation conditions, following the failure of the automation. This means that automating high levels of information processing can leave individuals out of the loop. Effect of LOA on Subjective Ratings of Workload The effect of LOA on the subjective workload ratings failed to reach significance, F (3, 68) = 1.082, p >.05, ηp2 = 0.046. Examining the First Automation Failure Effect ATC Performance: Advance Notification Time A 2 (failure) X 4 (LOA) mixed ANOVA with LOA as the between subjects variable was conducted on the advance notification time to examine the differences in the time taken to respond to the first and second automation failures by individuals assigned to the four LOAs. Failure was the within subjects variable with first automation failure and second automation failure as the two treatment levels. Results indicated a significant main effect of LOA on the advance notification time, F (3, 68) = 20.681, p < .001, ηp2 = .477. No other significant main effects or interactions were identified. A Tukey’s HSD

218


analysis on advance notification time indicated that participants in the information acquisition condition (M = 54.034 s, SD = 12.473) were faster in detecting upcoming collisions following the failure of the automation compared to those in the information analysis (M = 30.384 s, SD = 24.267), decision and action selection (M = 22.608 s, SD = 22.209), and action implementation (M = 12.184 s, SD = 16.405) conditions, in both the scenarios. Thus, participants working with the information acquisition automation were faster in detecting collisions following both the first and second automation failures compared to those working with other LOAs. Even after the first automation failure, participants working with the high LOAs continued to exhibit overreliance on the automated aids and were slower in responding to the second automation failure, compared to participants in the information acquisition condition. Secondary Task Performance The secondary task performance of the participants in the follow up scenario was compared to the performance of the participants in the first test scenario in order to assess the effects of the first automation failure on subsequent secondary task performance. A 2 (scenario) X 4 (LOA) mixed ANOVA with LOA as the between subjects variable was conducted on the hit-to-signal ratio. Scenario (first test scenario, follow up scenario) was the within subjects variable. Results indicated a significant main effect of LOA on the advance notification time, F (3, 68) = 12.202, p < .001, ηp2 = .350. No other significant main effects or interactions were identified. A Tukey’s HSD analysis on the hit-to-signal ratio revealed that participants in the information acquisition condition (M = 0.813, SD = 0.108) were less accurate in responding to wind shear warnings compared to those in the

219


information analysis (M = 0.907, SD = 0.0615), decision and action selection (M = 0.928, SD = 0.0485), and action implementation (M = 0.930, SD = 0.046) conditions. Thus, even after being exposed to the first automation failure, participants working with the high LOAs continued to attend to the secondary task and exhibited superior performance in the secondary task in comparison to those in the information acquisition condition. This helps to corroborate the notion that individuals working with high LOAs continued to rely on the collision detection aids provided to them to complete the ATC task, despite being exposed to the automation failure in the first test scenario. Total SA The total SA of the participants after being exposed to the first automation failure (i.e., SA in the follow up scenario) was compared to the total SA prior to the first automation failure. A 2 (failure) X 4 (LOA) mixed ANOVA with LOA as the between subjects variable on total SA revealed a significant main effect of LOA, F (3, 68) = 8.852, p < .001, ηp2 = .281. Failure (before first automation failure, after first automation failure) was the within subjects variable. A Tukey’s HSD analysis on total SA showed that participants in the information acquisition condition (M = 0.488, SD = 0.119) had higher SA compared to those in the information analysis (M = 0.381, SD = 0.100), decision and action selection (M = 0.371, SD = 0.101), and action implementation (M = 0.348, SD = 0.106) conditions. In summary, individuals in the information acquisition automation condition had higher SA than the other LOAs before the automation failure as well as after the automation failure. There was also a significant main effect of failure, F (1, 68) = 17.533, p < .001, ηp2 = .205. Participants had higher SA before the first automation failure (M = 0.428, SD =

220


0.111) rather than after the failure (M = 0.367, SD = 0.127). Lower SA in the follow up scenario would be due to a vigilance decrement associated with monitoring the ATC display for long time (e.g., Endsley et al., 2003; Parasuraman, 1987). There is also a possibility that the difference in SA would have arisen due to the differences in the first and second test scenario. In the first test scenario, differences among the LOAs arose in the recall of altitude and destination attributes. In the follow up, there were differences among the LOAs in the recall of the altitude, destination, and location attributes. Therefore, a 2 (failure) X 4 (failure) mixed ANOVA was conducted on altitude recall, destination recall, and location recall. Altitude recall. The SA for altitude of the participants after being exposed to the first automation failure was compared to the SA for altitude prior to the first automation failure. A 2 (failure) X 4 (LOA) mixed ANOVA on altitude recall revealed a significant main effect of LOA, F (3, 68) = 6.470, p < .01, ηp2 = .222. A Tukey’s HSD analysis on altitude recall showed that participants in the information acquisition condition (M = 0.592, SD = 0.263) had higher SA for altitude compared to those in the information analysis (M = 0.408, SD = 0.179), decision and action selection (M = 0.439, SD = 0.210), and action implementation (M = 0.367, SD = 0.233) conditions. In summary, individuals in the information acquisition automation condition had higher SA for altitude than the other LOAs before the automation failure as well as after the automation failure. There was also a significant main effect of failure, F (1, 68) = 45.482, p < .001, ηp2 = .401. All the participants had higher SA for altitude before the first automation failure (M = 0.569, SD = 0.224) rather than after the failure (M = 0.333, SD = 0.242). Lower SA

221


for altitude for all the LOAs in the follow up scenario could be due to a vigilance decrement or due to differences in the first and second scenario. Destination recall. The SA for destination after being exposed to the first automation failure was compared to the SA for destination prior to the first automation failure. A 2 (failure) X 4 (LOA) mixed ANOVA on destination recall revealed a significant main effect of LOA, F (3, 68) = 5.980, p < .01, ηp2 = .209. A Tukey’s HSD analysis on destination recall showed that participants in the information acquisition condition (M = 0.519, SD = 0.230) had higher SA for destination compared to those in the decision and action selection (M = 0.339, SD = 0.191) and action implementation (M = 0.297, SD = 0.206) conditions. In summary, individuals in the information acquisition automation condition had higher SA for destination than the other LOAs. There was also a significant main effect of failure, F (1, 68) = 24.523, p < .001, ηp2 = .265. All the participants had higher SA before the first automation failure (M = 0.456, SD = 0.206) rather than after the failure (M = 0.311, SD = 0.235). Lower SA for destination for all the LOAs in the follow up scenario could be due to a vigilance decrement or due to differences in the first and second scenario. Location recall. The SA for location after being exposed to the first automation failure was compared to the SA for location prior to the first automation failure. A 2 (failure) X 4 (LOA) mixed ANOVA on location recall revealed a significant main effect of LOA, F (3, 68) = 5.651, p < .01, ηp2 = .200. A Tukey’s HSD analysis showed that participants in the information acquisition condition (M = 0.486, SD = 0.193) had higher SA for location compared to those in the decision and action selection (M = 0.353, SD = 0.164) and action implementation (M = 0.322, SD = 0.184) conditions.

222


In summary, even after exposure to the first automation failure, individuals working with the information acquisition automation had higher SA than those assigned to the high LOAs. Total SA as a Mediator between LOA and Performance The purpose of this analysis was to examine whether SA after exposure to the first automation failure mediated the relationship between LOA and collision detection following the second automation failure. Sobel test statistics failed to reveal significant p value for the indirect effect of total SA as a mediator between LOA and advance notification time following the second automation failure (z = -1.738, p > .05). The SA for altitude, destination, location, heading, and call sign also did not mediate the association between LOA and collision detection performance following the second automation failure. Therefore, SA failed to mediate the relationship between LOA and advance notification time following the second automation failure. That is, after exposure to the first automation failure, one’s understanding of the situation no longer mediated the relationship between the LOA and the performance following the subsequent automation failure. Perhaps, individuals’ confidence in the automation might be mediating the relationship between LOA and performance, following the automation failure. There is evidence that operators continue to rely on automated systems even if there are occasional failures if their confidence in the system exceeds their confidence in manual operation (e.g., Parasuraman & Riley, 1997). There is also a possibility that SA may not have mediated the relationship between LOA and performance due to methodological limitations. First, Endsley (2000)

223


recommends that in order to get a reliable assessment of SA, multiple SAGAT freezes must be administered in the scenario. However, only one SAGAT freeze was administered in this scenario in order to reduce the duration of the experiment. This could be why the SA score failed to mediate the relationship between SA and performance following the automation failure. Second, SA scores as obtained using the SAGAT methodology could not be a true indicator of operator SA. SAGAT has been criticized for heavily relying on operator’s memory (e.g., Durso & Dattel, 2004). The extent to which SA mediates operator performance following automation failures needs attention in future studies.

224


APPENDIX C INFORMED CONSENT FORMS Informed Consent Form: Participants Receiving Credit By signing this consent form, you will be agreeing to participate in a study called “Effects of Levels of Automation on Air Traffic Controller Situation Awareness”. Dr. Patricia DeLucia is responsible for this research. Her phone number is 742-3711 ext. 259. The purpose of this research is to examine your ability to avoid collisions between aircraft in an airspace. You will first be trained to control air traffic and thereafter your ability to prevent aircraft collisions will be assessed. This experiment will be conducted on one day and will last a total of approximately 3 hours. You will be given rest periods when needed. You may not continue to participate in this experiment if you fail to learn the skills needed to proceed with the experiment. However, you will receive course credit for the hours you participated in this experiment. Dr. Patricia R. DeLucia will answer any questions you have about the study. For questions about your right as a subject or about injuries caused by this research, contact the Texas Tech University Institutional Review Board for the Protection of Human Subjects, Office of Research Services, Texas Tech University, Lubbock, TX 79409 or you can call (806) 742-3884. By participating in this research, you should not experience any risk beyond those encountered during daily life. You will receive 3 credits for your participation. You will also learn about the research. There are also benefits to science and society derived from your participation. Only Dr. Patricia DeLucia and Arathi Sethumadhavan will have access to your data. Any data you provide will be stored in a locked room. Any use of your data will not mention you by name. Your participation is voluntary. If you refuse to participate, it does not involve a penalty. You may quit at any time without penalty or loss of benefits.

Signature of Subject:____________________________ Date:______________________ This consent form is not valid after (December, 2009).

225


Informed Consent Form: Participants Receiving Remuneration By signing this consent form, you will be agreeing to participate in a study called “Effects of Levels of Automation on Air Traffic Controller Situation Awareness”. Dr. Patricia DeLucia is responsible for this research. Her phone number is 742-3711 ext. 259. The purpose of this research is to examine your ability to avoid collisions between aircraft in an airspace. You will first be trained to control air traffic and thereafter your ability to prevent aircraft collisions will be assessed. This experiment will be conducted on one day and will last a total of approximately 3 hours. You will be given rest periods when needed. You may not continue to participate in this experiment if you fail to learn the skills needed to proceed with the experiment. However, you will be compensated for the hours you participated in this experiment. Dr. Patricia R. DeLucia will answer any questions you have about the study. For questions about your right as a subject or about injuries caused by this research, contact the Texas Tech University Institutional Review Board for the Protection of Human Subjects, Office of Research Services, Texas Tech University, Lubbock, TX 79409 or you can call (806) 742-3884. By participating in this research, you should not experience any risk beyond those encountered during daily life. You will be compensated $5 for each hour of participation and will receive a bonus of $15 for the completion of the project. Therefore, if you complete the 3 hours of participation, you will be compensated $30. You will also learn about the research. There are also benefits to science and society derived from your participation. Only Dr. Patricia DeLucia and Arathi Sethumadhavan will have access to your data. Any data you provide will be stored in a locked room. Any use of your data will not mention you by name. Your participation is voluntary. If you refuse to participate, it does not involve a penalty. You may quit at any time without penalty or loss of benefits.

Signature of Subject:____________________________ Date:______________________ This consent form is not valid after (December, 2009).

226


APPENDIX D BIOGRAPHICAL QUESTIONNAIRE Participant # -------------1. What is your age? 2. Gender:

Female

Male

3. Do you have normal vision or corrected to normal vision? 4. Do you have normal hearing or corrected to normal hearing? 5. Do you have color blindness? 6. Have you ever piloted an airplane?

Yes

or No

7. Do you have air traffic control experience? Yes or No Please indicate your current level of education 8. If currently in college, please circle how you are classified as of this semester: Freshman Sophomore Junior Senior Graduate School. 9. If currently in college, please list your major: 10. If you are a college student, please indicate your GPA:

227


APPENDIX E VIDEO GAME EXPERIENCE QUESTIONNAIRE Participant # -------------Please circle the best answer for each of the following questions. 1. Have you ever played video games? Yes

No

2. How interested are you in video games? Circle a number from 1 (not interested) to 5 (very interested). 1 2 3 Not interested

4

5 Very interested

3. Do you currently play video games? Yes

No

If you answered “Yes” to Question 3, please answer the following: 4. How long have you been playing video games? a. 6 months

b. 1 year

c. 2-5 years.

d. 5-10 years

e. 10 or more years

5. How often (approximately) do you currently play video games? a. Daily

b. Weekly

c. Once a month

d. Once in 6 months e. Once a year 6. How many hours per week do you spend playing video games? a. Less than 1 hour

b.1 – 3 hours

d. 7 – 10 hours

e. 11+ hours

c. 4 – 6 hours

7. How good do you feel you are at playing video games? a. Very good

b. Moderately good

c. Not very skilled

d. No skill 8. What are your Top 3 (in order) genres, or video game categories, that you enjoy to play? (Choose from the following list or add your own).

228


#1.________________________________________________________________ #2.________________________________________________________________ #3.________________________________________________________________

Action Fighting First-person shooter Role-playing Massively Multiplayer Online Games Flight Racing Sports Military Space Strategy Real-time tactical and turn-based tactical God games Economic simulation games City-building games Adventure Arcade Educational Maze Music Pinball Platform Puzzle Stealth Survival horror Vehicular combat

229


APPENDIX F PARTICIPANT INSTRUCTIONS Thank you for participating in this experiment. This experiment is conducted to study the effects of automation on performance in an air traffic control task. So in this experiment, you are an air traffic controller and we will first learn about the airspace that you will be controlling. Airspace (Instructional Video) The airspace consists of airports, sector gates, waypoints and aircraft. Let us talk about each in detail. 1. Airports There are two airports, x and y. 2. What are Sector gates? There are 4 sector gates, namely, A, B, C, and D. These are gates through which an airplane enters your airspace from another airspace. These are also exits through which an airplane leaves your airspace and enters another airspace. Once an aircraft leaves your airspace, it is no longer your responsibility. 3. Now let’s learn about Waypoints There are 5 waypoints, namely, E, F, G, H and I. An aircraft has to pass through a waypoint to reach its destination. For example, suppose there is an aircraft originating from sector gate A and its destination is airport x. It has to pass through waypoint E to get to airport x. Similarly an aircraft originating from sector gate C whose destination is sector gate B, has to pass through waypoints G and F to get to sector gate B. 4. Finally, let’s talk about Aircraft

230


•

An aircraft has 3 important characteristics, namely altitude, speed and destination.

•

An aircraft can have 3 altitude levels. (1, 2 and 3)

•

Speed of an aircraft is represented by F (fast), M (medium), and S (slow).

•

A slow speed aircraft travels at one-third the speed of a fast aircraft. A medium speed aircraft moves at two-third the speed of a fast aircraft. This means that it takes a slow speed aircraft almost 3 times longer to reach its destination compared to a fast speed aircraft. It takes a medium speed aircraft almost 1.5 times longer to reach its destination compared to a fast speed aircraft.

•

Destination of an aircraft can be a sector gate or an airport. [Ask participants to explain what they learned.]

Map Study Instructions •

Now we will learn the map of the airspace [give the map to the participant]

•

There are 4 sector gates – A, B, C, D and 2 airports x and y.

•

There are 5 waypoints – E, F, G, H, and I.

•

The landing direction to airport x is westwards and to airport y is southwards.

•

Planes are identified by 1 or 2 digit numbers (e.g., 1, 2, etc)

•

Planes can travel at 3 altitudes – 3, 2, 1

•

Destination of a plane can be a sector gate or an airport.

•

Planes can travel at 3 speeds – F, M, and S. A slow plane takes 3 times longer to reach its destination compared to a fast plane headed towards the same destination. A medium plane takes 1.5 times longer to reach its destination compared to a fast plane.

231


•

Planes fly the smallest possible routes on the airspace to minimize delay in the airspace. So you cannot change the route at which a plane travels. The following table shows the route that a plane starting from a certain point takes to a certain destination in the airspace. [give the table to the participant]

Source & Destination A to B A to C A to D A to x A to y

Route AEFB AEIGC AEHD AEx AEFy

B to A B to C B to D B to x B to y

BFEA BFGC BFIHD BFEx BFy

C to A C to B C to D C to x C to y

CGIEA CGFB CGHD CGIEx CGFy

D to A D to B D to C D to x D to y

DHEA DHIFB DHGC DHEx DHIFy

x to A x to B x to C x to D x to y

xEA xEFB xEIGC xEHD xEFy

y to A y to B y to C y to D y to x

yFEA yFB yFGC yFIHD yFEX

232


•

You have 5 minutes to learn the map and the source-destination table.

•

Then you will be tested on your knowledge of the map. You will be given 2 minutes to reconstruct the map. [Give participants a blank map]

•

Indicate the following on the map: o Sector gates, airports, and waypoints o The landing directions to airports x and y o The speeds at which a plane can travel. o The altitudes at which a plane can travel. o Please indicate on the table provided to you the route at which a plane will travel when its source and destination are given. [Give participants the source-destination table.] o You will be given 5 minutes to complete this table.

Rules to Control Traffic (Instructional Videos) •

Let us now learn the rules for controlling traffic in your airspace.

•

Each scenario starts with a certain number of airplanes on the computer screen. As the scenario progresses more aircraft will keep appearing on the screen. The airplanes will pop up near the sector gates, on the routes, or near airports.

•

An aircraft is represented by 4 attributes in the airspace. These attributes are Speed, altitude, destination, and call sign o Speed is represented by F (fast); M (medium); and S (slow). o There are three Altitude levels (1, 2 and 3). o The destination of an aircraft can be a sector gate or an airport. o Call signs or aircraft identifiers are represented by 1 or 2 digit numbers

233


An example of an aircraft is F3C_15.

This means that this aircraft is moving at Speed F, Level 3, and its destination is sector gate C. The call sign or aircraft identification number of this aircraft is 15.

o Another example is F3x_10.

This indicates that this aircraft is moving at Speed F, Level 3, and its destination is airport x. The call sign or aircraft identification number of this aircraft is 10.

•

You will be controlling all the planes in the airspace using a computer mouse.

•

All aircraft fly the smallest possible routes on the airspace to minimize delay in the airspace. So you cannot change the route at which an aircraft travels. For example, consider an aircraft starting from sector gate A whose destination is sector gate C. It always travels the shorter path – AEIGC instead of the longer path AEFGC.

•

Although you cannot change the route at which an aircraft travels, you can change its altitude and speed.

•

You must click on the arrow of an aircraft to make any changes. When the arrow is first clicked it will turn yellow. A yellow arrow means that the aircraft has been contacted and waiting for further instructions from the controller. For example, if you want to change the altitude of an aircraft, you click on the arrow of the aircraft and then click on the altitude (1, 2 or 3). If you want to change the speed of an aircraft, you click on the arrow of the aircraft and then click the speed (F, M or S).

234


•

A change (i.e., speed or altitude) must be executed before the controller can select another aircraft. Therefore, if you are trying to select an aircraft and nothing is happening check to see if any other aircraft has been selected but not given a change.

•

Only one command can be issued to an aircraft at a time. The cycle is select aircraft, click on command, select aircraft, and click on command. [End of this instructional video. Ask participants to explain what they learned. Start the next instructional video]

•

The objective of the game is to land an aircraft at an airport or exit an aircraft through a sector gate as soon as possible without any errors. Aircraft headed for sector gates should exit at Level 3 and at Speed Fast. For example, consider this aircraft exiting through sector gate D. Aircraft landing at airports should be at Level1 and Speed Slow when the aircraft is being landed. Aircraft at a different level or speed will land safely, but points will be deducted. It’s all right to have a plane at a different speed or level in transit to the airport, but before the final landing try to change the level to 1 and the speed to slow. For example, consider this aircraft landing at airport y. Its speed is slow and its level is 1.

•

Do not change the speed of a plane to slow several minutes before landing at an airport. For example, consider this plane. Change its speed to slow only after it is midway on Fy. This is because when you slow down a plane, the time taken to land the plane at its destination increases. That is, the en route delay time increases. Remember, your goal, as a controller is to make the planes safely reach their destination as quickly as possible. The more time you take, more points will be deducted. So change the speed of a plane to slow only midway after it crosses F to

235


land at y. Similarly, change the speed of a plane landing at airport x to slow only after it is midway on Ex. •

Do not change the speed of a plane to slow at any time except when it is being landed in order to minimize delay in your airspace.

•

Once an aircraft has landed at an airport or exited through a sector gate, it is no longer your responsibility. [End of this instructional video. Ask participants to explain what they learned.]

•

Look at this plane [point to the plane F3x_10 on the screen]. This plane is flying at speed Fast, level 3, and its destination is airport x. It is flying at altitude 3 because the pilot has requested that altitude. So you should change its altitude to 1 only after it crosses E to land at x. Similarly, look at this plane [point to the plane F2B_50 on the screen]. This plane is flying at speed Fast, level 2, and its destination is sector gate B. It is flying at altitude 2 because the pilot has requested that altitude. So you should change its altitude to 3 only after it crosses F. In short, change the levels of planes only when they are right about to land or about to exit. [Start the next instructional video]

•

Let us now talk about separation errors and collisions.

•

If two planes flying at the same altitude are within 5 miles of each other – this is a 5 mile distance in this airspace, then this will create a separation error. Consider this example. A separation error may or may not result in a collision. In this example, separation error will not result in a collision because planes are not moving towards each other. Separation errors should be avoided.

236


•

Now let us talk about collisions. A collision occurs when two aircraft that are at the same altitude reach the same location at approximately the same time. This red cross here indicates that a collision occurred here. Penalties are recorded when a collision occurs. This is called an aircraft penalty. Your job as a controller is to avoid all possible collisions between aircraft.

•

You can prevent collisions and separation errors between two aircraft by making an altitude separation.

•

What is an altitude separation? Suppose two aircraft are at level 2. If these two aircraft arrive at the same location at the same time, then there is a conflict. You can resolve this conflict by changing the altitude of one of the aircraft.

•

Once you detect an upcoming collision, press the conflict button located on the right side of the screen. For example, consider these two aircraft with call signs 10 and 15. These will collide near I if they keep flying at the same altitude.

•

As soon as you detect an upcoming collision, press the conflict button. Try to be as fast as possible in pressing the conflict button. The faster you detect an upcoming collision, better your conflict detection performance.

•

After pressing the conflict button, tell aloud the names of the aircraft that would be involved in an upcoming collision.

•

Press the conflict button before you make the altitude separation to resolve the collision. But if you think that pressing the conflict button before making the altitude change will result in a collision, please go ahead and avoid the collision and then press the conflict button.

•

Do not forget to press the conflict button when you detect a collision.

237


•

So your sequence of action should be – press conflict button and then change the altitude of one of the aircraft to avoid an upcoming collision.

•

Try to minimize making unnecessary altitude changes to aircraft. You can make altitude changes to avoid collisions and separation errors between aircraft. You can also make altitude changes to land or exit planes at the right altitude. [End of this instructional video. Ask participants to explain what they learned. Start the next instructional video]

•

Let us now talk about handoff delays.

•

The scenario will start with a certain number of aircraft activated, and periodically additional aircraft will appear as a grayish color. These planes are waiting to be activated (or waiting to accept the handoff). It just means that these planes are waiting to enter your airspace and can do so only when you give them the permission. To activate an aircraft click on the arrow. Within 7 seconds, the aircraft will turn green, indicating it’s been activated.

•

It is important to try to activate all gray aircraft that appear on the screen as soon as possible. If you do not activate a gray aircraft within 7 seconds, it will automatically activate itself and you will be assigned a handoff delay penalty.

•

Therefore, a handoff delay score is kept for how long it takes to activate all the aircraft that appear on the screen. Try to minimize the handoff delay time.

•

Finally, a score of all errors, penalties, and delays are recorded. Try to minimize all errors, penalties and delays. [End of this instructional video. Ask participants to explain what they learned.]

238


Secondary Task Instructions (Instructional Video) •

In addition to controlling air traffic, you are also required to monitor a weather display for wind shear warnings.

•

A Wind shear is a change in wind speed and/or wind direction with distance along a plane's flight path. Wind shear has been a significant cause of aircraft accidents in the United States. Let me give an instance of how a change in wind direction affects the flight of a plane. If the wind is blowing slightly upwards under the plane, and suddenly switches and blows downwards, the result is a direct force pushing the plane downwards. If a wind shear such as this one is violent enough, it can be deadly.

•

A change in wind speed can also affect the flight of a plane. For example, if an aircraft experiences a sudden decrease in wind speed, it can reduce the lift on its wings to dangerously low values.

•

Therefore, monitoring wind shear warnings is important.

•

This task will be presented on the laptop placed adjacent to the air traffic control display.

•

In this display, you will see a series of numbers presented one after the other.

•

A value that is less than 130 or greater than 170 represents a wind shear warning.

•

So if you see a number that is less than 130 or greater than 170, please press the spacebar key on the laptop. Pressing the space bar will send a signal to the planes in your airspace about the wind shear alert.

•

Please try to be as fast and accurate as possible in this task. The speed and accuracy with which you perform this task will be recorded.

239


•

If you see numbers in the range 130-170 (including 130 and 170), do not press any key. If you press the spacebar key for the numbers in the range 130-170, it will be recorded as an error.

•

This task is secondary to the air traffic control task. However, try to be as fast and accurate as possible in this task.

•

Let us now practice this task.

•

[Ask participants to explain what they learned.]

•

If you do not press the spacebar key when the number on the screen is less than 130 or greater than 170, a beep sound will be presented. This beep sound is presented to give you feedback that you forgot to press the spacebar key to a wind shear warning.

SAGAT Instructions At a random time during this scenario, the scenario will freeze and the air traffic control screen will be blanked. The weather display will also be frozen. At this point, please turn away from the air traffic control screen. You will then be provided with a map of the airspace. Sector gates, airports, and waypoints will be shown on the map. You are required to indicate the location of all known aircraft that were in the scenario right before the freeze occurred with an X mark [Show them an example of how to complete this]. You will not be able to acquire any information from the air traffic control screen to complete this. Then return the map to the experimenter. You will then be presented with a map with the correct locations of all the aircraft. For each aircraft present on the map, indicate the following: 1. Aircraft altitude 2. Aircraft destination

240


3. Aircraft call sign 4. Aircraft heading direction Once you complete this, give the map to me. You have a maximum time of 5 minutes to complete this. We will then continue with the scenario. At any time during any of the scenarios during this experiment, if you encounter such freezes, you will be given a map of the airspace and asked to recall the attributes of all the aircraft present in your airspace at that time. Multiple freezes can also occur during a scenario. These freezes will occur at random times. Let us now complete a scenario to familiarize with this recall procedure. LOA Instructions: Information Acquisition Group (Instructional Video) •

From now on, in the rest of the scenarios you will be given an automated aid to help you control traffic.

•

This automated aid detects the altitude of all the aircraft present in your airspace and will present the aircraft that are flying at the same altitude in the same color.

•

All the aircraft flying at altitude 3 will be presented in the airspace in green color.

•

All the aircraft flying at altitude 2 will be presented in the airspace in pink color.

•

All the aircraft flying at altitude 1will be presented in the airspace in blue color.

•

So if you see two aircraft that have the same color that will violate the 5 mile separation standard, then it means that a potential separation error or collision may occur.

•

Now I will explain how this automated aid works. This aid determines the altitude of each aircraft in your airspace and presents the aircraft in green color if the altitude is 3, pink color if the altitude is 2, and blue color if its altitude is 1.

241


•

Even though the automated aid helps you detect potential separation errors and collisions more easily by presenting aircraft that are at the same altitude in the same color, it is still your responsibility to avoid the occurrence of separation errors and collisions between aircraft pairs by making an altitude separation.

•

You have the authority to accept or reject the information presented by the automated aid. That is, you can yourself look for the altitude of each aircraft rather than relying on the color coding provided by the automated aid.

•

The automated aid is not 100% reliable. That is, the aid might acquire the wrong altitude information of an aircraft. For example, let us suppose that the altitude of an aircraft is 2, in which case it should be displayed in pink color. When the aid acquires the wrong altitude information for this aircraft, it will display the altitude of this aircraft in green or blue color instead of pink color.

•

Even if the aid fails, it is still your responsibility to effectively monitor the aircraft in your airspace and prevent separation errors and collisions between aircraft.

•

As before, press the conflict button once you detect an upcoming conflict. Be as fast as possible. Please tell aloud the call signs of the aircraft involved in the upcoming collision.

•

Any questions? So let us do a practice scenario. You can ask questions when we complete the practice scenario. [Ask participants in what color altitudes 3, 2, and 1 are represented.]

LOA Instructions: Information Analysis Group (Instructional Video) •


242


•

This automated aid will indicate the aircraft IDs of those aircraft that will loose separation or will collide with each other if they keep going in the same altitude. This information will be presented in a box titled “Potential collisions and separation errors” on the left side of the screen. For example, it will say “3 (GC) 5 (AE)”. This means that the aircraft with aircraft IDs 3 and 5 at locations GC and AE respectively are flying at altitude 3 and will collide with each other in the future.

•

If multiple potential conflicts are projected to occur in the airspace, then the aircraft IDs of those aircraft that would be involved in these conflicts will be displayed in the table in separate lines.

•

In addition, to help you locate the two aircraft, the two aircraft projected to be in conflict will be highlighted in blue color in the airspace.

•

Even though the automated aid highlights both the aircraft that will loose separation or will collide with each other, it will not prevent the occurrence of the collision or separation error. That is, you have to avoid the occurrence of separation errors and collisions between aircraft pairs by making an altitude separation.

•

Once you avoid the upcoming collision between the highlighted aircraft by making an altitude separation, then the highlighting will disappear from the screen.

•

Now I will explain how this automated aid works. This aid determines the speed, location, altitude, heading direction, and destination of each aircraft in your airspace and estimates whether this aircraft will be within 5 miles of any other aircraft in the future. If the 5-mile separation will be violated in the future, the aid will display this information in the table.

243


•

You have the authority to accept or reject the information presented by the automated aid. That is, you can yourself look for potential separation errors and collisions instead of using the information provided by the automated aid.

•

The automated aid is not 100% reliable. That is, the aid might acquire a wrong speed, location, altitude, heading or destination information of an aircraft and might hence fail to detect an upcoming conflict, in which case this information will not be displayed in the table.

•


•

As before, press the conflict button once you detect an upcoming conflict. Be as fast as possible. Please tell aloud the call signs of the aircraft involved in the upcoming collision.

•

Any questions? So let us do a practice scenario. You can ask questions when we complete the practice scenario. [Ask participants to explain the functionality of the automated aid.]

LOA Instructions: Decision and Action Selection Group (Instructional Video) •


•

This automated aid will indicate the aircraft IDs of those aircraft that will loose separation or will collide with another aircraft in the future if they keep going in the same altitude. The aid will also provide a recommendation for resolving the potential conflict. For example, it will say “3 (GC) Shift lvl to 1”. This means that the altitude of the aircraft whose aircraft ID is 3 and which is in segment GC in the airspace needs

244


to be changed to level 1 to prevent an upcoming conflict with another aircraft. This information will be presented in a box titled “Recommendations” on the left side of the screen. •

If multiple potential conflicts are projected to occur in the airspace, then the recommendations to avoid these conflicts will be displayed in the table in separate lines.

•

In addition, to help you locate this aircraft, the aircraft on which the altitude change is recommended will be highlighted in blue color in the airspace.

•

Even though the automated aid provides you with recommendations to resolve potential separation errors and collisions between aircraft, it is still your responsibility to avoid the occurrence of separation errors and collisions between aircraft pairs by making an altitude separation.

•

Once you avoid the upcoming collision by making the altitude change on the recommended aircraft, then the highlighting will disappear from the screen.

•

Now I will explain how this automated aid works. This aid determines the speed, location, altitude, heading direction, and destination of each aircraft in your airspace and checks whether this aircraft will be within 5 miles of other aircraft in the future. If the 5-mile separation will be violated in the future, this aid will provide the recommendation to resolve the potential separation error or collision.

•

You have the authority to accept or reject the recommendation of the automated aid. That is, you can yourself look for potential separation errors and collisions and make decisions on how to avoid separation errors and collisions instead of using the recommendations provided by the automated aid.

245


•

The automated aid is not 100% reliable. That is, the aid might acquire a wrong speed, location, altitude, heading or destination information of an aircraft and might hence fail to detect an upcoming separation error or collision, in which case the recommendation to avoid the collision or separation error will not be displayed in the table.

•


•

As before, press the conflict button once you detect an upcoming conflict. Be as fast as possible. Please tell aloud the call signs of at least one of the aircraft involved in the upcoming collision.

•


LOA Instructions: Action Implementation Group (Instructional Video) •


•

This automated aid will detect those aircraft that will loose separation or will collide with another aircraft in the airspace in the future, if these keep going at the same altitude. The aid will then prevent the occurrence of the potential separation error or collision by changing the altitude of one of the aircraft.

•

The aircraft that will loose separation or will collide with another aircraft will be highlighted in a blue color in the airspace and it’s altitude will be changed. This information will also be displayed in a box tilted “Action performed by Automation”

246


on the left side of the screen. For example, it will say “3 (GC) Changed lvl to 1”. This means that the altitude of the aircraft whose aircraft ID is 3 and which is in segment GC in the airspace was changed to level 1 to prevent an upcoming separation error or collision with another aircraft. •

Now I will explain how this automated aid works. This aid determines the speed, location, altitude, heading direction, and destination of each aircraft in your airspace and checks whether this aircraft will be within 5 miles of other aircraft in the future. If the 5-mile separation will be violated in the future, this aid will resolve the potential separation error or collision by making an altitude change.

•

The automated aid is not 100% reliable. That is, the aid might acquire a wrong speed, location, altitude, heading or destination information of an aircraft and might hence fail to detect an upcoming separation error or collision, in which case it may fail to avoid the collision or separation error between aircraft.

•


•

As before, press the conflict button once you detect an upcoming conflict. Be as fast as possible. Please tell aloud the call signs of at least one of the aircraft involved in the upcoming collision.

•


247


NASA-TLX Instructions

Please rate the workload that you experienced during this scenario. [Double click on the nasa-tlx icon and enter the subject number followed by scenario name.] Please rate your mental demand, physical demand, temporal demand, performance, effort, and frustration level. Before making the ratings, we will go over the definitions of each of these terms. [Give participants the following table.]

Now, you will just click on the area of the scales that best represents your answer. If you have any questions about the meaning of these terms, please ask.

Now, click on the box that best represents your answer.

248