to their errors by means of warnings, dialogues with the user, and automatic ..... threat management model of planning drawing upon earlier work notably by ..... assessment of masking effects due to the interventions of the safety logic.
A proactive approach to human error detection and identification in aviation and air traffic control Tom Kontogiannis Department of Production Engineering & Management, Technical University of Crete, Chania, Greece, Stathis Malakis Hellenic Civil Aviation Authority, Rhodes/Diagoras International Airport, Rhodes, Greece, Abstract In recent years, there has been a realization that total elimination of human error may be difficult to achieve. A further reduction of accidents will require a better understanding of how practitioners manage their errors in ways that consequences are contained or mitigated. With this goal in mind, the present study has set out to propose a framework of cognitive strategies in error detection that would make human performance resilient to changes in work demands. The literature regarded error detection as a spontaneous process that occurs either while an action is executed (action-based detection) or after action feedback (outcome-based detection). To help practitioners maintain a state of mindfulness and introspection, this study proposes several cognitive strategies such as, rehearsing tasks for future execution, bringing into conscious attention routine tasks, seeing how trajectories change over time, and cross-checking data for reliability. Two further detection mechanisms are proposed at the situation assessment and planning stages of performance.
Awareness-based detection may include revising a model of the
situation, finding hidden assumptions, and testing the plausibility of assumptions. Planning-based detection addresses issues such as, identifying uncertainties in a plan, thinking out possible errors, and deciding when and how often to review task progress. Finally, several attitudinal factors and team factors are presented that affect the processes of error detection and identification. The cognitive strategies in error detection together with the attitudinal and team factors constitute a framework for designing the content of error management training. KEYWORDS:
Human reliability, error detection, mindfulness, situation assessment,
planning, error management training.
1
A proactive approach to human error detection and identification in aviation and air traffic control
1.
Introduction
The main focus of research on human error has been primarily on error classification schemes, error prevention through interface design and error tolerant systems (Hollnagel, 1996; Shorrock and Kirwan, 2002). With the increasing complexity of technical systems, however, there has been a realization that total elimination of human error may be difficult to achieve.
There are always bound to be complex situations in which errors
may creep up due to high workload, decision making under stress, and poor team coordination.
What seems to be more important in these situations is preventing or
containing adverse consequences through the detection and correction of errors rather than prevention of errors in the first place.
This new error management approach,
incorporated in the fifth generation of Crew Resource Management (CRM), seeks to provide an understanding of how practitioners manage errors in ways that consequences are contained or mitigated.
Therefore, the emphasis has been on how errors are
detected, how they are identified or explained and how consequences are controlled to maintain safe system performance (Helmreich and Merritt, 2000). An understanding of the error management process is essential in improving safety and reliability of operations. Since early nineties, a growing number of studies have examined error recovery in laboratory tasks and real environments including: aviation (Wioland and Amalberti, 1996; Doireau et al., 1997; Sarter and Alexander, 2000), air traffic control (Bove, 2004), process control industries (Kanse and van der Schaaf, 2001), maritime (Seifert & Hutchins, 1992) and human-computer interaction (Zapf et al., 1994). These studies have shown that a considerable number of errors are never detected or are detected too late for effective intervention and recovery.
An
observational study of normal airline operations (Thomas, 2004) has shown that almost half of the errors went undetected by the crew, although only a small number of errors led to undesired aircraft states. Furthermore, the error detection rate seems to be lower for mistakes (i.e., errors in forming intentions and choosing strategies) but higher for slips and lapses (i.e., errors in 2
executing a strategy). A study of Aviation Safety Reporting System (ASRS) incidents indicated that the person in error managed to detect only 24% of his or her errors; the majority of errors were detected by other crew members and air traffic controllers (Sarter and Alexander, 2000). In addition, most self-detected errors were caught incidentally by a routine check, not because of some expectation-driven strategy on task progress. This finding may indicate that proactive and self-monitoring strategies should become part of error management training. This paper aims at developing a taxonomy of proactive strategies in error detection to enhance operator resilience and provide the content basis of error management training. Error management entails several cognitive processes that must be explored systematically.
Rizzo et al. (1994) and van der Schaaf (1995)
distinguished three processes in error management, namely: (a) error detection - realizing that an error is about to occur or suspecting that an error has occurred, independent from understanding the nature and cause of the error (Zapf et al, 1994). (b) error explanation - explaining why an error occurred. (c) error correction - modifying an existing plan or developing a new plan to compensate. Error correction is a complex process that takes several forms. Mo and Crouzet (1996) and Kontogiannis (1999) distinguished between three goals in error correction: (1) backward recovery where the system is brought back to its initial state prior to the occurrence of the failure, (2) forward recovery where the system is brought into an intermediate state in order for the operators to ‘buy time’ and find a better solution later, and (3) compensatory recovery where redundant equipment is activated to bring the system to the final state that was desired.
Kanse (2004) proposed a different
categorization of error correction as follows:
Stabilization of the situation in order to ‘freeze’ the problem and prevent things from getting out of control (e.g., climbing up at a safer altitude so that pilots get a better estimate of the problem and its consequences)
Mitigation of the problem in order to reduce the amount, scope and impact of error consequences (e.g., extinguishing a fire)
3
Workaround of the problem in order to find a temporal solution and keep the situation under control (e.g., stop temporarily the leakage from a pump)
Permanent solution to the problem that usually follows one of the above three correction strategies (e.g., install new filter in pump)
The aim of this study is to contribute toward a better understanding of error management in high risk environments by reviewing the mechanisms and cognitive strategies involved in error detection and identification.
The first part of this paper
focuses on the mechanisms of error detection and identification.
A framework of
cognitive strategies is proposed based on an analysis of near misses, simulation studies, reviews of non-technical skills and models of human cognition.
The third part of this
paper presents several attitudinal factors and team factors that affect the processes of error detection and identification.
Implications for error management training are
discussed in the concluding section.
2.
Mechanisms of error detection and identification
2.1 Error detection A recent review of error detection mechanisms (Blavier et al., 2005) has classified them into four forms on the basis of two criteria: (1) whether the agent who detected the error was the same person that committed it, an observer or a function of the technical system, and (2) whether the error was detected before or after the results of the action appeared on the user interface. Outcome-based detection is the first form of detection that is triggered by a mismatch between ‘observed outcomes’ and ‘expected effects’. Difficulties in attending to actual outcomes and maintaining ‘expectations about effects’ can be the result of a combination of job factors. The action outcomes, for instance, may not be perceptible because of poor interface design, may be masked by safety logic interventions, or may not be sufficiently attended because of high workload. On the other hand, ‘expectations about effects’ may be ill-specified because of unfamiliarity with the work domain or may be attributed to other causes. A well known paradigm of ill-specified expectations can be found in mathematical calculations or mental maths, where it is difficult to know what to expect in terms of effects (Sellen, 1994). In distributed team work, the handling of
4
emergencies may be misled by attributing the ‘observed outcomes’ to the wrong expectations. In the Swissair 111 accident over Halifax (1998), for instance, two Radar controllers assumed that the flight crew had turned off the electrical systems in order to dump fuel (AAI [1]). This expectation, reinforced by analogies from the past, may explain the cessation of air-to-air communication and secondary radar information. The real cause of the problem was a rapidly spreading fire that had violently terminated the operation of several aircraft systems including the electrical one. Errors can also be detected in the execution stage where people notice a mismatch between actions being executed and actions specified in their plans. As argued by Sellen (1994), action-based detection takes place through a perception of some aspect of the erroneous action either auditorially, visually, or proprioceptively.
A large
percentage of typing errors, for instance, can be detected by skilled typists in the absence of any feedback from the display or keyboard (Rabbitt, 1978). The same concept applies to speech where many psycholinguistics accept the existence of an internal “editor” that constantly monitors and checks speech outcomes before and immediately after their utterance (Fromkin, 1980).
In air traffic control (ATC), lapses and slips of the tongue
can be captured by a comparison
between ‘memory of issued instructions’ and
instructions specified in the communication protocol. Sellen (1994) has also identified another two forms of error detection based on the availability of ‘forcing functions’ and ‘cross-checking by another person’.
Forcing
functions are design constraints that block deviations from the expected course of action. Norman (1988) identified several design elements of text editors that could alert people to their errors by means of warnings, dialogues with the user, and automatic correction of minor errors. In the context of aviation, a forcing function takes the form of a computer algorithm that rejects pilot actions that could endanger the safety envelop of flight. Detection by another person is a form of detection that depends on team communications and cross monitoring activities.
While a large number of slips and
lapses can be detected by the person who committed the error, the detection of mistakes is more difficult because the same error-producing conditions may hinder error detection. Analyses of simulated scenarios in aviation (Thomas, 2004) and naval command teams (Serfaty and Entin, 1996) have shown that team monitoring can be a valuable source in detecting mistakes of other team members. 5
2.2 Error identification and localization Once an error has been detected, people may try to identify the nature of the problem and explain the causes of error. The contribution of the error explanation phase to the error handling process is not so well researched.
Analysis of near-miss reports
(van der Schaaf, 1995) has shown that relatively few recoveries of errors went through the more analytic phase of explaining the causes of error. This could be attributed to the limited time available to compensate or develop a new plan. On the other hand, there might be cases where an explanation of error causes may be necessary in order to compensate.
This reflects some of the research problems about the extent that an
accurate understanding of the problem is necessary for developing a plan of action. Kanse (2004) has argued that error explanation can involve both the identification of the nature of the problem and the explanation of the underlying cause of the problem (error localization). In error identification, for instance, people may sample additional information to understand the nature of the problem in terms of affected functions, available means and system consequences (e.g., the degree of urgency in stabilizing the situation).
Practitioners are able to identify some of their own errors and hidden
assumptions although the causes that led to these errors may not be known yet. On the other hand, error localization requires a more detailed diagnosis of the conditions and causes that led to the actual error. The boundaries between error identification and localization are not always clear cut because practitioners are likely to adopt a mixed approach where some assumptions are made about the causes of error without a conscious effort to test them; however, as information accrues, assumptions can be further clarified. Kanse and van der Schaaf (2001) presented a circular recovery process where, after the identification of error, a ‘quick fix’ (i.e., immediate correction) is attempted, followed by a more detailed process of error localization in order to find out a long-term solution to recover the problem later on. The implication may be that, although error localization is not always necessary for all participating members to correct the problem, error detection and identification are essential in setting new directions for correcting the problem. The focus of this paper, therefore, has been on the processes of error detection and identification.
6
3. Cognitive strategies in error detection and identification Action-based and outcome-based detection regard error detection as a spontaneous process where little preparation has been made by the operators in advance of the error. The processes of ‘enforcing functions’ and ‘team monitoring’ also reinforce a view of detection as an effortless process that is triggered by the system or another team member. What most of previous studies fail to recognize is that experienced operators have to maintain a state of alertness and mindfulness manifested as: rehearsing tasks for future execution, bringing into conscious attention routine tasks, thinking out possible errors that may occur and devising barriers, drawing relationships between data, seeing how trajectories change over time, and cross-checking data for reliability. These cognitive processes involve deliberate planning for the unexpected and self-introspection into task progress that enhances operator resilience. It is important, therefore, that we understand the metacognitive activities of action-based and outcome-based detection. Metacognition has been outlined by G. Kranz, a former mission flight controller of NASA, as a process of “maintaining gimlet-eyed focus on the job at hand while gathering reserves for what lay ahead” (Kranz 2001. pp 308). He coined the term as settling in a “relaxed alertness”. This was the state of the mission controllers of Apollo 13 prior to the outbreak of the events that ended in barely averting a great disaster. Another criticism of the literature is that little attention has been paid to detection of problems at the conceptual and planning stages of performance.
Detection of
mistakes can occur while a plan of action is formulated or an assessment is communicated to other team members. Several error detection strategies may be brought into play such as revising an assessment that appeared plausible at an earlier stage given available data, finding hidden assumptions in an assessment, identifying uncertainties in a plan that should be checked out during execution, thinking out possible errors that may occur, and deciding when and how often to review task progress.
These strategies at the
conceptual stage can be termed awareness-based and planning-based detection. To put into perspective the various cognitive strategies involved in detecting errors at the conceptual and execution stages, Figure 1 presents a simple model of human performance that encompasses four stages: (1) assessment of the situation, (2) formulation of a plan of action, (3) rehearsal, execution and adaptation of the plan and (4) evaluation of the outcome of performance based upon system feedback.
The four stages 7
are performed in a circular fashion, so that feedback of performance can be used to alter an existing goal or modify a previous assessment of the situation. Insert Figure 1 here
This is a nonlinear model of performance since there is no need for practitioners to complete situation assessment before decision making. Practitioners may live up with some uncertainty about the situation and make a decision how to tackle the problem at an early stage;
as more evidence becomes available, the situation can be revised. In the
same sense, it is not necessary to specify in advance detailed plans how to tackle a problem; modifications to plans can be made after some feedback is provided. The level of specification, depth of preparation, and mode of switching between stages are important self-monitoring activities that can provide valuable support in error detection and identification. This paper proposes four stages in error detection, as follows:
Awareness-based detection where an assessment of the situation is revised in order to identify ‘hidden’ and ‘untested’ assumptions, collect missing data and formulate a comprehensive account of problem causes.
Planning-based detection where a plan of action is revised so that new evidence is taken into account, conflicts between goals are balanced, and dependencies between tasks are reduced in order to provide more opportunities for error detection.
Action-based detection where errors are ‘caught in the act’ by means of proactive strategies such as, rehearsing tasks, thinking out possible errors and devising barriers at the execution stage.
Outcome-based detection where mismatches between ‘expected outcomes’ and ‘observed outcomes’ are identified with cognitive strategies such as, seeing how trajectories change over time, spotting rates of change and cross-checking data. The main argument, put forward in this paper, is that human error should not be seen
as a failure of cognitive resources to make a correct assessment or develop a correct plan in the first place.
Figure 1 presents an incremental view of performance where an
assessment can tolerate certain sources of uncertainty and proceed with a plan of action while remaining vigilant to new evidence. Detecting errors in situation assessment amounts to handling data uncertainty so that the assessment becomes more accurate and 8
reliable.
Other sources of uncertainty faced by operators at different stages of
performance (as shown in Figure 1) include: incomplete and conflicting data, goal tradeoffs, need to modify plans quickly, acting and monitoring simultaneously, masking effects and automation effects. Table 1 shows a taxonomy of cognitive strategies in error detection and identification that is described in the following sections. The cognitive strategies have been elicited from observations of simulator training exercises at the Eurocontrol premises of Institute of Air Navigation Services (IANS) and Maastricht Upper Area Control Centre (MUACC) as well as from several analyses of near misses from the Aviation Safety Reporting System (Antonogiannakis, 2003; Mitsotakis, 2006). Insert Table 1 here
3.1 Strategies in awareness-based detection Detecting errors and misunderstandings in the way that people construe a problem requires a process of introspection, known as metacognition.
Self-monitoring is
important for improving understanding of situations that involve complex patterns of cues. The process of situation assessment may require more than the pattern-matching abilities of human experts. Recent studies in making sense of complex problems have proposed that people tend to generate stories and explanations to account for problems that do not seem to fit previous experiences (Cohen et al. 1996; Klein et al. 2003). Story building entails building a mental model of the situation and sustained efforts in critiquing and correcting this regularly. Cohen et al. (1996) described the process of story building in terms of a Recognition/Meta-recognition (R/M) framework where experts struggle to construct complete and coherent (without conflict) models of the situation.
This is done by
critiquing their models in terms of completeness, coherence and reliability (see Figure 2). An assessment is incomplete if key elements of a situation model are missing. Further data can be collected or retrieved from memory which, on some occasions, may result in conflicting models of the situation (i.e., arguments with contradictory conclusions). Under stress, some individuals may explain-away evidence in order to maintain a coherent assessment or model of the situation. In contrast, experts resolve conflicts by testing their explanations or assumptions for reliability. This may entail cross-checking
9
related instruments, waiting for additional data, or inviting colleagues into the assessment; the reliability test usually results in dropping false data and explanations. A reliability test can also be initiated in the absence of conflict; assumptions and explanations can be generated and tested to fill-in gaps in an incomplete model of the situation. The R/M model facilitates critique and correction by reducing considerations into a single common currency: the reliability of assumptions. If unreliability is too great, a new cycle of critiquing may trigger efforts to construct a new model of the situation. Other frameworks for studying the process of interrogating a model of the situation or a model of a plan (Klein, 2004) tend to agree that people should develop skills in coping with uncertainty. Handling uncertainty and revising understanding are important elements of cognitive strategies in detecting mistakes before starting an implementation of a course of action. On the basis of the R/M framework, four types of assessment-based detection mechanisms have been proposed as follows. Insert Figure 2 here
3.1.1. Makes an effort to detect missing cues Practitioners make sense of a situation by trying to build coherent explanations of what has happened. When there is a ‘gap’ in their understanding, they look for other cues that may be currently missing in the environment. The challenge for practitioners here is to estimate correctly the urgency of the situation and decide whether it is appropriate to spend more time collecting data or make a final assessment with the available evidence. Sometimes it makes sense to wait longer because misunderstandings and errors can be detected when more data are used to interpret the situation.
At other times, waiting
passively for new evidence is not the best way to handle uncertainty.
Experienced
controllers have been observed to ‘provoke the system’ in order to generate new data. A usual source of difficulty in ATC is military aircraft entering a traffic section without a visual presentation of their route and altitude on the radar screen.
Air traffic controllers
can detect missing cues and fill their gaps in understanding the intentions of the aircraft by making a clear inquiry to the aircraft or a military base in the vicinity.
To detect
errors in assessing a traffic pattern, controllers may have to collect all the relevant
10
information from the onset of the problem. Gathering information in a piecemeal fashion may create gaps in understanding the problem and leave little latitude for error detection. Cues can also take the form of ‘absent indications’, that is, absence of change in system parameters. Chappell (1995) explains that “pilots may probably notice that the engine fire light is not on, but it is harder to notice that the other crew member did not say ‘takeoff checklist complete’ or that the green arc did not move to reflect the new crossing restriction that they thought they entered correctly”. This form of ‘negative information’ is an expectation that stems from the mental model of practitioners. ‘Mental simulations’ of how an event can give rise to the current situation would create expectations about the extent that certain parameters will change; lack of change could cast doubt on the current assessment of the situation.
3.1.2. Makes an effort to find hidden assumption Under time pressure, experienced operators make assumptions in order to build a coherent explanation of the situation and accept them as true until there is some reason to doubt them. Unfortunately, some assumptions may remain ‘hidden’ and never get tested as operators may be unaware of them.
Consider for instance the case of a long period of
silence on ATC radio communications.
A subconscious assumption may be set with the
pilot that there has been a light radio congestion or even a time break in the pilot traffic communications.
The real cause of silence could be the result of improperly hooking
up the headsets, turning the volume too low earlier on, or even using an improper frequency. Radio silence is a trap because things may be taking place around the aircraft and the pilot could be unaware of them.
Testing an assumption could take several forms
such as, checking that the radio volume is up or asking a low-priority question to the controller to confirm that everything is working as expected. Finding and testing hidden assumptions is vital for detecting errors and misunderstandings of the situation. Cohen et al. (1997) proposed a training method that helps practitioners to counteract overconfidence in their assessments.
They are trained to imagine that part of their
assessment is wrong and try to generate a list of ‘opposing arguments’ that may have taken for granted. This list of concerns must be tested for reliability and, in cases where their concerns turn true, practitioners must revise their understanding to account for them.
11
3.1.3. Does not explain away conflicting evidence Finding ‘hidden assumptions’ often reveals inconsistencies or contradictions with data that have been gathered and trusted in the past. Unfortunately, practitioners, may explain away inconsistencies which prevents error detection and results in cognitive tunnel vision (De Keyser and Woods, 1993). In the aviation domain, for instance, delays, time pressure and commitment to continue to destination may reinforce practitioners into explaining away evidence that contradicts their assessment of the situation.
Research
into plan continuation errors has shown that pilots may not take note of critical information that runs contrary to their early commitments (Muthard and Wickens, 2002). Misleading cues may be combined with unusual indications and create an unforgiving environment that inhibits detection of erroneous hypotheses.
In the Air
Transat 236 incident over Lajes (2001), for instance, the computer issued a fuel imbalance message between the left and right wing tanks followed by an unusual indication that was apparently unrelated to the fuel leak problem
(AAI [2] ).
The
unusual indication (oil pressure was high but the recorded temperature was low) caused the pilots to distrust the computer system and consequently only believe there was a problem when the engines finally stopped minutes later.
A manual calculation after the
first fuel imbalance message failed to reveal a significant loss of fuel which made error detection difficult.
Also, the consideration of “absent indicators” (e.g., absence of a
warning system for loss of fuel) may have contributed to their mind set. In other cases, ‘absent indications’ may be the result of the design of automation modes which can deprive crews of any opportunities to re-assess the situation. In the US Air 1016 incident over Charlotte, North Carolina (1994), for instance, an aircraft started a ‘go around’ procedure because the crew encountered a thunderstorm close to the airport and retracted the flaps (AAI [3]). This action inhibited the issue of a windshear warning because of a design weakness of the computer system.
As a result, the crew failed to
recognize the shift of the weather phenomenon to a windshear.
In this sense, the ‘go
around’ procedure masked the activation of the windshear warning and impeded the detection of changes in the weather condition. It appears that misleading cues, absent indicators, and unusual cue patterns may set up an environment that impedes error detection. In these conditions, practitioners may feel uncertain in their diagnosis but may test the plausibility of their assumptions 12
when the opportunities arise. In the previous incident AAI[2], the pilots disregarded the computer message about loss of fuel and established a crossfeed line between the two tanks until both run out of fuel.
A quick test of their assumption would have been to
maintain the crossfeed operation for a temporary period and then calculate fuel balance. The problem of data interpretation may be exacerbated when people explain away or take immediate action to counteract a symptom and then forget to integrate this information with data that become available later in the search. Klein (1998) suggested that practitioners should keep a mental track of inconsistencies that have been explained away in order to appreciate the effort spent on holding to a fixated assessment.
By
seeing how much has been explained away, practitioners may be alarmed to start another evaluation of their assessment after a predefined mental threshold has been reached.
3.1.4 Tests the plausibility of assumptions. Critiquing and correcting mental models of the situation rests heavily on testing the trustworthiness or truth of the underlying data and assumptions. This process requires competence in cognitive strategies such as, seeing whether a change is leveled-off or made worse in the future, cross-checking data, and verifying the functioning of sensors. It is not however always possible to test assumptions in this fashion by shifting attention from one to another.
Due to limitations in time and resources, some assumptions may
not be possible to test but this is not a sufficient reason for rejection. Operators have several options to consider when an assessment rests on untested assumptions. They can acknowledge the risks in their current assessment but take corrective actions so that their plans do not depend upon these assumptions. In some cases, for instance, air traffic controllers may have difficulties in understanding whether a traffic conflict can be averted by the activation of the Traffic Conflict Avoidance System (i.e., TCAS is an aircraft system that issues advices on the vertical separation with other aircraft in conflict). It is hard to test an assumption that the conditions and geometry of an impeding traffic conflict will trigger TCAS.
Instead of
spending additional time, trying to test an assumption, controllers can issue an instruction to change aircraft vectoring since this horizontal separation could not interfere with the vertical separation advised by TCAS.
13
3.2 Strategies in planning-based detection Planning is a process that ranges from setting goals and directions to detailed courses of action. This section focuses mainly on detecting errors at the level of setting directions and goals that may be difficult to modify once a decision has been reached. On the contrary, task scheduling is more amenable to error detection and modifications when operators have already chosen a correct goal and direction for action. Figure 3 shows a threat management model of planning drawing upon earlier work notably by Helmreich et al. (1999) and Klein (2004). Operators first try to avoid threats by anticipating points of concern and weaknesses in existing plans. Established plans are questioned and revised to addresses issues of completeness, consistency and reliability. Plans can be adapted by regulating their level of specificity (i.e., complexity) and modularity (i.e., coupling). When changing a plan is not feasible or not practical (e.g., it involves a higher cost than pursuing the same course of action) then operators try to minimize the threat or error consequences. This section looks into aspects of anticipatory behaviors, levels of specification of plans and timescales for revising plans so that errors are captured before their actual implementation. Insert Figure 3 here
3.2.1. Anticipates weaknesses in plans and identifies information need. Experienced operators are concerned with weaknesses of their plans and make efforts to anticipate adverse events that threaten the viability of their decisions (Amalberti, 1992; Klein, 1998).
Concerns about changes in the environment that may threaten an ongoing
plan may include adverse weather, possible unavailability of resources and failure of safeguards. In the ATC context, controllers often employ a ‘threat acknowledgement’ strategy (see anticipation box in Figure 3) about certain types of flights (i.e. training, military formations) and types of events (i.e., weather, degraded surveillance system performance) and subsequently provide special handling instructions to pilots (i.e., extra attention, greater than normal separation minima).
Experienced controllers may also
adapt their plans (e.g., switch to vertical separations instead of horizontal ones) to prevent memory lapses likely to occur in the event of computer failures (e.g., having to remember too many aircraft headings if the screen is turned off accidentally).
14
Other ‘points of concern’ may focus on ways of coping with conflicting goals and the riskiness of one’s own actions (e.g., negative consequences and errors). Anticipation is useful in making preparations for when and how to detect adverse events or errors as well as prevent them from occurring in the first place.
In the aviation domain,
anticipation involves ‘staying ahead’ of the aircraft for preparing and coping with potential problems - e.g., adverse weather conditions, dense traffic at the destination airport, and cross winds at landings. (Amalberti and Deblon, 1992). Pilots can mentally simulate landings under adverse conditions and anticipate possible problem areas that should be examined in advance of the actual landing.
3.2.2. Considers a timescale for evaluating progress and questioning plans Complex and dynamic environments make it difficult to think out a detailed plan that would work first time around.
In most cases, a plan should be revised to take into
account of new developments of the situation and any problem areas with the plan itself. The need for establishing a timescale for questioning and revising a plan is more profound when error detection becomes an issue. Considering a timescale for revising a plan can prevent operators from becoming absorbed by the situation and the performance of tasks, ensures that other colleagues are available in time to assist, and helps operators live up with uncertainty in planning (see Figure 3). Several accidents have happened because the flying crew, under the time pressure to complete a mission, accepted a revision of the flight plan without re-assessing how several factors could impact upon flight safety.
In the American Airlines 965 incident
over Cali (1995), for instance, the crew accepted a last minute proposal by ATC to land on a different runway without evaluating the timescale for making all necessary changes (AAI[4]).
Landing on the new runway reduced the time available to carry out tasks,
increased workload and communication demands, imposed new tasks because the aircraft was coming too high and too fast for the new runway, and required additional preparations (e.g., using charts to become familiar with runway).
The crew did not
consider several factors that would reduce their opportunities for error detection and correction – e.g., inoperative radar, ambiguous ATC communications, too many new tasks to attend, and poor visibility.
When they felt confused, they did not choose to
15
discontinue the landing approach and forgot to put the speed brakes on. This prevented them from climbing up faster when they saw the obstacle and tried to change the route. Evaluating task progress is a control feedback mechanism that anticipates the need for change and for making decisions during execution.
Much of project
management consists of using milestones to let managers see whether they need to update their understanding of how the project is progressing. Another way to recognize when it is about time to start interrogating a plan is to keep track of events that should not be happening and wait for a predetermined time only. Klein (2004) refers to these alarming events as ‘tripwires’ that indicate that the plan may have some weaknesses or errors that need to be addressed.
This may imply that error detection can be initiated or another
plan option should be adopted without the need to know what went wrong.
3.2.3. Regulates plan complexity and avoids pre-occupation with details In many emergency scenarios, practitioners are required to consult emergency procedures prior to making a decision. Although procedures may seem to contain all the information and responses needed to counteract problems, practitioners often feel overwhelmed by the large amount of information which prevents them from getting a handle of the overall strategy.
Pre-occupation with details, therefore, tends to frame practitioners into
following a complex set of actions which may hinder detection of changes in the environment (e.g., monitoring traffic communications, sequencing with other aircraft on the final approach, and engine monitoring). On the other hand, when procedures provide broad instructions, a greater latitude for judgement is left for operators to decide how to adapt to the unexpected. Regulating plan complexity implies putting together a plan that fits the circumstances (Klein, 2004). For instance, high levels of expertise and high chances of unanticipated events would require a simple plan where broad instructions are provided to team members allowing a greater latitude for adaptation for the unexpected. On the other hand, complex plans may be required for less experienced operators facing similar problems that have been designed in the procedures. A related issue is the degree of specialization since people who have a wider range of expertise can fill-in for each other, understand each other and detect problems and errors when they occur.
16
A flexible approach to planning entails thinking out alternative plans in broad terms and retaining them in memory for a comparative evaluation. Simplifying a plan in this manner – that is, more ‘points of concern’ but fewer details – makes it easier to put the plan in action and improvise as progress of the task is made. A comparison between broad plans can also become easier because ‘mental simulation’ of how plans may play out in the future becomes easier and possible weaknesses or errors can be detected before their actual implementation.
3.2.4. Uses loosely coupled plans to gain flexibility Another way to detect problems with a plan of action is to make the plan more modular so that tasks can stand on their own (Klein, 2004).
A ‘modular’ plan allows
practitioners to make changes in one part without worrying how these changes may affect the other parts. In the opposite side, ‘integrative plans’ may be efficient in optimizing resources and costs but may increase the coupling or dependencies between tasks. Integrative tasks are more efficient than modular plans but are more brittle because changes in one part propagate to the other parts. Making modular plans may sacrifice efficiency in favor of error detection and flexibility. Drawing an analogy with tight and loose coupling systems (Perrow, 1984), it is possible to specify some features of modular plans.
In this respect, error detection and recovery may be supported when plans take
account of the following:
Use redundant human resources to ensure that more people are available for cross checking and error detection.
Build barriers between tasks so that errors do not propagate to the next task, thus making the symptom easier to detect and attribute to the failed task.
Delay performance of the second task until feedback from the first task becomes available.
Identify alternative means of executing tasks and select those that do not affect performance of the following task.
Provide the mechanism for coordination between the colleagues but let them regulate the coordination as necessary.
17
The context of work will influence the degree of coupling to incorporate in a plan of action. Situations with scarce resources, for instance, will require an integrated plan whilst situations with unanticipated events will make modular planning more appropriate as they would require a greater margin for error detection and recovery.
3.3
Strategies in action-based detection
Self-monitoring can also play an important role in assessing task progress during the implementation of a plan. Studies of operator performance in simulated emergencies have shown that many errors are related to the loss of mental track in the execution of steps (Roth et al., 1994). Inadequate self-monitoring can give rise to omissions, failure to detect problems caused by previous efforts, forgetting steps that have been interrupted or deferred, and failure to detect errors of others. Simulations of production planning exercises in a hot strip mill (Bagnara et al., 1987) showed that errors were discovered largely as a result of standard check behavior. Self-monitoring is a proactive strategy that takes three forms. The first one is a general work habit where a routine check is made on previous actions, current ones, and actions that have been suspended or deferred. The second form assumes a rehearsal or preview of future actions that may be carried out later on under time pressure. The third form concerns the creation of reminders, task triggers and barriers in order to prevent errors or ‘catch errors in the act’.
The reasons why and when an action sequence is
checked depend on the constraints of the task, the context of work and the operator idiosyncratic attitudes. Although the actual detection of errors may take place in the execution stage, some type of proactive risk analysis may be required at the planning and conceptual stage.
Klein et al (2005) argued that experienced operators tend to evaluate the riskiness
of their actions, that is, look for problems that are not inherent in the situation itself, but are located within their own actions.
The risk potential of actions may include the
possibility of side-effects (e.g., firefighters moving to a safe area might be out of radio contact) and negative consequences. This analysis is carried out in advance of the actual execution of the plan so that operators are alert to these risks and adopt specific methods how to detect and correct them (i.e., create reminders or embed barriers). Proactive behaviours may include identifying adverse events and conditions that may increase error 18
potential (e.g., a safeguard that does not materializes or a resource may become suddenly unavailable when needed) and matching familiar error forms to certain actions (e.g., recalling certain types of errors that have occurred in the past to another colleague). These arguments may suggest that some form of proactive risk analysis may be required for certain types of cognitive strategies classified grouped as standard checks on routine tasks, reminders and task barriers. Three types of cognitive strategies for action-based detection are presented below.
3.3.1. Carries out pre-action and post-action checks on routine tasks Operating in a familiar and repetitive environment may result in several lapses and slips, as human experts have given many routine tasks to ‘mental automation’.
Running a
conscious check on highly automated tasks and making it a work habit implies (1) bringing into conscious attention momentarily routine tasks that have been automated and (2) engaging in mental processes such as, retrieving the intent of tasks, recalling withheld or unsuccessful tasks and rehearsing the sequence of future steps.
Although such mental
processes may tax attentional resources and increase workload, they are a valuable form of error detection. Thomas and Petrilli (2004) reported that several pre-action checks on routine actions have already been embedded into procedures, such as cross-checking the Flight Management Computer inputs prior to execution. An informal version of preaction checks have been embedded in the mnemonic of “Identify-Confirm-Select” which has been created to force pilots to question the conditions of application and order of execution rather than merely carry out automatically a series of routine tasks. Another example involves running of post-action checks in order to review whether tasks have been executed, interrupted or postponed. In this sense, a mental counter operates that checks-off several actions as they are considered in a sequence (Reason, 1990).
A standard check can also take the form of checking-off items on an
external source such as a checklist or procedure. It is assumed that the operators do not wait for the actual outcome of their actions and carry out this checking-off routine during the execution stage.
19
3.3.2. Rehearses tasks that may be carried out later under time pressure Research in error detection by Blavier et al. (2005) has highlighted the facilitating role of prospective memory, that is, memory of tasks that should be executed in future and retrieval of intentions. In this sense, rehearsing a series of steps that may be carried out later under time pressure is a good strategy for preventing slips. Mentally rehearsing tasks can also work for error detection since intentions leave a stronger trace in memory during this process or tasks can establish connections with environmental cues. Hence, maintaining an active trace of an intention in memory may assist in detecting cases where the wrong intention was selected (i.e., making tea instead of coffee) or the intention was not implemented at all.
In aviation, it is a common practice that pilots mentally rehearse
a series of actions that may have to implement in future under time pressure or may need to respond to future unexpected events (i.e., “touch and go” in landing).
3.3.3. Creates reminders, task triggers and error barriers Another way of supporting action-based detection is to create reminders and task triggers. Interruptions and high workload can divert attention from the course of action and may result in omissions or delays in performing tasks.
Reminders can help
operators detect omissions particularly in cases where tasks are independent from each other. Pilots, for instance, tend to develop informal reminders, such as turning the checklist upside-down on the yoke-clip when it is interrupted, as a physical reminder that it has not been completed. Other pilots have the technique of selecting the audio for the outer marker when they have been instructed to contact tower at the outer marker way out on the approach; this gives them a reminder that they do not have to look at during busy times in flight (Chappell, 1995). Experienced operators acquire complex habit structures which enable them to perform all tasks skillfully. There are however times when these habits can get in the way of safety. If a task must be performed in a different way than it is normally done, habit patterns may take-over without operators realizing it. The best way to combat this natural tendency is to create a barrier, so that errors are stopped from having an adverse consequence and hence they are brought into conscious attention.
For instance,
operators may configure the computerized control system in ways that block the execution of certain tasks.
This form of proactive behaviour requires operators to be 20
aware of possible error patterns and take measures to prevent them or at least catch them in the act.
3.4 Strategies in outcome-based detection Outcome-based detection relies on observing mismatches between ‘actual outcomes’ and ‘expected outcomes’ of actions.
Detection of mismatches can be difficult in modern
technical systems for reasons related to the presentation of information and the capacity of operators to form accurate and timely expectations. In complex systems, the actual outcome may entail monitoring an enormous amount of data that may change over time. On the other hand, expectations of outcomes depend on the operators having a good mental model of the system which becomes more difficult to achieve as the complexity and coupling of the system increases. Observing and understanding the significance of these mismatches is very important for promoting outcome-based detection.
Previous studies on how operators
cope with the problem of ‘data overload’ (Woods, 1995) have pointed out the role of the context in which data appear and the role of goals and expectations of the observer.
A
particular piece of information gains significance or meaning, mainly from its relationship to the context in which it occurs
(i.e., relationships to other data, the
trajectory followed over time, the onset of changes, and any actions taken by other colleagues or the automation. This section examines potential cognitive strategies that would detect interesting changes and link them to expectations of action outcomes. Three cognitive strategies for outcome-based detection are discussed below.
3.4.1. Examines relational and temporal patterns of change An important mechanism of perception and cognition that enables people to focus on the relevant ‘data subset’ is perceptual organisation.
Although modern sometimes systems
overwhelm people with data, experienced operators are able to see meaningful relationships that point to the semantic properties of the task. Perceptual organisation refers to the ability of operators to group together several data that signify certain task properties.
Another source of difficulty in detecting problems is the way that changes of
data evolve over time.
Many industrial systems have long response times and hence,
action consequences take longer to produce any feedback. A long timespan of a change 21
may go unnoticed by operators.
The trajectory or trend of a change is also very
important for problem detection. The difference between a safe and unsafe trajectory is fairly clear at the end but by then valuable time may have been lost.
Chirstofferssen et
al., (2002) give the example of a monitor showing a trend that may signify that ‘pressure is falling’. To get an accurate estimate operators should consider both the past and the future portions of the trajectory since this may signify that the trend of a change has leveled-off, turned around or continue to fall. Knowing how long to monitor a trend in the past or in the future is difficult and depends on the nature of problem.
3.4.2. Considers a model of influences and interventions Many technical systems have long response times and masking effects that make the problem difficult to detect or interpret (Kontogiannis, 1999).
In cases of multiple faults,
it is likely that the symptom of the first problem may be confounded by the symptom of the second problem if it is not detected early.
Masking effects due to multiple faults or
influences may also cause operators to fixate on the first influence so that another more damaging one can go unnoticed (Woods et al., 1994). When the system response to an operator action is long, it is likely that other operators or the supervisory system may take additional actions whose outcomes confound the consequences of the first action. Masking effects of subsequent interventions make it difficult to separate the results of one’s own actions from the actions of other agents. The accident of British Midlands 092 in Kegworth (1989) is an example of an erroneous assessment of masking effects due to the interventions of the safety logic which masked the actions of the crew (AAI[5]). In that incident, switching off the righthand engine (the healthy one) seemed to have cured the symptom temporarily (i.e., engine vibrations). In actual fact, the vibrations stopped because this action of the crew caused the auto-throttle to stop feeding the damaged left-hand engine with more fuel. In this sense, the mode-change of the auto-throttle masked the effects of the actions of the crew who kept believing that the problem was in the right-hand engine.
Coping with
masking effects requires a good knowledge of the interventions of other agents, such as safety logic and team members, whose actions are not directly observable. Hence, it is necessary that operators have a model of influences and interventions that may be associated to masking effects. 22
3.4.3. Verifies the accuracy and reliability of information The patterns of change created on the interface are produced through the use of sensors that may vary along several dimensions (e.g., sensitivity and reliability).
Operators
should always check the accuracy and reliability of data provided by the sensors. Some information may be inaccurate if the update rate of the sensor is much slower that the speed of change of the problem. Other information may be false if the sensor is not functioning properly (e.g., poor maintenance or places close to high temperatures).
In
some cases, the reliability of a sensor can be cross-checked against the reading of another sensor when the two parameters are functionally related (e.g., high oil temperature and low oil pressure may indicate a lubrication problem).
As an example, a
trainee pilot did not carry out this confirmation check and shutdown the engine; the post accident inspection found that the oil pressure gauge was inoperative. The implication is that operators should adopt a proactive strategy of cross-checking the reliability of information before drawing any inferences for the nature of the problem.
4.
Factors affecting error detection
This section examines several attitudinal and team factors that affect the detection and identification of errors. Attitudinal factors refer to the orientation the person has to the situation, the feelings and stance towards other colleagues and the level of arousal or anxiety.
Team factors refer to the communication and coordination aspects of
performance in small and large groups in the loop (e.g., ATC, other aircraft in the vicinity and company offices).
Detection by internal and external observers can be a valuable
source of support in error detection. Attitudinal factors seem to have a global effect on error detection that makes it difficult to pinpoint their influence on the detection of slips, lapses and mistakes.
Team factors, on the other hand, seem to have more discernible
effects on error detection. Cross-checking and communicating intent, for instance, are useful for recognizing the intent of others and detecting mistakes.
4.1
Attitudinal factors
The stance of a person plays an important role in establishing a work environment in which errors can be detected. For instance, retaining alertness and being ‘pre-occupied with failure’ is a stance that facilitates detection whilst compliance deprives operators 23
from opportunities for detecting problems at early stages when these are more manageable. Also, coping with frustration from errors and avoiding to dwell on selfcriticism are important attitudes that determine how people deal with conflicting evidence and change their initial conclusions. From the literature review, four attitudinal factors were identified that affect the error detection process (see Table 2).
Insert Table 2 here
4.1.1. Vigilance and alertness People try to comprehend an unfamiliar situation by assuming that this is an unusual example of something already known to them (i.e., drawing analogies to their past experience). The reverse process – ‘making the familiar strange’ – is equally useful for maintaining vigilance for subtle changes in the work environment.
Remaining vigilant
counteracts the syndrome of ‘complacency’, that is, a sense of self-satisfaction accompanied by unawareness of actual dangers or equipment deficiencies.
Accident
statistics of the US Forest Service (Jensen, 1989), for instance, showed that most aircraft accidents occur after the fire season is over and they usually occur during what most pilots consider routine, point-to-point flights (e.g., running out of fuel, wheels-up landings, etc.).
Task-induced complacency occurs during routine tasks where pilot
expectations reduce vigilance to novel stimuli. Familiarity breeds conformity. ‘Making the familiar strange’ is an attitude that counteracts complacency. Error suspicion and curiosity can also help operators prepare for contingencies. Landing on a short runway is a procedure well-known to many pilots. Suspicious and curious pilots may decide in advance that failure to touch down before a certain point on the runway will call for an immediate go-around. In this sense, pilots may plan ahead their corrective plans for errors that may occur when landing on a short runway. Suspicion and curiosity are considered to be real virtues in aviation. Collins (1992) argued that “morbid as it sounds, the pilot who feels that the skies of the world are full of lurking hazards is likely to be the safest pilot”. Jensen (1995) discussed several job design and attitude-change programs that encourage error-suspicion and curiosity behaviours to counteract compliance.
24
4.1.2. Awareness of vulnerability to errors The expectation that errors occur as a natural part of everyday practice is an attitude that has been identified by many pilots as an important precursor to error detection and error management (Thomas and Petrilli, 2004). Acknowledging and accepting the possibility of errors can inform a critical mindset that prepares pilots for the occurrence of errors. Awareness of vulnerability to errors makes people recognize that even though they think they understand the system and the ways in which it can fail, surprises are still possible. Weick and Sutcliffe (2001) emphasized the traps involved in preoccupation with shortterm success and false optimism.
Success narrows perceptions, makes people less
tolerant of conflicting evidence and breeds overconfidence in the current plan of action. A healthy skepticism about a model of the situation and awareness of the possibility for failure can increase, not only the potential for error detection, but also learning and recovery from errors.
4.1.3. Awareness of degradation and disengagement The process of self-monitoring also involves awareness of one’s own degradation in performance or mental state.
Symptoms of degradation and disengagement may
include: staying behind the situation, suffering a constant distraction, feeling surprised by little events and feeling tired.
Experienced pilots refer to this mental state as drifting
‘out of the loop’ or disengaging from the problem.
Remaining conscious of these
symptoms requires active monitoring of one’s own mental state.
It is important,
therefore, that sufficient cognitive resources should remain available for this metacognitive activity. Thomas and Petrilli (2004) argued that periods of high workload increase not only the likelihood of performance degradation but also decrease the cognitive resources available for this type of self-monitoring.
The challenge for future
research would be to tackle this ‘double-edged’ problem.
4.1.4. Coping with frustrations from errors When error consequences are significant, the detection of errors and their subsequent attribution of blame can cause stress and frustration.
A ‘blame’ culture in the
organisation may exacerbate frustration and induce some form of coverage up of errors. In this respect, when errors are detected further action may be taken to cover the problem 25
rather than recover it.
Furthermore, inability to cope with frustrations may reinforce the
syndrome of ‘group think’ where people tend to suppress their own arguments if these are not consistent with existing beliefs in the team.
Nevertheless, coping with frustrations
is an important skill that can be supported by appropriate training programs (e.g., stress inoculation, Epstein 1983).
When exposed to controlled exercises, operators can master
how to cope with frustrations from their own errors.
Positive experiences of stress
control in the past can make human performance more robust.
Other skills that facilitate
error management is avoiding self-criticism and dwelling on the history of the error.
4.2
Team factors
Interactions with team members and supervisors can provide ample opportunities for the detection of problems and errors (Fischer and Orasanu, 1999). Team communication is probably one of the most important mechanisms for detecting and correcting mistakes even at the stages of situation assessment and planning.
Analyses of incidents and
simulated emergencies have shown that team communication can be a major source of detection in erroneous diagnoses and plans (Doireau et al., 1997; Sarter and Alexander, 2000; Thomas, 2004). An excellent account of barriers to teams in detecting problematic situations has been presented by Klein (2006). This section is narrower in scope and addresses some team factors affecting the detection of errors rather than problems in general (see Table 3). Insert Table 3 here
Assertiveness and cross-checking seem to have a global effect on error detection whilst ‘adopting multiple perspectives’ and ‘communicating intent’ are useful in the detection of mistakes where the recognition of the intention of others becomes of primary importance.
4.2.1. Assertiveness Adopting an assertive style and voicing concerns is essential to error detection but there are several difficulties in doing so.
Monitoring the actions of others or eliciting the
intentions of crew members should be done with ‘tact and consideration’, without conveying a sense of criticism nor questioning their abilities.
In some cases, a simple
26
prompt is all that is required to trigger an action that another person has omitted. In other circumstances, a more deliberate process may be required in order to alert the other team members that an error occurred.
A delicate balance between assertiveness and tact
should be achieved in order to establish an appropriate communication environment in a team.
The role of assertiveness in teams has long been recognized in the aviation sector
(Chidester and Foushee, 1988) and several training courses in crew resource management (CRM) have addressed alertness in commercial airlines (Helmreich et al., 1999).
4.2.2. Cross-checking others and monitoring for signs of fatigue Monitoring and cross-checking of team members proved to be a valuable resource in detecting errors in diagnosis and planning (Serfaty and Entin, 1996; Sarter and Alexander, 2000).
Error detection requires that team members have access to the
performance of their colleagues. When operators are confined to separate workstations that cannot be monitored by their colleagues then opportunities for error detection are reduced.
Another requirement to error detection is that the detector must happen to
attend to the aspect of the task that could result in error; high workload can divert attention to other parts of the job.
It is important, therefore, that operators utilize low
tempo periods of work to rehearse or plan future activities so that some cognitive resources are still available for cross-checking colleagues during actual performance. Monitoring the mental state of colleagues is also important for error detection since they may not be entirely cognizant with the situation (e.g., ‘left behind the situation’ or ‘kept out of the loop’) due to fatigue, distraction or over-concern with home and family matters. Several pilots have argued that they can read the ‘body language’ of their colleagues and understand that others are not ‘in the loop’ or not aware that something is coming up (Thomas and Petrilli, 2004). Noticing cues about a degrading mental state of others makes people more alert to potential errors that may creep up and hence, they can increase their scanning patterns of others.
4.2.3. The ability to adopt multiple perspective Another requirement to error detection is that the detector must have a perspective of the possible goals associated with the observed behaviour. This last point is very important since an action could seem very sensible to an outside observer but attain a different goal 27
to that sought by the actual performer. Knowledge of possible goals associated with a task is very important in detecting mistakes. Supervisors and team members, who used to do the same job in the past, can act as detectors because they adopt multiple perspective to figure out the intentions of others. This brings us to the question of job design and the overlapping of knowledge or roles between team members. Multi-skill training can achieve a broadening of task perspectives and contribute to error detection by other team members.
Seifert and
Hutchins (1992) have quoted the example of the bearing-taker (i.e., the person who assumes the perspective of “compass reader”) who doesn’t know how this information is used by the plotter. Broadening the perspective of the bearing-taker to think about the bearing in terms of the co-ordinate space of the plot-chart (e.g., the physical location perspective) could help him notice reading errors. Adopting multi-perspectives can support error detection in other ways too. For instance, familiarity with the working style of colleagues and with existing job stressors can increase awareness of the mental state of other colleagues.
4.2.4. Communication of intent A more direct approach to figuring out the intentions of other is the direct communication of intent.
Klein (1998) argued that ‘communication of intent’ may enable team
members to ‘catch errors in advance’ before the implementation stage. Communicating intentions and assessments can help the team to obtain an overall picture of the situation, a kind of shared awareness that enhances detection of mistakes. When the intent behind orders or coordinating actions is not clarified, it is more difficult for people to detect erroneous assumptions and accurately predict future actions. A common example from ATC is the situation where traffic-handling intentions between adjacent sector teams are not clarified – through the handover procedure - and result in loss of aircraft separation. Communication of intent can also be given at the level of plan implementation by commanders who brief their subordinates on the intentions behind plans and orders. In this way, subordinates are more likely to detect any wrong assumptions that led to the formulation of the current plan during the stage of plan rehearsal and implementation. Hence, wrong assumptions made at the stages of situation assessment and planning can be detected later on at the implementation stage by subordinates who have been briefed 28
about the overall intention of the plan.
Military commanders have been observed to
communicate their plans at the level of intention rather than detailed actions to their subordinates (Klein, 1998). Knowing the intent behind a plan enables front-line members to adapt plans as the situation may change in different ways.
In this sense,
communication of intent can support recovery of errors through the adaptation of plans.
5.
Discussion and implications for error management training
Error detection is a mental process that requires a lot of effort because people are asked to maintain a state of alertness and self-introspection in addition to maintaining progress with their assigned task.
In this sense, people are asked to look out for potential threats,
rehearse plans to find weak areas, see things that were missed earlier, find hidden assumptions, keep track of what has been explained away and always be prepared to revise an assessment or an old plan.
Building ‘self-introspection’ into a task can
increase workload because ‘automated’ elements of the task are brought into conscious attention and more conscious space is needed for running these cognitive strategies. This requirement becomes even more difficult to achieve in periods of high workload, or at the end of a long shift, because the available cognitive resources for this type of selfmonitoring are reduced. Awareness-based detection is a process that calls for a state of mindfulness and ambivalence that is difficult to attain.
In complex systems, operators are faced with
situations that are partly familiar and partly novel. For these cases, Weick and Sutcliffe (2001) argued that ‘people should retain a model of situation created by their past experience but also watch for unfamiliar and novel cues in the interest of building a comprehensive story or account of events’.
Engaging in simultaneous belief and doubt
is admittedly a difficult exercise but this stance of ambivalence may be required in order to exploit the valuable experience of practitioners and, at the same time, leave more opportunities for improvisation and error detection.
Maintaining ambivalence will
facilitate the process of storybuilding and increase information intake, all while practitioners can continue to do something familiar that at least stabilizes the situation. Planning-based detection is also difficult to attain because it requires practitioners to forgo standard procedures and rules of thumb in favor of what amounts to ‘reinventing the wheel’ every time that a plan of action is called for. Seeing old things in new ways, 29
staying ahead of the situation, reducing the coupling between tasks, and setting milestones for revisions are the last things people want to do every time they have to develop a course of action. Reliance on old practices and standard procedures is a lot more effortless and makes for efficient performance but may act against the detection of weak areas in a plan of action.
‘Reinventing the wheel’, may involve all these cognitive
strategies in planning, yet new experiences are gained that may enable people to develop a new understanding of the situation and detect problems at an early stage.
Ultimately,
these cognitive strategies in error detection and identification would make operator performance resilient to changes in the work demands of the environment in cases of unexpected turns of the situation or unsuccessful human interventions at earlier stages. A number of error recovery studies in process control (Woods et al., 1994; Kanse, 2004) have shown that error detection and identification have an ad hoc nature and rest heavily on self-monitoring strategies. Clearly, there is a need for developing appropriate forms of operator support that would reduce the “cognitive burden” of a detection strategy in parallel to the performance of the task.
A case in point is error
management training where error detection is mastered in the context of technical skills that are practiced in simulator training.
Recent studies in the aviation domain (Naikar
and Sauders, 2003; Thomas and Petrilli, 2004) have taken a similar approach where trainees are required to master technical skills in a more varied context of practice that allows errors to occur but provides opportunities for error detection.
The context of
practice may include: missing cues, masking effects and delayed or poorly integrated symptoms in order to provide opportunities for error detection strategies. Increasing data uncertainty and goal conflicts may shift the emphasis from mastering technical skills to building up a repertoire of problem solving strategies.
The traditional training of
operators to become flawless in technical skills can increase preoccupation with success and may deprive them from opportunities to practice error detection strategies.
Naikar
& Sauders (2003) have proposed a powerful learning mechanism in military training which involved a process of controlled transgression through the flight safety envelope. Error management training should also address the attitudinal factors that may affect error detection such as, remaining vigilant to counteract complacency, accepting the possibility of error, monitoring for signs of fatigues and ‘out of the loop’ behaviours, and coping with stress and frustration from errors. Some progress has 30
already been made in the aviation domain how to instill alertness and error acceptance in aviation training programs.
Jensen (1995), for instance, presented several training
methods that can be used to counteract complacency. Most of these methods focus on developing appropriate attitudes, such as critiquing oneself, verbalizing routine actions under high workload, repeating back consciously the comments of other team members, and asking “what if” questions.
Another form of training relies on changing the
composition of the team during simulated exercises (Kontogiannis, 1999). That is, working with a new person requires communication of expectations and this could make people think why certain expectations were set in the first place.
As a result,
errors based on expectations can be reduced or detected in a timely fashion. Cross-checking of team members has proven to be one of the most important factors in error detection (Sarter and Alexander, 2000; Thomas, 2004). Team members can play a significant role in monitoring the actions of their colleagues and figuring out possible intentions behind certain actions.
The implications for error management
training is that operators should become familiar with the jobs and the challenges of their colleagues.
Detection of mistakes may be very difficult if team members cannot
understand the perspectives of other job positions. Klein (2006) has highlighted many of the difficulties that teams and organizational units can face in problem detection. Error management training should have to examine how multiple perspectives can be build into teams of operators without incurring an excessive cost on team communication. Previous studies in error management training have looked into isolated elements of error detection. A need was identified to consider a whole repertoire of error detection strategies that combines a variety of strategies for different tasks and application domains.
The current study has made a step forwards into proposing a framework of
cognitive strategies and attitudinal – team factors that affect the detection and identification of errors.
Future research should look into how to create learning
environments that cultivate a repertoire of detection strategies mastered in the context of the target technical skills. Several conditions of practice should be explored to examine how self-introspection can be build into task performance so that the overall workload remains manageable.
31
References Air Accident Investigation [1]. In-flight fire leading to collision with water near Peggy’s Cove, Nova Scotia on 2 September 1998. Transportation Safety Board of Canada, Quebec Canada. Report No A98H0003. Air Accident Investigation [2]. All engines out landing due to fuel exhaustion; Air Transat Airbus A330-243 C-GITS, Lajes, Azores, Poortugal, 24 August 2001. Portuguese Aviation Accidents Prevention and Investigation Department. Accident investigation final report Air Accident Investigation [3]. Flight into terrain during missed approach USAir 1016, DC 9-31, N954VJ Charlotte/Douglas International Airport, North Carolina, 2 July, 1994. National Transportation Safety Board. Air Accident Investigation [4]. Controlled flight into terrain American Airlines flight 965 Boeing 757-233, N651AA near Cali, Colombia, December 20, 1995. Aeronautica Civil of the Republic of Colombia. Air Accident Investigation [5]. Accident to Boeing 737 - 400 G-OBME near Kegworth, Leicestershire on 8th January 1989. Air Accidents Investigations Branch, Department of Transport. Report no. 4190, HSMO, London. Amalberti, R., 1992. Safety in process control: an operator-centered point of view. Reliability Engineering and System Safety, 38, 99-108 Amalberti, R., Deblon, F., 1992. Cognitive modelling of fighter aircraft process control: A step towards an intelligent on-board assistance system. International Journal of Man Machine Studies, 36, 639-671. Antonogianakis, A., 2003. Pilot decision making and error management in general aviation (in Greek). MSc Thesis submitted to the Department of Production Engineering & Management, Technical University of Crete, Greece. Bagnara, S., Stablum, F., Rizzo, A., Fontana, A., Ruo, M., 1987. Error detection and correction: a study on human-computer interaction in a hot strip mill production planning and control system. In: Proceedings of the First European Meeting on Cognitive Science Approaches to Process Control, October 1987, Marcoussis, France.
32
Blavier, A., Rouy, E., Nyssen, A.S., de Keyser V., 2005. Prospective issues for error detection. Ergonomics, 48, 758-781. Bove, T., 2004. Development and validation of a human error management taxonomy in air traffic control. Unpublished PhD Thesis, Riso National Laboratory, Roskilde, Denmark. Chappell, S., 1995. Managing situation awareness on the flight deck or the next best thing to a crystal ball. CRM Developers Group. Website: caar.db.erau.edu/crm/resources/paper/cha pell.html. Chidester, T.R., Foushee, H.C., 1988. Leader personality and crew effectiveness: A full mission simulation experiment. NASA Ames Research Centerm Moffett field, CA. Christofferssen K., Woods, D.D.,
Blike, G.T., 2002.
Making sense of change:
extracting events from dynamic process data. Technical Report ERGO-CSEL01 TR02, Institute of Ergonomics, Ohio State University. Cohen, M., Freeman, J., Thompson, B.B., 1997. Training the naturalistic decision maker. In: C.E. Zsambok and G. Kein (Eds.), Naturalistic Decision Making, pp. 257-268, Lawrence Erlbaum Associates, Hillsdale, NJ. Cohen, M., Freeman J., Wolf, S., 1996. Meta-recognition in time stressed decision making: recognizing, critiquing and correcting. Human Factors, 38, 206-219. Collins, R.L. , 1992, Air Crashes. Thomasson-Grant, Virginia. Doireau, P., Wioland L., Amalberti, R., 1997. La detection des erreurs humaines par des operateurs exterieurs a l’action: le cas du pilotage d’ avion. Le Travail Humain, 60, 131-153. Epstein, S., 1983. Natural healing process of the mind: Graded stress inoculation as an inherent mechanism.
In: D. Meichenbaum and M.E. Jaremko (Eds.), Stress
Reduction and Prevention, pp. 39-66, Plenum, New York. Fischer, U., Orasanu, J., 1999. Say it again Sam ! effective communication strategies to mitigate pilot error.
In: R.S Jensen (Ed.), Proceedings of the 10th International
Symposium on Aviation Psychology, Ohio State University, Columbus. Fromkin, V.A. (Ed.), 1980. Errors in Linguistic Performance: Slips of the Tongue, Ear, Pen, and Hand . Academic Press, New York.
33
Helmreich, P.L., Merritt, A. C., 2000. Safety and error management: the role of Crew Resource Management. In B.J. Hayward and A.L. Lowe (Eds.), Aviation Resource Management, pp. 107-119, Ashgate, Aldershot, UK. Helmreich RL, Klinect JR, and Wilhelm JA, 1999. Models of threat, error, and CRM in flight operations. In: Proceedings of the tenth international symposium on aviation psychology. pp. 677-682 (Columbus: Ohio State University). Hollnagel, E. (1996). Cognitive Reliability Assessment Methodology. Academic Press, London. Jensen, R.S., 1989. Aviation Psychology. Gower Press, Brookfield, VT. Jensen, R.S., 1995, Pilot Judgment and Crew Resource Management. Ashgate Publishing, Aldershot. Kanse, L., 2004. Recovery uncovered: how people in the chemical process industry recover from failures. Unpublished PhD Thesis, Technical University of Eindhoven, Netherleands. Kanse, L., van der Schaaf, T. , 2001. Recovery from failures in the Chemical process industry. International Journal of Cognitive Ergonomics, 5, 199-211. de Keyser, V., Woods, D.D., 1993.
Fixation errors:
failures to revise situation
assessment in dynamic and risky systems. In: A.G. Colombo and Saiz de Bistamente (Eds.), Advanced Systems in Reliability Modeling, Kluwer, Norwell, MA. Klein, G.A., 1998. Sources of Power: How People Make Decisions. MIT Press: MA. Klein, G.A., 2004. The Power of Intuition. A Currency Book/Doubleday, New York. Klein. GA., 2006.
The strengths and limitations of teams for detecting problems.
Cognition Technology and Work, 8, 227-236 Klein, G., Pliske, R., Crandall, B., Woods, D.D., 2005. Problem detection. Cognition, Technology and Work, 7, 14-28 Klein, G., Philips, J.K., Rall, E.L., Peluso, D.A., 2003. A data frame theory of sense making. In: Proceedings of the Sixth International Conference on Naturalistic Decision Making, Florida, May 15-17, 2003. Kontogiannis, T., 1999. User strategies in recovering from errors in man machine systems. Safety Science, 32, 49-68. Kranz G., 2001. Failure is not an Option: Mission Control from Mercury to Apollo 13 and Beyond. Berkley Books, New York. 34
Mitsotakis, A., 2006. Decision making under stress, pilot error, and strategies in error detection and correction in commercial aviation (in Greek). MSc Thesis submitted to Department of Production Engineering & Management, Technical University of Crete, Greece. Mo, J., Crouzet, Y.,1996. Human error tolerant design for air-traffic control systems. In: P. C. Cacciabue and I. A. Papazoglou (Eds.), Proceedings of the Third Probability Safety Assessment and Management, PSAM-III, June 1996, Crete, Greece. Muthard, E. K., Wickens, C. D., 2002. Factors that mediate flight path monitoring and errors in plan revision: an examination of planning under automated conditions. Technical Report AFHD-02-11/NASA-02-8. NASA Ames Research Center, Moffett Field, CA. Naikar, N., Sauders, A., 2003. Crossing the boundaries of safe operation: an approach for training technical skills in error management. Cognition, Technology and Work, 5, 171-180. Norman, D. A., 1988. The Psychology of Everyday Things. Basics Books, New York. Perrow, H., 1984. Normal Accidents: Living with High Risk Technologies. Basics Books, New York. Rabbitt, P., 1978. Detection of errors by skilled typists. Ergonomics, 21, 945-958. Reason, J.T., 1990. Human Error. Cambridge University Press, Cambridge. Rizzo, A., Ferrante, D., Bagnara, S., 1994. Handling human error. In: J. M. Hoc, P. C. Cacciabue and E. Hollnagel (Eds.), Expertise and Technology: Cognition & Human Computer Interaction, pp. 195-212, Lawrence Erlbaum Associates, New Jersey. Roth, E. M., Mumaw, R. J., Lewis, P. M., 1994. An empirical investigation of operator performance in cognitively demanding simulated emergencies. NUREG/CR-6208, U. S. Nuclear Regulatory Commission, Washington, DC. Sarter, N.B., Alexander H.M., 2000. Error types and related error detection mechanisms in the aviation domain: an analysis of aviation safety reporting system incident reports. The International Journal of Aviation Psychology, 10, 189-206. van der Schaaf, T.W., 1995. Human recovery of errors in man-machine systems. Proceedings of the
In:
Sixth IFAC/IFIP/IFORS/IEA Symposium on the Analysis,
Design and Evaluation of Man-Machine Systems, June 1995, Cambridge, MA.
35
Seifert, C.M., Hutchins, E.L., 1992. Error as opportunity: learning in a cooperative task. Human Computer Interaction, 7, 409-435. Sellen, A. J. (1994). Detection of everyday errors. Applied Psychology: An International Review, 43, 475-498. Serfaty, D., Entin, E.E., 1996. Team adaptation and coordination training. In: R. Flin, E. Salas, M. Strub, and L. Martin (Eds.), Decision Making Under Stress: Emerging Themes and Applications, pp. 170-18, Ashgate, Aldershot. Shorrock, S.T., Kirwan, B, 2002. Development and application of a human error identification tool for air traffic control. Applied Ergonomics, 33, 319-336. Thomas, M.J.W., 2004. Predictors of threat and error management: identification of core non-technical skills and implications for training systems design. International Journal of Aviation Psychology, 14, 207-231. Thomas M.J.W., Petrilli, R., 2004. Error Management Training: an investigation of expert pilots’ error management strategies during normal operations and flight crew training. ATSB Aviation Safety Research Grant Scheme Project 2004/0050, Centre for Applied Behavioural Science, University of South Australia. Weick K., Sutcliffe, K.M., 2001. Managing the Unexpected. Jossey Bass, San Francisco. Wioland, L., Amalberti, R., 1996.When errors serve safety: towards a model of ecological safety. In: Proceedings of the First Conference on Cognitive Systems Engineering in Process Control. November 1996, Kyoto University, Japan. Woods, D. D, Johannesen, L.J, Cook, R.I., Sarter, N. B., 1994. Behind human error: Cognitive systems, computers and hindsight. Crew Systems Ergonomics Information Analysis Center, Wright-Patterson AFB, Ohio. Woods, D.D., 1995. Towards a theoretical base for representation design in the computer medium: ecological perception and aiding human cognition. In: J. Flach, P. Hanock, J. Caird, and K. Vicente (Eds), Global perspectives on the ecology of human machine systems
pp. 157-188, Vol 1, Lawrence Erlbaum Associates, Hillsdale.
Zapf, D., Maier, G.W., Rappensperger, G., Irmer, C., 1994. Error detection, task characteristics, and some consequences for software design. Applied Psychology: An International Review, 43, 499-520.
36
Acknowledgements
The authors would like to acknowledge the support from all operational and student controllers in the simulator training exercises held at EUROCONTROL premises of Institute of Air Navigation Services (IANS) and Maastricht Upper Area Control Centre (MUACC).
Dr. Barry Kirwan (EUROCONTROL Experimental Centre, Bretigny /
Orge, France) provided access to the experimental facilities. Special thanks to Angelos Antonogianakis and Adamadios Mitsotakis (115 Hellenic Air Force Base, Souda, Crete) for the analysis of a large sample of near misses from the Aviation Safety Reporting System (ASRS).
37
CAPTIONS OF FIGURES AND TABLES
Figure 1. A simple model of human performance with four stages of error detection Figure 2.A critiquing model of situation awareness (adapted from Cohen et al., 1996) Figure 3. A threat management model of planning
Table 1. Cognitive strategies in error detection and identification Table 2 Attitudinal factors affecting error detection Table 3
Team factors affecting error detection
38
Fig. 1. A simple model of human performance with four stages of error detection
39
Fill in gaps with more data
Incomplete explanation s
Fill in gaps with more assumptions
Inconsistent explanation s Drop unreliable and conflicting assumptions
Explain conflicts by adopting assumptions
Unreliable assumptions
Fig. 2. A critiquing model of situation awareness (adapted from Cohen et al., 1996)
40
Environment (Threats) Feedback
Regulate complexity & coupling
Change plan
Feedback
Anticipate points of concern (Avoid threats)
Question plan Test for: –Completeness – Consistency – Reliability
Mitigate consequences
Preserve plan
Fig. 3. A threat management model of planning
41
Table 1. Cognitive strategies in error detection and identification
Awareness-based detection Makes an effort to detect ‘missing cues’ Makes an effort to find ‘hidden assumptions’ Does not ‘explain away’ conflicting evidence Tests the plausibility of assumptions Planning-based detection Anticipates weaknesses in plans and identifies information needs Considers a timescale for questioning plans Regulates plan complexity and avoids pre-occupation with details Uses loosely coupled plans to gain flexibility Action-based detection Carries out pre-action and post-action checks on routine tasks Rehearses tasks that may be carried out later under time pressure Creates reminders, task triggers and error barriers Outcome-based detection Examines relational and temporal patterns of changes. Considers a model of influences and interventions. Verifies the accuracy and reliability of sensors.
42
Table 2 Attitudinal factors affecting error detection
Vigilance and alertness Awareness of vulnerability to errors Awareness of degradation and disengagement Coping with frustrations from errors
Table 3
Team factors affecting error detection
Assertiveness Cross-checking others and monitoring for signs of fatigue The ability to adopt multiple perspective Communication of intent
43