Human Interaction with Lights-out Automation: A Field Study - CiteSeerX

6 downloads 39094 Views 68KB Size Report
maintain, and repair it in mind--somewhat similar to the manufacturing concept of .... observed over the course of nine days. The goal was to ... Problem Categories. Pass System Hard- ware. Init. Logic. Timing. Auto-1. *. Auto-2. *. *. *. Auto-3.
In Proceedings of the 3rd Annual Symposium on Human Interaction with Complex Systems , Wright State University, Dayton, OH, August 1996 (pp. 276-283).

Human Interaction with Lights-out Automation: A Field Study* David M. Brann, David A. Thurman & Christine M. Mitchell Center for Human-Machine Systems Research School of Industrial & Systems Engineering Georgia Institute of Technology Atlanta, GA 30332-0205 (404) 894-4321 (404) 894-2301 (fax) [dmbrann, dave, cm]@chmsr.gatech.edu

intended to function autonomously will occasionally require operator intervention. Though the operator will move from supervisory controller to manager-byexception, the automation must be human-centered in design. That is, the design must facilitate rapid human inspection, comprehension, intervention, repair, and maintenance.

Abstract This paper describes a field study in which a recently implemented ‘intelligent’ control system for NASA satellites was observed. The autonomous control system, called Genie, is intended to replace the current two-person operations team with which NASA staffs the control rooms for each scientific satellite. As these control rooms are typically staffed twenty-four hours a day, seven days a week, there is good deal of interest in the use of increased automation to improve the efficiency with which personnel are used and to decrease costs.

Introduction As more sophisticated technologies are developed, ‘intelligent’ automated control systems are being proposed for use in a wide range of complex dynamic systems. The intention is to use automation technologies to replace operators during most nominal and anticipated situations. The operator’s role changes from a supervisory controller interacting (e.g., monitoring and intervening) continuously with the system to a manager-by-exception who is not even necessarily physically present in the control room (Thurman & Mitchell, 1995). The intended paradigm, like that in some manufacturing systems, is ‘lights-out automation.’ ‘Lights-out automation’ is a term coined in the 1980’s in the area of intelligent manufacturing (e.g., Jaikumar, 1986). Initially, with the introduction of inexpensive computer technology, robotics, and automation, manufacturing designers envisioned a lights-out manufacturing facility--a fully automated manufacturing systems in which robots, intelligent work cells, automated material handling systems, and computer-based distributed controllers produce goods without human supervision or intervention (e.g. Warnecke and Steinhilper, 1985; Shaiken, 1985). Inevitably, however, experience showed that lights-out production was critically dependent on human expertise, both human operators and engineers (e.g., Jaikumar, 1986; Shaiken, 1985). ‘Islands of automation’ existed in which operators were the bridges enabling the systems to function reliably and flexibly (Jaikumar, 1986). Moreover, in manufacturing systems, it was widely noted that the introduction of automated technologies and control automation, while reducing the number of required

Genie is a good example of a class of emerging control automation technology that is intended to replace human operators responsible for system control; such technology has been called ‘lights-out automation.’ The field study attempts to assess the extent to which the design and structure of Genie allow such automation to function effectively and to determine the nature of problems which require human intervention either to resume manual control of the system itself or to repair Genie. ** The results suggest that in order for lights-out automation to be effective, it must be designed with the human operator who will occasionally troubleshoot, maintain, and repair it in mind--somewhat similar to the manufacturing concept of designing for maintainability. The results of this study suggest that even automation *

This research was supported in part NASA Goddard Space Flight Center, Grant NAG 5-2227 (GT-E24-X12), A William Stoffel, Technical Monitor. **

It is important to note this case study is not intended to criticize Genie or its designers. Such a system is a necessary first step and a necessary pre-condition for a study such as this and its associated research. Our research is intended to enhance systems such as Genie by specifying functions and characteristics that provide more effective human-automation interaction.

1

personnel, often increased the required skill levels of the persons need to ensure a productive system (Adler, 1986).

operator must resume control, the system must be repairable by allowing operators to assume control of the system itself, fix the control automation, or both. Finally, in an off-line mode, the automation must be maintainable and extensible. The fundamental issue in assessing automation is the extent to which automation can function in a lights-out manner. Designers assume that with proper debugging, validation, and verification, automation software will not fail. Human-machine engineers counter with the proposition that in the dynamic and evolving world of control of complex systems, it is impossible, both theoretically and practically, to guarantee that all states and circumstances have been anticipated and adequately managed. Roth, Bennett & Woods (1987) suggest that “unanticipated variability” will always arise and design needs to take this into account. This field study was conducted in order to explore these issues in a realistic control system. Genie, an ‘intelligent’ control system for NASA satellite ground control, was recently developed and partially fielded. This study assessed the operation of Genie in actual control situations. The study attempts to measure the extent to which Genie could function autonomously and, when human intervention was required, the types of problems that were encountered. In this study, the system was measured against the criteria of being inspectable (What is it doing? Why is it doing that?), predictable (What will it do next?), repairable (Don’t do this, do that), extensible (Do this new activity in the future), and intelligent (Do not make the same mistakes twice).

Role of Operators in ‘Lights-Out’ System Control This paper assumes that operators will have a very different function in system control as the level of automation increases. Operators will be managers, not manual controllers or even supervisory controllers. As such, they will not be present in the control room under normal conditions. They will resume system control when the control automation encounters situations with which it cannot cope. Operators, when the automation fails, will have a range of functions including resumption of manual (supervisory) control of the system itself and repair of the control automation (possibly in real-time). In addition, as domain experts, operators will also be responsible for maintaining and extending the knowledge base of an ‘intelligent’ control system. Unexpected events may require small repairs or changes to the system (e.g., parameter values or set points). Larger changes in the behavior of the controlled system or its function may require extensions to the knowledge base, e.g., additions or modifications. Operators, rather than software designers, are likely to remain essential elements of complex dynamic systems. Their expertise is essential for the maintenance and growth of the knowledge base. Knowledge-based systems whose design fails to take into account the need for domain specialists to refine, maintain, and extend system knowledge are examples of brittle or clumsy automation (Woods, Johannesen, Cook, & Sarter, 1994). Currently, however, automation is often designed based on the assumption that human intervention is rarely, if ever needed, and, thus, little or no consideration in given to the operator-automation interface. When operator intervention is required, inevitable problems arise. Wiener (Wiener, 1989; Wiener, Chidester, Kanki, Palmer, Curry, & Gregorich, 1991) provides a classic characterization of this situation by describing pilots interacting with the flight management computer on the flight decks of most modern commercial airplanes: Pilots ask of the automation, “What is it doing now? Why is it doing that? What will it do next?” The discussion above and Wiener’s characterization of current problems with operator-automation interfaces suggest several properties that automation that must be periodically monitored, repaired, and maintained by human operators should have. A human-centered design (Billings, 1991) of an intelligent control system is inspectable (What is it doing? Why is it doing that?) and predictable, i.e., supports the operator’s need to anticipate the behavior of the control system (What will it do next?). When the

Field Study Environment The field study was performed in the domain of satellite ground control at NASA's Goddard Space Flight Center in Greenbelt, Maryland. At this facility, human operators periodically establish a communications link (known as a ‘pass') with an unmanned scientific satellite to transmit scientific data from the satellite's instruments to the ground, assess the health and safety of the satellite, uplink commands which enable it to function until the next period of communication, and monitor the communications links between the control room and the satellite. Data from the satellite are transmitted through a ground station and arrive at a control room in which operators monitor and control the satellite via a network of computer resources, (Figure 1). This study took place in the control room of the Solar, Anomalous and Magnetospheric Particle Explorer (SAMPEX) science satellite. SAMPEX passes are between 10 and 12 minutes in duration and the pass activities require two operators to be present in the control room.

2

Real-time & Stored Commands

SAMPEX Control Room

Science & Engineering Data

Genie & Operator Workstations

SAMPEX Control System

Ground Station

Figure 1 - Satellite Ground Control Architecture The Generic Inferential Executor (Genie) is an expert system developed at NASA Goddard to conduct real-time satellite ground control operations. It is one of the first examples of an intelligent control system that has been developed looking towards completely autonomous operations, replacing a team of two human operators. Its first implementation for real-time operations is in the mission operations environment for the SAMPEX mission. A software team in conjunction with an experienced SAMPEX operator designed the Genie knowledge base. The implementation that was tested was intended for human-assisted operations, though it was capable of completely autonomous operations. Genie is a rule-based system whose knowledge base is organized into pass scripts. Three different pass scripts are available for Genie operations, corresponding to the three types of passes that comprise normal operations. Pass scripts are comprised of various tasks, in turn consisting of

sets of rules, that are executed in a procedural, sequential fashion. Each task can have preconditions, expected results, various actions taken depending on whether or not conditions are met, and timing constraints that dictate when a given task should be performed. Genie has two basic modes of operation: advisory and automatic. In advisory mode, Genie pauses and obtains permission from an operator before issuing commands to the control system or satellite. In automatic mode it operates without operator intervention as long as it does not encounter problems. Genie provides operators with an interface for monitoring and guiding its execution. The Genie control panel (Figure 2) displays information about the status of the pass script (running or paused), the currently executing task, the task state (precondition, action, results-check), and whether or not there is another external procedure executing at the time. The Genie interface has a flow chart

Figure 2 - Genie Control Panel

3

depicting the pass script, with nodes representing individual tasks. Color-coding shows task state. The control panel allows operators to send commands directly to the spacecraft, to change Genie’s mode, and to control either the pass script as a whole or individual tasks that comprise it. If an individual task fails or its execution pauses due to an anomaly, the operators can direct Genie to retry an individual task or to skip the task and resume execution.

Passes in which was Genie not Utilized. Genie was not utilized in four passes because the SAMPEX operators knew Genie could not perform effectively due to abnormal circumstances or specific activities planned for the passes that existing pass scripts did not include. Two passes involved special activities that needed to be performed with the ground station at the end of the pass, thus the regular pass activities needed to be completed as quickly as possible at the beginning of the pass. This involved performing some tasks in parallel that are normally executed sequentially. The current Genie pass scripts are not capable of executing those tasks in parallel.

Method In early 1996, nineteen SAMPEX passes were observed over the course of nine days. The goal was to observe how Genie performed autonomously or how operators interacted with Genie to carry out pass activities.. Since this was a field test, two SAMPEX operators responsible for pass activities were present during all passes. In addition, a third SAMPEX operator was often present observing Genie execution. This operator was also responsible for repairing Genie in real-time, either helping Genie complete individual tasks or, in cases of failure, determining what Genie was attempting to accomplish and either assuming manual control or returning control to the operations team. During the passes the researcher took detailed notes on Genie execution and actions (if any) of the SAMPEX operators. In addition to the notes, Genie produces a log of its actions, and the SAMPEX control system keeps a detailed event log as well. The primary measures of Genie’s effectiveness were the number of passes that Genie completed successfully without any repair by the operator and the types of problems Genie encountered. Problems with the completion of the pass activities can be grouped into six classes: • problems caused by the ground system (computer and communication resources outside the control room) • problems caused by the control system hardware • problems that involved the initial configuration of Genie •problems caused by failing to synchronize operations with the SAMPEX operators when Genie executed in an advisory mode •problems due to logic errors within the Genie pass scripts • problems with timing in the Genie pass scripts

Successful Advisory passes Genie not utilized Successful 5% 21% Automated passes 0%

Unsucces -ful Advisory passes 53%

Unsucces -ful Automated passes 21%

Figure 3 - Summary of Passes Genie was not utilized for a third pass because problems from an earlier pass required additional tasks outside the scope of Genie’s abilities. Finally, Genie was not utilized in a fourth pass as the ground station staff knew in advance that there were likely to be problems maintaining the communications link to the satellite. Genie continually monitors the communications and pauses when communications problems are encountered; resumption of Genie activities requires manual operator intervention. Thus, knowing frequent manual intervention would be required, the operators decided to forego the use of Genie. Automated Passes. Each of the four passes during which Genie was utilized in automatic mode had problems that prohibited Genie from successfully completing the pass script completely autonomously. The results of these passes are summarized in Table 1. In the first pass, Genie completed the required pass activities, but the system on which Genie was executing failed shortly before the end of the pass. The second pass failed due to problems with establishing a communications link to the spacecraft from the ground station.

Results Genie was utilized for fifteen of the nineteen passes observed in this study. In four of these passes, Genie executed in an automatic mode; in the remaining eleven, it executed in advisory mode (Figure 3).

4

Genie’s design requires this human input in preparation for every pass. Because Genie did not have command capability, its operation was synchronized to the SAMPEX operators’ commanding activities. Synchronization proved difficult given the timing constraints in the pass script and the fact that the normal activities of operators are not nearly as consistent or sequential as the task structure specified in the Genie pass scripts. Timing problems, occurring in three of the eleven passes, are a function of the often ‘brittle‘ design of rulebased systems in which an exact envelope is specified, e.g., a task must be successfully accomplished within 30 seconds. Even small violations, probably acceptable to operators, e.g., 31 seconds, cause system failure. This underscores an interesting difficulty in trying to evaluate automated systems. The methods used by human operators often allow discretion over task timing, sequencing, etc. Automatic control systems, such as Genie, rarely model multiple methods, implementing instead one ‘best’ strategy. Thus, attempting to test or evaluate the automated system by having it “shadow” a human operator is problematic due to the differences in operational style. In addition, it should be noted that these differences can cause problems beyond evaluating the system. Pilots often complain that their flight management systems don’t fly the plane in the same manner that they would. This limits the ability of the pilot/ operator to understand or predict what the system is doing.

Table 1 - Summary of Automatic Pass Execution Problem Categories Pass System Hard- Init. Logic Timing ware Auto-1 * Auto-2 * * * Auto-3 * ** Auto-4 * The third pass failed due to initialization (the operator entered incorrect information regarding the ground station) and timing problems with task execution. These problems were resolved by the operator who instructed Genie to skip the ground station initialization tasks. Genie continued execution without further problems for the remainder of this pass. The fourth pass had minor problems with the ground station as well, again causing timing problems for Genie. The operator instructed Genie to resume operation after each pause. Once the ground station established communications with the satellite there were no further problems were encountered. Advisory Passes. Eleven passes were observed with Genie in advisory mode. The results of these passes are summarized in Table 2. Problems included system problems in which the ground station could not maintain communications (System); initialization problems in which operators incorrectly entered required pass parameters (Initialization); synchronization problems in trying to have Genie ‘shadow’ and parallel activities of the operations team (Synchronization); logic problems within Genie’s pass scripts (Logic); and timing problems in which Genie task timing constraints were inconsistent with the timing of events in the operational environment (Timing). It is interesting to note that in seven of eleven passes initialization problems occurred. Though these are technically ‘human’ errors, it is important to note that

Summary. Figure 4 summarizes Genie’s performance in all the observed nineteen passes. There was only one pass that Genie completed without any manual intervention. Ignoring the ground system problems that are beyond Genie’s control, two of fifteen passes were completed successfully. As shown in Figure 5, initialization and timing problems occurred most frequently, more than twenty-five percent of the time.

Table 2 - Summary of Advisory Pass Execution Problem Categories Pass System Hardware Init. Synch. Adv-1 * Adv-2 * Adv-3 * Adv-4 * Adv-5 Adv-6 * Adv-7 ** Adv-8 ** Adv-9 Adv-10 * * Adv-11 * * *

5

Logic

Timing

*

* *

*

The frequency of the initialization problems (37%) exemplifies the “islands of automation” problem inherent in many automated control systems. Almost all of the initialization information resides in a sophisticated planning and scheduling system on another computer in the control room The planning system, however, cannot ‘talk’ to the control system. Instead, the operator must obtain printed information from the planning system and re-enter it into Genie.

Repairable Once a system such as Genie fails or pauses, requiring manual intervention, operators must assume control of the system manually or repair Genie and re-start it. Given the limitation for inspection and anticipation, the operator has great difficulty in determining a course of repair. Furthermore, Genie does not fail gracefully or allow operator intervention opportunistically. In cases where the operator identifies a need to intervene prior to a complete failure by Genie, the operator must wait for Genie to fail completely and pause its execution before the operator can assume control. Through the course of the study it became apparent that Genie was not a very forgiving system, particularly for initialization problems. In most of the cases where there were problems with the initialization, Genie was given incorrect information by the operator. There is no facility for the operator to correct the initialization information, and it is often necessary to shut down the Genie system and re-initialize with the correct information.

Discussion The results of this field study highlight several important issues. First, unexpected variability was encountered most of the time. Thus, Genie, a system intended to function in a lights-out manner, required manual intervention by operators with great regularity. As a result, supporting operators in management-by-exception functions is indeed an important issue. Inspectable (What is it doing? Why is it doing that?) Genie provides a high level view of task flow and status. Individual tasks are not, however, inspectable. When Genie fails, or pauses due to unanticipated circumstances, there is no way for to determine what control actions were already executed, what values were checked, or what timing constraints were violated.

Maintainable/Extensible Genie successfully completed just seven percent of the passes for which it was utilized; there were four additional passes for which it was not used because of known limitations. Obviously, a need for modifications and extensions exists. Currently, only three scripts exist. Any modification to a task common to all three scripts must be manually made to each script separately. Lack of an integrated knowledge base invites human-error and is certainly errorprone. Extension of Genie’s capabilities, adding, for example, regular but infrequently executed tasks, requires the creation of a new script. Adding a new task in the task flow is supported by drag-and-drop in the Genie interface.

Predictable Due to the lack of detailed task knowledge, operator cannot anticipate either the actions Genie will next execute or the conditions on those actions. The rule-based implementation of tasks in the pass scrip allows inspection by a knowledge engineer or programmer, but is indecipherable by most domain experts. In the run-time environment, neither programmers nor operator can view the contents of tasks and thus prediction is not possible.

Figure 4 - P roble m Fre que ncy 100% 80% 60% 40%

37% 26%

21%

20%

16% 5%

21%

5%

0% Sy st e m

H ar d -

In it iali-

Sy n ch r o -

Ge n ie

Ge n ie

Not

w ar e

zat io n

n izat io n

L o g ic

T im in g

Ut ilize d

P ro b le m Class

6

Defining the semantics within a task, however, requires skill in the definition of a knowledge base implemented as a set of rules. The current interface provides minimal support for task definition for operators who are engineers or other technically trained individuals, but not programmers. If an automated system is difficult to modify and extend, it may not be readily used. As has been observed in a number of studies of expert systems, when the effort required to utilize and maintain them is more than that required to perform tasks manually, systems go unused (Jones & Mitchell, 1995; Roth, Bennett, & Woods, 1987; ). At NASA Goddard, for example, operators of the Hubble Space Telescope refused to use an expert system for monitoring because interaction was so difficult (Roalstad, 1996).

To address the characteristics of inspectability and repairability, we propose that the interface and knowledge base of the system be structured around a model of the operator’s nominal goals and activities in the system. Such modeling methodologies have been shown to be effective in supporting the design of interfaces for complex monitoring and control systems (Mitchell & Saisi, 1987; Thurman & Mitchell, 1994). By structuring the knowledge base of the intelligent system around a model of the operator’s activities, the operator responsible for supervising and guiding the system will be better able to understand the activities the system is performing and why (Woods, 1991). This helps to ensure that the operators will have the needed information in order to understand and repair either the controlled or control system. The characteristics of flexibility, maintainability, and extensibility can be met by structuring the knowledge base of the system hierarchically, using the same model that is used to develop the interface. Rather than having monolithic pass scripts, a pass script can be assembled from the component activities that correspond to the units of activities that operator use, much the way Schank’s scripts are composed of memory organization packets (MOPS) (Schank, 1982). In addition, if control scenarios are assembled from lower activities into pass scripts as required, the problem of propagating changes through individual pass scripts is avoided. Furthermore, a control system dynamically assembled from smaller parts is more flexible. If the environment changes during the pass, the system has the ability to change its path of operation by dynamically reconfiguring the pass script with alternative activities. Repair and extension are facilitated since the system’s capabilities can be modified by adding a new activity or slightly changing an existing activity. Finally, an ‘intelligent’ interface for real-time operator repair might record the system state and operator actions as another case, thereby allowing the control system to ‘learn.’ The study of Genie provides an exciting opportunity to begin to understand the interaction of humans with increasingly sophisticated control automation. This study will help to provide initial data to formulate a robust theory of human-centered design for autonomous control systems. Continuing research, using satellite ground control as a test-bed, will further refine the theory, implement its tenets in an alternative architectures, and evaluate the results.

Brittle Automation The additional issue of time criticality that was brought to light by the percentage of passes that featured timing problems (33%). An intelligent control system operating in a time-constrained complex system must have some sort of timing constraints integrated with its execution. These time constraints must be rather strict in many cases in order to ensure that activities are completed on time. It is problematic, however, to have a 33% failure rate due to timing constraints. One solution may be to give the system greater flexibility with respect to its timing; flexibility must be matched, however, by constraints that ensure the ‘hard’ timing constraints of pass communications are met. Intelligent Automation (Don’t Make the Same Mistake Again!!) The ability to learn from its mistakes seems like a reasonable requirement of an intelligent system. In the current configuration, even after an operator corrects a malfunction, the autonomous system has no capability to learn from its mistakes. It will make the same mistake, under the same circumstances, unless the knowledge base is manually modified. Implications for the Design of Human-Centered Control Automation This study provides additional evidence to suggest that operator-automation interaction is critical to the ultimate success of sophisticated autonomous control systems. Although lights-out automation is the goal, which eventually may be met, the transition from manual or supervisory control to full automation has a long period during which operators play a critical role--manager-forexceptions. For effective overall system operation, the design of the automation must support operator functions such as inspection, prediction, repair, and maintenance.

References Adler, P. (1986) New Technologies, New Skills. California Management Review, 29, 9-28. Billings, C. E. (1991). Human-Centered Aircraft Automation: A Concept and Guidelines (Technical

7

Memorandum No. #103885). NASA Ames Research Center.

Wiener, E. L., Chidester, T. R., Kanki, B. G., Palmer, E. A., Curry, R. E., & Gregorich, S. E. (1991). The impact of cockpit automation on crew coordination and communication: I. Overview, LOFT evaluations, error severity, and questionnaire data No. NASA Contractor Report 177587). t Field, CA: NASA Ames Research Center.

Jaikumar, R. (1986). Postindustrial Manufacturing. Harvard Business Review, Nov-Dec, 69-76. Jones, P. M. & Mitchell, C. M. (1995). Human-computer cooperative problem solving: Theory, design, and evaluation of an intelligent associate system. IEEE Transactions on Systems, Man, and Cybernetics, SMC25(7), 1039-1053.

Woods, D. D. (1991). The cognitive engineering of problem representations. In G. R. S. Weir & J. L. Alty (Eds.), Human-computer interaction and complex systems (pp. 169-188). London: Academic Press.

Martin, T., Ulich, E., & Warnecke, H.J. (1990). Appropriate Automation for Flexible Manufacturing. Automatica, 29, 611-616.

Woods, D. D., Johannesen, L. J., Cook, R. I., & Sarter, N. B. (1994). Behind Human Error: Cognitive Systems, Computers, and Hindsight. Wright-Patterson AFB, OH: Crew Systems Ergonomics Information Analysis Center.

Mitchell, C. M. & Saisi, D. L. (1987). Use of model-based qualitative icons and adaptive windows in workstations for supervisory control systems. IEEE Transactions on Systems, Man, and Cybernetics, SMC-17(4), 573-593. Roalstad, A. S., (1996). Personal Communications. Roth, E. M., Bennett, K. B., & Woods, D. D. (1987). Human interaction with an "intelligent" machine. International Journal of Man-Machine Studies, 27, 479526. Schank, R. C. (1982). Dynamic Memory: A Theory of Reminding and Learning in Computers and People. Cambridge: Cambridge Univ. Press. Shaiken, H. (1985). The Automated Factory: The View from the Shop Floor. Technology Review , Jan, 17-25 Thurman, D. A. & Mitchell, C. M. (1994). A methodology for the design of interactive monitoring interfaces. In Proceedings of the 1994 IEEE International Conference on Systems, Man, and Cybernetics (pp. 1738-1744). San Antonio, TX. Thurman, D. A. & Mitchell, C. M. (1995). Multi-system management: The next step after supervisory control? In Proceedings of the 1995 IEEE International Conference on Systems, Man, and Cybernetics (pp. 4207-4212). Vancouver, BC. Warnecke, H.J. & Steinhilper, R. (1985). Flexible Manufacturing. IFS (Publications) Ltd., UK. Wiener, E. L. (1989). Human factors of advanced technology ("glass cockpit") transport aircraft. No. Tech. Rep. 117528). Moffet Field, CA: NASA Ames Research Center.

8

Suggest Documents