A COMPARISON OF VARYING LEVELS OF AUTOMATION ON THE ...

A COMPARISON OF VARYING LEVELS OF AUTOMATION ON THE SUPERVISORY CONTROL OF MULTIPLE UASs Lisa Fern* R. Jay Shively† Unmanned aerial systems (UASs) have proliferated throughout the modern battlefield and by all predictions will continue to grow. UAS flights have increased to more than 500,000 hours in the air, largely in Iraq. New concepts of operations, and emerging techniques, tactics and procedures will require a single operator to simultaneously control multiple UASs. This additional tasking will increase workload and decrease situation awareness (SA) unless new humanmachine interfaces are instantiated. The current experiment investigated the utility of a new interface technique called Playbook. In Playbook, an operator calls a play in which all of the players (i.e., UASs) know their roles and have fully or semi-autonomous capability to carry out the necessary tasks to perform them – a situation that is analogous to a quarterback calling a play in football. Operators controlled three UASs in a coordinated mission to evaluate different interfaces with varying levels of automation: multiple UAS automation (Playbook), single UAS automation (scripts), and no automation (tools). Operators were evaluated on their accuracy and speed in acquiring and prosecuting targets, and identifying civilians. Playbook was found to have higher performance and lower workload.

INTRODUCTION Unmanned Aerial Vehicles (UAVs), now more appropriately termed Unmanned Aerial Systems (UASs), have proliferated throughout the modern battlefield and by all predictions will continue to grow. US military UAS operations have easily eclipsed half a million flight hours in Iraq and Afghanistan1. The Army alone has flown over 350 unmanned aircraft in Iraq (Shadows, Hunters and Ravens), logging more than 250,000 hours in the early months of 2007. In addition, the Air Force has flown the Predator over 3000 hours for each month it has been present in Afghanistan. These UAS usage rates are predicted to grow over the next 25 years, not just in military service, but also in use by homeland security, boarder patrol, and domestic commercial operations. *

Senior Research Associate, San Jose State University Research Foundation/Aeroflightdynamics Directorate, MS T12B-2 NASA Ames Research Center, Moffett Field CA 94041, [email protected] † Human Systems Integration Group Leader, Aeroflightdynamics Directorate, US Army Research Development and Engineering Command (AMRDEC), MS T12B-2 NASA Ames Research Center, Moffett Field CA 94041, [email protected]

1

One limiting factor in the widespread use of UASs may be the availability of trained operators. Current UAS ground stations vary somewhat between services and platforms, but nominal operations are constituted by an Air Vehicle Operator (AVO) and a Payload Operator (PO) for each vehicle; an operator to vehicle ratio of 2:1. However, new concepts of operations, and emerging techniques, tactics and procedures (TTPs) will require a single operator to simultaneously control multiple (possibly heterogeneous) UASs.

Achieving this requires that the operator to vehicle ratio reverse to 1:2 or even 1:4. The opera-

tional requirement for multiple vehicle control by a single operator is compounded by the lack of trained operators and the time required for training. User interfaces with automation capabilities are needed to address the issues posed by multiple UAS control. Current solutions to the challenges imposed by multiple UAS control generally fall into one of two system categories, agent-based or supervisory control. Agent-based systems support fully autonomous behavior through independent sensing, reasoning, decision making and action capabilities2. These approaches range in complexity from simple, feed-back control systems to those based on models of human cognition and decision making. Swarm intelligence, an example of a simplistic agent-based system, encompasses the collective intelligence of large groups where single agent behavior tends to be more reflexive than cognitive. Agents in swarming groups share information with each other in order to organize into flight configurations that optimize their shared sensing capabilities. However they are typically ‘unaware’ of their position or role within a group, applying local, rather than global, rules to control their behavior3. More complex systems based on current models of human behavior, for example those developed in the military command and control literature, instantiate more global knowledge and awareness, as well as independent capabilities, into single agents. Given the need for human individuals to make mission critical decisions during certain operations, fully autonomous UASs may be both unrealistic and unacceptable for many military operations. Alternatively, several studies have examined multi-agent systems (MAS) that utilize human-automation coordination between the human operator and semi-autonomous intelligent agents. These multi-agents systems are typically based on a supervisory control structure whereby the intelligent agents are given tasks by the human operator who then monitors task progress. Supervisory control thereby releases operators from the manual control tasks of one or more vehicles so that their cognitive resources can focus on overall mission management; however, the effectiveness of a supervisory control structure is dependent on determining what levels of automation and decision support are needed to manage operator workload and performance4. The level of automation chosen may also depend on task specification, which can vary from highlevel mission goals to more specific low-level behaviors. In response to different levels of task specificity, intelligent agents will require varying levels of decision-making ability or authority. Ruff, Narayanan, and Draper examined the effects of three different levels of automation (LOAs) (as defined by Rouse and Rouse) on multiple UAV control: manual control, management-by-consent, and management-by exception 5,6. Management-by-consent and management-by-exception enabled the MAS to

2

propose actions to the operator and either wait for consent to act (former) or to act independently after informing the operator who can override the action (latter). Compared to manual control, where the operator is responsible for deciding when and how to act, these systems are ‘aware’ of system and situation status, can alert the operators to changes that require responses, as well as propose action. One technique for supervisory control utilizes delegation of authority. An operator can predetermine goals and/or tasks to be delegated to automation or other intelligent entities. The tasks to be delegated and under what conditions this authority is passed can be set in advance. An example of the application of this technique is called Playbook®*, developed by Miller, Goldman, Funk, Wu, and Pate7. This system is based on the analogy of a coach calling a play in sports and utilizes automation for cooperative UASs. For example, when a quarterback calls a play, the players (as intelligent entities) know their roles and don’t need specific instructions. Applying this to the UAS environment, an operator (quarterback) calls a play in which each UAS (intelligent entity) knows their part in the play, and can execute tasks autonomously and cooperatively. For example, a UAS operator might call an “Overwatch Tango” play. In our example, “Tango” is a known way-point. A Shadow UAS knows to go into a loiter at 1,000 meters to watch ingress/egress routes, while a Firescout goes into a hover 500 meters in front of Tango to get a more detailed view. Both of these UASs are tasked by the single command given by the operator. Delegation control based on the play framework can get much more complex than this and may need to be modified for the specific situation, e.g., specify a North approach to a waypoint (due to sun angle). Using this technique, a single operator can control multiple UASs on a coordinated mission. The present experiment looks at potential performance advantages of this type of delegation control in contrast to scripts and manual control. For this experiment, plays are defined as delegation to multiple cooperative UASs, scripts are defined as single UAS automation, and manual control is analogous to the current operations. METHOD Participants Twelve volunteer pilots were recruited to participate in the study. All 12 participants were male, and the age range of participants was 21 to 34 years (M = 27.92; SD = 3.80). Participants were required to hold, at minimum, an active Private Pilot License (PPL). Two of the participants held Airline Transport Pilot (ATP) licenses and the remaining ten held commercial pilot licenses. Total flight hours ranged from

*

DISCLAIMER: Reference herein to any specific commercial products, process, or service by trade name, trademark, manufacturer, or otherwise, does not constitute or imply its endorsement, recommendation, or favoring by the United States Government. The views and opinions expressed herein are strictly those of the authors and do not represent or reflect those of the United States Government. The viewing of the presentation by the Government shall not be used by San Jose State University Research Foundation or Smart Information Flow Technologies as a basis of advertising.

3

530 to 4000 hours (M = 1442.50; SD = 1218.36), and one pilot had military experience. Eligibility was limited to participants who were right-handed, had normal or corrected-to-normal vision, and were under 35 years of age.

Participants were required for no longer than five hours each, were given a one hour

lunch break halfway through testing, and were compensated $124 for their time. None of the participants had previous UAS control experience. Multiple UAS Simulator (MUSIM) Ground Control Station Hardware. This simulation was generated with a quad-core CPU using an NVidia GeForce 8800 GTX video card, and 2GB RAM. The monitor used, was a 30” Apple Cinema Display, with a display resolution of 2560X1600 and 24-bit color. Software. This experiment was run on the Suse Linux 10.3 operating system. MUSIM has the following software dependences: 1) OpenSceneGraph for graphics and, 2) FLTK for graphical user interface. The missions scenarios were generated by MAK Technologies’ VR Forces. Terrain Database. A visual database was created using Creator Terrain Studio 2.0.2 and Creator 2.5.1. Terrain imagery was obtained from US Geological Survey satellite photography. The simulation utilized 30-meter elevation data with 45-meter texture data in the lower resolution areas and 0.7-meter texture data in the high resolution areas. Three designated areas within the database were utilized for this experiment. These areas can be characterized as dense, urban terrain, or medium density, industrial terrain. UAS Flight Model. A generic flight control model emulated three notional, tactical fixed-wing UASs for this simulation. Airspeed and altitude were fixed for all UASs in all conditions. Operator Interface. This simulation utilized a 1:3 operator/vehicle ratio user interface, consisting of three sensor views, a 2D top-down map view with a manual waypoint editing GUI, and an AV control panel. An optical mouse was used for navigation of operator control panels in the operator interface. Payload and Cursor Controllers. The simulated sensor payload on each UAS was electro-optical only with three degrees of freedom (DOF) sensing capability, including 360deg pan capability and +45/-110 pitch limits. Zoom (y-axis) capabilities supported a progressive change in FOV from 2 to 16deg (x – 8x). Sensor slew rate, when used, was set at 60deg/second. The gimbaled sensor was operator-controlled via a 6-DOF Connexion SpaceExplorer input device. The SpaceExplorer utilizes 6-DOF sensing technology (xaxis, y-axis, z-axis, pitch, heading, and roll) in a pressure sensitive device that requires right/left and up/down twisting motions to control the starepoint of the payload. Experimental Design The present study utilized a within-subjects, repeated measures design to examine performance and workload measures while conducting UAS surveillance missions in a MOUT (Military Operations on Urban Terrain) setting across three different modes of control.

4

Missions. One practice and three experimental missions were created for this experiment. All missions were 15 minutes in length and included: three Named Areas of Interest (NAIs), three enemy target vehicles (camo-colored Hun-Vs), and 48 civilian vehicles (multi-colored cars, trucks, and vans). The NAIs were identified as particular buildings in the simulated environment and were identified in both the map and sensor views by small green squares. The NAIs were the same for all missions. Enemy target and civilian vehicles and paths were created using MAK Technologies’ VR Forces software. All vehicles exited, one at a time, from one of the three NAIs in five to 20 second intervals during each mission. The four missions differed in the order and timing of exiting vehicles in order to reduce predictability between missions. Enemy target paths also varied between missions to reduce predictability of their behavior. Mission Objective. Participants were given four mission objectives: 1) monitor NAIs for enemy targets, 2) identify and track enemy targets, 3) prosecute high-value targets (HVTs), and 4) identify and mark civilian vehicles. The monitoring NAIs task required participants to use the payload sensors of the UASs to “stare” at each of the three NAIs. When a participant saw an enemy target exiting an NAI, he was tasked with engaging the auto-track function in order to track the target to its destination, while simultaneously continuing to monitor the three NAIs for civilian vehicles (only one target was presented at a time during a mission, however civilian vehicles could exit NAIs at any time). One target in Mission 1 and two targets in Missions 2 and 3 became HVTs by ‘picking up’ a large weapon, visible in the open back of the Hum-V. Once a participant identified an HVT, he was required to prosecute it. While engaged in these primary tasks, the participant was also responsible for identifying and marking civilian vehicles as ‘friendlies’. Control Modes. The three control modes examined in this experiment were plays, scripts and tools. Tools, or manual, represented the no-automation level of control. This mode required participants to manually control the payload sensor and flight paths of each of the three UASs in order to complete the mission tasks. The scripts mode of control provided single-UAS automation that allowed the participants to select a ‘script’ for one UAS at a time that would control that UAS’s behavior, including setting automatic flight paths and slewing the camera sensor to “stare” at one or two NAI(s). The plays mode of control provided multi-UAS automation. Participants could select a ‘play’ that would control the behavior of all three UASs at one time, including setting automatic flight paths and slewing sensors. In all modes of control, manual manipulation of the payload sensor was required to track targets and mark civilian vehicles. Table 1 outlines the operator steps required for each control mode in order to execute the prosecute target task.

5

Table 1. Operator steps associated with each control mode for the prosecute target task. Control Mode

Steps

Tools

1.

Turn on lase

2.

Lase target

3.

Send coordinates to weaponized UAS

4.

Switch primary UAS

5.

Turn on weapons

6.

Fire weapon

1.

Select and execute ‘lase’ script from the

Scripts

script interface

Plays

2.

Switch primary UAS

3.

Turn on weapons

4.

Fire weapon

1.

Select and execute ‘prosecute’ play from play interface

2.

Fire weapon

Procedures Training Session. After filling out the informed consent form and demographic survey, participants began a training session to familiarize themselves with MUSIM. The entire training session took, on average, one hour to complete and included four pre-recorded videos and four short ‘mini-missions’ with MUSIM. (These missions consisted of multiple civilian and target vehicles, but were not structured like the practice experimental missions used during the testing sessions). The four pre-recorded videos consisted of an introduction to MUSIM and it’s main components, and three videos demonstrating how to perform the mission objectives in each of the three control modes (one video per control mode). After each video was watched by the participant, he was given a 9 minute mission in MUSIM to practice the skills demonstrated in the video. The missions were repeated if a participant felt he needed more time to practice. Once participants had completed the training session and felt comfortable with MUSIM and the mission objectives, he began the experimental sessions. Experimental Sessions. Each participants participated in three experimental sessions, one for each control mode. Each experimental session began with the 15 minute practice mission, followed by the three 15 minute experimental missions during which data was collected. The order in which control modes and missions were presented to participants was counterbalanced according to a Latin Square. After complet-

6

ing each experimental session, participants filled out a NASA-TLX workload rating form for that session’s control mode. Measures Primary Task Performance. The two primary tasks for this experiment were 1) track enemy vehicles, and 2) prosecute high-value targets. Accuracy (%) and response time (in seconds) were collected for both tasks. Accuracy is defined as the number of targets that were correctly tracked or prosecuted out of the total number possible. Response time is defined as the average time it took for the participant to perform the respective task (i.e. tracking or prosecuting) from the time that the target or high-value target appeared in the mission. Secondary Task Performance. The secondary task for this experiment was to mark civilian vehicles. Accuracy and response time were also collected for this task. Accuracy and response time are defined as: the number of vehicles that were correctly marked out of the total possible, and the average time that it took for the participant to mark each vehicle from the time that it appeared in the mission, respectively. Workload. A NASA-TLX was administered to the participants after each experimental session. In addition, participants were also asked to rate their own performance after each experimental session. RESULTS The data was analyzed using a 3 (control: plays, scripts, tools) X 3 (mission: 1, 2, 3) repeated measures ANOVA. The results are organized by task. Primary Tasks Track Enemy Targets. There was not a main effect of control mode on participant accuracy in tracking enemy targets, F(2, 22) = 2.140, p =.141. The percentage of targets tracked was not different in the plays (M = 97.2; SD = 1.5), scripts (M = 91.7, SD = 3.9), and tools (M = 90.7, SD = 2.3) control modes. There was, however, a main effect of control mode on response time, F(2, 22) = 5.520, p < .05. Post hoc comparisons revealed that response time was significantly faster in the plays control mode (M = 16.296, SD = .969) than in both the scripts (M = 20.720, SD 1.825) (p < .05) and tools (M = 24.858, SD = 3.344) (p < .05) control modes. There was not a main effect of mission number on accuracy or response for tracking enemy targets, nor were there any significant interactions of control mode and mission. Prosecute HVTs. There was not a main effect of control on accuracy, F(2, 22) = 3.043, p = .068. The difference in the percentage of HVTs prosecuted did significantly differ between plays (M = 86.1, SD = 3.5), scripts (M = 73.6, SD = 4.8), and tools (M = 72.2, SD = 6.3). However, there was a significant main effect of control on response time F(2, 22) = 19.534, p < .001. Response time for prosecuting HVTs was significantly faster in the plays control mode (M = 20.484; SD = 1.848) compared to the scripts control mode (M = 44.544; SD = 4.916) (p < .001) and the tools control mode (M = 50.555; SD = 4.797) (p
.05. However, there were also no significant interactions of mission and control mode. This suggests that while there was a difference in accuracy within different missions, this difference was consistent across the three different control modes.

Figure 1. Response times to complete the primary tasks in each of the control modes. Secondary Task The main effect of control mode on the percentage of civilian vehicles marked was significant, F(2, 22) = 12. 336, p < .001. A significantly higher percentage of vehicles were marked in the plays control mode (M = 71.4; SD = 4.3) than in both the scripts (M = 58.2; SD = 4.1) (p < .01) and tools (M = 56.6; SD = 3.5) (p < .01) control modes. The main effect of control mode on response time was not significant, F(2, 22) = 1.642, p > .05. This result likely due to the large number (48) of possible vehicles that can be marked in each mission. There was not a significant main effect of mission number on accuracy or response time. There were also no significant interactions of control mode and mission.

Figure 2. Accuracy rates for the secondary task in each of the control modes.

8

NASA-TLX A repeated measures ANOVA was used to analyze the difference in NASA-TLX ratings between the three control modes. This difference was significant, F(2, 22) = 4.289, p < .05. Post hoc analysis revealed that workload ratings were significantly lower in the plays control mode (M = 2.467; SD = .194) than in the tools control mode (M = 3.292; SD = .294) (p < .05), but not significantly different that workload ratings in the scripts control mode (M = 2.808; SD = .243) (p > .05).

Figure 3. NASA-TLX ratings for each of the control modes. DISCUSSION The results of this experiment clearly demonstrate performance and workload advantages for plays over scripts or tools. This finding was evident in both primary and secondary tasks measures. Further, this performance advantage was coupled with reduced workload in the play condition. Both of the primary task measures (track enemy targets and prosecute high value targets) demonstrated an advantage in the response time measure when operators were controlling the UASs via plays. The secondary task performance measure (mark civilian vehicles) exhibited higher accuracy in the plays condition whereby operators were able to correctly mark more vehicles. It is interesting to note the sensitivity of the performance measures to the different tasks, which is likely a result of the difference in total number of opportunities to engage a low number of targets (more sensitive to differences in response time) versus a large number of civilian vehicles (more sensitive to differences in accuracy). Perceived workload, as reflected by NASA-TLX, showed lower workload in the plays condition. Taken together, these findings are very strong evidence for the advantages of the use of delegation to support the supervisory control of multiple UASs. In addition to the empirical efforts on delegation control, the Human Systems Integration group at the Army’s Aeroflightdynamics Directorate is conducting a flight test of this concept. Delegation control will

9

be used to control four UASs (two live vehicles and two virtual). The flight demonstration will control an Yamaha RMAX rotor-wing UAV and a ground robot (ROVER), developed by Carnegie Mellon University. In addition, two virtual Shadow UASs will be flying a coordinated mission in a virtual database of the live test site. This demonstration is scheduled to take place in April, 2009 at the Ft. Ord MOUT site. While the experiment and flight demonstration provide strong evidence for the utility of delegation control for supervisory control of UAS, more work is required. The conditions for this experiment lent themselves to good performance by delegation control: the conditions were always valid for the available plays (i.e. they were a good ‘fit’); the plays always had the assets they needed to be performed; and the conditions did not change after the play was called. Further experiments are needed to demonstrate the utility of this technique in more complex conditions. Follow-up experiments in the HSI laboratory will focus on such issues as: ambiguous conditions, dynamic environments, failure modes, and other situations that might be more difficult for plays to succeed. ACKNOWLEDGMENTS The authors would like to acknowledge Tom Marlow and Terry Welsh for their extensive contribution and technical skills in developing the MUSIM environment used in this study.

10

REFERENCES 1

Associated Press. (2008, January 2). Rise of the Machines: UAV use soars. Retrieved September 15, 2008, from http://www.military.com/NewsContent/0,13319,159220,00.html. 2

Karim, S., Heinze C., & Dunn, S. (2004). Agent-based mission management for a UAV. Proceedings of the International Conference on Intelligent Sensors, Sensor Networks & Information Processing (pp. 481-486). Los Alamitos: IEEE Press. 3

Guadiano, P., Shargel, B., Bonabeau, E. & Clough, B. T. (2003). Swarm intelligence: a new C2 paradigm with an application to the control of swarms of UAVs. 8th International Command and Control Research Technology Symposium, USA, 1-13. 4

Cummings, M. L., Bruni, S., Mercier, S., & Mitchell, P. J. (2007). Automation architecture for single operator, multiple uav command and control, The International C2 Journal, 1(2), 1-24. 5

Ruff, H. A., Narayanan, S., & Draper, M. H. (2002). Human interaction with levels of automation and decision-aid fidelity in the supervisory control of multiple simulated unmanned air vehicles. Presence, 11(4), 335-351. 6

Rouse, W. B., & Rouse, S. H. (1983). A framework for research on adaptive decision aids. (Technical Report AFAMRL-TR-83-082). WPAFB, OH: Air Force Aerospace Medical Research Laboratory. 7

Miller, C., Goldman, R., Funk, H., Wu, P., & Pate, B. (2004). A playbook approach to variable autonomy control: application for control of multiple, heterogeneous unmanned air vehicles. Proceedings of FORUM 60, the Annual Meeting of the American Helicopter Society, Baltimore, MD.

11