Automation Trust in Conditional Automated Driving ...

4 downloads 227164 Views 23MB Size Report
The role of information display on developing appropriate trust Automation ...... presented on an Apple iPad 3 tablet mounted in the center console. ...... might work through a tutorial before conditional automated driving can be engaged.
Automation Trust in Conditional Automated Driving Systems: Approaches to Operationalization and Design

Dissertation zur Erlangung des akademischen Grades doctor rerum naturalium (Dr. rer. nat.) vorgelegt der Human- und Sozialwissenschaftlichen Fakult¨at der Technischen Universit¨at Chemnitz

vorgelegt von: eingereicht am: Disputation am: Erstgutachter: Zweitgutachter:

Sebastian Hergeth, geboren am 23.06.1986 in M¨ unchen 02.05.2016 16.09.2016 Prof. Dr. Josef F. Krems, Technische Universit¨at Chemnitz Prof. Dr. Martin Baumann, Universit¨at Ulm

Danksagung Diese Dissertation w¨are ohne die Unterst¨ utzung einer Vielzahl von Personen nicht m¨oglich gewesen. Ich m¨ochte diese Gelegenheit nutzen, um ihnen zumindest Ansatzweise meine Dankbarkeit daf¨ ur auszudr¨ ucken: Ich bedanke mich bei Prof. Dr. Josef F. Krems f¨ ur die akademische Betreuung, Begleitung und Begutachtung dieser Arbeit, bei Prof. Dr. Martin Baumann f¨ ur die Begutachtung dieser Arbeit, bei Dr. Nora Broy und Dr. Lutz Lorenz f¨ ur die Betreuung und Begleitung der Arbeit bei der BMW Group, bei Dr. Roman Vilimek f¨ ur seine Unterst¨ utzung als Mentor und weit dar¨ uber hinaus, bei Dr. Thomas Franke, Christian Gold, Nicole K¨ uhnpast, Dr. Frederik Platten, Lisa Precht und Dr. Josef Schumann f¨ ur Ihre kritischen Meinungen und Verbesserungsvorschl¨age zu den im Rahmen dieser Arbeit entstandenen wissenschaftlichen Artikeln, bei Philipp Kerschbaum, Dr. Matthias Beggiato und Shih–Yi Chien f¨ ur unseren wissenschaftlichen Austausch, bei meiner Teamleiterin Ronee Chadowitz, meinen Abteilungsleitern Hans-Joachim Faulstroh und Michael Heimrath sowie Dr. Andreas Keinath f¨ ur das Erm¨oglichen dieser Arbeit, beim gesamten FaSi-Team und allen Beteiligten im Geb¨aude 25, bei Benedikt Zierer f¨ ur die Umsetzung unserer Anzeigekonzepte, bei Lars Toenert und Anson An f¨ ur die Durchf¨ uhrung unserer gemeinsamen Studie, bei Jeanette Bohr f¨ ur Ihre Hilfe bei der Kommunikation unserer Forschung, bei Andrea Neum¨ uller und Nicole Reitmeier f¨ ur die Begleitung im Rahmen des BMW ProMotion Programms, bei Arianna Tacconi f¨ ur die Zerstreuung und Erheiterung, bei Stefan Fuhrmann und Daniel H¨aring f¨ ur ihre jahrzehntelange Freundschaft (und das Korrekturlesen dieser Arbeit!), bei Katharina L¨oschl f¨ ur ihre Unterst¨ utzung und ihren R¨ uckhalt, bei den Teilnehmerinnen und Teilnehmern der im Rahmen dieser Arbeit durchgef¨ uhrten Studien sowie bei den anonymen Reviewern der daraus entstandenen wissenschaftlichen Artikel. Dar¨ uber hinaus m¨ochte ich mich bei den sicher zahlreichen Personen bedanken und zugleich entschuldigen, die ich in dieser Auflistung unterschlagen habe. Nicht zuletzt gilt mein Dank allen Studenten, mit denen ich im Rahmen dieser Dissertation zusammenarbeiten durfte: Nicole Biebel, Marc-Kevin Doeker, Carolin Dungs, Timo Fischer, Fabian Gruss, Oliver Jarosch, Nicole Murgg, Charlotte Pfeiffer, Philipp Prokop und Fabian Ries. Ebenso m¨ochte ich mich von ganzem Herzen bei allen meinen Kollegen und meinen Freunden bedanken, ohne die die Arbeit an dieser Dissertation viel weniger Spaß gemacht h¨atte. Der zweifelsohne gr¨oßte Dank gilt meinen Eltern Anneliese und Heinz sowie meinen Großeltern, die mich u ¨ber alle H¨ohen und Tiefen meines Lebens hinweg bedingungslos unterst¨ utzt haben.

Zusammenfassung Systeme zum automatisierten Fahren erlauben es, die Fahrzeugf¨ uhrung in einem gewissen Maß vom Fahrer an das Fahrzeug zu u ¨bertragen. Da der Fahrer auf diese Weise unterst¨ utzt, entlastet oder sogar ersetzt werden kann, werden Systeme zum automatisierten Fahren mit einem großen Potential f¨ ur Verbesserungen hinsichtlich Straßenverkehrssicherheit, Fahrkomfort und Effizienz verbunden – vorausgesetzt, dass diese Systeme angemessen benutzt werden. Systeme zum hochautomatisierten Fahren stellen in diesem Zusammenhang eine besondere Herausforderung f¨ ur die Mensch-MaschineInteraktion dar: So wird es dem Fahrer bei diesem Automatisierungsgrad zwar zum ersten mal erm¨oglicht, das System nicht mehr permanent u ¨berwachen zu m¨ ussen und somit die Fahrtzeit potentiell f¨ ur fahrfremde T¨atigkeiten zu nutzen. Es wird jedoch im¨ mer noch erwartet, dass der Fahrer nach einer vorherigen angemessenen Ubernahmeaufforderung die Fahrzeugf¨ uhrung im Bedarfsfall gew¨ahrleisten kann. Angemessenes Automatisierungsvertrauen stellt daher eine zentrale Komponente f¨ ur die erfolgreiche Kooperation zwischen Fahrern und Systemen zum hochautomatisierten Fahren dar und sollte bei der Gestaltung derartiger Systeme ber¨ ucksichtigt werden. Fr¨ uhere Befunde weisen beispielsweise bereits darauf hin, dass unterschiedliche Informationen u ¨ber automatisierte Systeme ein m¨oglicher Ansatz sein k¨onnten um das Automatisierungsvertrauen des Fahrers aktiv zu gestalten. Automatisierungsvertrauen als Variable in der Gestaltung von Fahrzeugtechnologie zu ber¨ ucksichtigen erfordert jedoch zun¨achst auch in der Lage zu sein, Automatisierungsvertrauen ad¨aquat messen zu k¨onnen. In diesem Sinne war die Zielsetzung dieser Arbeit einerseits die Untersuchung verschiedener Methoden zur Messung des Automatisierungsvertrauens des Fahrers sowie andererseits die Identifikation, prototypische Umsetzung und Bewertung potentieller Ans¨atze zur Gestaltung von Automatisierungsvertrauen im Kontext von Systemen zum hochautomatisierten Fahren. Zu diesem Zweck wurden drei Fahrsimulatorstudien mit insgesamt N = 280 Probanden durchgef¨ uhrt. Die vorliegenden Ergebnisse weisen darauf hin, dass (i) sowohl Selbstberichtsverfahren als auch Verhaltensmaße prinzipiell dazu verwendet werden k¨onnen um das Automatisierungsvertrauen des Fahrers in Systeme zum hochautomatisierten Fahren zu operationalisieren, (ii) eine vorherige Auseinandersetzung mit funktionalen Grenzen von Systemen zum hochautomatisierten Fahren einen nachhaltigen Effekt auf das Automatisierungsvertrauen des Fahrers in das System haben kann und (iii) insbesondere Informationen u ¨ber die Funktionsweise von Systemen zum hochautomatisierten Fahren das

Automatisierungsvertrauen des Fahrers in derartige Systeme verbessern k¨onnen. Damit liefert die vorliegende Arbeit sowohl wertvolle Ans¨atze zur Messbarmachung als auch Hinweise f¨ ur die Gestaltung von Automatisierungsvertrauen im Kontext des hochautomatisierten Fahrens. Dar¨ uber hinaus k¨onnen die Befunde dieser Arbeit in gewissem Maße auch auf andere Arten von Fahrzeugautomatisierung sowie unterschiedliche Dom¨anen und Anwendungen von Automatisierung u ¨bertragen werden.

Abstract Automated driving systems allow to transfer a certain degree of vehicle control from the driver to a vehicle. By assisting, augmenting or even supplementing the driver, automated driving systems have been associated with enormous potential for improving driving safety, comfort, and efficiency — provided that they are used appropriately. Among those systems, conditional automated driving systems are particularly challenging for human–automation interaction: While the driver is no longer required to permanently monitor conditional automated driving systems, he / she is still expected to provide fallback performance of the dynamic driving task after adequate prior notification. Therefore, facilitating appropriate automation trust is a key component for enabling successful cooperation between drivers and conditional automated driving systems. Earlier work indicates that providing drivers with proper information about conditional automated driving systems might be one promising approach to do this. Considering the role of automation trust as a variable in the design of vehicle technology, however, also requires that drivers’ automation trust can be viably measured in the first place. Accordingly, the objectives of this thesis were to explore different methods for measuring drivers’ automation trust in the context of conditional automated driving as well as the identification, implementation and evaluation of possible approaches for designing drivers’ automation trust in conditional automated driving systems. For these purposes, three driving simulator studies with N = 280 participants were conducted. The results indicate that (i) both self–report measures and behavioral measures can be used to assess drivers’ automation trust in conditional automated driving systems, (ii) prior familiarization with system limitations can have a lasting effect on drivers’ automation trust in conditional automated driving systems and (iii) particularly information about the processes of conditional automated driving systems might promote drivers’ automation trust in these systems. Thus, the present research contributes much needed approaches to both measuring and designing automation trust in the context of conditional automated driving. In addition, the current findings might also be transferred to higher levels of driving automation as well as other domains and applications of automation.

Contents Abbreviations

v

1 Introduction

1

2 Theoretical Background 3 2.1 Taxonomies of Automated Driving Systems . . . . . . . . . . . . . . . . . 4 2.2 Lee and See’s (2004) Dynamic Model of Trust and Reliance on Automation 7 2.3 Hoff and Bashir’s (2015) Three-Layered Automation Trust Model . . . . 9 2.4 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.5 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3 Overall Method 15 3.1 Operationalization of Automation Trust . . . . . . . . . . . . . . . . . . 16 3.2 Designing for Appropriate Automation Trust . . . . . . . . . . . . . . . . 18 4 Aims and Objectives

20

5 Effects of Take-Over Requests and Cultural Background on Automation Trust in Conditional Automated Driving Systems 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

21 21 23 26 28

6 Gaze Behavior as a Measure of Automation Automated Driving 6.1 Introduction . . . . . . . . . . . . . . . . . . 6.2 Method . . . . . . . . . . . . . . . . . . . . 6.3 Results . . . . . . . . . . . . . . . . . . . . . 6.4 Discussion . . . . . . . . . . . . . . . . . . .

30 31 33 37 40

i

Trust During Conditional . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

7 Effects of Prior Familiarization with Take-Over Requests during Conditional Automated Driving on Take-Over Performance and Automation Trust 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

45 46 48 51 56

8 Effects of Purpose, Process and Performance Information about Conditional Automated Driving Systems on Automation Trust and Perceived Usability 8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

61 62 64 68 72

9 General Discussion 9.1 Overall Findings . . . . . . . 9.2 Theoretical Implications . . . 9.3 Practical Application . . . . . 9.4 Methodological Considerations 9.5 Further Research . . . . . . .

78 78 79 83 87 88

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

10 Conclusion

90

Bibliography

92

Appendix

104

ii

List of Figures 2.1

Appropriateness of automation trust as a function of calibration, resolution, and automation capability. . . . . . . . . . . . . . . . . . . . . . . . Levels of driving automation as defined by the Society of Automotive Engineers (2014). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lee and See’s (2004) conceptual model of the dynamic process that governs trust and its effect on reliance. . . . . . . . . . . . . . . . . . . . . . Hoff and Bashir’s (2015) three-layered model of factors that influence trust in automation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Schematical time course of automation operation and its effects on operator trust and reliance. . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11

5.1 5.2 5.3 5.4

Static driving simulator setup. . . . . . . . . . . . . . . . . . . . Driving simulator interior view. . . . . . . . . . . . . . . . . . . Single-item trust ratings during experimental session. . . . . . . Automation trust scale ratings before and after the experimental

24 24 27 27

6.1

Definition of areas of interest (AOIs) used to investigate participants’ glance behavior. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Schematic sequence of events during the experimental session. . . . . . . Correlation between dispositional self-reported automation trust and monitoring frequency averaged for each participant over non-drivingrelated tasks. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Correlations between participants’ self-reported automation trust and monitoring frequency of the automation during each non-driving-related task. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mean self-reported automation trust (A, showing standard errors bars), median monitoring frequency (B), and median monitoring ratio (C) during non-driving-related tasks. . . . . . . . . . . . . . . . . . . . . . . . .

2.2 2.3 2.4 2.5

6.2 6.3

6.4

6.5

iii

. . . . . . . . . . . . . . . session.

4 5 8 9

35 37

38

39

40

7.1 7.2 7.3 7.4 7.5

8.1 8.2

8.3 8.4 8.5

Average take-over time (± SE ) as a function of familiarization condition and take-over situation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . Average minimum time to collision (± SE ) as a function of familiarization condition and take-over situation. ttc = time to collision. . . . . . . . . . Average maximum resulting acceleration (± SE ) as a function of familiarization condition and take-over situation, based on 20% trimmed means. Average subjective criticality (± SE ) as a function of familiarization condition and take-over situation. . . . . . . . . . . . . . . . . . . . . . . . . Average automation trust (± SE ) before and after the experimental session for each familiarization condition. . . . . . . . . . . . . . . . . . . . The (a) baseline, (b) performance, and (c) process display concept implemented in the current (third) study. . . . . . . . . . . . . . . . . . . . . . Average manipulation check scores (± SE ) on the safety system item and comfort system item broken down by the two conditions, based on 20% trimmed means. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Average automation trust questionnaire scores (± SE ) broken down by display concepts and conditions. . . . . . . . . . . . . . . . . . . . . . . . Average SUS scores (± SE ) for the three display concepts across both conditions, based on 20% trimmed means. . . . . . . . . . . . . . . . . . Correlations between automation trust questionnaire ratings and SUS scores, broken down by display concepts and conditions . . . . . . . . . .

iv

52 53 54 55 56

66

69 70 71 72

List of Tables 5.1

6.1

8.1

Composition of the Overall Sample of the First Study, Broken Down by Nationality. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

23

Correlations Between Self-Reported Automation Trust, Monitoring Frequency and Monitoring Ratio During Non-Driving-Related Tasks. . . . .

39

Composition of the Sample of the Third Study, Broken Down by Condition. 65

A.1 Automation Trust Questionnaire Items. . . . . . . . . . . . . . . . . . . . 105

v

Abbreviations ACC Active Cruise Control ADAS Advanced Driver Assistance System BASt Bundesanstalt f¨ ur Straßenwesen NHTSA National Highway Traffic Safety Administration NTSB National Transportation Safety Board SAE Society of Automotive Engineers SuRT Surrogate Reference Task SUS System Usability Scale TOR Take-Over Request TTC Time to Collision VAS Visual Analogue Scale

vi

1 Introduction Automated driving has received a lot of public attention lately, manifesting itself for example in a lead article published in Time Magazine (Vella & Steinmetz, 2016). However, only few people to date have actually had the chance to experience such systems that are largely still under development. To still get an idea of the concept of automated driving, a little thought experiment might provide some persepective: Imagine you are driving down the highway in your shiny new car and switch on its automated driving system for the first time. Your car now drives completely on its own, and does not even require to be monitored. As long as you are not notified by the system, you could use the freed up driving time to attend to non-driving-related tasks such as using your car’s onboard infotainment system, making a phone call, or simply to relax, but the question is — would you? At the time of this writing, the introduction of SAE Level 3 (conditional) automated driving systems systems such as the one just described is expected to be only a few years away (Verband der Automobilindustrie e.V., 2015). To what extent drivers will trust these systems, however, has been much more unclear. When this thesis was conceived of, there already was a large body of evidence indicating that trust in automation (subsequently automation trust) is one of the most important variables for enabling successful human-automation collaboration (for an overview, see Lee & See, 2004; Hoff & Bashir, 2015). At the same time, there was no published research investigating the role of drivers’ automation trust in the — very peculiar — context of conditional automated driving. Motivated by this hole in the literature the current thesis had two main objectives: Ultimately, the present research was intended to identify, implement and prototypically evaluate potential approaches to promote appropriate automation trust in conditional automated driving systems. This, in turn, also required to have viable methods for measuring drivers’ automation trust in the context of conditional automated driving. For these purposes, three driving simulator studies were conducted. Using this data base, the current thesis has produced two published and two submitted peer-reviewed

1

manuscripts that investigate the operationalization and design of automation trust in the context of conditional automated driving: Manuscript 1 – Hergeth, S., Lorenz, L., Krems, J. F., & Toenert, L. (2015). Effects of take-over requests and cultural background on automation trust in highly automated driving. Proceedings of the 8th International Driving Symposium on Human Factors in Driver Assessment, Training and Vehicle Design, 331–337. Manuscript 2 – Hergeth, S., Lorenz, L., Vilimek, R., & Krems, J. F. (2016). Keep your scanners peeled: Gaze behavior as a measure of automation trust during highly automated driving. Human Factors, 58, 509–519. doi:10.1177/0018720815625744 Manuscript 3 – Hergeth, S., Lorenz, L., & Krems, J. F. (2015). What did you expect? Effects of prior familiarization with take-over requests during conditional automated driving on take-over performance and automation trust. Manuscript submitted for publication. Manuscript 4 – Hergeth, S., Lorenz, L., Krems, J. F., & Broy, N. (2016). Effects of purpose, process and performance information about conditional automated driving systems on automation trust and perceived usability. Manuscript submitted for publication. These four manuscripts are the foundation for the following dissertation, which is intended to integrate the present research in the broader literature. This includes an overview of the theoretical background (see Chapter 2) as well as overall method of this work (see Chapter 3), its general aims and objectives (see Chapter 4), and a general discussion (see Chapter 9). The specific motivation, method and results of each study are discussed separately in the respective chapters pertaining to each of the four manuscripts (see Chapter 5 – Chapter 8).

2

2 Theoretical Background In a broad sense, automated driving systems can be defined as ”the hardware and software that is collectively capable of performing all aspects of the dynamic driving task” necessary to operate a vehicle in road traffic (Society of Automotive Engineers, 2014, p. 5). Accordingly, automated driving systems allow to transfer a certain degree of vehicle control from the driver to the vehicle (Trimble, Bishop, Morgan, & Blanco, 2014). By assisting, augmenting or even supplementing the driver, this shift has been associated with enormous potential for benefits such as improvement of driving safety (Hummel, Kuehn, Bende, & Lang, 2011; Meyer & Deix, 2014; Trimble et al., 2014) and driving comfort (Payre, Cestac, & Delhomme, 2014; Strand, Nilsson, Karlsson, & Nilsson, 2014). Automated driving has also been linked with additional societal benefits like improved productivity, reduced infrastructure needs and enhanced energy efficiency (Beggiato et al., 2015; KPMG International and the Center for Automotive Research, 2012). However, such prospective benefits could be compromised if drivers trust automated driving systems inappropriately: People tend to rely on automation they trust and reject automation they do not trust. Automation trust can best be understood as ”the attitude that an agent will help achieve an individual’s goals in a situation characterized by uncertainty and vulnerability” (Lee & See, 2004, p. 51). As automation exceeds the understanding of the operator, and particularly in complex and unpredictable situations, trust plays a key role for successful human-automation interaction (Lee & See, 2004). Under such conditions, automation trust has been identified as one of the most important factors shaping users’ willingness to rely on automation (Lee & See, 2004, p. 51): ”By guiding reliance, trust helps to overcome the cognitive complexity people face in managing increasingly sophisticated automation”. Accordingly, drivers might tend to reject a reliable automated driving system if they think it is not trustworthy or rely excessively on an automated driving system without recognizing its limitations (Lee & See, 2004; see Figure 2.1). Parasuraman und Riley (1997) described these respective states as disuse and misuse, both of which can be detrimental to the usefulness of automation.

3

For example, train operators were found to continue disabling automated speed warnings even after a fatal accident occurred near Baltimore in 1987, whereas a 1994 plane crash near Columbus, Ohio, was attributed to the pilot’s heavy reliance on an autopilot system that led him to neglect monitoring his aircraft’s airspeed (Parasuraman & Riley, 1997).

Overtrust: Trust exceeds system capabilities, leading to misuse

Calibrated trust: Trust matches system capabilities, leading to appropriate use

Trust Distrust: Trust falls short of system capabilities, leading to disuse Good resolution: A range of system capability maps onto the same range of trust

Poor resolution: A large range of system capability maps onto a small range of trust

Automation capability (trustworthiness)

Figure 2.1. Appropriateness of automation trust as a function of calibration, resolution, and automation capability. Adapted from Lee and See (2004). Therefore, facilitating appropriate automation trust (i.e., trust matching the true capabilities of the automation) is essential to prevent misuse and disuse of automated driving systems and exploit their full potential (Lee & See, 2004; see Figure 2.1). However, the practical importance of automation trust for successful collaboration between drivers and automated driving systems — just as for other kinds of human-automation interaction (Lee & See, 2004; Hoff & Bashir, 2015; see also Chapter 9) — also depends on the respective automation level of a particular automated driving system.

2.1 Taxonomies of Automated Driving Systems In addition to general automation taxonomies (e.g. Parasuraman, Sheridan, & Wickens, 2000; Riley, 1989; Sheridan & Verplank, 1978), several domain-specific taxonomies of au-

4

Summary of SAE International’s Levels of Driving Automation for On-Road Vehicles Issued January 2014, systems SAE international’s J3016 provides a common taxonomy definitions for automated driving in order to simplify tomated driving have been proposed (forand a comprehensive overview, see Trimble communication and facilitate collaboration within technical and policy domains. It defines more than a dozen key terms, including those below,Among and provides the full descriptions and taxonomies examples for each level. et al.,italicized 2014). specific of automated driving, the most prominent The report’s six levels of driving automation span from no automation to full automation. A key distinction is between level 2, where the ones were devised byofthe German Bundesanstalt f¨ ur Straßenwesen (BASt; Gasser et al., human driver performs part the dynamic driving task, and level 3, where the automated driving system performs the entire dynamic driving task. 2012),These the American National Highway Traffic Safety Administration (NHTSA; National levels are descriptive rather than normative and technical rather than legal. They imply no particular order of market introduction. Elements indicate minimum rather than maximum system capabilities for each level. A particular vehicle may have multiple driving Highway Traffic Administration, 2013), the Society Automotive Engineers automation features Safety such that it could operate at different levels dependingand upon the feature(s) that areof engaged. refers to the assistance system, Engineers, combination of driver assistance systems, or automated driving system. Excluded are warning (SAESystem ; Society ofdriver Automotive 2014). and momentary intervention systems, which do not automate any part of the dynamic driving task on a sustained basis and therefore do not change the human driver’s role in performing the dynamic driving task.

SAE level

Execution of Monitoring Steering and of Driving Acceleration/ Environment Deceleration

Fallback Performance of Dynamic Driving Task

System Capability (Driving Modes)

the full-time performance by the human driver of all aspects of the dynamic driving task, even when enhanced by warning or intervention systems

Human driver

Human driver

Human driver

n/a

Name

Narrative Definition

Human driver monitors the driving environment

0

No Automation

1

Driver Assistance

the driving mode-specific execution by a driver assistance system of either steering or acceleration/deceleration using information about the driving environment and with the expectation that the human driver perform all remaining aspects of the dynamic driving task

Human driver and system

Human driver

Human driver

Some driving modes

2

Partial Automation

the driving mode-specific execution by one or more driver assistance systems of both steering and acceleration/ deceleration using information about the driving environment and with the expectation that the human driver perform all remaining aspects of the dynamic driving task

System

Human driver

Human driver

Some driving modes

Automated driving system (“system”) monitors the driving environment

3

Conditional Automation

the driving mode-specific performance by an automated driving system of all aspects of the dynamic driving task with the expectation that the human driver will respond appropriately to a request to intervene

System

System

Human driver

Some driving modes

4

High Automation

the driving mode-specific performance by an automated driving system of all aspects of the dynamic driving task, even if a human driver does not respond appropriately to a request to intervene

System

System

System

Some driving modes

5

Full Automation

the full-time performance by an automated driving system of all aspects of the dynamic driving task under all roadway and environmental conditions that can be managed by a human driver

System

System

System

All driving modes

Copyright © 2014 SAE International. The summary table may be freely copied and distributed provided SAE International and J3016 are acknowledged as the source and must be reproduced AS-IS.

Key definitions in J3016 include (among others): Dynamic driving task includes the operational (steering, braking, accelerating, monitoring the vehicle and roadway) and tactical

Figure 2.2. Levels of driving automation defined byof the the Society oftheAutomotive En(responding to events, determining when to change lanes, turn, useas signals, etc.) aspects driving task, but not strategic (determining destinations and waypoints) aspect of the driving task.

c 2014 SAE gineers (Society of Automotive Engineers, 2014). Copyright

Driving mode is a type of driving scenario with characteristic dynamic driving task requirements (e.g., expressway merging, high speed cruising, low speed traffic jam, closed-campus operations, etc.). Request to intervene is notification by the automated driving system to a human driver that s/he should promptly begin or resume performance of the dynamic driving task.

International.

Contact: SAE INTERNATIONAL +1.724.776.4841 • Global Ground Vehicle Standards +1.248.273.2455 • Asia+86.21.61577368

Building on earlier work, the SAE Surface Vehicle Information Report J3016 (Society of Automotive Engineers, 2014) provides the most recent and extensive taxonomy among those three. Focusing on higher levels of driving automation, it details the range of automation levels in on-road motor vehicles and contains comprehensive definitions of automated driving systems as well as related terms (see Figure 2.2). In addition, this taxonomy also includes a detailed breakdown of the roles of the human driver and system at different levels of driving automation (see Figure 2.2). These levels are descriptive P141661

5

and not normative (Society of Automotive Engineers, 2014), but can help greatly to devise a common language.1 Figure 2.2 shows that starting with SAE Level 3 (conditional) automated driving, the driver is no longer required — and thus, cannot be expected (Society of Automotive Engineers, 2014) — to monitor the system during the time it executes the dynamic driving task. However, when using conditional automated driving systems the driver is still relied upon to provide fallback performance of the driving task after being prompted by the system with sufficient lead time (Society of Automotive Engineers, 2014). In contrast, the system must be able to provide fallback performance of the dynamic driving task on its own during higher level automated driving (see SAE Level 4 and 5; Figure 2.2). From a human factors perspective, these boundary conditions make conditional automated driving particularly challenging: On the one hand, conditional automated driving systems usher in a paradigm shift that is connected with enormous potential benefits for driving safety, comfort and efficiency (see Chapter 1). On the other hand, inappropriate automation trust leading to disuse and misuse might compromise the usefulness of such systems (see Chapter 2). Accordingly, fostering appropriate automation trust in conditional automated driving systems should be one of the key concerns for designing such systems. Considering automation trust as a variable in the design of vehicle technology, however, also requires that automation trust can be viably measured (Walker, Stanton, & Salmon, 2016). Unfortunately, these demands are paralleled by a lack of dedicated research both on how automation trust in conditional automated driving systems could be measured and how it could be facilitated. Against this background, the research focus of the present work was further narrowed down by theoretical considerations drawn primarily from two established models of automation trust. These models were Lee and See’s (2004) dynamic model of trust and reliance on automation and Hoff and Bashir’s (2015) threelayered trust model.

1

The previously published versions of Manuscript 1 and Manuscript 2 were based on the BASt taxonomy (Gasser et al., 2012), which was and still is frequently used in the German language. According to the Society of Automotive Engineers (2014), the SAE levels definitely correspond to BASt levels (Gasser et al., 2012) and approximately to NHTSA levels (National Highway Traffic Safety Administration, 2013). For the aforementioned reasons, SAE terminology was used in the submitted versions of Manuscript 3 and Manuscript 4 and is used without exception throughout this thesis.

6

2.2 Lee and See’s (2004) Dynamic Model of Trust and Reliance on Automation Based on an extensive literature review, Lee and See (2004) proposed a dynamic model of automation trust and reliance that describes the factors influencing automation trust and how automation trust mediates reliance on automation (see Figure 2.3). Of particular importance, this model illustrates that there is a dynamic interaction between the automation, interface, operator and context guiding trust and its effect on behavior (Lee & See, 2004). According to the model, trust in combination with other attitudes generates an intention to rely on automation. Whether this intention to rely then translates to actual reliance on the automation additionally depends on context factors such as operator workload or time constraints (see Figure 2.3). In other words, automation trust guides reliance, but does not determine it. Importantly, automation trust is formed not only through analytic and analogical processes, but in large part through affective processes (see also Chapter 6). Lee and See (2004) emphasize three critical elements of their framework: The closed-loop dynamics of trust and reliance Automation trust and the influence of automation trust on automation reliance are part of a closed-loop feedback process (see Figure 2.3). This means that automation trust is guided by interaction with the automation, and interaction with the automation is in turn guided by automation trust. The importance of context on trust and on mediating the effect of trust on reliance Reliance on automation is guided by automation trust, but depends on many additional factors (see Figure 2.3). In addition, automation trust itself — just as the capability of automation — is also influenced by the context in which the automation is used. The role of information display on developing appropriate trust Automation trust develops through interpretations of information about automation. When direct observation of automation is difficult, information about the automation is usually relayed by a display. Accordingly, automation trust calibration strongly depends on the content and format of such displays (see Figure 2.3).

7

Individual, Organizational, Cultural and Environmental Context Factors Affecting Automation Capability

Reputation Gossip Interface Features

Organizational Structure Cultural Differences Predisposition to Trust

Information Assimilation and Belief Formation

Trust Evolution

Workload Exploratory Behaviour Effort to Engage Perceived Risk Self Confidence

Intention Formation

Time Constraints Configuration Errors

Reliance Action

Appropriateness of trust Calibration Resolution Temporal/Functional Specifity

Display

Automation

Information about the Automation Attributional Abstraction (Purpose, Process and Performance) Level of Detail (System, Function, Sub-Function, Mode)

Figure 2.3. Lee and See’s (2004) conceptual model of the dynamic process that governs trust and its effect on reliance.

Transfer to Conditional Automated Driving Systems For this thesis, Lee and See’s (2004) model provided the basic framework to derive approaches for both measuring and designing automation trust in the context of conditional automated driving systems. On the one hand, the clear conceptual differentiation between automation trust as an attitude and automation reliance as a behavior suggests that automation trust can either be assessed directly with self-report measures or inferred indirectly from users’ behavior. At the same time, the model illustrates that the strength of the connection between self-report measures of automation trust and behavioral indicators of automation trust will vary depending on a multitude of intervening factors (see Figure 2.3). Above that, considering automation trust and automation reliance as part of a closed-loop dynamic process shows that automation trust calibration can best be investigated with longitudinal study designs. Finally, the central role displaying information about automation plays in the model provided a starting point for the design of drivers’ automation trust. In particular, this encompassed investigating

8

the role of the three different components of attributional abstraction identified in the framework (see Figure 2.3 and Chapter 8).

2.3 Hoff and Bashir’s (2015) Three-Layered Automation Trust Model Building on the work of Lee and See (2004), Hoff and Bashir (2015) integrated more recent research on automation trust published between 2002 and 2013. Their review of 127 studies provided the basis for a three-layered model of factors affecting trust in automation that complements Lee and See’s (2004) framework (see Figure 2.4).

Prior to Interaction

During Interaction Reliance on System

Preexisting Knowledge Internal Variability External Variability

Dynamic Learned System Performance

Initial Learned Situational Dispositional

Culture Age

Design Features

Situational Factors Not Related to Trust

Trust

Gender Initital Reliance Strategy

Personality Traits

Figure 2.4. Hoff and Bashir’s (2015) three-layered model of factors that influence trust in automation. A central component of this model is the identification of three different layers of automation trust (Hoff & Bashir, 2015): Dispositional trust, situational trust, and learned trust. Dispositional trust describes an individual’s enduring overall tendency to trust automation. The context-dependent component of automation trust, on the other hand, is manifested in situational trust. Finally, learned trust represents users’ past experi-

9

ences with automation (see Figure 2.4). To illustrate the interactive nature of humanautomation interaction, the authors further distinguish between initial learned trust and dynamic learned trust. Whereas dynamic learned trust reflects the variable performance of automation, initial learned trust describes automation trust in advance to interaction with a system (see Figure 2.4). Similar to Lee and See’s (2004) model, automation trust and reliance are conceptualized as part of a closed-loop process involving a multitude of mediating and moderating factors (see Figure 2.4).

Transfer to Conditional Automated Driving Systems For the present thesis, the main input derived from this model was the subdivision of automation trust in the three postulated layers. Complementary to Lee and See’s (2004) model, this allowed more detailed hypotheses about the relationships between automation trust as an attitude and actual automation reliance in form of driver behavior during the use of conditional automated driving systems. This distinctiveness was particularly valuable for investigating the connection between self-report measures and behavioral measures of automation trust that formed the main topic of Manuscript 2 (see Chapter 6).

2.4 Related Work With the basic framework for the current thesis laid out in these two models, earlier research and related work on automation trust provided another important source of inspiration to get an idea of how the theoretical factors and processes postulated in the models could be applied to the context of conditional automated driving.

Earlier Research on Automation Trust Apart from the automotive domain, substantial research on human-automation interaction has been conducted in the areas of aviation and unmanned vehicles, rail, and process control systems (for an extensive overview, see Trimble et al., 2014). Summarizing earlier findings, Wickens and Xu (2002) schematically illustrated the time course of automation use and the development of automation trust and automation reliance (see Figure 2.5): During initial system use, operators’ automation trust typically increases until an automation failure occurs (Wickens, Hollands, Banbury, & Parasuraman, 2013; see Figure 2.5). This first failure of automation can have particularly grave consequences

10

and often leads to reduced automation trust and reliance, which is why this effect has a special importance for investigations of human-automation interaction (Lee & Moray, 1992; Merlo, Wickens, & Yeh, 1999). Subsequently, automation trust and reliance tend to recover over time, even when further automation failures occur. In the long run, automation trust and reliance should settle at a steady state that more or less reflects the actual reliability of the automation. Importantly, the effects of faults on automation trust and automation reliance as well as the recovery of automation trust and automation reliance are not instantaneous, but are spread over time (Lee & Moray, 1992; Lee & See, 2004, see Figure 2.5).

High First Failure

Automation Trust / Reliance

Steady State

Prior Beliefs and Expectations

Low

F

F

F

Experience

Figure 2.5. Schematical time course of automation operation and its effects on operator trust and reliance. F = Failure of automation. Adapted from Wickens and Xu (2002). In principle, this pattern might also transfer to drivers’ automation trust calibration in the context of conditional automated driving systems. However, there are some fundamental differences in operational characteristics that limit the applicability of earlier findings on human-automation interaction to conditional automated driving systems (Trimble et al., 2014). On the one hand, research in other fields such as aviation, rail, and process control typically involves highly trained professionals that perform supervision tasks in a strictly regulated work environment (Lewandowsky, Mundy, & Tan, 2000). In contrast, drivers can vary more widely in skill level and training, just like the context in which automated

11

driving systems are used might range from anything between recreation to commercial applications. For example, people might drive a car just for the sake driving pleasure, but are far less likely to perform prolonged process control tasks for fun. Above that, drivers unlike pilots or train operators are not required and therefore, cannot be expected to supervise their vehicle during conditional automated driving (Society of Automotive Engineers, 2014; see Figure 2.2). On the other hand, earlier studies on automation trust in the automotive domain have mostly assessed variations of active cruise control (ACC; e.g. Beggiato & Krems, 2013;Cahour & Forzy, 2009; Kazi, Stanton, Walker, & Young, 2007; Rajaonah, Anceaux, & Vienne, 2006; Rudin-Brown & Parker, 2004; Verberne, Ham, & Midden, 2012), collision warning systems (e.g. Abe & Richardson, 2006; Bliss & Acton, 2003; Koustanai, Cavallo, Delhomme, & Mas, 2012; Lees & Lee, 2007), or other lower level automation such as partially automated driving systems (SAE Level 2; e.g. Helldin, Falkman, Riveiro, & Davidsson, 2013; Koo et al., 2014; Llaneras, Salinger, & Green, 2013; Naujoks, Purucker, Neukum, Wolter, & Steiger, 2015). Although these studies might seem closely related on the surface, such lower level automation operates under completely different premises than conditional automated driving systems. Most importantly, the requirement to constantly monitor such systems again creates incomparable boundary conditions that necessitate new methods and designs to investigate the measurement and design of drivers’ trust in automated driving systems.

Research on Automation Trust in the Context of Conditional Automated Driving Accordingly, only very few experimental research so far has specifically addressed the role of automation trust in the context of conditional automated driving. In the following, some studies are described that were published during the work on this thesis and are of particular importance for the discussion of the present results. In a combined focus group and driving simulator study, Beggiato et al. (2015) found evidence indicating that compared to manual driving drivers expressed different information needs in the context of partial and conditional automated driving. The results also showed that variance in participants’ information needs could mainly be explained by trust in automation. The authors concluded that with respect to automated driving, drivers should be provided with information ”primarily focused on the status, transparency and comprehensibility of system action in contrast to driving-task related in-

12

formation during manual driving” (Beggiato et al., 2015, p. 1). This work is taken up in Chapter 8 to discuss the effects of providing drivers with different kinds of information about automated driving systems. Gold, K¨orber, Hohenberger, Lechner, and Bengler (2015) reported the results of a driving simulator study designed to investigate whether experiencing an automated driving system would alter drivers’ automation trust and evaluation of the system. Comparing participants’ automation trust before and after experiencing take-over scenarios during a period of conditional automated driving revealed that experiencing the system lead to significantly increased automation trust. The results also showed that contrary to the authors initial expectations, participants’ automation trust could not be inferred from their horizontal gaze behavior. This research is primarily related to the investigation of drivers’ gaze behavior as a measure of automation trust during conditional automated driving reported in Chapter 6 and put into context in Chapter 9. Similar to the line of thought described in Chapter 7, Payre, Cestac, and Delhomme (2015) investigated the effects of practice, automation trust, and prior interaction with a conditional automated driving system on drivers’ take-over times after system-initiated take-over requests. The results of a driving simulator study revealed a positive correlation between automation trust and reaction times when participants received simple practice on the system, but not when they received elaborate practice prior to the experimental session. Thus, Payre et al. (2015) suggested to train drivers with automated driving system before initial use in order to avoid inadvertent negative effects of conditional automated driving systems. The importance of their findings in light of the current thesis are addressed in Chapter 9.

2.5 Problem Statement Taken together, strong theoretical and empirical evidence suggests that automation trust will be a key component for successful cooperation between drivers and automated driving systems. Among those systems, conditional automated driving systems are — from a psychological perspective — particularly challenging: While the driver is not required to monitor conditional automated driving systems, he / she is still expected to provide fallback performance of the dynamic driving task after prior notification. Thus, beneficial use of conditional automated driving systems can be facilitated by appropriate automation trust calibration, which is best understood as a dynamic process that fluctuates during interaction with a system. Drawing on earlier work that can only be transferred

13

with limitations to automated driving, one promising approach to facilitate appropriate automation trust might be the kind of information about conditional automated driving systems drivers are provided with. Considering the role of automation trust as a variable in the design of vehicle technology, however, also requires that drivers’ automation trust can be viably measured.

14

3 Overall Method Overall, three studies forming the data set for this thesis were conducted using fixed base (first study and second study) and moving base (third study) driving simulators. In total, N = 280 people participated in these studies. Consistent with the dynamic processes postulated in the underlying theoretical models (see Figure 2.3 and Figure 2.4), all studies were designed as mixed between-within designs (Field & Hole, 2002). Methodological details of each study are given in the corresponding chapters (see Chapter 5 – Chapter 8). An overall discussion of the methodology is provided in Chapter 9.4. Manuscript 1 (see Chapter 5) was based on the overall sample of the first study, whereas only a subsample from the first study was considered for Manuscript 2 (see Chapter 6). The overall sample in the first study was composed in equal shares of German BMW Group employees and Chinese BMW Group employees. While this sample was adequate for a cross-cultural investigation of self-reported automation trust and driving behavior, two major obstacles contradicted a combined evaluation of glance behavior with a joint sample of German and Chinese drivers: Most importantly, the driving scene in the study was comprised of a standard German three-lane highway. This seems to have had a major impact on Chinese participants’ gaze behavior, as they were neither familiar with German highways and landscape nor German traffic conditions. Above that, although German and Chinese participants did not differ in any major demographic variables such as age, gender, or education that would render them incomparable per se (see Chapter 5), additional analyses revealed that there were significant differences between participants in years they had held a driver’s license as well as driving experience. Both of these variables have been shown to have a substantial impact on gaze behavior during driving (e.g. Chapman & Underwood, 1998; Mourant & Rockwell, 1972). Thus, it was decided to limit the analyses of glance behaviors to the German subsample, for which the driving scenario was originally designed. Manuscript 3 (see Chapter 7) and Manuscript 4 (see Chapter 8) were based on the second study and third study conducted within this thesis, respectively. The approaches

15

employed to measuring and designing trust in conditional automated driving systems across the three studies are outlined in the next two subsections.

3.1 Operationalization of Automation Trust In their literature review, Hoff and Bashir (2015) distinguished between trusting behaviors and self-report measures as indicators of automation trust. Of the 127 studies investigated by them, 62% employed both self-report measures and behavioral measures of automation trust. Similarly, a multimethod approach was chosen to investigate the operationalization of drivers’ automation trust in conditional automated driving systems that included variations of self-report measures as well as behavioral measures.

Self-Report Measures of Automation Trust Automation trust is usually considered a purely psychological construct that can only be assessed directly with self-report measures (Wickens et al., 2013; Wickens & Xu, 2002). Such self-report measures can be considered valid indicators of users’ automation trust if they are properly constructed (Moray & Inagaki, 1999). Therefore, self-report measures of participants’ automation trust in the form of questionnaires (see Chapter 5, Chapter 7 and Chapter 8) were employed in all three studies and served as the point of reference of participants’ automation trust. This approach was further diversified by using two different questionnaires: Whereas a frequently used (Hoff & Bashir, 2015) automation trust questionnaire proposed by Jian, Bisantz, and Drury (2000) was employed in the first study (see Chapter 5), a more recent questionnaire developed on the basis of earlier instruments by Chien, Semnani-Azad, Lewis, and Sycara (2014) was employed in the second (see Chapter 7) and third study (see Chapter 8). The respective strengths and weaknesses of these two questionnaires observed in the current studies are discussed in Chapter 9.4. A complete version of Chien, Semnani-Azad, et al.’s (2014) questionnaire is reproduced in the appendix of this thesis (see Appendix). In addition to these questionnaires, single-item automation trust ratings (i.e. individual items directly asking for participants’ automation trust) were used to assess participants self-reported automation trust (see Chapter 5 and Chapter 6). Earlier research (e.g. Brown & Noy, 2004; Master et al., 2005) indicated that single-item automation trust ratings may provide a valid and ecological alternative to more extensive automation trust questionnaires. In addition, single-item automation trust ratings seemed partic-

16

ularly valuable for capturing short-term and temporary changes in users’ automation trust. For such applications, a continuous measurement of automation trust with questionnaires is neither practical nor desirable. For the present research, these single-item automation ratings were therefore used to complement automation trust questionnaires with a more continuous assessment of automation trust during conditional automated driving. The implementation of these single-item automation trust ratings is discussed in Chapter 9.

Behavioral Measures of Automation Trust Alternatively, users’ automation trust can also be derived from observable behavior (Wickens et al., 2013). This assumption is reflected in the connection between automation trust and automation reliance in the models of both Lee and See (2004) and Hoff and Bashir (2015). By their very nature, behavioral measures of automation trust are less intrusive than self-report measures and accordingly, would be better suited for real-world applications. Above that, behavioral measures of automation trust in principle can be collected continually and are thus probably better suited than questionnaires to capture momentary changes in users’ automation trust. However, behavioral automation trust measures that require active interaction with automation (see Chapter 5 for a discussion of this problem) may not be universally applicable to the domain of automated driving systems. Following these considerations, behavioral measures of automation trust investigated in the current studies included drivers’ reactions to take-over requests as well as their usage behavior (see Chapter 5), gaze behavior (see Chapter 6) and take-over performance (see Chapter 7).

Relationship between Self-Report Measures and Behavioral Measures of Automation Trust As illustrated above, both earlier research and theoretical considerations suggest a connection between self-reported measures and behavioral measures of automation trust (see Chapter 2). The relationships between different self-report measures and behavioral measures of automation trust investigated in this thesis are discussed in detail in Chapter 5 and Chapter 6. An overall discussion of the automation trust measures investigated in this thesis is provided in Chapter 9.

17

3.2 Designing for Appropriate Automation Trust Based on the central role displaying information about automation plays for users’ automation trust in both Lee and See’s (2004; see Chapter 2.2 and Figure 2.3) as well as Hoff and Bashir’s (2015; see Chapter 2.3 and Figure 2.4) model, displaying information about conditional automated driving systems was investigated as an exemplary approach to design drivers’ automation trust. Building on Lee and Moray’s (1992) categorization, Lee and See (2004) posited that automation trust can be derived from information about the purpose, process and performance of automation (see Figure 2.3). With this in mind and the theoretical background to investigate automation trust in the context of conditional automated driving laid out in the first and second study, two prototypical display concepts meant to promote drivers’ automation trust in conditional automated driving systems and a baseline concept were devised, implemented and tested against each other in the third study of this thesis (see Chapter 8). The content and format of the two experimental displays was designed in reference to ecological interface design (EID), which is a comparably new approach for designing interfaces (Burns & Hajdukiewicz, 2004). According to Burns and Hajdukiewicz (2004), EID can be particularly helpful if: • Asking users does not work • Users are supposed to become experts • Unexpectedness has to be handled EID is focused primarily on the design of complex systems such as medical equipment (Burns & Hajdukiewicz, 2004). Accordingly, it also seemed a promising approach for the interface design of conditional automated driving systems. Building on the work of (Salmon, Regan, Lenne, Stanton, & Young, 2007), the displays developed in this thesis aimed to simplify the complex algorithms underlying conditional automated driving systems. In addition, the display concepts were inspired by scarce earlier research conducted on interfaces for automated driving systems (see Chapter 8). Similar concepts have been proposed by various parties involved in the design of automated driving systems (e.g. Helldin et al., 2013; Beller, Heesen, & Vollrath, 2013; for an overview see Manca, Happee, & de Winter, 2015). A different approach focusing on prior familiarization with system limitations of conditional automated driving systems was investigated in the second study (see Chapter

18

7). In contrast to displaying information about the conditional automated driving system online, this approach was chosen to investigate the effects of providing drivers with a priori information and training on TORs and evaluate different approaches that might possibly complement each other.

19

4 Aims and Objectives Flowing from these considerations, the first objective of the present thesis was to identify and evaluate different methods for operationalizing automation trust in the context of conditional automated driving. Building on that, the second objective of this thesis was to derive, implement and evaluate potential approaches to promote appropriate automation trust in conditional automated driving systems. For these purposes, a total of three driving simulator studies were conducted: First Study The first study was designed to (a) investigate the effects of TORs during conditional automated driving on drivers’ automation trust, (b) assess the validity of different self-report and behavioral measures of drivers’ automation trust and (c) establish an experimental framework that could be used for the investigation of automation trust in the context of conditional automated driving. In addition, it also served to explore the influence of national differences on drivers trust in the context of conditional automated driving. This study provided the basis for Manuscript 1 (see Chapter 5) and Manuscript 2 (see Chapter 6). Second Study The objective of the second study was to investigate the effects of prior familiarization with TORs during conditional automated driving on drivers’ initial take-over performance and automation trust. In addition, this study was used to verify the framework established in the first experiment. This study was the foundation for manuscript 3 (see Chapter 7). Third Study The scope of the third study was to compare the effects of theoretically distinct kinds of information about conditional automated driving systems on drivers’ automation trust and subjective system usability, as well as potential interdepencies between these constructs. In this regard, three different display concepts were developed and compared. This study was discussed in Manuscript 4 (see Chapter 8).

20

5 Effects of Take-Over Requests and Cultural Background on Automation Trust in Conditional Automated Driving Systems2 Abstract: Appropriate automation trust is a prerequisite for safe, comfortable and efficient use of conditional automated driving systems. Earlier research indicates that a drivers’ nationality and TORs due to imperfect system reliability might affect trust, but this has never been investigated in the context of conditional automated driving. A driving simulator study (N = 80) showed that TORs only temporarily lowered trust in conditional automated driving systems, and revealed similarities in trust formation between German and Chinese drivers. Trust was significantly higher after experiencing the system than before, both for German and Chinese participants. However, Chinese drivers reported significantly higher automation mistrust than German drivers. Self-report measures of automation trust were not connected to behavioral measures. The results support a distinction between automation trust and mistrust as separate constructs, short- and long-term effects of TORs on automation trust, and cultural differences in automation trust.

5.1 Introduction Conditional automated driving systems can increase driving comfort, safety, and efficiency, but only if drivers trust them appropriately. They provide longitudinal and lateral vehicle control in certain conditions and for a limited amount of time (Trimble 2

This chapter is based on a previous publication: Hergeth, S., Lorenz, L., Krems, J. F., & Toenert, L. (2015). Effects of take-over requests and cultural background on automation trust in highly automated driving. Proceedings of the 8th International Driving Symposium on Human Factors in Driver Assessment, Training and Vehicle Design, 331–337.

21

et al., 2014). During that time, the driver does not have to monitor the vehicle and can fully engage in non-driving-related tasks. However, the driver is required to take over vehicle control if requested by the system. These TORs are necessary because perfect reliability of conditional automated driving systems is virtually impossible, for example due to the environmental complexity involved or sensor failures. But how are TORs perceived by the driver, and how does that affect trust in conditional automated driving systems? TORs indicate that the system cannot continue to provide complete vehicle control. On the one hand, they could be interpreted as automation failures. Trust is best understood as a dynamic process, and automation failures typically decrease trust (Lee & See, 2004). On the other hand, drivers might perceive TORs as warnings. This makes an important difference: Lees and Lee (2007) proposed a differentiation of warning types along purpose, process and performance dimensions. In a study of automotive collision warning systems, they showed that intended, comprehensible and useful warnings fostered trust. Studies on TORs (e.g. Gold, Damb¨ock, Lorenz, & Bengler, 2013) have modeled TORs exactly as such accurate warnings. People tend to rely on automation they trust, and disregard automation they do not trust (Chien, Lewis, Semnani-Azad, & Sycara, 2014). So far, the effects of TORs on automation trust are unclear. Among other variables that might affect automation trust, cultural differences have been frequently cited (Chien, Lewis, et al., 2014). For example, Sherman, Helmreich, and Merritt (1997) found significant cultural differences amongst pilots in attitudes regarding flight deck automation. They hypothesized that pilots from more hierarchical national cultures (e.g. Asian countries) may be more favorable towards automation than pilots from more individualistic societies (e.g. Western countries). This emphasizes that findings regarding trust in automation should be validated when they are transferred from one culture to another (Lee & See, 2004), especially when comparing Western drivers such as Germans to Asian drivers such as Chinese. The goal of this study was to investigate the effects of TORs and cultural background on automation trust in conditional automated driving systems. We hypothesized (i) that TORs would affect automation trust and (ii) that Chinese drivers would be more inclined to trust conditional automated driving systems than German drivers.

22

5.2 Method Participants Ninety-one employees of the BMW Group voluntarily participated in the study. Due to simulation errors during the experiment, 11 German participants were excluded from the analysis and replaced, resulting in the planned sample size of N = 80 (see Table 5.1). Table 5.1 Sample Characteristics. Gender

Age

n participated n considered Men Women German Chinese

51 40

40 40

21 25

19 15

M

SD

ADAS Experience Driving Experience M

27.83 8.66 3.28 30.65 4.05 2.73

SD

M

SD

1.72 1.78

10.31 6.24

8.51 3.57

Note. ADAS = Advanced driver assistance systems, six-point scale (1 = not at all; 6 = very often).

National subsamples were compared along several potential confounding factors (Beggiato & Krems, 2013). A chi-square test for independence indicated no significant association between nationality and gender balance, χ2 (1) = 0.82, p = .366, odds ratio = .66. There was no significant age difference between the German and Chinese participants, t(55.31) = -1.87, p = .067, r = .24. There was also no significant difference in experience with advanced driver assistance systems between German and Chinese participants, t(77.91) = 1.40, p = .165, r = .16. German participants did report significantly more driving experience (years of owning a drivers license) than Chinese participants, t(52.28) = 2.79, p = .007, r = .36. In practical terms however, the absolute difference was rather unsubstantial (4.07 years).

Experimental Design The study employed a two-factor mixed between-within design, with cultural background as the between-subjects factor (Chinese; German). Each participants’ trust was measured repeatedly over the course of the experiment (before and after the experimental session, and eight times during the experimental session), with time of measurement forming the within-subjects factor. As a self-report measure of automation trust, the automation trust scale developed by Jian et al. (2000) was used. This 12-item questionnaire has been adopted successfully in studies with German (Beggiato & Krems, 2013)

23

and Chinese (Ritz, 2004) participants. Additionally, we used single-item automation trust ratings (e.g. Brown & Noy, 2004) to assess drivers’ trust during the experimental session. As behavioral measures of automation trust, resumption lags and take-over times were recorded. Rajaonah et al. (2006) found that trust in an active cruise control was positively correlated with the time spent using the device. Conversely, low trust in a conditional automated driving system might be related to the time not spent using the device. Borrowing a term from multitasking, resumption lags were defined as the time between a TOR and the re-activation of the conditional automated driving system by the driver. Regarding take-over times, studies with rearend collision warning systems indicate that reaction times to warnings extend with increasing trust (Abe, Itoh, & Tanaka, 2002). Take-over time was defined as the first manual braking or steering input after a TOR exceeding 2◦ steering wheel angle or 10% braking pedal position.

Apparatus The study was conducted in BMW Group laboratories in Germany and China. The static driving simulator had six visual channels including rear visibility (see Figure 5.1). The three forward channel Plasma monitors, each at a resolution of 1920 × 1080 and with 127 cm screen size, were rendered at 60 frames/s and provided a horizontal field of view of 78◦ . A display with the same specifications right behind the vehicle’s rear seats provided an image for the rearview mirror. The two side mirror rear channels accommodated 800 × 600 TFT displays.

Figure 5.1. Simulator setup.

Figure 5.2. Driving view.

24

simulator

interior

Procedure Data were collected in single ninety-minute experiments. The virtual driving scenario for all sessions was a standard three-lane freeway with a hard shoulder. At the beginning of each experiment, participants were briefed on the driving simulator and the conditional automated driving system. The conditional automated driving system provided lateral and longitudinal control, including lane changes and overtaking. Participants were told that the system would not require monitoring during conditional automated driving and that any necessary driver intervention would be announced with sufficient time to react by a TOR (combined sinusoidal sound and visual icon). Manual braking or steering shut off the automation. To create a real-world situation with drivers engaging in non-driving-related tasks during conditional automated driving, a modified version of the Surrogate Reference Task (SuRT; International Organization for Standardization, 2012) was presented on a tablet mounted in the center console (see Figure 5.2). At the beginning of each non-driving related task, the experimenter asked participants to rate their single-item automation trust in the conditional automated driving system (”On a scale from 0% to 100%, how much do you trust the system?”). In a training session, participants were familiarized with the simulator, the conditional automated driving system, the non-driving-related task, and the TOR. Participants then completed an a priori version of the automation trust scale questionnaire. In the following experimental session, the first single-item automation trust rating was collected after 2 minutes of conditional automated driving. Immediately afterwards, the non-driving-related task was presented for 45 seconds. This process was repeated intermittently every 2 12 minutes, resulting in a total of eight times of measurement (see Figure 5.3). During the second presentation of the non-driving-related task, approximately 5 12 minutes into driving, a suddenly occurring accident in the participant cars’ own lane triggered the first TOR (time to collision = 7s). After clearing the accident, participants reactivated the conditional automated driving system. If they failed to do so, the experimenter prompted them after 45 seconds. A second TOR occurred during the fourth presentation of the non-driving-related task. The situation was identical in both TORs and varied only in traffic density, counterbalanced between participants. During the subsequent last four non-driving-related tasks, no TORs occurred. The sequence of events of TORs and non-driving-related tasks is displayed in Figure 3. After the experimental session, participants filled out the automation trust scale for a second time, some scales that are outside the scope of this report, and a demographic questionnaire. The experiment concluded with an open interview to collect additional information.

25

5.3 Results Self-Report Measures Single-item automation trust ratings A mixed between-within subjects analysis of variance was conducted to assess the effect of cultural background on single-item automation trust ratings across time of measurement (see Figure 5.3). Mauchly’s test indicated that the assumption of sphericity had been violated for the main effect of time of measurement, χ2 (27) = 288.56, p < .001. Therefore, degrees of freedom were corrected using Greenhouse-Geisser estimates of sphericity (ε = .49). Results showed a significant main effect of time of measurement on single-item automation trust ratings, F (3.40, 264.91) = 9.23, p < .001, r = .19, indicating that ratings changed during the experimental session. Contrasts revealed that both the first TOR, F (1, 78) = 4.73, p = .033, r = .24, and second TOR F (1,78) = 6.74, p = .011, r = .28, significantly lowered single-item automation trust ratings. In addition, single-item automation trust ratings at the last time of measurement were significantly higher than at the first time of measurement, F (1, 78) = 36.01, r =.56, p < .001. This means that automation trust increased during the experimental session, and decreased only temporarily after TORs. There was no significant main effect of cultural background on single-item automation trust ratings, F (1, 78) = 0.13, p = .719, r = .00. There was also no significant interaction effect between time of measurement and cultural background on single-item automation trust, F (3.40, 264.91) = 0.36, p = .812, r = .04. Automation trust scale To evaluate the impact of cultural background on participants’ automation trust scale ratings before and after the experimental session, mixed between-within subjects analysis of variance were performed. In line with empirical evidence (Spain, Bustamante, & Bliss, 2008), trust as measured by the automation trust scale was treated as a multi-dimensional construct. Separate scores were calculated for the subscales automation trust and automation mistrust. Results showed a significant main effect for time of measurement on automation trust, with trust being significantly higher after the experimental session than before, F (1,78) = 7.08, p = .009, r = .29. There was no significant main effect of cultural background on automation trust ratings, F (1, 78) = 2.66, p = .107, r = .18. There was also no significant interaction effect between the time of measurement and cultural background on automation trust, F (1, 78) = 0.73, p = .397, r = .07.

26

For mistrust ratings, there was a significant main effect of cultural background, F (1, 78) = 15.51, p < .001, r = .41. Automation mistrust ratings of Chinese participants were significantly higher than those of German participants, both before and after the experimental session (see Figure 5.4). Results showed no significant main effect of the time of measurement on automation mistrust, F (1,78) = 0.01, p = .921, r = .01. There was also no significant interaction effect between the time of measurement and cultural background on automation mistrust, F (1, 78) = 2.01, p = .160, r = .16.

Figure 5.3. Single-item trust ratings dur- Figure 5.4. Automation trust scale ratings ing the experimental session.

before and after the experimental session.

Behavioral Measures Take-over times Single-item automation trust ratings did not significantly predict take-over times for Chinese, b = 2.93, t(78) = -.18, p = .855, or German participants, b = 4.24, t(78) = -1.25, p = .215. Single-item automation trust ratings could not explain a significant proportion of variance in take-over times for the Chinese, R 2 = .03, F (1, 78) = .00, p = .855, or German participants, R 2 = .02, F (1, 78) = 1.56, p = .215. Resumption lags Resumption lags longer than 45 seconds were excluded from analysis (China = 25, Germany = 28), because participants were prompted to reactivate the system at that time after a TOR. Single-item automation trust ratings preceding TORs could not significantly predict subsequent resumption lags for the Chinese, b =

27

18.62, t(63) = -0.06, p = .952, or German participants, b = 23.62, t(60) = 1.04, p = .303. Single-item automation trust did not explain a significant proportion of variance in resumption lags after TORs for the Chinese participants, R 2 = .00, F (1, 63) = .00, p = .952, or German participants, R 2 = .02, F (1, 60) = 1.08, p = .303.

5.4 Discussion The objective of this study was to investigate the effects of TORs and cultural background on trust in conditional automated driving systems. The results show that regardless of cultural background, TORs temporarily lowered drivers’ automation trust (see Figure 5.3). This indicates that in line with earlier research and theoretical considerations (Lee & See, 2004), TORs were indeed perceived as automation failures. On the other hand, both single-item automation trust ratings and automation trust scale ratings were significantly higher at the end of the experimental session than in the beginning. These seemingly contradictory findings might be explained if short- and long-term effects of TORs are differentiated. TORs may have undermined trust temporarily because they illustrated that the system was not perfectly reliable. In the long run however, TORs might have helped participants to understand the system, thereby increasing trust. According to Lee and See (2004), trust can be fostered — among else — by observing system behavior (performance dimension) and understanding the underlying mechanisms (process dimension). Another interpretation is provided by Beggiato and Krems (2013), who suggest that automation failures do not decrease trust if they are known in advance. Participants in the current study were instructed beforehand about possible automation failures. Additionally, TORs in the current study were exclusively accurate warnings, which presumably foster trust within the framework of Lees and Lee (2007). Spain et al. (2008) suggested that when using the system trust scale, trust should be considered as a two-dimensional construct. The current results support this distinction. Although not significant, there seems to be a difference in trust formation between German and Chinese participants: While German participants’ mistrust slightly increased during the experimental session, Chinese participants’ mistrust decreased. Additionally, Chinese participants’ mistrust was significantly higher both before and after the experimental session (see Figure 5.4). A possible explanation is that the driving strategy of the automation (e.g. headway distances or cooperation with other road users) was evaluated differently. Single-item trust ratings however, were again remarkably similar

28

between cultural backgrounds (see Figure 5.4). It should be kept in mind that since participants were all BMW Group employees, they possibly shared an organizational culture that might have been as influential as their cultural background. The lacking connection between automation trust and resumption lags (cf. Rajaonah et al., 2006) could be explained by limited variability, as drivers in the current study were required to reactivate the system within a given timeframe. Since there was a tight time window for driver intervention after a TOR to avoid an accident, the same explanation could apply to take-over reactions. In addition, Itoh, Abe, and Tanaka (1999) hypothesized that decisions are made quickly both if an operator trusts or distrusts an automation strongly. These effects might have cancelled each other out and blurred the relation between self-report measures of trust and take-over times. The current study has extended earlier findings on automation trust to conditional automated driving systems. Above that, it has contributed to the understanding of cultural differences in automation trust, and investigated select behavioral measures of automation trust. The findings suggest a differentiation between automation trust and mistrust, short- and long-term effects of TORs on automation trust, and cultural differences in automation trust. Future studies should investigate how other types of TORs, for example incomprehensible ones, affect trust in conditional automated driving systems. It is also unclear how drivers react to TORs if they are not informed about possible automation failures beforehand. In addition, there remains a lack of reliable and valid behavioral measures of trust in conditional automated driving. Resumption lags and take-over reaction times should be examined in situations when drivers can choose if and how they act, for example after uncritical TORs. Finally, no study so far has investigated the long-term development of trust in conditional automated driving systems, and whether single-item ratings are also suitable to assess automation mistrust.

29

6 Gaze Behavior as a Measure of Automation Trust During Conditional Automated Driving3 Abstract: Earlier research from other domains indicates that drivers’ automation trust might be inferred from gaze behavior, such as monitoring frequency. Building on this, the feasibility of measuring drivers’ automation trust via gaze behavior during conditional automated driving was assessed with eye tracking and validated with self-reported automation trust in a driving simulator study. The gaze behavior and self-reported automation trust of 35 participants attending to a visually demanding non-driving-related task during conditional automated driving was evaluated. The relationship between dispositional, situational, and learned automation trust with gaze behavior was compared. Overall, there was a consistent relationship between drivers’ automation trust and gaze behavior. Participants reporting higher automation trust tended to monitor the automation less frequently. Further analyses revealed that higher automation trust was associated with lower monitoring frequency of the automation during non-driving-related tasks, and an increase in trust over the experimental session was connected with a decrease in monitoring frequency. We suggest that (a) the current results indicate a negative relationship between drivers’ self-reported automation trust and monitoring frequency, (b) gaze behavior provides a more direct measure of automation trust than other behavioral measures, and (c) with further refinement, drivers’ automation trust during conditional automated driving might be inferred from gaze behavior. Potential applications of this research include the estimation of drivers’ automation trust and reliance during conditional automated driving.

3

This chapter is based on a previous publication: Hergeth, S., Lorenz, L., Vilimek, R., & Krems, J. F. (2016). Keep your scanners peeled: Gaze behavior as a measure of automation trust during highly automated driving. Human Factors, 58, 509–519. doi:10.1177/0018720815625744

30

6.1 Introduction On July 6, 2013, Asiana Airlines flight 214 struck a seawall on approach to San Francisco International Airport and caught fire, resulting in the death of three passengers. An investigation by the National Transportation Safety Board (NTSB) determined that — among other factors — the pilots’ overreliance on automation led to inadequate monitoring of airspeed during the landing sequence, and causally contributed to the accident (National Transportation Safety Board, 2014). This is only one of many dramatic examples that demonstrate the importance of understanding automation trust and monitoring behavior in human–machine interaction. And although findings from aviation should be transferred to the automotive domain with care, it might illustrate some of the challenges connected with the introduction of conditional automated driving systems. During conditional automated driving, a car single-handedly performs longitudinal and lateral vehicle control in certain conditions and for a limited amount of time (Trimble et al., 2014). In contrast to lower levels of automation, the driver is not expected to monitor his vehicle during this time and might engage in non-driving-related tasks such as using on-board entertainment systems. However, conditions exceeding system boundaries can require the driver to take over manual vehicle control. For example, lane markings and road networks may change and render stored maps inaccurate (Aeberhard et al., 2015). Any time the system determines it is no longer able to support conditional automated driving, it must issue a TOR with sufficient time for the driver to reengage in the driving task. Earlier research indicates that excessive automation trust might lead to misuse and even abuse of conditional automated driving systems (e.g., Lee & See, 2004; Parasuraman & Riley, 1997). It might also compromise readiness to take over manual control (Moray & Inagaki, 1999). Carlson, Desai, Drury, Kwak, and Yanco (2014, p. 21) argued that if ”people trust systems too much, such as when challenging environmental conditions cause the systems to operate at the edge of their capabilities, users are unlikely to monitor them to the degree necessary.” In a test track study, Llaneras et al. (2013) found that when drivers were provided with a rudimentary system capable of automated steering and ACC, they were more likely to engage in risky non-driving-related tasks than when they were provided only with ACC. Participants also tended to increase their allocation of visual attention away from the forward roadway. Similarly, Abe et al. (2002) found in a study with rear-end collision warning systems that reaction times to warnings extended with increasing automation trust.

31

However, some minimum level of automation trust is necessary so that drivers are willing to activate conditional automated driving systems and eventually attend to nondriving-related tasks until a takeover is required. When users have low trust in an automated system, they tend to disuse it and are less likely to take full advantage of its capabilities (Carlson et al., 2014; Lee & See, 2004; Sheridan & Parasuraman, 2005). As Brown and Noy (2004, p. 38) put it, the ”extent that an individual driver will allow a device to take over functions will depend on the amount of trust that s/he feels toward it.” Importantly, there is some evidence that even though automation reliability plays a major role in this process (Wickens & Xu, 2002), systems do not have to be flawless to be trustworthy. For example, studies investigating ACC systems found that drivers developed trust even in imperfect systems (Beggiato & Krems, 2013) and despite simulated failures (Rudin-Brown & Parker, 2004) after using the system. Accordingly, appropriate automation trust closely reflecting actual reliability will support safe as well as comfortable use of conditional automated driving systems. Unfortunately, though, automation trust is notoriously hard to measure. Automation trust is a purely psychological construct, and the only way to assess it directly is via self-report measures (Wickens & Xu, 2002). Typically, subjective ratings such as questionnaires (e.g., Jian et al., 2000; Madhani, Khasawneh, Kaewkuekool, Gramopadhye, & Melloy, 2002) or rating scales (Brown & Noy, 2004; Master et al., 2005) have been used. When they are properly constructed, subjective ratings can be reliable and valid indicators of automation trust (Moray & Inagaki, 1999). Self-report measures are intrusive, however, and not viable in applied settings. Unless collected frequently, they are also not capable of capturing temporary changes in automation trust. Alternatively, automation trust might be inferred from observable indicators of automation reliance. Automation trust and automation reliance are usually connected: If we trust an agent, whether a machine or a human, we will tend to rely on that agent (Wickens et al., 2013; Wickens & Xu, 2002). For example, Muir and Moray (1996) found that operators tend to use automation they trust and reject automation they do not trust. However, interaction-based measures of automation trust imply action of the user, which may not be observable during phases of normal system operation. For instance, measures that have been used to asses behavioral adaptation to ACC systems such as lane-keeping quality (Rudin-Brown & Parker, 2004) are not meaningful during conditional automated driving. In addition, Parasuraman and Manzey (2010) point out that overt behavioral indicators such as monitoring behavior might be connected more closely to subjective automation trust than other objective measures. It has been argued that

32

imperfect automation is monitored at a frequency determined by the operator’s automation trust — the more a driver trusts an automated system, the less he or she will monitor it (Brown & Noy, 2004). In their aforementioned study, Muir and Moray (1996) found a significant inverse relationship between operators’ trust ratings and monitoring frequency. This predicts that with increasing automation trust, operators monitor a system less (Moray & Inagaki, 1999). Similarly, in a study of human monitoring of an automated system using a multitask flight simulation considering both sampling behavior and subjective reports of trust, Bagheri and Jamieson (2004) found that automation trust was significantly related to the sampling frequency of a monitoring area. But is the relationship between self-reported automation trust and monitoring frequency during conditional automated driving sufficiently strong to use eye-tracking metrics as a real-time measure of drivers’ automation trust? Hoff and Bashir (2015) identified three primary layers of variability in automation trust: dispositional trust, representing an individual’s overall tendency to trust automation; situational trust, depending on the external environment and context-dependent characteristics of the operator; and learned trust, representing knowledge of a system drawn from past experience or interaction. In their model, automation reliance is directly affected by automation trust. If there really is a negative connection between self-reported automation trust and monitoring behavior, it should persist at every layer. The objective of this study was to examine the association between self-reported automation trust and driver’s gaze behavior during conditional automated driving. We hypothesized that there would be a negative relationship between self-reported automation trust and monitoring frequency. We expected that (a) participants with higher automation trust would monitor the system less frequently, (b) higher self-reported automation trust at the beginning of an non-driving-related task would be connected to lower monitoring frequency during the subsequent non-driving-related task, and (c) with accumulating system experience, self-reported automation trust would increase just as monitoring frequency would decrease.

6.2 Method Participants Fifty-one German BMW Group employees recruited via a mailing list voluntarily participated in the study. None of them had previous experience with conditional automated

33

driving. Of those, n = 11 participants were excluded from analysis due to simulation errors during the experiment. Another n = 5 participants were excluded from analysis because eye-tracking data could not be recorded completely or with sufficient quality (i.e., with a detectable pupil in at least 90% of frames), resulting in a sample size of N = 35. The 22 male and 13 female participants considered for analysis were between the ages of 18 and 55 years (M = 27.94, SD = 8.97). On average, they owned a driver’s license for 10.47 years (SD = 8.81).

Apparatus The study was conducted in a static driving simulator with six visual channels, including rear visibility (see Figure 5.1). The three forward channel plasma monitors, each at a resolution of 1920 × 1080 pixels and with a 127-cm screen size, were rendered at 60 Hz and provided a combined horizontal field of view of 78◦ . A display with the same specifications positioned behind the vehicle mock-up provided an image for the rearview mirror. The two side mirrors accommodated 800 × 600 TFT display rear channels. A modified version of the SuRT (International Organization for Standardization, 2012) was presented on an Apple iPad 3 tablet mounted in the center console. The SuRT required participants to identify a target item within an array of distractors by pressing on it and served to mimic a real-world situation with drivers engaged in a visually demanding non-driving-related task during conditional automated driving. All questionnaires were collected online with another handheld tablet. Eye-tracking data were recorded using head mounted Dikablis Essential glasses and D-Lab software version 2.5 with a tracking frequency of 50 Hz and glance direction accuracy between 0.3 and 0.35◦ visual angle (Ergoneers, 2013).

Design The study employed a one-factor within-subjects design. During the experimental session, participants’ automation trust was measured at the beginning of every non-drivingrelated task (eight times in total), with time of measurement forming the within-subjects factor. The experimental session incorporated two TORs. It has been argued that automation trust should be investigated not only under normal operating conditions (Madhani et al., 2002) but also when the system encounters limitations. Earlier research indicates that exposure to automation limitations might facilitate calibration to a system’s true reliability (Parasuraman & Manzey, 2010), and it has been hypothesized

34

that trust may develop and calibrate appropriately only if operators have the opportunity to observe the automation handle situations where it could potentially fail (Moray & Inagaki, 1999). In reference to earlier studies that assessed automation trust with single-item trust ratings (e.g., Brown & Noy, 2004; Madhani et al., 2002) and perceived risk in ACC (Saffarian, Happee, & Winter, 2012), the experimenter asked the participants to report their automation trust at the beginning of every non-driving-related task (”On a scale from 0% to 100%, how much do you trust the system?”). Participants were told to refer their ratings to the conditional automated driving system.

Figure 6.1. Definition of areas of interest (AOIs). Driving scene incorporates windshield, side windows, and mirrors. Drivers’ monitoring frequency served as a behavioral measure of automation trust. Monitoring glances were defined as any fixations on the driving scene, including the windshield, mirrors, and side windows, or the instrument cluster during non-drivingrelated tasks (see Figure 6.1). Two or more glances to the same area of interest (AOI) separated by blinks less than 120 ms were combined, and glances shorter than 120 ms when passing through AOIs were eliminated from analysis (Ergoneers, 2013; Inhoff & Radach, 1998; Jacob & Karn, 2003). To quantify the monitoring frequency, defined as how often drivers cross-checked the conditional automated driving system, the raw number of monitoring glances had to be put into perspective: The second and fourth nondriving-related task were interrupted by TORs and therefore were shorter than the other non-driving-related tasks (see Figure 2). Correspondingly, the monitoring frequency was

35

defined as the number of monitoring glances during a particular non-driving-related task scaled to the duration of that non-driving-related task: M onitoring f requency =

nmonitoring glances tnon−driving−related task

In addition, the ratio of time drivers spent on monitoring during each non-drivingrelated task was assessed. Fitts, Jones, and Milton (1950) proposed fixation frequency as a measure of a display’s importance, and fixation duration a measure of difficulty of information extraction or interpretation. These conclusions are still useful today (Jacob & Karn, 2003), and we argue that in the context of conditional automated driving as well, fixation frequency provides a more direct measure of automation trust and reliance than fixation durations. The monitoring ratio thus served primarily as a control variable to rule out that drivers compensate changes in monitoring frequency by adjusting the time spent on monitoring. The monitoring ratio was also scaled to the duration of non-driving-related tasks to allow for differences in absolute non-driving-related task time: P (tf ixation1 , tf ixation2 ,..., tf ixationn ) × 100% M onitoring ratio = tnon−driving−related task

Procedure Data were collected in single 90-minute experiments. The virtual driving scenario in all sessions was a standard three-lane highway with a hard shoulder. At the beginning of each experiment, participants were familiarized with the simulator, the conditional automated driving system, the non-driving-related task, and TORs. The conditional automated driving system provided lateral and longitudinal control, including lane changes and overtaking. Any manual braking or steering shut off the automation. The experimenter advised the participants that they would not have to monitor their vehicle during conditional automated driving. Participants were told to attend to the non-drivingrelated task whenever it was presented and that TORs would announce any need for driver intervention with sufficient time to react. In consideration of the design guidelines for warnings suggested by Wogalter, Laughery, and Mayhorn (2012), TORs were composed of a 75-dB sinusoidal sound (2010 Hz, duration 0.47 s) and 25 × 17-mm red icon shown in the upper middle area of the instrument cluster (see Figure 6.1 and Figure 6.2). After a training session, participants were equipped with eye-tracking glasses. During the following experimental session, the first self-reported automation trust rating was collected after approximately 2 minutes of driving. Via intercom, the experimenter asked the participants to rate their trust in the conditional automated driving

36

Figure 6.2. Schematic sequence of events during the experimental session. system. Immediately afterwards, participants attended to the non-driving-related task for approximately 80 seconds until the tablet screen was switched off. This pattern was repeated intermittently every 2.5 minutes for a total of eight times. During the second non-driving-related task, a suddenly occurring accident in the driver’s own lane with a time to collision (ttc) of 7 seconds triggered a first TOR. Research on TORs suggests that drivers who are not monitoring their vehicle can regain safe manual control within this time frame (Gold et al., 2013). After passing the accident, participants reactivated the conditional automated driving system by themselves or were prompted by the experimenter after 45 seconds. A second TOR occurred during the fourth nondriving-related task. The situation was identical in both TORs and varied only in traffic density, counterbalanced between participants. There were no TORs during the last four non-driving-related tasks. The sequence of TORs and non-driving-related tasks is displayed in Figure 6.2. The experiment concluded with a demographic questionnaire and interview after the experimental session.

6.3 Results A significance level of .05 was used for all statistical tests, unless stated otherwise. Effect sizes are interpreted in reference to Cohen (1992). Monitoring frequencies and monitoring ratios were not normally distributed. Therefore, nonparametric methods were used for all statistics involving these measures.

Dispositional Automation Trust To analyze whether variability in dispositional self-reported automation trust would translate to variability in monitoring frequency between participants, mean self-reported automation trust and monitoring frequency were calculated for each participant by aver-

37

aging individual values over all eight non-driving-related tasks. There was a significant negative correlation between dispositional self-reported automation trust and average individual monitoring frequency, τ = -.25, p = .02, one-tailed, corresponding to a medium effect size (see Figure 6.3).

Figure 6.3. Correlation between dispositional self-reported automation trust and monitoring frequency averaged for each participant over non-driving-related tasks.

Situational Automation Trust The association between situational self-reported automation trust and monitoring frequency during each non-driving-related task was investigated using separate correlations. The false discovery rate was controlled for by correcting the level of significance with Benjamini and Hochberg’s (1995) method. During every non-driving-related task, there was a small to medium negative correlation between self-reported automation trust and monitoring frequency (see Figure 6.4). Except for the second and eighth non-drivingrelated task, all these correlations between self-reported automation trust and monitoring frequency were significant (see Table 6.1). In addition, there were significant, high positive correlations between monitoring frequency and monitoring ratio during each non-driving-related task (see Table 6.1). Higher levels of monitoring frequency were associated with higher levels of monitoring ratio, indicating that participants did not compensate changes in monitoring frequency by adapting their monitoring ratio. Consequently, there were also medium to large negative correlations between selfreported

38

automation trust and monitoring ratio, all significant except for the second non-drivingrelated task (see Table 6.1).

Figure 6.4. Correlations between participants’ self-reported automation trust and monitoring frequency of the automation during each non-driving-related task. Table 6.1 Correlations Between Self-Reported Automation Trust, Monitoring Frequency and Monitoring Ratio During Non-Driving-Related Tasks. τ Non-Driving-Related Task Measures 1 2 3 4 5 6 7 8 Self-reported Automation Trust and Monitoring Frequency -.28* -.18 -.32** -.24* -.22* -.21* -.27* -.21 Self-reported Automation Trust and Monitoring Ratio -.29* -.19 -.47** -.34* -.31* -.31* -.39** -.30* Monitoring Frequency and Monitoring Ratio .64*** .53*** .48*** .48*** .61*** .65*** .52*** .52***

Note. False discovery rate controlled for by correcting the level of significance with Benjamini and Hochberg’s (1995) method. All tests were one-tailed. * p < .05 ** p < .01 *** p < .001.

Learned Automation Trust To determine if experiencing the system during the experimental session had an effect on learned automation trust that would be reflected in monitoring frequency, self-reported automation trust and monitoring frequency during the first and last non-driving-related task were compared. The false discovery rate was controlled for by correcting the level of significance with Benjamini and Hochberg’s (1995) method. Results indicate that average self-reported automation trust during the last non-driving-related task was significantly higher (M = 82.08, SD = 14.80) than during the first non-driving-related

39

task (M = 74.43, SD = 17.56), t(30) = -3, p = .002, r = .51, one-tailed (see Figure 6.5). A Wilcoxon signed-rank test indicated that monitoring frequency during the last non-driving-related task (Mdn = 0.097) was significantly lower than during the first non-driving-related task (Mdn = 0.13) (p = .03, r = -0.36, one-tailed) (see Figure 6.5). Overall, 60% of participants reported increased automation trust, and 54% showed a decrease in monitoring frequency.

Figure 6.5. Mean self-reported automation trust (A, showing standard errors bars), median monitoring frequency (B), and median monitoring ratio (C) during non-driving-related tasks.

6.4 Discussion The current study was designed to investigate the relationship between drivers’ selfreported automation trust and gaze behavior during conditional automated driving. The results support the hypothesis that there is a negative relationship between self-reported automation trust and monitoring frequency. As expected, there was a significant negative correlation between participant’s dispositional automation trust and monitoring frequency. Participants with greater automation trust were less likely to monitor the conditional automated driving system (see Figure 6.3). Apparently, there were some individual differences in self-reported

40

automation trust between participants, which were mirrored by individual differences in monitoring behavior. The results also suggest a negative relationship between situational automation trust and monitoring frequency: During each non-driving-related task, higher self-reported automation trust ratings correlated with lower monitoring frequency (see Figure 6.4). Even though absolute levels of self-reported automation trust and monitoring frequency fluctuated between non-driving-related tasks (see Figure 6.5), it seems that there was an enduring connection between self-reported automation trust and monitoring frequency over situations. Considering the entire experimental session, driver’s self-reported automation trust increased significantly from the first to the last time of measurement, while monitoring frequency and monitoring ratio decreased (see Figure 6.5). This indicates that the hypothesized change in learned automation trust after experiencing the system was reflected in monitoring frequency as well. During the experimental session, participants experienced the conditional automated driving system handle various situations like overtaking, lane changing, and TORs, thus getting to know its capabilities and limitations. Participants apparently gained trust by observing and interacting with the system, which was also reflected in their monitoring frequency. Earlier research has also shown that automation trust increases during early experience with a system (Beggiato & Krems, 2013; Moray & Inagaki, 1999), and Parasuraman and Manzey (2010) even hypothesized that repeated exposure to automation imperfections may facilitate calibration to a system’s true reliability. Taken together, all three layers of automation trust showed a consistent, negative association with monitoring frequency. Thus, the current results support Brown and Noy (2004), who argued that if drivers trust a device, they will monitor it less. With higher automation trust, participants attended more closely to the non-driving-related task and were less likely to monitor the driving situation (see Figure 6.3, Figure 6.4 and Figure 6.5). This suggests that participants’ willingness to rely on the conditional automated driving system translated to actual reliance and corroborates the assumption that automation trust as an attitude might be inferred from monitoring frequency during conditional automated driving. The present findings also complement earlier eye-tracking studies showing a significant negative correlation between operators’ selfreported automation trust and monitoring of an automated pump (Muir & Moray, 1996) or sampling of a monitoring area (Bagheri & Jamieson, 2004). In addition, the high positive correlations between monitoring frequency and monitoring ratio (see Table 6.1) show that participants did not compensate changes in monitoring frequency by adapting

41

monitoring ratio. If participants monitored the conditional automated driving system less frequently, they also spent less time monitoring it. The relationship between self-reported automation trust and monitoring frequency was not perfect, however. Although there was a consistent, mostly significant negative relationship between automation trust and monitoring frequency at each of the three layers, automation trust could only explain a moderate amount of variance in monitoring frequency. One of the reasons for this may be the limited variability in self-reported automation trust ratings and monitoring frequencies: In particular, participants tended to report high automation trust, while exhibiting rather low monitoring frequencies (see Figure 6.3, Figure 6.4 and Figure 6.5). Above that, implicit and intuitive processes that are not captured by self-report measures might have mediated the relationship between automation trust and monitoring frequency. According to Lee and See (2004, p. 63), rational processes ”do not account for the affective aspects of trust, which represent the core influence of trust on behavior (Kramer, 1999). Emotional responses are critical because people not only think about trust, they also feel it (Fine & Holyfield, 1996).” Earlier research has also shown that the relationship between automation trust and reliance decreases under higher workload, because people sometimes have little choice but to rely on a system to deal with high workload (Hoff & Bashir, 2015). Considering the high attentional demand of the SuRT, participants may have focused closely on the non-driving-related task to keep up with task demands, independent of their automation trust and at the expense of monitoring their vehicle (Biros, Daly, & Gunsch, 2004). Another explanation is provided by Lee and See (2004), who suggested that the scope of automation trust is not limited to complete systems but can also refer to particular automatic controllers. As evident from the wording of the question, participants’ selfreported automation trust referred to the conditional automated driving system as a whole (see ”Design” subsection of this chapter). Participants’ monitoring frequency, however, might have referred more specifically to the notification component of the conditional automated driving system. Psychological assessment instruments are supposed to be valid and reliable, with a particular emphasis placed on construct validity (Clark & Watson, 1995). According to Cronbach and Meehl (1955), at least three steps are involved in evaluating the construct validity of an instrument: First, a set of theoretical concepts and the relations between them has to be identified. Second, methods for measuring the proposed constructs have to be derived. Third, empirical investigation of the expected relationships between constructs and their observable manifestations is required (Clark & Watson,

42

1995). The current study has initiated this process by articulating a connection between drivers’ automation trust and gaze behavior during conditional automated driving, as well as proposing a measurement method using eye-tracking metrics. By investigating the hypothesized relationship between drivers’ self-reported automation trust and the frequency of monitoring glances at the dispositional, situational, and learned layer, the current findings also provide the first empirical evidence for the proposed association between automation trust and gaze behavior. However, neither its reliability nor construct validity can be construed from a single set of observations — this will require a series of further investigations.

Limitations When generalizing the results of the current study, it should be kept in mind that participants were instructed to attend to the non-driving-related task whenever it was presented. This was necessary to create comparable conditions for all participants, but future studies should try to replicate the present findings under less restrained conditions. If drivers are given more decisional freedom, the relationship between automation trust and reliance should be even stronger (Hoff & Bashir, 2015). Even so, eye trackers are sensitive instruments and will not always provide usable data, for example if there are interferences with eyewear (Poole & Ball, 2005). Further research is needed before gaze behavior might be considered a valid measure of automation trust and should focus on standardizing which eye movement metrics are used and how they can be interpreted.

Conclusion By investigating the relationship between self-reported automation trust and monitoring frequency during conditional automated driving, the current study provides a sound basis upon which an objective, noninvasive measure of automation trust could be developed. This is also the first time that automation trust during conditional automated driving has been measured online and with a high temporal resolution. Apart from laboratory experiments, this is especially valuable for real-world settings where the assessment of automation trust via subjective measures is not feasible. For example, drivers with low automation trust who frequently monitor their vehicle during conditional automated driving could be provided with additional information about the system. Conversely, TORs could be modified to facilitate control transitions if monitoring frequencies indicate

43

exceptionally high automation trust. However, the current results also illustrate that a sizeable proportion of variance in monitoring behavior might not (yet) be explained by automation trust and that more research is necessary before eye-tracking metrics might be used for the real-time assessment of automation trust. Further research is planned to investigate the relationship between automation trust, automation reliance, and takeover readiness in a high fidelity dynamic driving simulator. These studies will proceed to evaluate the criterion validity of gaze behavior during conditional automated driving and explore how self-reported automation trust and monitoring frequency are connected to driver interaction with the system, such as the ratio of time conditional automated driving systems are activated or the frequency of driver-initiated transitions of control.

44

7 Effects of Prior Familiarization with Take-Over Requests during Conditional Automated Driving on Take-Over Performance and Automation Trust4 Abstract: System-initiated TORs are one of the biggest concerns for conditional automated driving, and have been studied extensively in the past. Most, but not all of these studies have included training sessions to familiarize participants with TORs. This makes them hard to compare, and might obscure first failure-like effects on take-over performance and automation trust formation. The objective of this study was to investigate the effects of prior familiarization with TORs during conditional automated driving on initial take-over performance and automation trust. A driving simulator study compared drivers’ take-over performance in two take-over situations across four prior familiarization conditions (no familiarization; description; experience; description and experience), and automation trust before and after experiencing the system. As hypothesized, prior familiarization with TORs had a more positive effect on take-over performance in the first take-over situation than a subsequent take-over situation. In all conditions, automation trust increased after participants experienced the system. Participants who were given no prior familiarization with TORs reported highest automation trust both before and after experiencing the system. The current results extend earlier findings suggesting that prior familiarization with TORs during conditional automated driving will be most relevant for take-over performance in the first take-over situation, and lowers drivers’ 4

This chapter is based on a manuscript submitted for publication: Hergeth, S., Lorenz, L., & Krems, J. F. (2015). What did you expect? Effects of prior familiarization with take-over requests during conditional automated driving on take-over performance and automation trust. Manuscript submitted for publication.

45

automation trust. Potential applications of this research include different approaches to familiarize users with automated driving systems, better integration of earlier findings, and sophistication of experimental designs.

7.1 Introduction Today’s drivers are more or less skilled at handling a variety of traffic situations, but there is one thing they are not — yet — familiar with: Resuming manual vehicle control after a period of conditional automated driving. During conditional automated driving, an automated driving system completely performs the dynamic driving task. The driver does not have to monitor the automated driving system in the process, but is expected to take over the driving task ”within a reasonable amount of time after being prompted by the automatic driving system with a request to intervene” (Society of Automotive Engineers, 2014). These TORs are one of the biggest challenges for conditional automated driving, and have been the subject of intense research efforts in the past (for an overview, see Radlmayr & Bengler, 2015; Trimble et al., 2014).

Research on Take-Over Requests during Conditional Automated Driving In order to assess take-over performance in the context of conditional automated driving, human factors studies have typically investigated driver reactions to TORs under varying conditions. In most of these studies, participants were familiarized with TORs in some sort of training session before the first relevant take-over situation occurred (e.g. Gold et al., 2013; Hergeth, Lorenz, Krems, & Toenert, 2015; Lorenz, Kerschbaum, & Schumann, 2014; Merat & Jamson, 2009; Radlmayr, Gold, Lorenz, Farid, & Bengler, 2014; Strand et al., 2014). Other studies have used uncritical TORs to familiarize participants with system boundaries (Zeeb, Buchner, & Schrauf, 2015), or assumed that participants were already familiar with the automated driving system from practical experience in earlier experiments (Merat, Jamson, Lai, Daly, & Carsten, 2014). Still other studies have not specified explicitly to what extent participants were familiarized with TORs beforehand (Damb¨ock, Farid, Toenert, & Bengler, 2012; Louw, Merat, & Jamson, 2015). Only one study we found (Petermann-Stock, Hackenberg, Muhr, & Mergl, 2013) did not provide participants with any prior information about possible system limitations and TORs.

46

Effects of Prior Familiarization with System Boundaries on Driver Behavior This dissimilarity makes studies on conditional automated driving hard to compare. On the one hand, there is some evidence that human performance during first failures of automation differs from performance during subsequent failures (Sanchez, Rogers, Fisk, & Rovira, 2014). User responses to the first automation failure are fundamentally important for human-automation interaction, and have repeatedly contributed to automation-based accidents (Wickens et al., 2013). On the other hand, first failure costs are sometimes small or cannot be observed at all (Wickens & Xu, 2002). Wickens et al. (2013) argued that a critical difference whether first failure costs occur in experiments depends on the extent that participants are informed about automation imperfections beforehand. They suggest that one possibility to overcome the first failure effect is with training or practice before real-time use is undertaken. In doing so, actually experiencing automation failures seems to be more effective than merely informing the user about them (Skitka, Mosier, Burdick, & Rosenblatt, 2000). Koustanai, Mas, Cavallo, and Delhomme (2010) found that familiarization with critical use cases made interactions with a forward collision warning system more effective and safer, and that these improvements were more positive if drivers had experienced critical situations in advance than if they had read a written description of them. What does that mean for the evaluation of take-over performance during conditional automated driving? Technically, TORs during automated driving are not considered automation failures, but intentional notifications or warnings of imminent system limitations (Society of Automotive Engineers, 2014). Nonetheless, drivers’ responses to — and perceptions of — first TORs will most probably differ from responses to subsequent TORs. There is, for example, evidence that drivers using ACC systems are largely unaware of system limitations, and can at best be expected to read half of their car owner’s manual (Beggiato & Krems, 2013). In contrast to most studies on conditional automated driving, it is unclear to what extent real-world drivers will be familiar with TORs when they use conditional automated driving for the first time.

Effects of Prior Familiarization with System Boundaries on Automation Trust Above that, prior expectations and attitudes can also influence automation trust formation and user behavior (Hoff & Bashir, 2015). Drivers’ automation trust, in turn, is

47

crucial to how automated driving systems are both used and misused in the real world (Marinik et al., 2014). Earlier studies on TORs during conditional automated driving (Hergeth et al., 2015; Gold, K¨orber, et al., 2015) found that initial trust increased after participants had experienced the system, even if critical situations occurred. In both of these studies, participants were familiarized with TORs before the experimental session. Similarly, Beggiato and Krems (2013) demonstrated that initial trust in ACC increased if participants were given either a correct system description or an incorrect description including nonexistent problems. However, automation failures led to a constant decrease in automation trust if participants were given an idealized and incomplete description that ignored potential problems. In addition, the more potential critical situations had been presented to participants before the experiment, the less their initial automation trust was. Participants in the idealized condition reported highest, and participants in the over-informed condition lowest automation trust. The authors concluded that prior information about potential system limitations decreases initial automation trust, but prevents subsequent system breakdowns from resulting in a decrease in automation trust.

Objective and Hypotheses The objective of this study was to investigate the effects of prior familiarization with TORs during conditional automated driving on initial take-over performance and automation trust. We hypothesized that prior familiarization with TORs would interact with take-over situations such that prior familiarization with TORs would have a more positive effect on take-over performance in a first take-over situation than a second takeover situation. We further expected that prior familiarization with TORs would reduce initial automation trust, but lead to increased trust after experiencing the automated driving system.

7.2 Method Participants One hundred sixteen BMW Group employees recruited via a mailing list voluntarily participated in the study. Only employees who had neither expert knowledge about nor prior experience with conditional automated driving were eligible to participate. Six participants were excluded from analysis due to technical errors during the experiment,

48

resulting in a sample size of N = 110. The 81 male and 29 female participants were between the ages of 20 and 59 years (M = 29.59, SD = 6.87). On average, they held a driver’s license for 11.87 years (SD = 8.53).

Apparatus The study was conducted in a static driving simulator with six visual channels, including rear visibility (see Figure 5.1). The three forward channel monitors, each at a resolution of 1920 × 1080 pixels and with 127 cm screen diagonal, were rendered at 60 Hz and provided a combined horizontal field of view of 78◦ . A display behind the vehicle mockup with the same specifications provided an image for the rearview mirror. The two side mirrors accommodated 800 × 600 pixels rear channel displays. A modified version of the SuRT (International Organization for Standardization, 2012) was presented on an Apple iPad 3 tablet mounted in the center console. The SuRT required participants to identify a target item within an array of distractors by pressing on it, and served to mimic a real-world situation with drivers engaged in a visually demanding non-driving-related task during automated driving.

Design The study employed a two-factor mixed between-within design, with prior familiarization condition as the between-subjects factor (no familiarization, n = 25; description, n = 28; experience, n = 29; description and experience, n = 28). Participants were presented with two TORs during the experimental session, forming the within-subjects factor (first take-over situation; second take-over situation). The description and experience condition represents the level of prior familiarization employed in most earlier studies (see ”Introduction” subsection of this chapter), thus serving as the reference condition. Take-over performance The appropriateness of driver responses to TORs entails ”the timely, safe, and correct performance of the dynamic driving task for the prevailing circumstances” (Society of Automotive Engineers, 2014), subsequently divided in metrics of take-over time and take-over quality (Gold, Lorenz, & Bengler, 2014). Take-over time Take-over time was defined as the time between a TOR and the first manual input by the driver, specified as a steering wheel input bigger than | ± 2◦ | or a depression of the brake pedal of more than 0.036%.

49

Take-over quality Take-over quality was assessed via minimum time to collision (ttc) and maximum resulting acceleration after a TOR, which have both been used successfully to evaluate take-over quality (e.g. Gold et al., 2013). The ttc is defined as the time after which two objects will collide if they maintain their present speed and same trajectories (Hayward, 1972), and has been suggested as a safety measure of traffic situation criticality (Vogel, 2003). Lower ttc values indicate higher criticality, which is why the minimum ttc after a TOR represents the highest criticality. Maximum resulting acceleration, measured in m/s2 , was defined as: maximum accelerationresulting = p maximum accelerationlongitudinal2 + accelerationlateral2 Higher accelerations indicate an approach to limits of driving dynamics and thereby, less safe reactions to TORs. Subjective criticality In addition, subjective criticality ratings of both take-over situations were collected on a ten-point scale proposed and validated by Neukum, L¨ ubbeke, Kr¨ uger, Mayser, and Steinle (2008). Among else, this instrument has been used for the evaluation of TORs during partial automated driving (Naujoks et al., 2015). Automation Trust Automation trust was assessed with an 18-item version of the empirically derived scale proposed by Chien, Semnani-Azad, et al. (2014; see Appendix). Items were rated on a seven-point scale and averaged, with higher scores indicating higher automation trust. Participants were instructed to refer their ratings to the automated driving system.

Procedure Data were collected in one-hour experiments. The driving scenario in all sessions was a standard three-lane highway with a hard shoulder. The conditional automated driving system provided complete lateral and longitudinal control, including lane changes and overtaking. At the beginning of every experiment, the experimenter read out a basic description of the automated driving system. Participants were told that they would not have to monitor their vehicle during automated driving, and to attend to the SuRT whenever it was presented. In the description condition and description and experience condition, the experimenter read out additional information about limitations of the automated driving system, as well as the specification and course of events during TORs.

50

Participants were also shown a sample picture of the TOR icon. The description was derived from Gasser et al.’s (2012) definition of highly automated driving, which corresponds to conditional automated driving in the SAE International (2014) terminology. In a subsequent training session, participants were familiarized with the simulator, the automated driving system, and the SuRT. In the experience condition and description and experience condition, participants were also presented with two TORs during the training session. After prior notification, the TORs were triggered manually on an open stretch of road, without an actual need for intervention. Participants were instructed to take over the dynamic driving task once by braking and once by steering. The experimenter did not give any additional information. Participants then filled out an a priori automation trust questionnaire. In the no familiarization condition, participants were given neither a description nor demonstration of TORs. During the following experimental session, participants activated the automated driving system immediately upon entering the highway. After about one minute of driving, they had to attend to the SuRT for the first time. Approximately 90 seconds later, the tablet screen was switched off, signaling the end of the SuRT. The SuRT was then presented every other minute, for a total of ten times. During the fourth and eighth SuRT, a suddenly occurring accident ahead of the participant’s car in the middle lane triggered a TOR with a ttc of seven seconds. Earlier research on TORs suggests that drivers who are not monitoring their vehicle — and were provided with prior description and experience of TORs beforehand — can regain safe manual control within this time frame (Gold et al., 2013). TORs were composed of a 75 dB sinusoidal sound (2010 Hz, duration 0.47 s) and a 25 × 17 mm red icon shown in the upper middle area of the instrument cluster. After passing the accident, participants reactivated the automated driving system themselves or were prompted by the experimenter after 45 seconds. The conditions in both TORs were completely identical. The experiment concluded with a post-hoc automation trust questionnaire.

7.3 Results A significance level of .05 was used for all statistical tests, unless stated otherwise. All tests were two-tailed.

51

Take-Over Performance Take-over time Data from one participant were excluded from this analysis because brake pedal values were recorded incorrectly, rendering the corresponding take-over times uninterpretable (n reported = 109). There were significant main effects of take-over situation, χ2 (1) = 64.13, p < .001, and condition, χ2 (3) = 25.07, p < .001, on takeover time. Take-over time in the first take-over situation was significantly longer than in the second take-over situation, t(108) = 9.3, p < .001, r = .67. Contrasts revealed that take-over time in the conditions no familiarization, b = 1.17, t(105) = 6.68, p < .001, r = .55, and experience, b = 0.58, t(105) = 3.39, p = .001, r = .31, were significantly longer than in the description and experience condition. There was also a significant ordinal interaction effect between condition and take-over situation, χ2 (3) = 29.49, p < .001 (see Figure 7.1). This indicates that the effect of condition on takeover time described above was different in the first and second take-over situation. To break down this interaction, take-over times in each condition were compared separately across both take-over situations. These contrasts showed that take-over time for the no familiarization condition, b = -0.95, t(105) = -5.07, p < .001, r = .44, and demonstration condition, b = -0.4, t(105) = -2.22, p = .029, r = .21, in the first take-over situation differed significantly more from the description and demonstration condition than in the second take-over situation.

Figure 7.1. Average take-over time (± SE ) as a function of familiarization condition and take-over situation.

52

Take-over quality There was a significant main effect of take-over situation on minimum ttc, χ2 (1) = 17.10, p < .001. Minimum ttc in the first take-over situation was significantly shorter than in the second take-over situation, t(109) = -4.28, p < .001, r = .38. There was no significant main effect of condition on minimum ttc, χ2 (3) = 3.75, p = .29, but a significant ordinal interaction effect between condition and take-over situation, χ2 (3) = 14.84, p = .002 (see Figure 7.2). This indicates that the effect of condition on minimum ttc was different in the first and second take-over situation. Contrasts revealed that compared to the description and experience condition, no familiarization, b = 0.92, t(106) = 3.22, p = .002, r = .3, and experience, b = 0.67, t(106) = 2.45, p = .016, r = .23, reduced minimum ttc significantly more in the first than the second take-over situation.

Figure 7.2. Average minimum time to collision (± SE ) as a function of familiarization condition and take-over situation. ttc = time to collision. Maximum resulting accelerations were not normally distributed and were investigated with robust methods based on trimmed means (Wilcox, 2012). The level of trimming was set at 20% (Wilcox & Keselman, 2003). There was no significant main effect of condition on maximum resulting acceleration, Q = 1.36, p = 0.258, but a significant main effect of take-over situation on maximum resulting acceleration, Q = 32.07, p < 0.001. The maximum resulting acceleration in the first take-over situation was significantly higher than in the second take-over situation, M = -2.16 (SE = 0.38), ty (65) = 5.68, p < .001, 95% CI [-1.40, -2.92]. Finally, there was a significant interaction effect between condition and take-over situation, Q = 3.5, p = 0.017 (see Figure 7.3). Multiple comparisons using

53

Wilcox’ (2005) sequentially restrictive method for controlling family wise error showed that compared to the description and experience condition, the decrease in maximum resulting acceleration in the second take-over situation was significantly stronger in the description condition, D = -1.44, critical significance level = 0.017, p = 0.015, ξ = .51.

Figure 7.3. Average maximum resulting acceleration (± SE ) as a function of familiarization condition and take-over situation, based on 20% trimmed means.

Subjective criticality There was a significant main effect of take-over situation, χ2 (1) = 22.19, p < .001, on criticality ratings. Criticality ratings in the first take-over situation were significantly higher than in the second take-over situation, t(109) = 4.94, p < .001, r = .43 (see Figure 7.4). There was no significant main effect of condition on subjective criticality, χ2 (3) = 4.42, p = .22., or significant interaction effect between prior familiarization condition and take-over situation, χ2 (3) = 6.76, p = .08.

54

Figure 7.4. Average subjective criticality (± SE ) as a function of familiarization condition and take-over situation.

Automation trust The automation trust ratings of one participant who reported perfect automation trust both before and after the experiment were found to be univariate outliers and excluded from analysis (n reported = 109). There was a significant main effect of time of measurement on trust scale ratings, χ2 (3) = 9.64, p = .022. Trust scale ratings before the experimental session were significantly lower than after, t(108) = -6.65, p < .001, r = .54 (see Figure 7.5). There was a significant main effect of condition on automation trust, χ2 (3) = 9.64, p = .022. Contrasts revealed that automation trust in the no familiarization condition was significantly higher than in the description and experience condition, b = 0.50, t(105) = 2.66, p = .009, r = .25. There was no significant interaction effect between condition and take-over situation, χ2 (3) = 1.13, p = .77.

55

Figure 7.5. Average automation trust (± SE ) before and after the experimental session for each familiarization condition.

7.4 Discussion This study was designed to investigate the effects of prior familiarization with TORs during conditional automated driving on take-over performance and automation trust. The results support our hypotheses that prior familiarization with TORs would have a more positive effect on take-over performance in the first take-over situation than in a subsequent take-over situation, and decrease initial automation trust. Contrary to our expectations, experiencing the automated driving system increased automation trust regardless of prior familiarization with TORs.

Effects of Prior Familiarization with TORs on Take-Over Performance The current findings extend earlier research that highlighted the particular importance of first failure effects in human automation interaction (Wickens et al., 2013) to automated driving. Although TORs during conditional automated driving are technically not automation failures, there was an interaction effect between prior familiarization with TORs and take-over situations: In the description and experience condition, takeover performance in the first and second take-over situation was very similar. When participants had not been given a combined description and demonstration of TORs

56

prior to the experimental session, however, take-over performance in the first take-over situation was inferior compared to the second take-over situation. This corroborates Wickens and Xu (2002), who argued that a critical difference whether first failure costs are observed in experiments is to what extent participants are given prior information that the automation may be imperfect. In the current study, prior description of TORs had a more positive effect on initial take-over performance than prior experience (see Figure 7.2 and Figure 7.1). In contrast, earlier research suggested that experiencing critical situations in practice trials is more effective to prevent first failure costs than simply informing users about them (Koustanai et al., 2010; Skitka et al., 2000; Wickens & Xu, 2002; Wickens et al., 2013). This might be attributed to the detailed explanation of TORs participants received beforehand. Participants in the description condition were informed about possible take-over situations, and how they might react in such situations. When required to resume the dynamic driving task after a TOR, they probably relied on their stored knowledge to handle these situations. There is already strong evidence that if users are made aware of system limitations, they can better prepare for incorrect actions of automation (Wickens & Xu, 2002). These results have broad implications for the assessment of take-over performance during conditional automated driving. On the one hand, drivers’ responses to the first take-over situation should be considered separately from responses to subsequent takeover situations. This means that general take-over performance during conditional automated driving cannot be inferred from drivers’ responses in a first take-over situation, and vice versa. On the other hand, results of studies on take-over performance during conditional automated driving always need to be interpreted with the level of prior familiarization provided to participants in mind. This implies that studies that have investigated take-over performance with and without prior familiarization complement each other, but should not be mixed together. Remarkably, there also seems to be a difference between prior description and prior experience of TORs. While both might affect take-over performance and perceptions of take-over situations, they appear to take effect in different ways. Finally, findings on take-over performance under certain conditions should only be generalized for the respective level of expertise and take-over situation under investigation.

57

Effects of Prior Familiarization with TORs on Subjective Criticality Subjective criticality ratings mirrored the effects of prior familiarization with TORs on take-over performance. Participants in the description and experience condition rated both TORs least critical, and across conditions, participants perceived the second takeover situation as significantly less critical than the first one (see Figure 7.4). Although failing to reach statistical significance, this effect was most pronounced when participants were not given any prior familiarization with TORs (see Figure 7.4). This indicates that familiarity with TORs not only leads to better take-over performance, but is also connected with less critical evaluations of take-over situations. Participants in the no familiarization condition might have been particularly surprised by the first TOR, which is why in comparison, the second TOR appeared much less critical to them.

Effects of Prior Familiarization with TORs on Automation Trust Beggiato and Krems (2013) found that if drivers were given an incomplete and idealized account of ACC that omitted potential problems, automation failures led to a constant decrease in trust without recovery. In the current study, however, automation trust increased even if potential problems were omitted. Across all conditions, participants reported significantly higher automation trust after experiencing the system. This complements earlier studies on conditional automated driving, which found that drivers familiarized with TORs beforehand gained automation trust during initial system use even when take-over situations occurred (Hergeth et al., 2015; Gold, Berisha, & Bengler, 2015). Unlike failures of ACC systems, TORs are technically not automation failures, but notifications or warnings (Society of Automotive Engineers, 2014). In addition, although potential system limitations (i.e. TORs) were left out in the no familiarization condition, participants were not given an idealized description either. Participants probably built trust by observing and interacting with the system during the experimental session, thus getting to know its capabilities and limitations. In a test track study in which participants were not informed about possible system failures beforehand (RudinBrown & Parker, 2004), drivers’ automation trust increased significantly after using the system despite a simulated ACC failure. It has been argued that just as observing automation in situations where it could potentially fail might support trust formation

58

(Moray & Inagaki, 1999), repeated exposure to automation limitations might facilitate calibration to a systems true reliability (Parasuraman & Manzey, 2010). Beggiato and Krems (2013) also found that the more initial information participants were given about potential problems of an ACC system, the lower their automation trust was after reading a description of the system. In the current study, participants in the no familiarization condition reported significantly higher automation trust than participants in the description and experience condition, while automation trust ratings in the description condition as well as experience condition centered in between — both before and after the experimental session. There seems to be an almost additive effect that the less information participants are given about possible system limitations, the higher their automation trust is (see Figure 7.5). Likewise, Cahour and Forzy (2009) found that giving participants instructions which included negative aspects before they used a cruise control system for the first time undermined trust. Hoff and Bashir (2015) argued that already before interacting with an automated system, preexisting knowledge can modify the trust formation process. They stress the particular importance of first impressions with automation, and highlight that this heuristic might be especially relevant for unfamiliar automation.

Application Rupp and King (2010) speculated that future automated driving systems might require specialized training to earn certificates or licenses. The current results suggest that prior descriptions of TORs could elicit similar behavior in critical situations as more exhaustive training sessions. This opens up a new spectrum of methods to train novice users on boundaries of conditional automated driving systems. For example, first-time users might work through a tutorial before conditional automated driving can be engaged. Alternatively, buyers could be given an introduction to such systems during vehicle delivery. Further, but potentially more challenging methods, might include adaptive TORs tailored to drivers’ expertise with automated driving. From a more theoretical perspective, the current findings show that prior familiarization of participants with critical situations such as TORs affects the inferences that can be drawn from studies on automated driving. This should be kept in mind when earlier studies are interpreted, and new ones conceived of. At the most extreme, takeover performance could be investigated either by using experimental designs without any prior familiarization, or after exposing participants to TORs until there is no further improvement in take-over performance.

59

Limitations To avoid inadvertent side effects, both take-over situations in the current study were completely identical and relatively easy to handle. Future studies should investigate whether the positive effects of prior familiarization with TORs also transfer to novel and more complex driving situations. Above that, it remains unclear which elements of prior familiarization with TORs had a positive effect on take-over performance. To support the driver during take-over situations, further research is required to determine whether the observed differences in take-over performance occurred during perception, mental processing, or motor execution (Ruscio, Ciceri, & Biassoni, 2015).

Conclusion The current results indicate that prior familiarization with TORs significantly affects take-over performance and automation trust formation in the context of conditional automated driving. This helps to integrate earlier, sometimes conflicting findings that investigated take-over situations with different levels of prior familiarization. By systematically manipulating prior familiarization with TORs, the current results also extend earlier research that has demonstrated the importance of familiarity with system limitations for human-automation performance. In particular, there seem to be important differences between drivers’ reactions to first and subsequent take-over situations, and strong effects of prior familiarization with TORs on take-over performance as well as perceptions of automated driving systems. We should keep this in mind when we study and design conditional automated driving systems, so that future users will be able to resume manual control when they need to.

60

8 Effects of Purpose, Process and Performance Information about Conditional Automated Driving Systems on Automation Trust and Perceived Usability5 Abstract: Appropriate automation trust supports safe, comfortable and efficient use of conditional automated driving systems and will depend on how good the capabilities of the automation are conveyed to the driver. In theory, providing drivers with information about the purpose, performance, and process of conditional automated driving systems in a user-friendly way could promote appropriate automation trust. Based on these information dimensions, a motion-based driving simulator study compared drivers’ automation trust and subjective system usability of a conditional automated driving system across two system purpose conditions (comfort system; safety system) and three display concepts (baseline display; performance display; process display). When provided with the process display, participants in both conditions reported significantly higher automation trust and rated subjective system usability better than when they were provided with the baseline display or the performance display. Purpose information had no significant effect on automation trust or subjective system usability. Across conditions and display concepts, there was a positive correlation between subjective system usability and automation trust. The current results suggest that providing drivers with process information about conditional automated driving systems can increase drivers’ automation trust and subjective system usability, and indicate a connection between subjective 5

This chapter is based on a manuscript submitted for publication: Hergeth, S., Lorenz, L., Krems, J. F., & Broy, N. (2016). Effects of purpose, process and performance information about conditional automated driving systems on automation trust and perceived usability. Manuscript submitted for publication.

61

system usability and automation trust. Potential applications of this research include design recommendations that could help to create trustable automated driving systems and facilitate appropriate automation trust.

8.1 Introduction For most people, automated driving systems are — quite literally — black boxes. Particularly as long as human drivers are expected to provide fallback performance of the driving task, however, their automation trust should match the capabilities of such systems as good as possible. This can be supported by providing drivers with adequate information about automated driving systems. During conditional automated driving, an automated driving system completely performs the dynamic driving task. In contrast to lower level automation, the driver does not need to monitor the system, but is still expected to take over the driving task within sufficient lead time upon request by the system (Society of Automotive Engineers, 2014). This transfer of control has been associated with potential benefits such as increased road safety, driving comfort, and efficiency (Trimble et al., 2014). Whether these promises can be realized, however, will also depend on how drivers trust conditional automated driving systems. Especially when complexity and uncertainty make a thorough understanding of automation difficult and operating conditions require adaptive behavior, trust guides reliance on automation (Hoff & Bashir, 2015; Lee & See, 2004): People tend to disuse automation they do not trust, whereas high automation trust often leads to misuse of automation. Both conditions can be detrimental for human-automation cooperation (Lee & See, 2004). Therefore, drivers’ automation trust should closely match the actual capabilities of conditional automated driving systems.

Theoretical Background According to Lee and See’s (2004) dynamic model of trust and reliance on automation, ”Attributions of trust can be derived from a direct observation of system behavior (performance), an understanding of the underlying mechanisms (process), or from the intended use of the system (purpose)” (p. 67). Lees and Lee (2007) suggested that the purpose of automation could be manipulated by giving users different instructions about what an automation is designed to do. For example, driver assistance systems are often classified as rather comfort-oriented systems

62

such as ACC or safety-oriented systems such as electronic stability control (Engeln & Vratil, 2008; Richardson et al., 1997; Wallentowitz & Reif, 2010; Winner, Danner, & Steinle, 2009). These labels might also be applied to conditional automated driving systems, and could guide drivers’ initial expectations and interpretation of such systems (Walker et al., 2006). Providing drivers with information about the performance and process of conditional automated driving systems is less clear-cut, since complex automation often eludes direct observation and requires relaying information with a display (Lee & See, 2004). In a driving simulator study, Beller, Heesen, and Vollrath (2013) found that communicating uncertainty of an automated driving system with a visual display increased automation trust and improved human-automation interaction. Similarly, Helldin, Falkman, Riveiro, and Davidsson (2013) report the results of a driving simulator study which showed that visualizing uncertainty of an automated driving system led to more appropriate automation trust. These findings indicate that providing drivers with information about the performance of conditional automated driving systems might be a feasible way to facilitate appropriate automation trust. On the other hand, Beggiato et al. (2015) investigated drivers’ information needs during automated driving in a combined focus group and driving simulator study and found that participants primarily requested information pertaining to the state, transparency, and comprehensibility of system actions. In either case, no studies so far have examined the effects of providing drivers with information about the process of conditional automated driving systems on automation trust. Thus, it remains unclear how information about the process and performance of conditional automated driving systems compare in shaping automation trust, and interact with information about the purpose of conditional automated driving systems. Another important consideration is the way information about automation is presented: Lee and See (2004) emphasize that ”the mere availability of information will not ensure appropriate trust. The information must be presented in a manner that is consistent with the cognitive processes underlying the development of trust” (p. 61). There is evidence that automation trust might depend on interface features of automation not directly related to its actual capabilities, such as appearance and ease of use (for an overview, see Lee & See, 2004; Hoff & Bashir, 2015). For example, Atoyan, Duquet, and Robert (2006) found that users’ automation trust in an intelligent data fusion system increased as the usability of the system was enhanced (Hoff & Bashir, 2015).

63

Aims and Objectives This study investigates the effects of providing drivers with information about the purpose, performance, and process of conditional automated driving systems on automation trust and subjective system usability. It further examines the relationship between automation trust and subjective system usability. We hypothesized that higher subjective system usability would be connected with higher automation trust.

8.2 Method Experimental Design The study employed a two-factor mixed between-within design, with system purpose condition as the between-subjects factor (comfort system; safety system). Participants’ assessed automation trust and subjective system usability for three display concepts (baseline display; performance display; process display) during separate experimental drives, with display concept forming the within-subjects factor. The order of display concepts was counterbalanced across participants. Participants were randomly assigned to conditions.

Participants Sixty-eight BMW Group employees recruited via a mailing list voluntarily participated in the study. Only employees who had no previous experience with conditional automated driving systems were eligible. Due to simulation errors during the experiment, n = 2 participants were excluded from analysis. This resulted in a sample size of N = 66 participants between the ages of 20 and 59 years considered for analysis. Both conditions were compared along potential confounding factors (Hoff & Bashir, 2015; Lee & See, 2004). There was no significant association between condition and gender balance, χ2 (1) = 0.83, p = .363, odds ratio = 1.62. There were also no significant differences between conditions in age, ty (39.94) = 0.74, p = .463, ξ = .13, experience with advanced driver assistance systems, ty (35.97) = 0.84, p = .406, ξ = .12, or driving experience measured in years of owning a driver’s license, ty (39.88) = 0.65, p = .520, ξ = .11 (see Table 8.1).

64

Table 8.1 Composition of the Sample Broken Down by Condition. Gender Condition

n

Comfort System 34 Safety System 32

men women 22 24

12 8

Age M

SD

ADAS Experience Years Owning Driver’s License M

31.03 8.91 2.76 32.75 9.02 2.66

SD

M

SD

1.16 1.18

12.88 14.72

8.46 8.86

Note. ADAS = Advanced Driver Assistance System, five-point Likert scale (higher scores indicate higher ADAS experience).

Apparatus The study was conducted in a dynamic driving simulator. Seven 1080p projectors provided a 240◦ horizontal × 45◦ vertical frontal field of view at 60 frames per second. Rear view was provided using two additional projectors with the same specifications for the outside mirrors, and one LCD screen positioned behind the back seats inside the vehicle mockup for the rear mirror. The vehicle mockup used in this study was a fully instrumented BMW 5 series. The motion system consisted of a hydraulic Hexapod with six degrees of freedom, capable of up to 7 m/s2 transitional acceleration and 4.9 m/s2 continuous acceleration.

Independent Variables Purpose information To evaluate the effects of purpose information, the system purpose was manipulated by giving participants different information about its purpose in written form. Apart from general information about the automated driving system’s functionality, this written information contained a definition of the distinction between comfort systems and safety systems. The written information ended up in an explicit statement about the automated driving system’s purpose as either a comfort system (”Conditional automated driving is a comfort system, intended to increase driving comfort”) or safety system (”Conditional automated driving is a safety system, intended to increase road safety”). Performance and process information To investigate the effects of performance and process information, three display concepts were implemented and presented on the car’s 10.2 inch central information display (see Figure 8.1). Additionally, minimum essential information about the system state (Available / Activated / Deactivated) was always shown in the instrument cluster.

65

The baseline display consisted only of an infotainment menu screen, giving no supplementary information about the automated driving system (see Figure 8.1a). This served as the reference condition.

a

b

c

Figure 8.1. The (a) baseline, (b) performance, and (c) process display concept implemented in the current (third) study. The performance display was modeled after the uncertainty representation proposed by Helldin et al. (2013) and indicated automation performance on a seven-point scale, with higher values representing higher automation performance (see Figure 8.1b). A red mark between the second and third level depicted the threshold of automation uncertainty. Participants were told that below this level, the automation was on the cusp of requesting driver intervention. During the experimental drive, automation performance fluctuated between levels three and seven, for example when the weather worsened or another car cut out in front of the participants’ car. The process display concept was designed in reference to EID considerations (Burns & Hajdukiewicz, 2004; Salmon et al., 2007). It depicted a schematic, digital representation of the driving environment augmented with additional information about the conditional automated driving system such as recognized lanes, planned maneuvers, and other objects surrounding the driver’s vehicle (see Figure 8.1c).

66

Dependent Variables Manipulation check Participants were given a manipulation check to establish whether the purpose information had an effect on participant’s perception of the system purpose. The manipulation check consisted of a ten-item Likert scale that contained statements derived from the written information about the automated driving system and its functionality. For the present investigation, the two items that literally reproduced the written information about the system’s purpose were examined: The safety system item read ”Conditional automated driving is a safety system”, and the comfort system item read ”Conditional automated driving is a comfort system”. Items were rated on a seven-point scale, with higher scores indicating higher agreement with the statement. Automation trust Automation trust was assessed with an adapted version of the empirically derived scale to measure trust in Automation described by Chien, SemnaniAzad, et al. (2014). The automation trust questionnaire consisted of 18 items (e.g., ”I am confident about the performance of the system”; see Appendix). Items were rated on a seven-point Likert scale and averaged, with higher scores indicating higher automation trust. Participants were instructed to refer their ratings to the automated driving system. System usability System usability was measured with the System Usability Scale (SUS; Brooke, 1996). The SUS is a ten-item questionnaire giving a global view of users’ subjective assessments of system usability. Items are rated on a five-point Likert scale, rescaled, and summed. This results in a SUS score ranging from 0 to 100, with higher scores indicating higher overall usability of the system being studied. The SUS is one of the most popular questionnaires for the assessment of subjective system usability (Pina, Donmez, & Cummings, 2008). Among else, this scale has been used successfully for usability assessments of advanced driver assistance systems (e.g. Adell, 2010) and displays specifically designed for electric vehicles (Cocron et al., 2011).

Procedure Ahead of the experiment, participants were e-mailed a demographic questionnaire. Experimental data were collected in ninety-minute experiments.

67

At the beginning of every experiment, the experimenter briefed the participants about the driving simulator and the automated driving system. The conditional automated driving system provided complete vehicle control, including overtaking and lane changes. Any manual steering or braking shut off the automation. The experimenter advised the participants that they would not have to monitor their vehicle during conditional automated driving. To mimic a real-world situation with drivers engaged in a visually demanding non-driving-related task during automated driving, a modified version of the SuRT (International Organization for Standardization, 2012) was presented intermittently during the experimental drive on an Apple iPad 3 tablet mounted in the center console. Participants were told to attend to the SuRT whenever it was presented, and that TORs would announce any need for driver intervention with sufficient time to react. Subsequently, the experimenter handed the participants the written information about the automated driving system’s purpose. After participants had read the written information, the experimenter asked them if they did understand it and, if necessary, answered any questions by repeating the statement about the automated driving system’s purpose from the written instruction. The participants then performed three ten-minute experimental drives, one with each display concept. Before each experimental drive, the experimenter explained the respective display concept under investigation. All experimental drives were identical, and consisted of light highway traffic including some non-critical manouvers such as overtaking, speed limits, and changes in weather conditions. Participants were instructed to activate the automated driving system immediately upon entering the highway. During the experimental drives, the SuRT was presented after 2, 5, and 8 minutes for 60 seconds. Following each experimental drive, participants completed the automation trust questionnaire and SUS. After the third experimental drive, participants also filled out the manipulation check. The manipulation check was collected in paper and pencil format, all other self-report measures were collected online with a hand-held tablet. The experiment concluded with a semi-structured interview.

8.3 Results A significance level of .05 was used for all statistical tests, unless stated otherwise. All tests were two-tailed.

68

Manipulation Check Manipulation check scores were not normally distributed and therefore investigated with the Yuen-Welch method (Yuen, 1974) for comparing trimmed means. The level of trimming was set at 20 % (Wilcox, 1998). Familywise error rate was controlled for with Bonferroni correction (α / 2). As intended, participants in the comfort system condition rated the conditional automated driving system more as a comfort system, ty (39.6) = 1.58, p = .246, ξ = .29, and less as a safety system than participants in the safety system condition ty (39) = 1.49, p = .146, ξ = .27 (see Figure 8.2). These differences represent medium sized effects (Wilcox, 2012) and indicate that the experimental manipulation was successful (Krantz, 1999).

Figure 8.2. Average manipulation check scores (± SE ) on the safety system item and comfort system item broken down by the two conditions, based on 20% trimmed means.

Automation Trust There was no significant main effect of system purpose condition on automation trust ratings, χ2 (1) = 0.05, p = .826., but a significant main effect of display concepts on automation trust scale ratings, χ2 (2) = 21.80, p < .001. Bonferroni-corrected post hoc tests (α / 3) revealed that automation trust scale ratings for the process display were significantly higher than for the baseline display, t(65) = -4.21, p < .001, r = .46, as well as than for the performance display, t(65) = -3.52, p = .002, r = .4, representing a

69

medium to large effect (Cohen, 1992). There was no significant difference in automation trust scale ratings for the baseline display and the performance display, t(65) = -1.69, p = .29, r = 0.2, representing a small to medium effect. Finally, there was no significant interaction effect between condition and display concepts, χ2 (2) = 0.75, p = .687 (see Figure 8.3).

Figure 8.3. Average automation trust questionnaire scores (± SE ) broken down by display concepts and conditions.

System Usability SUS scores were also not normally distributed and investigated with robust methods based on trimmed means, as described above for the manipulation check. There was no significant main effect of condition on SUS scores, Q = 0.02, p = .898, and no significant main effect of display concepts, Q = 1.58, p = .213. There was also no significant interaction effect between condition and display concept, Q = 0.06, p = .941 (see Figure 8.4).

70

Figure 8.4. Average SUS scores (± SE ) for the three display concepts across both conditions, based on 20% trimmed means. SUS = System Usability Scale.

Relationship between Automation Trust and Subjective System Usability There were mostly large, significant correlations between automation trust questionnaire ratings and SUS scores, with higher levels of automation trust associated with higher levels of perceived system usability (see Figure 8.5). Familywise error rate was controlled for with Bonferroni correction (α / 6).

71

Figure 8.5. Correlations between automation trust questionnaire ratings and SUS scores, broken down by display concepts and conditions. Familywise error rate was controlled for with Bonferroni correction (α / 6). SUS = system usability scale. ∗ p < .05.∗∗ p < .01.∗∗∗ p < .001.

8.4 Discussion The objective of this study was to examine the effects of information about the purpose, process, and performance of conditional automated driving systems on drivers’ automation trust and subjective system usability, as well as the relationship between subjective system usability and automation trust. The results indicate that purpose information and performance information had no significant effect on participants’ automation trust and subjective system usability. Providing participants with process information, however, had a positive effect on both automation trust and subjective system usability. Higher automation trust was consistently connected with higher subjective system usability.

72

Effects of Information about Conditional Automated Driving Systems on Automation Trust and Subjective System Usability Purpose information The current results show that different information about the purpose of the conditional automated driving system had a lasting effect on participants’ perception of the system, but did not affect automation trust (see Figure 8.3) or subjective system usability (see Figure 8.4). As intended, participants in the comfort system condition rated the conditional automated driving system more as a comfort system and less as a safety system than participants in the safety system condition (see Figure 8.2). Together, this indicates that framing conditional automated driving systems as comfort or safety systems does not guide drivers’ automation trust and perceived usability. The willingness of participants to trust conditional automated driving systems, regardless of how they are described in detail, might be attributed to the extensive capabilities and high sophistication of such systems. According to Lee and See (2004), the purpose basis of automation trust corresponds to faith and benevolence and reflects whether the user attributes a positive orientation to the automation. This, in turn, frequently depends on whether the automation designers’ intent is communicated to the user. Since both comfort systems and safety systems are essentially designed to assist the driver, either classification warrants trust in the conditional automated driving system. There also was no ambiguity about the automation’s intent, which was clearly communicated in either condition through the written description of the system. Above that, automation trust in the current study was assessed after participants had already experienced the automated driving system. However, purpose information is thought to influence trust primarily during initial use when there is little history of performance (Lee & See, 2004). Experiencing the conditional automated driving system might have influenced participants’ automation trust over and above information about the purpose of the system, and interfered with participants’ initial expectations. Performance information Providing participants with performance information also had no significant effect on automation trust and subjective system usability. Compared to the baseline condition, participants reported only slightly higher automation trust and better system usability when using the performance display (see Figure 8.3 and 8.4). Similarly, Beller et al. (2013) found that an automated driving system received higher trust ratings and increased acceptance when it provided uncertainty information, compared to the same system when it provided no uncertainty information. Helldin et

73

al. (2013), however, found that visualizing automation uncertainty during automated driving decreased drivers’ automation trust. This disparity might be explained by the observation that the effect of automation failures on automation trust depends on the negative consequences associated with the error and the context in which it occurs (Sanchez, 2006). For example, critical automation failures while performing an easy task during initial system use might undermine trust much more than harmless failures under difficult conditions after prolonged system experience (Hoff & Bashir, 2015). Accordingly, even similar failures of automation might have very different effects on automation trust and reliance, depending on their timing and criticality. In addition, participants were not notified when the automated driving systems exceeded limitations in Beller et al.’s (2013) and Helldin et al.’s (2013) studies. Providing drivers with information about the performance of automated driving systems, but not notifying them about the need to intervene when performance drops below some threshold level might invoke a sense of false security that could have unpredictable consequences. Process information When using the process display, participants in both conditions reported significantly higher automation trust than when they were provided with the baseline display or performance display. The process display was also associated with best subjective system usability. This corroborates earlier research, which suggests that providing drivers with process information could promote automation trust and enhance perceived system usability (Hoff & Bashir, 2015; Lee & See, 2004). According to Wickens and Xu (2002), ”automation will be better trusted (or at least relied upon with better calibration) to the extent that operators understand the algorithms underlying automation use” (p. 14). Similarly, Parasuraman and Riley (1997) argued that better knowledge about the design philosophy and functionality of automation may facilitate appropriate system use. The information contained in the process display probably helped participants to observe and understand the conditional automated driving system, thus increasing trust and perceived usability. Beggiato et al. (2015) also suggested that ”independent from specific scenarios, information should provide transparency, comprehensibility, and predictability of system actions” (p. 1). The process display provided all this information.

74

Relationship between subjective system usability and automation trust As hypothesized, the current results revealed a positive correlation between automation trust and subjective system usability. Regardless of condition and display concept, higher levels of subjective system usability were associated with higher levels of automation trust (see Figure 8.3). This coincides with earlier research, which indicates that automation trust might depend on surface features of a system’s interface that have no direct link to the actual capabilities of the system (for an overview, see Lee & See, 2004). Hoff and Bashir (2015) argued that design features such as the appearance of automation and ease of use of a system are an important consideration because they can influence trust in automation. In their literature review, they identified appearance as one of five design features of automation that influence trust and concluded that interfaces should be arranged with care. For example, visually attractive websites are trusted more than less aesthetic websites and there is a positive association between website usability and user trust (for an overview, see Hoff & Bashir, 2015). In consequence, interfaces that are meant to promote automation trust in automated driving systems should contain not only adequate information, but must also be thoughtfully arranged (Lee & See, 2004).

General Discussion Taken together, only process information had a significant positive effect on drivers’ automation trust and subjective system usability. Purpose and performance information, on the other hand, had no substantial effect. Considering the flawless reliability of the conditional automated driving system during the experimental drives, higher trust in the system was more appropriate. Thus, providing drivers with process information seems to be a promising way to promote appropriate automation trust and reliance. Nonetheless, performance information might alter trust and reliance if it is displayed in another way. Lee and See (2004) argue that simply providing information about automation is not sufficient, but it also needs to be properly formatted so that appropriate trust can develop. For example, displaying automation reliability has been found to facilitate appropriate reliance on a combat identification aid (Wang, Jamieson, & Hollands, 2009), but providing information with different framing (Lacson, Wiegmann, & Madhavan, 2005) or different display configurations (Neyedli, Hollands, & Jamieson, 2011) can itself affect automation trust and reliance (Hoff & Bashir, 2015).

75

In addition, providing drivers with performance or process information about conditional automated driving systems is by no means mutually exclusive. For example, it has been shown that performance information conveyed with feature degradation of image quality in a target cueing task (MacMillan, Entin, & Serfaty, 1994) or luminance in a signal detection task (Montgomery & Sorkin, 1996) can facilitiate appropriate automation trust (Lee & See, 2004). This could be easily incorporated in interfaces displaying the process of conditional automated driving systems. Lastly, manipulating the purpose of the automation did not affect how drivers interpreted information about the performance and process of the conditional automated driving system — in contrast to Lees and Lee’s (2007) assumption. Importantly, there also was no interaction effect with information about automation performance or process. This indicates that framing conditional automated driving systems in a certain way due to, for example, legal or marketing considerations will not necessarily influence how drivers use such systems.

Limitations and future research Interaction with system limitations In the current study, both the conditional automated driving system and its displays performed flawlessly. This was necessary to create comparable conditions across both conditions and the three display concepts, and establish the principle effects of each of the three information dimensions. However, trust and human-automation cooperation should be investigated not only under normal operating conditions, but also when the automation fails to function as intended (Madhani et al., 2002). Future studies are planned to investigate the effects of information about the purpose, performance and process of conditional automated driving systems on drivers’ automation trust and reliance under less ideal circumstances, for example when the driver is required to intervene or display contents diverge from realworld conditions. This research might also investigate the effects of not communicating the automation designers’ intent to the user at all (Lee & See, 2004). Effects over time In addition, the current study focused on automation trust calibration and subjective system usability during initial system use. Drivers’ information needs — and with it, the benefits of different kinds of information about the automation — will probably change over time. For example, Beggiato et al. (2015) suggest that due to familiarization with and higher trust in the automation, drivers will require less information with accumulating system experience. There is evidence that drivers require

76

at least two weeks of continuous ACC use before they exhibit a stable usage pattern and understand how to operate the system (Weinberger, Winner, & Bubb, 2001), which suggests that observations made during initial system use might not completely transfer to later system use (Winner, Danner, & Steinle, 2009).

Conclusion Previous research has identified different approaches to promote appropriate automation trust when using conditional automated driving systems, but it has been unclear which variant would be most promising. By pitting the effects of information about the purpose, process, and performance of conditional automated driving systems on drivers’ automation trust and subjective system usability against each other, the current study contributes to both the theory and practice of human factors: As proposed in Lee and See’s (2004) automation trust framework, the current results show that the three proposed dimensions are indeed discernible constructs. In direct comparison with information about the purpose and performance of conditional automated driving systems, providing drivers with process information seems to offer the most promising approach to enable successful human-automation collaboration. From a practical point of view, the current results provide designers of conditional automated driving system with much needed design recommendations. Above that, the intimate connection between automation trust and subjective usability indicates that designers should consider not only the content, but also the surface features of interfaces if they want to develop appropriate automation trust during conditional automated driving. After all, conditional automated driving systems should not remain black boxes for drivers.

77

9 General Discussion The objectives of this thesis were to explore viable methods for measuring drivers’ automation trust in the context of conditional automated driving as well as the identification, implementation and evaluation of possible approaches for designing drivers’ automation trust in conditional automated driving systems. The present results indicate that (i) both self-report measures and behavioral measures can be used to assess drivers’ automation trust in conditional automated driving systems, (ii) prior familiarization with limitations of conditional automated driving systems can have a lasting effect on drivers’ automation trust in conditional automated driving systems and (iii) displaying information about the process of conditional automated driving systems can promote automation trust in these systems. The individual findings on the operationalization of automation trust in the context of conditional automated driving investigated in the present thesis and the interconnections between self-report measures and behavioral measures of automation trust are reported in detail in Chapter 5, Chapter 6 and Chapter 7. The results concerning the design of drivers’ automation trust in conditional automated driving systems by means of prior familiarization with system limitations and displaying different kinds of information about conditional automated driving systems are presented in Chapter 7 and Chapter 8, respectively. Below, these results are briefly summarized (see Section 9.1) to provide a basis for discussion of the theoretical implications (see Chapter 9.2) and practical application (see Section 9.3) of this thesis, as well as methodical aspects (see Section 9.4) and need for further research (see Section 9.5).

9.1 Overall Findings The present studies indicate that experiencing conditional automated driving systems can increase drivers’ self-reported automation trust in conditional automated driving systems. After experiencing the conditional automated driving system, participants in the current studies reported signifcantly higher automation trust than before (see Chap-

78

ter 5 and Chapter 7). The occurrence of TORs during conditional automated driving, on the other hand, seems to decrease drivers’ self-reported automation trust only temporarily (see Chapter 5). These effects appear to be mirrored by antithetic changes in drivers’ monitoring behavior, supporting assumptions of a positive relationship between self-reported automation trust and glance behavior as a behavioral measure of automation trust. Higher automation trust tended to be connected with fewer and shorter monitoring glances (see Chapter 7). However, the current research provides no indication of significant connections between drivers’ self-reported automation trust and other behavioral measures of automation trust such as reaction times or system use (see Chapter 5). The results also show that compared to no prior familiarization, prior familiarization of participants with TORs during conditional automated driving can lower drivers’ selfreported automation trust both before and after experiencing a conditional automated driving system (see Chapter 7). At the same time, prior familiarization with TORs during conditional automated driving potentially might facilitate take-over performance in a first take-over situation (see Chapter 7). Regarding the design of drivers’ automation trust in conditional automated driving systems by means of providing information about the system, the current findings indicated that only information about the process of the system has a significant positive effect on participants’ self-reported automation trust (see Chapter 8). There was no indication that providing information about the purpose and performance of conditional automated driving system the way it was implemented in the current research had a significant effect on participants’ self-reported automation trust (see Chapter 8). These findings have both theoretical and practical implications.

9.2 Theoretical Implications From a theoretical point of view, the main contributions of the present thesis are is extension of earlier findings on the dynamic calibration of users’ automation trust to the domain of conditional automated driving as well as the application of theoretical models of automation trust to the context of automated driving in general.

79

Automation Trust Calibration in the Context of Conditional Automated Driving Similar to the negative effects of automation failures on automation trust observed in earlier research (Wickens & Xu, 2002; see Figure 2.4), the current results show that TORs during conditional automated driving can temporarily decrease drivers’ automation trust (see Chapter 5, Figure 5.1) and automation reliance (see Chapter 6; Figure 6.5). The present findings also suggest the existence of a first failure-like effect of the first TOR drivers experience during conditional automated driving: While a subsequent TOR during conditional automated driving also decreased participants’ automation trust, the effect of the first TOR on participants’ automation trust was much more pronounced (see Chapter 5; Figure 5.1). This finding is even more interesting because — as described above (see Chapter 7) — TORs cannot be considered automation failures in a technical sense, but should rather be regarded as intended notifications or warnings of impending system limitations. In a broader sense, this indicates that a purely technical classification of system states (as, for instance, in signal detection theory; Green & Swets, 1966) is not sufficient for a complete investigation of automation trust in the context of conditional automated driving that involves all necessary aspects. Instead, an evaluation of system limitations should also consider the perception of the driver, as proposed for example by Lees and Lee (2007). Supporting this argument, participants’ automation trust (see Chapter 5 and Chapter 7) and automation reliance (see Chapter 6) globally increased with continued system use. This agrees with the results from Gold, Berisha, and Bengler’s (2015) study, who also found that experiencing a conditional automated driving system led to increased automation trust. The fact that TORs do not lastingly destroy automation trust could be interpreted as an indication that drivers might be able to distinguish between different types of system limitations. Reflecting this conceptual difference, TORs could rather be understood as failures of automation to behave as expected, as suggested by one of the anonymous reviewers of Manuscript 3. Taken together, the dynamic process of automation trust calibration in conditional automated driving systems seems to be pretty similar to the general schematic time course proposed by Wickens & Xu (2002; see Figure 2.5). However, the present results also suggest that the duration of the experimental sessions with less than 60 minutes each might have been not long enough to equalize any initial differences in drivers’ automation trust completely (see Chapter 7, Figure 8.3) and develop an already stable level of automation trust. Visual inspection suggests that there might be a further

80

increase in automation trust with continuing system use (see Chapter 5, Figure 5.1; Chapter 7, Figure 8.3). Apart from that, participants tended to report rather high a priori automation trust in conditional automated driving systems across all three studies. Considering both the two different automation trust questionnaires (see Chapter 5, Chapter 7 and Chapter 8) and single-item automation trust ratings (see Chapter 5), participants’ self-reported automation trust exhibited a ceiling effect ranging towards the upper end of the scales. On the other hand, participants monitoring frequency of the conditional automated driving system as a behavioral measure of automation trust showed a floor effect over time (see Chapter 6). These tendencies might be explained by the inevitably high complexity and technical capabilities of conditional automated driving systems due to which users might attribute a high degree of technical sophistication to them. Apparently, participants were willing to grant these systems a certain leap of faith. It should be noted, however, that due to the peculiarities of the underlying sample this particular observation might only be transferred with limitations to the broader population (see Chapter 9.4). Nonetheless, this finding complies with related work that equally indicates a generally positive attitude towards conditional automated driving systems (e.g. Gold, Berisha, & Bengler, 2015; Payre et al., 2014). Similarly, research in other domains has also found evidence for such a positivity bias towards novel automation (e.g. Dzindolet, Peterson, Pomranky, Pierce, & Beck, 2003), which Hoff and Bashir (2015) attribute to the expectation of near perfect system functionality.

Application of Automation Trust Models to the Context of Conditional Automated Driving In addition to extending earlier findings, the present work demonstrates that the theoretical automation trust models drawn on for this thesis can be applied to the investigation of drivers’ automation trust in conditional automated driving systems. Lee and See’s (2004) Dynamic Model of Trust and Reliance on Automation As posited in Lee and See’s (2004) model, the current results indicate the existence of a dynamic closed-loop process of automation trust and automation reliance (see Figure 2.3) during the use of conditional automated driving systems. This is illustrated by the increase in participants’ automation trust (see Chapter 5 and Chapter 7) and automation reliance (see Chapter 6) connected with accumulating system experience, as

81

well as the temporary changes in both self-reported automation trust and participants’ monitoring behavior after TORs observed in the first study (see Figure 6). It should be noted, however, that throughout all studies participants were instructed to use the conditional automated driving system whenever possible. Therefore, it cannot be excluded that the decreases in participants’ automation trust associated with TORs found in the current studies would normally lead drivers to (temporarily) disuse the automation, thus changing subsequent automation trust formation under real-world conditions. Accordingly, the increase in automation trust observed in the current studies over time might be slower in real life. Above that, the current results also show that drivers’ automation trust itself as well as the strength of the relationship between automation trust and automation reliance in conditional automated driving systems depend on context — just as illustrated in Lee and See’s (2004) model (see Figure 2.3). On the one hand, both participants’ automation trust and automation reliance fluctuated in dependence on the operational capability of the system, which is manifested for instance in the momentary decrease in participants’ automation trust after the occurrence of TORs (see Figure 5.1). On the other hand, the strength of the relationship between participants’ automation trust and automation reliance was mediated by context factors: Probably due to the constrained time budget, there was no relationship between self-reported automation trust and takeover times as well as usage behavior (see Chapter 5). In addition, even though there was a substantial connection between participants’ self-reported automation trust and monitoring behavior, self-reported automation trust could explain only a fraction of the variance in participants’ monitoring behavior. As noted in Chapter 5 and Chapter 6, it is expected that these relationships will be stronger in real-world conditions under less restrained settings. Finally, the current results support the central role displaying information about automation takes in Lee and See’s (2004) model for facilitating automation trust calibration. Specifically, the results reported in Chapter 8 suggest that by providing drivers with information about conditional automated driving systems in a user-friendly way, automation trust indeed can be supported by vehicle designers. Hoff and Bashir’s (2015) Three-Layered Automation Trust Model In addition, the current results also indicate that the three layers of variability in automation trust described in Hoff and Bashir’s (2015) model can contribute to a better understanding of human-machine interaction in the context of conditional automated

82

driving systems. In particular, the subdivision into distinct layers can help to investigate the relationship between automation trust and automation reliance in more detail than it would be possible using Lee and See’s (2004) model on its own (see Chapter 6). Apart from that, the current results support the central role of system performance (see Chapter 5), design features (see Chapter 8) and situational factors not related to automation trust (see Chapter 5) that influence users automation reliance both prior and during interaction with automation posited in the model (see Figure 2.3). Last but not least, the current results provide first evidence about the influence of preexisting knowledge (see Chapter 5 and Chapter 8) and culture (see Chapter 5) on automation trust in the context of conditional automated driving systems, which are both high level factors in Hoff and Bashir’s (2015) model (see Figure 2.3).

9.3 Practical Application Importantly, the current results also contribute to practical applications of humanmachine interaction. This incorporates approaches to the measurement of automation trust on the one hand and ancillary guidelines for the design of vehicle technology on the other hand, with a consequential focus on conditional automated driving systems.

Measurement of Drivers’ Automation Trust in Conditional Automated Driving Systems With regard to the operationalization of automation trust, the present work provides examples of self-report measures as well as behavioral measures that can be used by researchers and practitioners alike to assess drivers’ automation trust in conditional automated driving systems. Self-Report Measures of Automation Trust Concerning self-report measures of automation trust, both automation trust questionnaires and single-item automation trust ratings proved to be promising approaches with peculiar strengths. In direct comparison of the two different automation trust questionnaires employed across the three studies, only the questionnaire developed by Jian et al. (2000) offers the possibility to assess automation mistrust in addition to automation trust. Alhtough this hypothetical component of automation trust (Spain et al., 2008)

83

has received little attention until now (which is also reflected in the fact that this construct is neither included in the model of Lee and See, 2004, nor Hoff and Bashir, 2015), investigating users automation mistrust might prove useful over and above considering only automation trust (see Chapter 5). On the other hand, the questionnaire proposed by Chien, Semnani-Azad, et al. (2014) seems more adequate for applied and real-world settings, since its items can be more easily adapted to fit a certain object of investigation at hand (see Appendix). This might allow to draw more specific and externally valid conclusions from experimental results. In addition, Chien, Semnani-Azad, et al.’s (2014) questionnaire offers the possibility to form subscales based on the dimensions of attributional abstraction posited in Lee and See’s (2004) model (for an example, see Chien, Lewis, Hergeth, Semnani-Azad, & Sycara, 2015) as well as the option to collect both people’s general and specific automation trust (Chien, Semnani-Azad, et al., 2014). Above that, the current findings indicate that there seems to be an acceptable agreement between automation trust questionnaires and single-item ratings of automation trust (see Chapter 5, Figure 5.1 and Figure 5.2). Accordingly, single-item automation trust ratings might provide an ecological extension or even alternative to more exhaustive automation trust questionnaires for applications where a fast, uncomplicated measurement of trust is more important than an exhaustive measurement. Although it is kind of surprising that people are able to assess such a complex, multifaceted construct as automation trust so good, this finding supports Lee and See’ (2004) assumption that automation trust is not only guided by rational processes but also has a strong affective component that people can ”feel”. A special aspect of single-item trust ratings concerns the method with which these ratings are collected: In the corresponding study of this thesis, single-item trust ratings were collected orally by the experimenter (see Chapter 5). As a result, many participants tended to give single-item trust ratings in increments of 5% and 10%. A possible solution for this problem might be to use (digital) visual analogue scales as suggested for example by Reips & Funke, 2008, or other more continous measures such as handset controls (Scherer et al., 2015). Particularly when collected digitally, VAS are reliable measurement tools and can provide diverse advantages over other collection methods, such as higher accuracy and less need for explanation (Reips & Funke, 2008). In this respect, it would seem recommendable to use VAS instead of oral collections of single-item automation trust ratings for future investigations whenever the circumstances permit it.

84

Behavioral Measures of Automation Trust Regarding behavioral measures of automation trust, drivers’ glance behavior seems to be a promising approach to quantifying drivers’ automation trust in conditional automated driving systems (see Chapter 6). On the one hand, glance behavior constitutes a direct measure of automation reliance that can also be assessed continuously. On the other hand, the measurement of glance behavior does not require interaction with an automation — in contrast to interaction-based measures such as activation or deactivation of automated driving systems (see Chapter 6). Finally and most importantly, the comparably strong connection between participants’ self-reported automation trust and glance behavior indicates that it can be a viable measure of automation trust (see Chapter 6), whereas no such connection could be found for participants’ reactions to TORs and usage of the conditional automated driving system (see Chapter 5). In their study on drivers’ automation trust in conditional automated driving systems, Gold, K¨orber, et al. (2015) also investigated the possibility to make inferences about drivers automation trust based on their glance behavior. An analysis of participants’ horizontal gaze behavior showed that there was no statistically or practically significant connection with self-reported automation trust, however. Therefore, using the frequency and duration of drivers’ monitoring glances during conditional automated driving as a behavioral indicator for measuring their automation trust at present seems preferable to using horizontal gaze behavior. In a broader context, drivers’ glance behavior is already used today to detect drowsiness and make drivers aware of it (Williamson & Chamberlain, 2005). With further development of eye-tracking technology, a detailed measurement of drivers’ gaze behavior similar to the one described here seems equally possible in a real-world environment. In contrast to self-report measures of drivers’ automation trust, which are hardly imaginable in applied settings, the measurement of drivers’ automation trust can be employed without impairing the driver in any way. For experimental settings, this provides the additional advantage that the immersion of the driver in, for example, driving simulation experiments is much less disturbed than when self-report measures are used.

Designing Drivers’ Automation Trust in Conditional Automated Driving Systems As to the design of drivers’ automation trust in conditional automated driving systems, the fundamental finding of this thesis is that automation trust can be guided both before

85

and during the use of such systems by communicating different kinds of information about the system to the driver. In agreement with Beggiato et al.’s (2015) findings, the present results indicate that information about the process of automated driving systems can indeed help the driver to gain automation trust (see Chapter 8). Apart from that, information about the limitations of conditional automated driving seems to have a moderating effect on drivers’ automation trust in the form that knowledge about system limitations tends to decrease drivers’ automation trust. This indicates that apart from providing drivers with information about conditional automated driving systems during actual system use, training drivers might have a lasting effect on automation trust and usage behavior. This extends Payre et al.’s (2015) findings that training can affect drivers’ automation trust and take-over capability with more specific predictions about the minimum kind of training that would be required to achieve a beneficial effect (i.e., a prior theoretical familiarization with system limitations; see Chapter 7). Interestingly, simply framing automated driving systems as either comfort-oriented or safety-oriented systems did not have an effect on drivers’ automation trust. Apparently, potential users already have certain expectations about conditional automated driving systems that influence their perception of the system over and above a general classification (see Chapter 8), but not specific information about system limitations (see Chapter 7). Walker et al. (2016) argued that the pinnacle of designing vehicle technology should be reaching an agreement between the mental model of the designer and that of the user, and identified the information about a system provided to the driver as the key variable for enabling this agreement. To that effect, a system should provide sufficient information that the driver can observe its behavior and — at least on a functional level — also understand it. Accordingly, the current results should not be interpreted as evidence that information about the purpose and performance of conditional automated driving systems do not influence drivers’ automation trust. On the one hand, both purpose and performance information might affect drivers’ automation trust if they are presented with another content or in a different format (see Chapter 8). On the other hand, it is entirely possible that purpose and performance information had an effect on drivers automation trust that just could not be captured with rational processes, but might have affected their automation reliance.

86

9.4 Methodological Considerations For the interpretation of the present findings some peculiarities of the studies reported in this thesis should be kept in mind. Apart from the individual limitations of the three studies discussed in the respective chapters (see Chapter 5 – Chapter 8), the samples underlying this work warrant a more extensive discussion. The reviewers of both Manuscript 1 (see Chapter 5) and Manuscript 2 (see Chapter 6) raised the concern that the sample for this research consisted exclusively of BMW Group employees. All participants were contacted via mailing lists. This sample was not drawn out of convenience, but with the intent of providing a representative sample of the target population to which these studies pertain (i.e., potential users of conditional automated driving systems), while satisfying indispensable confidentiality requirements. First of all, only participants who had no prior experience or expert knowledge about conditional automated driving systems were eligible to participate in the studies reported here. Above that, although BMW Group employees might differ from the general population in some aspects, this does not necessarily limit the external variability of the current findings: On the one hand, the main focus of the current research was not participants’ general attitude towards automation, but the specific relationship between drivers’ automation trust and reliance in conditional automated driving systems. As mentioned above, participants were required to have no prior experience with conditional automated driving in any form and were not allowed to participate in the study if they had expert knowledge of or were working with similar technologies. Thus, study participants did not differ from the general population with regard to their pre-existing knowledge, which has been identified by (Hoff & Bashir, 2015) to be the most important antecedent of initial learned trust. In addition, the conclusions drawn from the current findings are not supposed to be taken as universal, but need to be interpreted as relative differences between conditions and times of measurement compared against each other (In this regard it should be noted that the observation pertaining to the rather high level of participants automation trust’ observed across the current studies discussed in Chapter 9.1 might indeed be less generalizable to the broader population for this very reason, and should be taken with a grain of salt). On the other hand, the current samples consisted of people with diverse backgrounds. Among else, the samples included participants working for suppliers, business partners, and interns, who differed in demographic variables (e.g. age) as well as educational background (e.g. economists, psychologists, computer scientists, to name but a few). In this regard, it might be argued that the samples could have been even

87

more representative of the population in question than, say, a sample drawn from college students, who have been shown to differ substantially from the population at large and ”are among the least representative populations one could find for generalizing about humans” (Henrich, Heine, & Norenzayan, 2010, p. 1). Taken together, inferences drawn from the samples investigated in the current thesis should in principle also generalize to the population of drivers using conditional automated driving systems in the future.

9.5 Further Research The present thesis contributes preliminary approaches to the operationalization and design of drivers’ automation trust in the context of conditional automated driving systems, but cannot cover all aspects of this emerging field of research. Apart from the recommendations for further research given in each chapter discussing the three studies (see Chapter 5 – Chapter 8), there are some overarching topics that seem particularly promising for future investigations: While the driving simulator studies reported here provide a good balance of experimental control and external validity, they are not capable of reproducing the full range of behavioral degrees of freedom as well as complexity of real-world system use. Accordingly, future research could try to investigate to what extent the present findings can be replicated in less restrained settings such as field studies. In a similar vein, the current research focused on the initial use of conditional automated driving systems. However, results from a study conducted in real traffic by Winner, Hakuli, and Wolf (2009) suggest that drivers using ACC required a minimum of two weeks to learn the operation of the system. In this regard, an investigation of the medium- and long-term calibration of drivers’ automation trust in conditional automated driving systems would be a desirable and valuable extension of this thesis. This could be realized, for example, by repeatedly inviting the same participants to experiments on conditional automated driving systems or providing them with appropriately equipped vehicles for an extended amount of time. Apart from that, providing drivers with information about conditional automated driving systems as investigated in the current thesis constitutes only one of many possible approaches that could be used to design drivers’ automation trust in conditional automated driving systems. Among else, future studies could try to vary the content as well the format of displays that provide drivers with information about conditional automated driving systems. For example, it can be inferred from Lee and See’s (2004)

88

model that information about automation must not necessarily refer to the system as a whole, but could also be related to subsystems or individual functions (see Figure 2.3). Accordingly, displaying information about the process and performance of automation might also be related to subsidiary functions of conditional automated driving systems, such as longitudinal or lateral control. In addition, it might also be explored if and how the temporary detrimental effects of TORs on drivers’ automation trust observed in the current studies might be compensated by providing adequate information before, during or after the occurrence of take-over situations. For example, retroactive information about system limitations after TORs occur could complement approaches involving information prior to and during system use as investigated in this thesis. Alternatively, the sequence and design of TORs could be modified to attenuate their negative effects. In the long run, it will also be necessary to investigate the influence of different kinds of TORs on drivers’ automation trust. In the present studies (as in the majority of research on human-machine interaction in the context of conditional automated driving; see Chapter 7), TORs represented accurate warnings. This means that TORs were always intended by the system designer, useful for the driver (because it allowed him / her to avoid an impending collision) and understandable (because there always was a stimuli present which justified a TOR, in the current studies depicted by an accident in the participant’s own lane). Lees and Lee (2007) stress that an exclusively technical classification disregards such psychological processes in the evaluation of warnings, which can massively influence the appraisal and consequences of warnings. For example, predictable warnings are supposed to be evaluated differently by the driver than unpredictable warnings, irrespective of their usefulness or their intentionality (Lees & Lee, 2007).

89

10 Conclusion By providing some much needed preliminary approaches to measuring and designing drivers’ automation trust in conditional automated driving systems, the present thesis adds to our understanding of the general framework that will be required for the safe, comfortable and overall beneficial introduction of conditional automated driving systems on the market. The current results indicate that (i) both self–report measures and behavioral measures can be used to assess drivers’ automation trust in conditional automated driving systems, (ii) prior familiarization with system limitations can have a lasting effect on drivers’ automation trust in conditional automated driving systems and (iii) information about the processes of conditional automated driving systems might promote drivers’ automation trust in these systems. The scientific contribution of this thesis is also reflected by its reception in the literature: For instance, the manuscript that formed the basis for Chapter 5 has already been cited by Beggiato et al. (2015) as an example for a study on automation trust calibration and by Louw, Kountouriotis, Carsten, and Merat (2015) as an example for a study investigating the effects of TORs during automated driving. However, other approaches to measuring and designing drivers’ automation trust such as neural correlates of trust (e.g. Krueger et al., 2007) or antropomorphization of automated driving systems (e.g. Waytz, Heafner, & Epley, 2014) remain largely unexplored. Future studies could try to extend the present findings to real-world use of conditional automated driving systems and tie in such research. Lee and See (2004) noted that there has been an intimate connection between the availability of automation and the research interest in automation trust in the past. Similarly, the research interest in automation trust will likely grow and gain even further momentum just as the current trend towards increasing automation in the automotive domain will probably continue in the future (Verband der Automobilindustrie e.V., 2015). Of particular importance, the advent of conditional automated driving systems represents the first time that large parts of the population will have access to such a highly sophisticated kind of automation. Accordingly, the present research can possibly help to fulfill these prospective research demands.

90

Even though the focus of the present work were conditional automated driving systems, the current findings might also be transferred to higher levels of driving automation as well as other domains and applications of automation. For example, automation trust will remain an important component of successful human-automation collaboration in the context of highly and fully automated driving systems (see Figure 2.2). Simultaneously, however, these levels of automation may require a reconsideration of what constitutes appropriate automation trust: Eventually, it might be argued that when an automated driving system is always able to perform the dynamic driving task even if a human driver fails to respond appropriately to a request to intervene (SAE Level 4) or may not even entail a human driver at all (SAE Level 5 and to some extent SAE Level 4; see Figure 2.2), the consequences of not using such systems are more dire than any abstract problems that could arise from excessive automation trust in such systems. In any case, it is still a long, winding road until we may one day realize the vision zero of no fatalities or serious injuries in road traffic. Be it with the help of conditional automated driving systems or in any other way, I hope that the current thesis can make a little contribution to this effort.

91

Bibliography Abe, G., Itoh, M., & Tanaka, K. (2002). Dynamics of drivers’ trust in warning systems. IFAC Proceedings Volumes, 35 (1), 363–368. doi: 10.3182/20020721-6-ES-1901 .01614 Abe, G., & Richardson, J. (2006). Alarm timing, trust and driver expectation for forward collision warning systems. Applied Ergonomics, 37 (5), 577–86. doi: 10.1016/ j.apergo.2005.11.001 Adell, E. (2010). Acceptance of driver support systems. Proceedings of the European Conference on Human Centred Design for Intelligent Transport Systems, 2 , 475– 486. Aeberhard, M., Rauch, S., Bahram, M., Tanzmeister, G., Thomas, J., Pilat, Y., . . . Kaempchen, N. (2015). Experience, results and lessons learned from automated driving on germany’s highways. IEEE Intelligent Transportation Systems Magazine, 7 (1), 42–57. doi: 10.1109/MITS.2014.2360306 Bagheri, N., & Jamieson, G. A. (2004). Considering subjective trust and monitoring behavior in assessing automation-induced ”complacency”. In D. Vicenzi, M. Mouloua, & P. Hancock (Eds.), Human Performance, Situation Awareness and Automation: Current Research and Trends (Vol. 2, pp. 54–59). Mahwah, NJ: Erlbaum. Beggiato, M., Hartwich, F., Schleinitz, K., Krems, J. F., Othersen, I., & PetermannStock, I. (2015, November). What would drivers like to know during automated driving? Information needs at different levels of automation. Paper presented at 7. Tagung Fahrerassistenz. Munich, Germany. Beggiato, M., & Krems, J. F. (2013). The evolution of mental model, trust and acceptance of adaptive cruise control in relation to initial information. Transportation Research Part F , 18 , 47–57. doi: 10.1016/j.trf.2012.12.006 Beller, J., Heesen, M., & Vollrath, M. (2013). Improving the driver-automation interaction: An approach using automation uncertainty. Human Factors, 55 , 1130–1141. doi: 10.1177/0018720813482327

92

Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society: Series B , 57 , 289–300. doi: 10.2307/2346101 Biros, D. P., Daly, M., & Gunsch, G. (2004). The influence of task load and automation trust on deception detection. Group Decision and Negotiation, 13 , 173–189. doi: 10.1023/B:GRUP.0000021840.85686.57 Bliss, J. P., & Acton, S. A. (2003). Alarm mistrust in automobiles: How collision alarm reliability affects driving. Applied Ergonomics, 34 (6), 499–509. doi: 10.1016/ j.apergo.2003.07.003 Brooke, J. (1996). SUS: A ”quick and dirty” usability scale. In P. W. Jordan, B. Thomas, B. A. Weerdmeester, & I. L. McClelland (Eds.), Usability Evaluation in Industry (pp. 189–194). London, United Kingdom: Taylor & Francis. Brown, C. M., & Noy, Y. I. (2004). Behavioural adaptation to in-vehicle safety measures: Past ideas and future directions. In T. Rothengatter & R. D. Huguenin (Eds.), Traffic & Transport Psychology: Proceedings of the ICTTP 2000 (pp. 25–46). Oxford, United Kingdom: Elsevier. Burns, C. M., & Hajdukiewicz, J. (2004). Ecological interface design. Boca Raton, FL: CRC Press. Cahour, B., & Forzy, J.-F. (2009). Does projection into use improve trust and exploration? An example with a cruise control system. Safety Science, 47 (9), 1260– 1270. doi: 10.1016/j.ssci.2009.03.015 Carlson, M. S., Desai, M., Drury, J. L., Kwak, H., & Yanco, H. A. (2014). Identifying factors that influence trust in automated cars and medical diagnosis systems. The Intersection of Robust Intelligence and Trust in Autonomous Systems: Papers from the 2014 AAAI Spring Symposium, 20–27. Chapman, P. R., & Underwood, G. (1998). Visual search of driving situations: Danger and experience. Perception, 27 , 951–964. doi: 10.1068/p270951 Chien, S.-Y., Lewis, M., Hergeth, S., Semnani-Azad, Z., & Sycara, K. (2015). Crosscountry validation of a cultural scale in measuring trust in automation. Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 59 , 686–690. doi: 10.1177/1541931215591149 Chien, S.-Y., Lewis, M., Semnani-Azad, Z., & Sycara, K. (2014). An empirical model of cultural factors on trust in automation. Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 58 , 859–863. doi: 10.1177/1541931214581181 Chien, S.-Y., Semnani-Azad, Z., Lewis, M., & Sycara, K. (2014). Towards the develop-

93

ment of an inter-cultural scale to measure trust in automation. In P.-L. P. Rau (Ed.), Lecture Notes in Computer Science (Vol. 8528, pp. 35–46). Cham, Switzerland: Springer. doi: 10.1007/978-3-319-07308-8 4 Clark, L. A., & Watson, D. (1995). Constructing validity: Basic issues in objective scale development. Psychological Assessment, 7 , 309–319. doi: 10.1037/1040-3590.7.3 .309 Cocron, P., B¨ uhler, F., Neumann, I., Franke, T., Krems, J., Schwalm, M., & Keinath, A. (2011). Methods of evaluating electric vehicles from a user’s perspective – the MINI E field trial in Berlin. IET Intelligent Transport Systems, 5 , 127-133. doi: 10.1049/iet-its.2010.0126 Cohen, J. (1992). A power primer. Psychological Bulletin, 112 , 155–159. doi: 10.1037/ 0033-2909.112.1.155 Cronbach, L. J., & Meehl, P. E. (1955). Construct validity in psychological tests. Psychological Bulletin, 52 , 281–302. doi: 10.1037/h0040957 ¨ Damb¨ock, D., Farid, M., Toenert, L., & Bengler, K. (2012, May). Ubernahmezeiten beim hochautomatisierten Fahren. Paper presented at 5. Tagung Fahrerassistenz. Munich, Germany. Dzindolet, M. T., Peterson, S. A., Pomranky, R. A., Pierce, L. G., & Beck, H. P. (2003, jun). The role of trust in automation reliance. International Journal of HumanComputer Studies, 58 (6), 697–718. doi: 10.1016/S1071-5819(03)00038-7 Ergoneers. (2013). D-Lab & D-Lab Control (Version 2.5) [Eye tracking system]. Manching, Germany: Ergoneers. Field, A., & Hole, G. (2002). How to design and report experiments. London, United Kingdom: Sage. Fine, G. A., & Holyfield, L. (1996). Secrecy, trust, and dangerous leisure: Generating group cohesion in voluntary organizations. Social Psychology Quarterly, 59 , 22. doi: 10.2307/2787117 Fitts, P. M., Jones, R. E., & Milton, J. L. (1950). Eye movements of aircraft pilots during instrument-landing approaches. Aeronautical Engineering Review , 9 (2), 1–6. Gasser, T. M., Arzt, C., Ayoubi, M., Bartels, A., Eier, J., Flemisch, F., . . . Vogt, W. (2012). Rechtsfolgen zunehmender Fahrzeugautomatisierung (Berichte der Bundesanstalt f¨ ur Straßenwesen Heft F 83). Bergisch Gladbach, Germany: Bundesanstalt f¨ ur Straßenwesen. Gold, C., Berisha, I., & Bengler, K. (2015). Utilization of drivetime – performing non-

94

driving related tasks while driving highly automated. Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 59 , 1666–1670. Gold, C., Damb¨ock, D., Lorenz, L., & Bengler, K. (2013). ”Take over!” How long does it take to get the driver back into the loop? Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 57 , 1938–1942. doi: 10.1177/ 1541931213571433 Gold, C., K¨orber, M., Hohenberger, C., Lechner, D., & Bengler, K. (2015). Trust in automation – before and after the experience of take-over scenarios in a highly automated vehicle. Procedia Manufacturing, 3 (1), 3025–3032. doi: 10.1016/j .promfg.2015.07.847 Gold, C., Lorenz, L., & Bengler, K. (2014, June). Influence of automated brake application on take-over situations in highly automated driving scenarios. Paper presented at the FISITA 2014 World Automotive Congress. Maastricht, The Netherlands. Green, D. M., & Swets, J. A. (1966). Signal detection theory and psychophysics. Oxford, England: Wiley. Hayward, J. (1972). Near miss determination through use of a scale of danger (Highway Research Record No. 384). Washington D.C.: Highway Research Board. Helldin, T., Falkman, G., Riveiro, M., & Davidsson, S. (2013). Presenting system uncertainty in automotive UIs for supporting trust calibration in autonomous driving. In Proceedings of the 5th international conference on automotive user interfaces and interactive vehicular applications (pp. 210–217). New York, USA: ACM. doi: 10.1145/2516540.2516554 Henrich, J., Heine, S. J., & Norenzayan, A. (2010). The weirdest people in the world? Behavioral and Brain Sciences, 33 (2-3), 61–83. doi: 10.1017/S0140525X0999152X Hergeth, S., Lorenz, L., Krems, J. F., & Toenert, L. (2015). Effects of take-over requests and cultural background on automation trust in highly automated driving. Proceedings of the 8th International Driving Symposium on Human Factors in Driver Assessment, Training and Vehicle Design, 330–336. Hoff, K. A., & Bashir, M. (2015). Trust in automation: Integrating empirical evidence on factors that influence trust. Human Factors, 57 , 407–434. doi: 10.1177/ 0018720814547570 Hummel, M., Kuehn, M., Bende, J., & Lang, A. (2011). Fahrerassistenzsysteme. Ermittlung des Sicherheitspotenzials auf Basis des Schadengeschehens der Deutschen Versicherer (Forschungsbericht FS 03). Berlin, Germany: Gesamtverband der Deutschen Versicherungswirtschaft e. V.

95

Inhoff, A. W., & Radach, R. (1998). Definition and computation of oculomotor measures in the study of cognitive processes. In G. Underwood (Ed.), Eye guidance in reading and scene perception (pp. 29–53). Oxford, United Kingdom: Elsevier. doi: 10.1016/B978-008043361-5/50003-1 International Organization for Standardization. (2012). Road vehicles – ergonomic aspects of transport information and control systems – calibration tasks for methods which assess driver demand due to the use of in-vehicle systems (ISO/TS 14198:2012). Geneva, Switzerland: International Organization for Standardization. Itoh, M., Abe, G., & Tanaka, K. (1999). Trust in and use of automation: Their dependence on occurrence patterns of malfunctions. In IEEE SMC’99 Conference Proceedings (Vol. 3, pp. 715–720). IEEE. doi: 10.1109/ICSMC.1999.823316 Jacob, R. J. K., & Karn, K. S. (2003). Eye tracking in human–computer interaction and usability research: Ready to deliver the promises. In J. Hy¨on¨a, R. Radach, & H. Deubel (Eds.), The mind’s eye: Cognitive and applied aspects of eye movement research (1st ed., pp. 573–605). Amsterdam, The Netherlands: Elsevier. Jian, J., Bisantz, A., & Drury, C. (2000). Foundations for an empirically determined scale of trust in automated systems. International Journal of Cognitive Ergonomics, 4 , 53–71. doi: 10.1207/S15327566IJCE0401 04 Kazi, T. A., Stanton, N. A., Walker, G. H., & Young, M. S. (2007). Designer driving: Drivers’ conceptual models and level of trust in adaptive cruise control. International Journal of Vehicle Design, 45 , 339–360. doi: 10.1504/IJVD.2007.014909 Koo, J., Kwac, J., Ju, W., Steinert, M., Leifer, L., & Nass, C. (2014). Why did my car just do that? Explaining semi-autonomous driving actions to improve driver understanding, trust, and performance. International Journal on Interactive Design and Manufacturing, 9 , 269–275. doi: 10.1007/s12008-014-0227-2 Koustanai, A., Cavallo, V., Delhomme, P., & Mas, A. (2012). Simulator training with a forward collision warning system: Effects on driver-system interactions and driver trust. Human Factors, 54 , 709–721. doi: 10.1177/0018720812441796 Koustanai, A., Mas, A., Cavallo, V., & Delhomme, P. (2010). Familiarization with a forward collision warning on driving simulator: Cost and benefit on driver-system interactions and trust. Driving Simulator Conference Europe, 169–179. KPMG International and the Center for Automotive Research. (2012). Self-driving cars : The next revolution. Retrieved from http://www.kpmg.com/US/en/ IssuesAndInsights/ArticlesPublications/Documents/self-driving-cars

96

-next-revolution.pdf. Kramer, R. M. (1999). Trust and distrust in organizations: Emerging perspectives, enduring questions. Annual Review of Psychology, 50 (1), 569–598. doi: 10.1146/ annurev.psych.50.1.569 Krantz, D. H. (1999). The null hypothesis testing controversy in psychology. Journal of the American Statistical Association, 94 (448), 1372–1381. doi: 10.1080/01621459 .1999.10473888 Krueger, F., McCabe, K., Moll, J., Kriegeskorte, N., Zahn, R., Strenziok, M., . . . Grafman, J. (2007). Neural correlates of trust. Proceedings of the National Academy of Sciences of the United States of America, 104 , 20084–9. doi: 10.1073/pnas.0710103104 Lacson, F. C., Wiegmann, D. A., & Madhavan, P. (2005). Effects of attribute and goal framing on automation reliance and compliance. Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 49 , 482–486. doi: 10.1177/ 154193120504900357 Lee, J. D., & Moray, N. (1992). Trust, control strategies and allocation of function in human-machine systems. Ergonomics, 35 , 1243–1270. doi: 10.1080/ 00140139208967392 Lee, J. D., & See, K. (2004). Trust in automation: Designing for appropriate reliance. Human Factors, 46 , 50–80. doi: 10.1518/hfes.46.1.50 Lees, M. N., & Lee, J. D. (2007). The influence of distraction and driving context on driver response to imperfect collision warning systems. Ergonomics, 50 (8), 1264–1286. doi: 10.1080/00140130701318749 Lewandowsky, S., Mundy, M., & Tan, G. P. A. (2000). The dynamics of trust: Comparing humans to automation. Journal of Experimental Psychology: Applied , 6 , 104–123. doi: 10.1037/1076-898X.6.2.104 Llaneras, R. E., Salinger, J., & Green, C. A. (2013). Human factors issues associated with limited ability autonomous driving systems: Drivers’ allocation of visual attention to the forward roadway. Proceedings of the Seventh International Driving Symposium on Human Factors in Driver Assessment, Training, and Vehicle Design, 92–98. Lorenz, L., Kerschbaum, P., & Schumann, J. (2014). Designing take over scenarios for automated driving: How does augmented reality support the driver to get back into the loop? Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 58 , 1681–1685. doi: 10.1177/1541931214581351

97

Louw, T., Kountouriotis, G., Carsten, O., & Merat, N. (2015). Driver inattention during vehicle automation: How does driver engagement affect resumption of control? 4th International Conference on Driver Distraction and Inattention, 1–12. doi: 10.13140/RG.2.1.2017.0089 Louw, T., Merat, N., & Jamson, H. (2015). Engaging with highly automated driving: to be or not to be in the loop? Proceedings of the Eighth International Driving Symposium on Human Factors in Driver Assessment, Training, and Vehicle Design, 189–195. MacMillan, J., Entin, E. B., & Serfaty, D. (1994). Operator reliance on automated support for target recognition. Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 38 , 1285–1289. doi: 10.1177/154193129403801908 Madhani, K., Khasawneh, M. T., Kaewkuekool, S., Gramopadhye, A. K., & Melloy, B. J. (2002). Measurement of human trust in a hybrid inspection for varying error patterns. Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 46 , 418–422. doi: 10.1177/154193120204600343 Manca, L., Happee, R., & de Winter, J. C. (2015). Visual displays for automated driving: a survey. In Automotive ui 2015 - workshop on adaptive ambient in-vehicle displays and interactions (waadi). Nottingham, United Kingdom. Marinik, A., Bishop, R., Fitchett, V., Morgan, J. F., Trimble, T. E., & Blanco, M. (2014). Human factors evaluation of level 2 and level 3 automated driving concepts: Concepts of operation. (Report No. DOT HS 812 044). Washington, DC: National Highway Traffic Safety Administration. Master, R., Jiang, X., Khasawneh, M. T., Bowling, S. R., Grimes, L., Gramopadhye, A. K., & Melloy, B. J. (2005). Measurement of trust over time in hybrid inspection systems. Human Factors and Ergonomics in Manufacturing, 15 , 177–196. doi: 10.1002/hfm.20021 Merat, N., Jamson, A. H., Lai, F. C. H., Daly, M., & Carsten, O. M. J. (2014). Transition to manual: Driver behaviour when resuming control from a highly automated vehicle. Transportation Research Part F , 26 , 1–9. doi: 10.1016/j.trf.2014.05.006 Merat, N., & Jamson, H. A. (2009). How do drivers behave in a highly automated car? Proceedings of the Fifth International Driving Symposium on Human Factors in Driver Assessment, Training and Vehicle Design, 514–521. Merlo, J. L., Wickens, C. D., & Yeh, M. (1999). Effect of reliability on cue effectiveness and display signaling (Tech. Rep.). Meyer, G., & Deix, S. (2014). Research and innovation for automated driving in germany

98

and europe. In G. Meyer & S. Beiker (Eds.), Road vehicle automation (pp. 71–81). Cham, Switzerland: Springer. doi: 10.1007/978-3-319-05990-7 7 Montgomery, D. A., & Sorkin, R. D. (1996). Observer sensitivity to element reliability in a multielement visual display. Human Factors, 38 , 484–494. doi: 10.1518/ 001872096778702024 Moray, N., & Inagaki, T. (1999). Laboratory studies of trust between humans and machines in automated systems. Transactions of the Institute of Measurement and Control , 21 , 203–211. doi: 10.1177/014233129902100408 Mourant, R. R., & Rockwell, T. H. (1972). Strategies of visual search by novice and experienced drivers. Human Factors, 14 , 325–335. doi: 10.1177/001872087201400405 Muir, B. M., & Moray, N. (1996). Trust in automation. Part II. Experimental studies of trust and human intervention in a process control simulation. Ergonomics, 39 , 429–460. doi: 10.1080/00140139608964474 National Highway Traffic Safety Administration. (2013). Preliminary statement of policy concerning automated vehicles (NHTSA 14-13). Washington, DC: U.S. Department of Transportation. National Transportation Safety Board. (2014). Descent below visual glidepath and impact with seawall, Asiana Airlines Flight 214, Boeing 777-200ER, HL7742, San Francisco, California July 6, 2013 (Aircraft Accident Report NTSB/AAR-14/01). Washington, DC.: National Transportation Safety Board. Naujoks, F., Purucker, C., Neukum, A., Wolter, S., & Steiger, R. (2015). Controllability of partially automated driving functions – does it matter whether drivers are allowed to take their hands of the steering wheel? Transportation Research Part F , 35 , 185–198. doi: 10.1016/j.trf.2015.10.022 Neukum, A., L¨ ubbeke, T., Kr¨ uger, H.-P., Mayser, C., & Steinle, J. (2008). ACC-Stop & Go: Fahrerverhalten an funktionalen Systemgrenzen. 5. Workshop Fahrerassistenzsysteme, 2008 , 141–150. Neyedli, H. F., Hollands, J. G., & Jamieson, G. A. (2011). Beyond identity: Incorporating system reliability information into an automated combat identification system. Human Factors, 53 , 338–355. doi: 10.1177/0018720811413767 Parasuraman, R., & Manzey, D. H. (2010). Complacency and bias in human use of automation: An attentional integration. Human Factors, 52 , 381–410. doi: 10.1177/0018720810376055 Parasuraman, R., & Riley, V. (1997). Humans and automation: Use, misuse, disuse, abuse. Human Factors, 39 , 230–253. doi: 10.1518/001872097778543886

99

Parasuraman, R., Sheridan, T., & Wickens, C. (2000). A model for types and levels of human interaction with automation. IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans, 30 , 286–297. doi: 10.1109/ 3468.844354 Payre, W., Cestac, J., & Delhomme, P. (2014). Intention to use a fully automated car: Attitudes and a priori acceptability. Transportation Research Part F , 27 , 252–263. doi: 10.1016/j.trf.2014.04.009 Payre, W., Cestac, J., & Delhomme, P. (2015). Fully automated driving: Impact of trust and practice on manual control recovery. Human Factors, 58 , 229–241. doi: 10.1177/0018720815612319 Petermann-Stock, I., Hackenberg, L., Muhr, T., & Mergl, C. (2013, November). Wie ¨ lange braucht der Fahrer? Eine Eine Analyse zu Ubernahmezeiten aus verschiedenen Nebent¨atigkeiten w¨ahrend einer hochautomatisierten Staufahrt. Paper presented at 6. Tagung Fahrerassistenz. Munich, Germany. Pina, P., Donmez, B., & Cummings, M. L. (2008). Selecting metrics to evaluate human supervisory control applications (HAL Report HAL2008-04). Cambrige, MA: MIT Humans and Automation Laboratory. Poole, A., & Ball, L. J. (2005). Eye tracking in human-computer interaction and usability research: Current status and future prospects. In C. Ghaoui (Ed.), Encyclopedia of human-computer interaction (Vol. 2, pp. 211–219). Hershey, PA: Idea Group. doi: 10.4018/978-1-59140-562-7 Radlmayr, J., & Bengler, K. (2015). Literaturanalyse und Methodenauswahl zur Gestaltung von Systemen zum hochautomatisierten Fahren (FAT-Schriftenreihe Band 276). Berlin, Germany: Forschungsvereinigung Automobiltechnik e. V. Radlmayr, J., Gold, C., Lorenz, L., Farid, M., & Bengler, K. (2014). How traffic situations and non-driving related tasks affect the take-over quality in highly automated driving. Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 58 , 2063–2067. doi: 10.1177/1541931214581434 Rajaonah, B., Anceaux, F., & Vienne, F. (2006). Study of driver trust during cooperation with adaptive cruise control. Le Travail Humain, 2 , 99–127. Reips, U.-D., & Funke, F. (2008). Interval-level measurement with visual analogue scales in internet-based research: VAS generator. Behavior Research Methods, 40 , 699–704. doi: 10.3758/BRM.40.3.699 Riley, V. (1989). A general model of mixed-initiative human-machine systems. Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 33 , 124–128.

100

doi: 10.1177/154193128903300227 Ritz, F. (2004). Einflussfaktoren auf die Teamleistung im interkulturellen Vergleich (Doctoral Dissertation, Technische Universit¨at Berlin, Germany). Retrieved from http://dx.doi.org/10.14279/depositonce-907 Rudin-Brown, C. M., & Parker, H. A. (2004). Behavioural adaptation to adaptive cruise control (ACC): Implications for preventive strategies. Transportation Research Part F , 7 , 59–76. doi: 10.1016/j.trf.2004.02.001 Rupp, J. D., & King, A. G. (2010). Autonomous driving – a practical roadmap (SAE Technical Paper 2010-01-2335). Warrendale, PA: SAE International. doi: 10.4271/ 2010-01-2335 Ruscio, D., Ciceri, M. R., & Biassoni, F. (2015). How does a collision warning system shape driver’s brake response time? The influence of expectancy and automation complacency on real-life emergency braking. Accident Analysis & Prevention, 77 , 72–81. doi: 10.1016/j.aap.2015.01.018 Saffarian, M., Happee, R., & Winter, J. D. (2012). Why do drivers maintain short headways in fog? A driving-simulator study evaluating feeling of risk and lateral control during automated and manual car following. Ergonomics, 55 , 971–985. doi: 10.1080/00140139.2012.691993 Salmon, P. M., Regan, M., Lenne, M. G., Stanton, N. A., & Young, K. (2007). Work domain analysis and intelligent transport systems: Implications for vehicle design. International Journal of Vehicle Design, 45 , 426. doi: 10.1504/IJVD.2007.014914 Sanchez, J. (2006). Factors that affect trust and reliance on an automated aid (Doctoral dissertation, Georgia Institute of Technology). Retrieved from http:// hdl.handle.net/1853/10485 Sanchez, J., Rogers, W. A., Fisk, A. D., & Rovira, E. (2014). Understanding reliance on automation: Effects of error type, error distribution, age and experience. Theoretical Issues in Ergonomics Science, 15 , 134–160. doi: 10.1080/ 1463922X.2011.611269 Scherer, S., Dettmann, A., Hartwich, F., Pech, T., Bullinger, A. C., Krems, J. F., & Wanielik, G. (2015, November). How the driver wants to be driven - modelling driving styles in highly automated driving. Paper presented at 7. Tagung Fahrerassistenz. Munich, Germany. Sheridan, T. B., & Parasuraman, R. (2005). Human-automation interaction. Reviews of Human Factors and Ergonomics, 1 (1), 89–129. doi: 10.1518/155723405783703082 Sheridan, T. B., & Verplank, W. L. (1978). Human and computer control of under-

101

sea teleoperators (MIT Grant N0001477C0256). Cambridge, MA: Massachusetts Institute of Technology. Sherman, P. J., Helmreich, R. L., & Merritt, A. C. (1997). National culture and flight deck automation: Results of a multination survey. The International Journal of Aviation Psychology, 7 , 311–329. doi: 10.1207/s15327108ijap0704 Skitka, L. J., Mosier, K. L., Burdick, M., & Rosenblatt, B. (2000). Automation bias and errors: Are crews better than individuals? The International Journal of Aviation Psychology, 10 , 85–97. doi: 10.1207/S15327108IJAP1001 5 Society of Automotive Engineers. (2014). Taxonomy and definitions for terms related to on-road motor vehicle automated driving systems (SAE Standard J3016 201401). Warrendale, PA: SAE International. Spain, R. D., Bustamante, E. A., & Bliss, J. P. (2008). Towards an empirically developed scale for system trust: Take two. Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 52 , 1335–1339. doi: 10.1177/ 154193120805201907 Strand, N., Nilsson, J., Karlsson, I. C. M., & Nilsson, L. (2014). Semi-automated versus highly automated driving in critical situations caused by automation failures. Transportation Research Part F , 27 , 218–228. doi: 10.1016/j.trf.2014.04.005 Trimble, T. E., Bishop, R., Morgan, J. F., & Blanco, M. (2014). Human factors evaluation of level 2 and level 3 automated driving concepts: Past research, state of automation technology, and emerging system concepts (Report No. DOT HS 812 043). Washington, DC: National Highway Traffic Safety Administration. Vella, M., & Steinmetz, K. (2016). Why you shouldn’t be allowed to drive. Time Europe, 187 (8), 52–57. Verband der Automobilindustrie e.V. (2015). Automation – from driver assistance systems to automated driving (VDA-Brochure). Berlin, Germany: Verband der Automobilindustrie e.V. Retrieved from https://www.vda.de/dam/vda/ publications/2015/automation.pdf Verberne, F. M. F., Ham, J., & Midden, C. J. H. (2012). Trust in smart systems sharing driving goals and giving information to increase trustworthiness and acceptability of smart systems in cars. Human Factors, 54 , 799–810. doi: 10.1177/0018720812443825 Vogel, K. (2003). A comparison of headway and time to collision as safety indicators. Accident Analysis & Prevention, 35 , 427–433. doi: 10.1016/S0001-4575(02)00022 -2

102

Walker, G. H., Stanton, N. A., & Salmon, P. (2016). Trust in vehicle technology. International Journal of Vehicle Design, 70 , 157. doi: 10.1504/IJVD.2016.074419 Wang, L., Jamieson, G. A., & Hollands, J. G. (2009). Trust and reliance on an automated combat identification system. Human Factors, 51 , 281–291. doi: 10.1177/0018720809338842 Waytz, A., Heafner, J., & Epley, N. (2014). The mind in the machine: Anthropomorphism increases trust in an autonomous vehicle. Journal of Experimental Social Psychology, 52 , 113–117. doi: 10.1016/j.jesp.2014.01.005 Weinberger, M., Winner, H., & Bubb, H. (2001). Adaptive cruise control field operational test – the learning phase. JSAE Review , 22 , 487–494. doi: 10.1016/S0389-4304(01) 00142-4 Wickens, C. D., Hollands, J. G., Banbury, S., & Parasuraman, R. (2013). Engineering psychology and human performance (4th ed.). London, United Kingdom: Pearson. Wickens, C. D., & Xu, X. (2002). Automation trust, reliability and attention (Technical Report No. AHFD-02-14/MAAD-02-2). Savoy, IL: University of Illinois at UrbanaChampaign. Wilcox, R. R. (1998). How many discoveries have been lost by ignoring modern statistical methods? American Psychologist, 53 , 300–314. doi: 10.1037/0003-066X.53.3.300 Wilcox, R. R. (2012). Introduction to robust estimation and hypothesis testing (1st ed.). Waltham, MA: Elsevier. Wilcox, R. R., & Keselman, H. J. (2003). Modern robust data analysis methods: Measures of central tendency. Psychological Methods, 8 , 254–274. doi: 10.1037/ 1082-989X.8.3.254 Winner, H., Danner, B., & Steinle, J. (2009). Adaptive cruise control. In H. Winner, S. Hakuli, & G. Wolf (Eds.), Handbuch Fahrerassistenzsysteme (1st ed., pp. 478– 521). Wiesbaden, Germany: Vieweg+Teubner. doi: 10.1007/978-3-8348-9977-4 33 Winner, H., Hakuli, S., & Wolf, G. (Eds.). (2009). Handbuch Fahrerassistenzsysteme (1st ed.). Wiesbaden, Germany: Vieweg+Teubner. doi: 10.1007/978-3-8348-9977-4 Yuen, K. K. (1974). The two-sample trimmed t for unequal population variances. Biometrika, 61 , 165–170. doi: 10.1093/biomet/61.1.165 Zeeb, K., Buchner, A., & Schrauf, M. (2015). What determines the take-over time? An integrated model approach of driver take-over after automated driving. Accident Analysis & Prevention, 78 , 212–221. doi: 10.1016/j.aap.2015.02.023

103

Appendix

104

Table A.1 Automation Trust Questionnaire Items. Original Version

Adapted German Version

GPS has the functionality I need

Das System hat die Funktionalit¨at, die ich ben¨otige Das System verbessert meine Leistung Das System versetzt mich in die Lage, Aufgaben rascher zu erledigen Das System hat die technischen Merkmale, die erforderlich sind, um meine Aufgaben zu erledigen Die vom System bereitgestellten Informationen sind so gut wie diejenigen, die eine hochkompetente Person liefern kann Das System nutzt die von mir bereitgestellten Informationen korrekt Das System ist eine kompetente Orientierungshilfe Meine Interaktion mit dem System ist klar und nachvollziehbar Das System ist in Bezug auf die Interaktion benutzerfreundlich Das System nutzt angemessene Verfahren, um zu Entscheidungen zu kommen Das System stellt mir zeitnahe Informationen bereit Die vom System bereitgestellten Informationen basieren auf dem, was f¨ ur mich wichtig ist Bei der Verwendung der Informationen vom System, die sich auf ein zertifiziertes System, wie zum Beispiel Google Maps st¨ utzen, f¨ uhle ich mich sicher Ich habe Vertrauen in die Leistung des Systems Wenn ein kritisches Problem auftaucht, kann ich mich auf die vom System bereitgestellten Informationen verlassen Ich kann mich immer auf das System verlassen, um meine Leistung sicherzustellen Das System generiert Informationen, die auf meinen Bed¨ urfnissen basieren

GPS improves my performance GPS enables me to accomplish tasks more quickly GPS has the features required to help me complete my tasks The advice GPS provided is as good as that which a highly competent person could produce GPS correctly uses the information I provided GPS provides competent guidance My interaction with GPS is clear and understandable The GPS is user-friendly to interact with GPS uses appropriate methods to reach decisions GPS provides me with timely information The advice GPS provided is based on what is important to me I feel secure in using the information from GPS which is backed by a certified system, such as Google Maps I am confident about the performance of GPS When an emergent issue or problem arises, I would feel comfortable depending on the information provided by GPS I can always rely on GPS to ensure my performance GPS generates advice based on my needs

Note. Instructions read ”Bitte markieren Sie auf der folgenden Skala jeweils die Auspr¨agung, die Ihr Gef¨ uhl oder Ihren Eindruck bez¨ uglich der Automatisierung am besten beschreibt. Als ’Automatisierung’ ist hier das System zum hochautomatisierten Fahren zu verstehen”, which translates to ”On the following scale, please mark the point which best describes your feeling or your impression of the automation. ’Automation’ in this context stands for the conditional automated driving system”. Items were rated on a 7-point scale (”1 = stimme u ¨berhaupt nicht zu, 2 = stimme nicht zu, 3 = stimme eher nicht zu, 4 = stimme weder zu noch nicht zu, 5 = stimme eher zu, 6 = stimme zu, 7 = stimme vollkommen zu”; which translates to ”1 = strongly disagree, 2 = disagree, 3 = slightly disagree, 4 = neither agree nor disagree, 5 = slightly agree, 6 = agree, 7 = strongly agree”) and averaged, with higher scores indicating higher automation trust.

105

Sebastian Hergeth

Lebenslauf Allgemeine Angaben Name Geburtsdatum Geburtsort

Sebastian Hergeth 23.06.1986 München

Ausbildung Seit 04/2014

Promotionsstudium, BMW Group, München; Technische Universität Chemnitz, Chemnitz

10/2011 bis 09/2013

Masterstudium Wirtschafts-, Organisations-, und Sozialpsychologie, Ludwig-Maximilians-Universität, München

09/2008 bis 07/2011

Bachelorstudium Psychologie, Paris-Lodron-Universität, Salzburg

09/2006 bis 08/2008

Diplomstudium der Politikwissenschaften, Hochschule für Politik, München (nicht abgeschlossen)

Zivildienst 09/2005 bis 06/2006

MOP Integrativer Jugendtreff 27 e.V., München Mittags- und Hausaufgabenbetreuung sowie Freizeitgestaltung für Jugendliche mit und ohne Behinderung

Schulbildung 09/1992 bis 06/2005

Allgemeine Hochschulreife, Oskar-von-Miller-Gymnasium, München; Grundschule an der Simmernstraße, München

Praktika, Werkstudententätigkeit und Studienabschlussarbeit 03/2013 bis 08/2013

Masterarbeit, BMW Group Forschung und Technik GmbH

11/2012 bis 03/2013

Werkstudententätigkeit, Bertrandt Ingenieurbüro GmbH

07/2012 bis 10/2012

Praktikum, BMW Group Forschung und Technik GmbH

07/2010 bis 10/2010

Praktikum, Paris-Lodron-Universität, Salzburg

Sonstige Aktivitäten 05 / 2014 - Heute

Pilotenlizenz PPL-A / JAR-FCL

Sebastian Hergeth

06 / 2012 - Heute

Mitglied der European Association for Aviation Psychology (EAAP)

08 / 2010 - Heute

Mitglied im Fliegerverein München e.V.

09 / 2016 - Heute

Mitglied im Verein gegen betrügerisches Einschenken e.V.

Publikationen Chien, S.-Y., Lewis, M., Hergeth, S., Semnani-Azad, Z., & Sycara, K. (2015). Cross-country validation of a cultural scale in measuring trust in automation. Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 59, 686–690. doi:10.1177/1541931215591149 Hergeth, S., Lorenz, L., & Krems, J. F. (2015). What did you expect? Effects of prior familiarization with take-over requests during conditional automated driving on takeover performance and automation trust. Manuscript submitted for publication. Hergeth, S., Lorenz, L., Krems, J. F., & Broy, N. (2016). Effects of purpose, process and performance information about conditional automated driving systems on automation trust and perceived usability. Manuscript submitted for publication. Hergeth, S., Lorenz, L., Krems, J., & Toenert, L. (2015). Effects of take-over requests and cultural background on automation trust in highly automated driving. Proceedings of the 8th International Driving Symposium on Human Factors in Driver Assessment, Training and Vehicle Design, 331–337. Hergeth, S., Lorenz, L., Vilimek, R., & Krems, J. F. (2016). Keep your scanners peeled: Gaze behavior as a measure of automation trust during highly automated driving. Human Factors, 58, 509–519. http://doi.org/10.1177/0018720815625744 Kerschbaum, P., Lorenz, L., Hergeth, S., & Bengler, K. (2015). Designing the humanmachine interface for highly automated cars – challenges, exemplary concepts and studies. 2015 IEEE International Workshop on Advanced Robotics and its Social Impacts (ARSO) (pp. 1–6). doi:10.1109/ARSO.2015.7428223 Lorenz, L., & Hergeth, S. (2015). Einfluss der Nebenaufgabe auf die Überwachungsleistung beim Teilautomatisierten Fahren. Paper presented at 8. VDI Tagung “Der Fahrer im 21. Jahrhundert“. Braunschweig, Germany. Reviewer 2016

25th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN 2016)

2015

IEEE Transactions on Affective Computing (TAC)

2015

7th International Conference on Automotive User Interfaces and Interactive Vehicular Applications (Automotive'UI 15)

München, den 11.09.2016 Sebastian Hergeth

Eidesstattliche Erkl¨ arung Hiermit erkl¨are ich, dass ich die vorliegende Arbeit selbstst¨andig verfasst, nicht anderweitig zu Pr¨ ufungszwecken vorgelegt und keine anderen als die angegebenen Hilfsmittel verwendet habe. S¨amtliche wissentlich verwendeten Textausschnitte, Zitate oder Inhalte anderer Verfasser wurden ausdr¨ ucklich als solche gekennzeichnet. M¨ unchen, den 01.05.2016

Sebastian Hergeth