Designing Unambiguous Auditory Crash Warning ...

3 downloads 202 Views 875KB Size Report
warnings and alerts (or “alarms”), vehicle status sounds, or in vehicle social notifications. Results .... media system and indicate social media updates including.
Proceedings of the Human Factors and Ergonomics Society 58th Annual Meeting - 2014

2078

Designing Unambiguous Auditory Crash Warning Systems Bridget A. Lewis, Jesse L. Eisert, Daniel M. Roberts, Carryl L. Baldwin George Mason University A series of three studies examined the acoustic characteristics that contribute to a sound being unambiguously perceived as an urgent alarm within a vehicle context. In experiment 1, participants sorted a variety of sounds modeled after sounds currently in use in driver-vehicle interfaces (DVIs) into categories indicating highly critical warnings and alerts (or “alarms”), vehicle status sounds, or in vehicle social notifications. Results indicated that four criteria (peak-to-total time ratio, interburst interval, number of harmonics, and base frequency) explained 61% of the variance in categorization. From these criteria, cutoffs were determined and manipulated to create stimuli for an initial validation study. Experiment 2 results indicated that these criteria remained robust even when examined in a larger stimulus set and with different participants. Finally, Experiment 3 investigated rapid categorization under divided attention. Participants categorized alerts while driving in a desktop driving simulator and completing a secondary distracting task. Results indicate that previously defined parameter criteria and cutoffs are applicable in higher context and under load. Furthermore, sounds that met all criteria were responded to more quickly than those which met only some or no criteria, indicating that these criteria can be used to create sounds which are unambiguous and intuitive in an in-vehicle driving context.

Copyright 2014 Human Factors and Ergonomics Society. DOI 10.1177/1541931214581437

INTRODUCTION As we move towards a future in which our cars will be able to convey information to us as readily as our phones or computers, it becomes imperative that we focus our research efforts on the design of usable and intuitive, unambiguous driver-vehicle interfaces (DVIs). The auditory modality has long been used to convey anything from warnings (collision or deviation warnings both inside and outside of the vehicle) to status messages (door open or lights on) to more recent social notifications (email or phone chimes from Bluetooth-connected devices), and is an ideal modality in which to present a wide array of alerts, due mainly to its flexibility. Auditory alerts can vary along a plethora of parameters, including frequency, rhythm, duration, intensity, timbre, and number of harmonics. Most of these parameters are easily detected by the human auditory system and may already have inherent connections to situations of varying urgency, making them easy to quickly identify and react to in an appropriate manner. However, it is also possible that sounds which are ambiguous may increase workload, making the driving task itself harder or more dangerous, and may be responded to inappropriately (Wiese & Lee, 2001). Urgency mapping has long been used as a method of ensuring appropriate matching between situational urgency and the urgency inherent in a given alert (Hellier

& Edworthy, 1999). Urgency mapping has been applied across many domains, though it has increasingly been utilized for automotive alert design (Baldwin et al., 2012; Lewis, Penaranda, Roberts, & Baldwin, 2013; Marshall, Lee, & Austria, 2007; D. Marshall, Lee, & Austria, 2001; McKeown & Isherwood, 2007; Politis, Brewster, & Pollick, 2013). In almost all cases, urgency mapping has been used to evaluate the urgency of specific alerts using various scaling methods. These methods are particularly effective for the evaluation of the effect of systematic changes in sound parameters on perceived urgency. However, rarely have currently in-use alerts been evaluated for intuitiveness. The current experiments examined various alerting sounds currently in use or in development, and assessed the specific acoustic components which made them more or less likely to be categorized as either alarming sounds, vehicle status sounds or social media notifications. It was particularly important to assess those components which conveyed high urgency to a large portion of the population. EXPERIMENT 1 - PERCEPTUAL SORT TASK

Downloaded from pro.sagepub.com by guest on August 26, 2015

METHODS

Proceedings of the Human Factors and Ergonomics Society 58th Annual Meeting - 2014

Participants in this experiment were 21 (7 male, average age = 21.5 years) graduate and undergraduate students recruited through the university psychological research system or via word of mouth. Stimuli Stimuli for the preliminary perceptual sort task included 29 sounds created using Adobe Audition CS5.5 and Audacity software to closely resemble sounds that may currently be found in various DVIs. Stimuli from a variety of categories including forward collision warning, lane departure warning, park-assist, back-up assist, door open, seatbelt reminder and fatigue alerts were included. Stimuli were equated for loudness in Adobe Audition and were presented at 84 dB sound pressure level (SPL). Stimuli were embedded in the task screen in the same order for all participants, but all participants were given the option to explore stimuli in any order they chose. Apparatus and Procedure Stimuli were presented to subjects embedded in a Microsoft PowerPoint slide. Stimuli were indicated by number buttons which could be played by double-clicking the button. Buttons could then be dragged into any of four categories. Categories included “Warnings”, “Alerts”, “Status Notifications” and “Social Notifications”. Each category also had a box for a “Prototype” and two boxes for participants to indicate why or by what characteristics they had classified each category, and to indicate a “Best Urgency Level”. Upon arrival, subjects gave written informed consent and completed a short demographic survey. Subjects were given a short practice task using sounds not included in the experimental session. Subjects were asked to play each sound and then to categorize it into one of the four categories. Subjects were told that “Warnings” should indicate time-critical, collision warning sounds, “Alerts” should indicate non-critical alerting sounds including things like a lane departure, “Status Notifications” should include indications of the car’s status including low tire pressure or windshield wiper fluid, and “Social Notifications” should include sounds played by the cars media system and indicate social media updates including

Facebook or email updates. Subjects were allowed to play each sound as many times as they liked. After sorting, subjects were asked to choose a sound prototypical of each category, to indicate why they grouped sounds the way that they did, and to give a number between 1 and 100 that would indicate the best urgency level for each category. The experiment took around half an hour. RESULTS Results were analyzed in terms of the percentage of time each sound was included in a given category. Preliminary analysis of categorization and explanation data indicated that subjects did not reliably or consistently distinguish between warnings and alerts, and instead conceptualized both as alarming sounds. We therefore combined the warning and alert categories into one “Alarm” category. Table 1. Criteria determined by regression and their respective cutoff values Criteria Cutoff ≥ .70 ≤ 125 ms ≥3 ≥ 1000 Hz

1. Peak to Total Time Ratio 2. Interburst Interval (IBI) 3. Number of Harmonics 4. Base Frequency

Using a combination of backwards and stepwise regression analyses predicting warning group membership, four main criteria (explaining 61% of the variance, R = .802, adj. R2 = .612) were developed. These criteria and their respective cutoff values, values that represent the threshold at which sounds are categorized as alarming, are presented in Table 1. Percentage Categorization

Participants

2079

100% 80% 60% 40% 20% 0% No Criteria Criteria 1 Criteria 1 &2 Alarm

Status

Criteria 1,2, & 3

Meets All Criteria

Social

Figure 1. Percentage alarm, status and social categorization by criteria met in order, error bars reflect standard error of the mean

Downloaded from pro.sagepub.com by guest on August 26, 2015

Proceedings of the Human Factors and Ergonomics Society 58th Annual Meeting - 2014

DISCUSSION Experiment 1 provided preliminary data on which aspects of a sound are important for classification as timecritical, alarming sounds. However, criteria were based on the same data from which the classifications had been made, and very few sounds met all or even three of our four cutoff criteria. Therefore, we performed a small validation test in order to further examine the ability of our criteria to adequately predict categorization with a larger sample of sounds and new participants. EXPERIMENT 2 - PERCEPTUAL SORT EXPANSION AND VALIDATION METHODS Participants Participants in Experiment 2 were 15 individuals recruited through the university research pool or via word of mouth. As this was a short validation, demographic data were not collected for this sample. Stimuli Stimuli for Experiment 2 were 52 sounds. Sounds included those used in Experiment 1 (though now equated to a total duration of 1500 ms) and additional stimuli including some with added harmonics.

Apparatus and Procedure Apparatus and procedure were identical to those used in Experiment 1. RESULTS Results were analyzed identically to Experiment 1. Using the same criteria, a standard linear regression was able to account for 66% of the variance (R = .810, adj. R2 = .656) for predicting alarm categorization. Figure 2 shows categorization by criteria met. Percentage Categorization

Criteria are included in order of decreasing importance, as determined by regression parameters. Peak to total time ratio refers to a property derived in our lab to represent (particularly) longer onset or offset times and is defined as the peak time, or the length of time the sound is at its full intensity, divided by its total time, or the total time the sound is playing at any intensity level. Under this definition, a sound with no onset or offset time would have a ratio of 1, while a sound with an onset or offset time that includes the entire sound length would have a ratio approaching or equal to 0. Figure 1 shows the extent to which these criteria and cutoffs predict category membership.

2080

100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% No Criteria Criteria 1 Criteria 1 &2 Alarm

Status

Criteria 1,2, & 3

Meets All Criteria

Social

Figure 2. Percentage alarm, status and social categorization by criteria met in order, error bars reflect standard error of the mean

DISCUSSION As indicated by our results, our criteria and cutoffs are relatively efficient at predicting alarm categorization, hold across subjects, and are more robust when more sounds are included in the set. However, these experiments were performed with relatively little context, no distraction, and subjects were allowed to hear and recategorize as often as they desired within each session. As such, it was important that we test our criteria with participants who were driving and completing a secondary task, requiring them to categorize under divided attention. EXPERIMENT 3 – RAPID CATEGORIZATION UNDER DIVIDED ATTENTION

Downloaded from pro.sagepub.com by guest on August 26, 2015

METHODS

Proceedings of the Human Factors and Ergonomics Society 58th Annual Meeting - 2014

Participants Participants in Experiment 3 were 22 (13 male, average age = 25) undergraduate and graduate participants recruited through the university psychological research pool or via word of mouth.

2081

notification, by pressing the plus button on the steering wheel (see Figure 3). All stimuli were presented at least twice, and average categorization and response times are reported. Participants were instructed to respond as quickly and accurately as possible while maintaining vehicle control.

Stimuli

RESULTS

Apparatus and Procedure Subjects drove a simulated course using a RealTime Technologies desktop driving simulator equipped with a Logitech Driving Force GT steering wheel. Ambient sound was presented via speakers at 60 dB SPL and stimuli were presented via a second set of speakers at 75 dB SPL. While driving, participants were required to complete a visual n-back task shown on a small touchscreen monitor slightly to their right, responding positively or negatively to each stimulus by pressing buttons on the right side of the steering wheel.

Categorization data were analyzed identically to Experiments 1 & 2. Figure 4 shows percentage categorizations (including no response percentage). 100% Percentage Categorization

Stimuli included a subset of 26 sounds from Experiment 2 that had shown relatively consistent categorization across subjects and included a range of alarm, status and social sounds.

80% 60% 40% 20% 0% No Criteria

Alarm

Criteria 1

Status

Criteria 1 & 2

Meets All Criteria

Social

No Response

Figure 3. Percentage alarm, status and social categorization by criteria met in order. Note, all criteria which met harmonic criteria (criteria 4) also met base frequency criteria (criteria 3).

Previously defined criteria and cutoffs predicted 66% of alarm categorization during driving (R = .845, adj. R2 = .659). Response Time (s)

2.5 2.0 1.5 1.0 0.5 0.0 Meets No Criteria Meets Some Criteria Meets All Criteria Figure 4. Response time by criteria met

Figure 3. Experimental Setup

At varying intervals, sound stimuli were presented instead of n-back stimuli, and subjects were asked to respond to the sound by categorizing it as an alarm, by pressing the brake pedal, a status notification, by pressing the left bumper (L3) on the steering wheel, or as a social

Importantly, results indicate that the number of criteria met was also predictive of response times, such that sounds which met all criteria were categorized faster than those which only met some criteria (t(21) = 2.96, p = 0.001) (see Figure 5).

Downloaded from pro.sagepub.com by guest on August 26, 2015

DISCUSSION

Proceedings of the Human Factors and Ergonomics Society 58th Annual Meeting - 2014

The results of Experiment 3 indicate that our cutoff criteria adequately predict alarm categorization across both low and medium driving contexts and fidelities. More interestingly, response time results indicate that sounds that meet all criteria are the quickest to make a categorization decision about, as reflected by faster response times. Possibly indicating that sounds that meet some but not all criteria may be confusing to participants, requiring more time to choose a categorization. GENERAL DISCUSSION The primary aim of this series of experiments was to identify those acoustic components which have the greatest effect on categorization of alerts. In the first experiment, it was determined that the characteristic defined as peak-to-total time ratio, or the amount of time a sound was at its full intensity as opposed to the amount of time it was in onset or offset, interburst interval, number of harmonics, and base frequency were the acoustic components which had the greatest effect on alarm categorization for our participants. Using these parameters cutoffs were determined, at or beyond which the likelihood of alarm categorization by a given subject was high. The second experiment further confirmed that those specific parameters could be manipulated to increase or decrease the likelihood that a given alert would be classified as an alarm by changing their values to meet or not meet any of the four cutoff criteria. This manipulation confirmed our cutoff criteria - as an alert met more criteria it was more likely to be categorized as an alarming sound. The third and final experiment attempted to validate our criteria in a higher fidelity, driving experiment. Even under divided attention, our criteria managed to accurately predict categorization for a large proportion of alerts. What may be more important is that the number of criteria met directly affected response times, such that sounds that met all criteria were responded to significantly faster than those meeting only some criteria. These results indicate that sounds which have some acoustic components associated with highly critical sounds and some acoustic components associated with less critical sounds may be ambiguous to drivers, causing them to respond more slowly and possibly to react incorrectly.

2082

Overall, our results indicate that it is possible to define acoustic characteristics of sounds in an in-vehicle context in order to facilitate the design of intuitive, unambiguous alerting systems. Limitations to the current research include the small number of systems currently in use and the huge array of possible acoustic parameters which may be manipulated. Additionally, the current set of experiments examined alerts in only low to medium driving contexts, using a categorization task as opposed to responses to actual events. Future research that includes responses in a more ecologically valid driving context is needed, particularly research investigating responses to actual collision events. However, these three experiments provide strong evidence for the potential of this method to yield strong, valid criteria for the creation of intuitive, unambiguous alerts for in-vehicle information systems. ACKNOWLEDGMENT We would like to thank Christian Gonzalez and Nick Penaranda for their support in executing this research. Additionally, we would like to acknowledge funding support from the National Highway Traffic Safety Administration (under Westat led Task Order No: DTNH22-11D-00237/0001).

REFERENCES Baldwin, C. L., Eisert, J. L., Garcia, A., Lewis, B., Pratt, S. M., & Gonzalez, C. (2012). Multimodal urgency coding: auditory, visual, and tactile parameters and their impact on perceived urgency. Work: A Journal of Prevention, Assessment and Rehabilitation, 41(0), 3586–3591. Hellier, E., & Edworthy, J. (1999). On using psychophysical techniques to achieve urgency mapping in auditory warnings. Applied Ergonomics, 30(2), 167–171. Lewis, B. A., Penaranda, B. N., Roberts, D. M., & Baldwin, C. L. (2013). Effectiveness of bimodal versus unimodal alerts for distracted drivers. In Proceedings of the 7th International Driving Symposium on Human Factors in Driver Assessment, Training, and Vehicle Design. Bolton Landing, NY. Marshall, D. C., Lee, J. D., & Austria, P. A. (2007). Alerts for In-Vehicle Information Systems: Annoyance, Urgency, and Appropriateness. Human Factors, 49(1), 145–157. Marshall, D., Lee, J. D., & Austria, P. A. (2001). Annoyance and Urgency of Auditory Alerts for in-Vehicle Information Systems. Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 45(23), 1627–1631. McKeown, D., & Isherwood, S. (2007). Mapping candidate within-vehicle auditory displays to their referents. Human Factors, 49(3), 417–428. Politis, I., Brewster, S., & Pollick, F. (2013). Evaluating multimodal driver displays of varying urgency. In Proceedings of the 5th International Conference on Automotive User Interfaces and Interactive Vehicular Applications (pp. 92–99). New York, NY, USA: ACM. Wiese, E., & Lee, J. D. (2001). Effects of Multiple Auditory Alerts for inVehicle Information Systems on Driver Attitudes and Performance. Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 45(23), 1632–1636.

Downloaded from pro.sagepub.com by guest on August 26, 2015

Suggest Documents