Tornado Warning Decisions Using Phased-Array ... - AMS Journals

0 downloads 0 Views 3MB Size Report
Base data display was handled using the Advanced. Weather Interactive Processing System-2 (AWIPS-2), which is currently replacing AWIPS-1 as the baseline.
FEBRUARY 2015

HEINSELMAN ET AL.

57

Tornado Warning Decisions Using Phased-Array Radar Data PAMELA HEINSELMAN NOAA/National Severe Storms Laboratory, Norman, Oklahoma

DAPHNE LADUE Center for Analysis and Prediction of Storms, University of Oklahoma, Norman, Oklahoma

DARREL M. KINGFIELD NOAA/National Severe Storms Laboratory, and Cooperative Institute for Mesoscale Meteorological Studies, University of Oklahoma, Norman, Oklahoma

ROBERT HOFFMAN Institute for Human and Machine Cognition, Pensacola, Florida (Manuscript received 11 April 2014, in final form 9 September 2014) ABSTRACT The 2012 Phased Array Radar Innovative Sensing Experiment identified how rapidly scanned fullvolumetric data captured known mesoscale processes and impacted tornado-warning lead time. Twelve forecasters from nine National Weather Service forecast offices used this rapid-scan phased-array radar (PAR) data to issue tornado warnings on two low-end tornadic and two nontornadic supercell cases. Verification of the tornadic cases revealed that forecasters’ use of PAR data provided a median tornado-warning lead time (TLT) of 20 min. This 20-min TLT exceeded by 6.5 and 9 min, respectively, participants’ forecast office and regions’ median spring season, low-end TLTs (2008–13). Furthermore, polygon-based probability of detection ranged from 0.75 to 1.0 and probability of false alarm for all four cases ranged from 0.0 to 0.5. Similar performance was observed regardless of prior warning experience. Use of a cognitive task analysis method called the recent case walk-through showed that this performance was due to forecasters’ use of rapid volumetric updates. Warning decisions were based upon the intensity, persistence, and important changes in features aloft that are precursors to tornadogenesis. Precursors that triggered forecasters’ decisions to warn occurred within one or two typical Weather Surveillance Radar-1988 Doppler (WSR-88D) scans, indicating PAR’s temporal sampling better matches the time scale at which these precursors evolve.

1. Introduction During the past decade, National Weather Service (NWS) forecasters have experience using a few techniques to reduce the update time of volumetric radar data from the Weather Surveillance Radar-1988 Doppler (WSR88D) system. The first became available in 2004 with the operational release of volume coverage pattern (VCP) 12 (Lee and Steadham 2004; Brown et al. 2005) by the Radar Operations Center (ROC). The faster update time of VCP 12, combined with compact spacing

Corresponding author address: Pam Heinselman, 120 David L. Boren Blvd., Norman, OK 73072. E-mail: [email protected] DOI: 10.1175/WAF-D-14-00042.1 Ó 2015 American Meteorological Society

between its lower-elevation angles, provides forecasters with the ability to detect mesocyclones earlier and farther from the radar. However, this VCP came with a reduction in the estimate accuracy of reflectivity and velocity (Brown et al. 2000, 2005). An approach for further lowering a VCP’s scan time is to only scan elevations containing weather considered to be significant by the forecaster. This approach was used successfully in radar experiments conducted in the 1970s (Saffle 1976) and 1980s (Greene et al. 1983). It was reintroduced in 2000 by Brown et al. (2000), who termed their method flexible VCPs, as it allowed an algorithm to terminate and then restart a scanning pattern after an elevation that did not contain significant weather had been detected. This method was operationally applied

58

WEATHER AND FORECASTING

to the WSR-88D network in 2012 within the Automated Volume Scan Evaluation and Termination (AVSET) algorithm (Chrisman 2009). While AVSET is most beneficial when shallow storms are located far from the radar, an alternate approach is needed to provide more frequent volumetric updates of deep convection closer to the radar. One approach is to reduce the update time at a specified elevation of interest, such as 0.58. The need for prioritized low-altitude scanning on the WSR-88D was highlighted following the rapid intensification of a tornado rated as a category 5 event on the enhanced Fujita (EF) scale that occurred on 22 May 2011 and that devastated Joplin, Missouri (NOAA 2011; Marshall et al. 2012; Karstens et al. 2013). In operations, more rapid updates at 0.58 have previously been accomplished by manually restarting the VCP. Available now is an algorithm known as the Supplemental Adaptive Intravolume Low-Level Scan (SAILS; Crum et al. 2013). SAILS inserts a ‘‘split cut’’ 0.58 scan into the middle of either VCP 12 (Brown et al. 2005) or VCP 212 (Office of the Federal Coordinator for Meteorological Services and Supporting Research 2013). The implementation of SAILS provides NWS forecasters with the ability to attain more frequent 0.58 scans when tornadoes or other quickly evolving low-altitude weather events are possible, with the result of longer volume scan times. Each of these options provides the ability to receive faster updates; however, the extent to which these may or may not have aided forecasters’ warning decision processes has not been studied. What is known is that the extent to which these approaches can reduce update time is limited by the WSR-88Ds’ mechanically scanning antennas. Electronically scanning phased-array weather radars (e.g., Zrnic et al. 2007; Bluestein et al. 2010; Wurman et al. 2012) and imaging weather radars (e.g., Isom et al. 2013) provide greater scanning flexibility. The National Oceanic and Atmospheric Administration/ National Severe Storms Laboratory (NOAA/NSSL) is developing and demonstrating techniques for the rapid sampling of storms using an S-band phased-array radar (PAR) system at the National Weather Radar Testbed in Norman, Oklahoma (e.g., Yu et al. 2007; Heinselman and Torres 2011; Curtis and Torres 2011). PAR technology is a logical choice for reducing scan times for several reasons. First, the capability of a fourpanel PAR to scan four 908 sectors (e.g., Zrnic et al. 2007; Brown and Wood 2012) would reduce the scan time relative to a traditional WSR-88D VCP by at least 75% (e.g., from 4.2 to 1 min), without impacting the accuracy of the reflectivity or velocity. Second, electronic scanning provides the flexibility to scan the atmosphere and storms adaptively at each azimuth position (Heinselman and Torres 2011). As described in

VOLUME 30

Heinselman and Torres (2011), rather than determining the maximum elevation angle to scan for the entire volume, as done by AVSET, the Adaptive Digital Signal Processing Algorithm for PAR Timely Scans (ADAPTS) determines the maximum elevation angle to scan at each azimuth. Third, focused sampling like that accomplished by SAILS can be applied not only at a given elevation but to a specified storm within a sector (Priegnitz and Heinselman 2013). Specifically, the radar operator may manually choose to focus scanning only on azimuths containing the storm for a scheduled period (e.g., 1 min) and, then, intermittently scan the full sector. The frequency at which a storm’s reflectivity and velocity structure is updated impacts the ability of users to cognitively process the data. Our expectation has been that the techniques applied on the PAR will enhance forecasters’ making sense of the situation during events that tend to evolve rapidly and, as a result, extend tornado-warning lead times. This expectation was first examined in the 2010 Phased Array Radar Innovative Sensing Experiment (PARISE) with respect to tornadowarning lead time (Heinselman et al. 2012). To assess this expectation, six NWS forecaster pairs worked the 19 August 2007 tornadic tropical supercell event: three pairs received 43-s volumetric PAR data, and three pairs received 4.5-min volumetric PAR data to approximate the WSR-88D coverage. The event included two supercells; the north storm produced an EF1 tornado, whereas the south storm produced no tornado. The use of rapid-scan PAR data did result in longer tornado-warning lead times, which ranged from 11 to 18 min versus 21.6 to 6 min, respectively. However, two of the forecaster pairs using the 43-s data issued a tornado warning on the south storm (null event), while pairs using the 4.5-min data did not. Due to the small sample size, the 2010 PARISE results are not generalizable. A primary goal of this study, termed 2012 PARISE, was to examine the external validity of the 2010 PARISE findings. To increase the sample size, 12 forecasters worked four supercell cases using only rapid-scan PAR data. Half of the cases were nontornadic. In each case, the forecasters’ goal was to determine whether or not to issue a tornado warning. Performance was quantified using NWSdefined warning accuracy measures and lead times. The use of video-cued recall (described in section 3) allowed for a more detailed documentation of forecasters’ decision-making processes than the design used by Heinselman et al. (2012).

2. Radar data and visualization Case selection criteria included sufficient longevity and temporal continuity in the data prior to tornadogenesis

FEBRUARY 2015

59

HEINSELMAN ET AL.

TABLE 1. Case dates and times (UTC), radar updates (s), approximate heights of rotation at 0.58 within primary storms (kft), and tornado occurrence during the event.

Date

Duration (UTC)

Scan strategy updates (s)

11 May 2010

0035–0111

14 Apr 2011 22 Apr 2011 22 May 2011

2055–2120 2339–2358 0050–0142

Vol scan: 59-s interleaved 4 low-elev scan: 22 70 54 56

(for tornadic cases), minimal velocity aliasing, and storm location within a 120-km range of the PAR. The application of these criteria resulted in the selection of four supercell cases with durations of 19–52 min (Table 1; Figs. 1 and 2). During the two tornadic supercell cases (11 May 2010 and 22 May 2011) EF0 and/or EF1 tornadoes occurred, whereas during the nontornadic supercell cases (14 April and 22 April 2011) no tornadoes occurred. However, the nontornadic cases did form in environments that later produced tornadoes at least 45 min after case end time. In real-time operations, an NWS forecaster issued an unverified warning during the 14 April 2011 nontornadic case. Therefore, we anticipated that the null case would be particularly challenging to forecasters. Data were collected using interleaved and conventional scanning strategies (Table 1). By using interleaved scanning during data collection on 11 May 2010, the lowest four elevations were sampled twice between volumetric (22 elevation) scans to prioritize sampling nearest to the ground, where tornadoes develop. Besides occurring between volumetric scans, this approach differed from SAILS (Chrisman 2009) in that multiple lowelevation angles were sampled. The durations of the full volumetric and prioritized low-elevation scans were 59 and 22 s, respectively. During the 2011 cases, storms were sampled with conventional scan strategies with volumetric updates near 60 s. Based on storm coverage and range, these updates were reduced during data collection by running ADAPTS (Heinselman and Torres 2011). Base data display was handled using the Advanced Weather Interactive Processing System-2 (AWIPS-2), which is currently replacing AWIPS-1 as the baseline forecasting platform at Weather Forecast Offices (WFOs) across the country. Utilizing the AWIPS framework provided forecasters with access to PAR data within a familiar display and warning environment. This approach allowed the research to focus on the evaluation of rapid radar data updates rather than on impacts of software retraining. For ease in data management and display, the four PAR cases were preprocessed using the Common Operations and Development Environment (CODE) Radar

Approx heights of rotation at 0.58 within primary storms (kft)

EF rating and duration (UTC)

4.0–7.0

EF0, 0105–0109

3.9–4.1 1.5–1.8 2.4–3.4

None None EF0, 0118–0120 EF0, 0129–0133 EF1, 0141–0147

Product Generator software (Johnson et al. 1999). Utilizing CODE, AWIPS-readable reflectivity, velocity, and spectrum width products were generated without data quality degradation.

3. Experiment design Fundamental to this study was a method for capturing forecasters’ reasoning and sense making. Human factors methods for doing this are referred to as cognitive task analyses (Crandall et al. 2006). One specific method is referred to as think aloud problem solving (see Ericsson et al. 2006) in which the participant verbalizes their thinking while they are conducting the problem solving task. Rather than having our forecasters provide a concurrent verbalization of their reasoning, as was done in 2010 PARISE (Heinselman et al. 2012), we (PH and DLD) chose to use the retrospective method called the recent case walkthrough (RCW; Militello and Hutton 1998; Hoffman 2005), which has a track record of success in the field of human factors (see Omodei and McLennan 1994; Crandall et al. 2006). Using the RCW method, the participant first works the problem and then is immediately asked to retrospect about it. This method reflects the forecaster’s familiar task of analyzing data to form a conceptual model and, then, giving a briefing on the analysis and predictions. In the present use of the RCW method, two participants each experiment week were asked to independently work potentially tornadic events in simulated real time and then retrospect about their thinking in three ‘‘sweeps,’’ in which more detail is progressively added. In the first sweep of the recalled event, which typically took 1.5 times the length of the case, PH and DLD drafted a timeline of each forecaster’s actions and thought processes while they reviewed a video replay of their desktop. During the second sweep, the video was reviewed again and the descriptions developed in the first sweep were refined, corrected, and/or enriched by the forecaster, with the researcher prompting for explanation where needed. In the third sweep, forecasters were asked to identify key judgments and the information used to make them.

60

WEATHER AND FORECASTING

VOLUME 30

FIG. 1. The 0.58 reflectivity and velocity near the beginning of each tornadic case: (a) 0035 UTC 11 May and (b) 0050 UTC 22 May 2011. Reflectivity and base velocity legends for color shading are shown at the top. Range rings are labeled (km).

These judgments included decisions both to warn and not to warn. Forecasters were then asked additional questions to deepen our understanding of their judgments, decision making, conceptual models, knowledge base, and work procedures. The researchers then switched places for the second case reported here. This method was pilot tested with two volunteers from the Warning Decision Training Branch. (Example probe questions and an excerpt from the data collected are presented in Table 2.)

a. Participant selection Participant selection focused on the Central and Southern Regions of the NWS, where tornadoes are climatologically most prevalent (e.g., Brooks et al. 2003). After providing the Scientific Services Division chiefs an overview of the experiment, they coordinated with local offices to identify potential participants. To ensure cases chosen had not been worked by potential participants, the Norman, Oklahoma, office was excluded. The participant

FEBRUARY 2015

HEINSELMAN ET AL.

61

FIG. 2. As in Fig. 1, but for each null case: (a) 2056 UTC 14 Apr and (b) 2339 UTC 22 Apr 2011.

list provided was used to select 12 forecasters based on office location, availability during the 6-week experiment, and varying years of experience. Each forecaster was individually contacted and provided the opportunity to consent to participate. In all reporting from this experiment, pseudonyms are used. The experiment included participants from nine NWS offices, with a slight majority of forecasters selected from the Southern Region. They ranged in experience from 1.3 to 19 yr. Six participants, whom we call Mike, Maggie, Randy, Bridget, Elmer, and Dirk, had five or fewer years’ experience. Mike had taken the Distance

Learning Operations Course (DLOC) within the past year. DLOC provides initial training to NWS forecasters on the use of the WSR-88D in the forecast and warning decision-making process. The other six, whom we call Bob, Jay, Avery, Pat, Brad, and Ben, had a minimum of nine years’ experience. Participants had worked during 5–25 live severe events in the previous year, but had not necessarily issued the warnings.

b. Complete experiment procedure The study was conducted during six of the weeks in June–August 2012. During each experiment week, two

62

WEATHER AND FORECASTING

VOLUME 30

TABLE 2. (a) Probe questions: Query participant using prompts below as a guide. (b) Excerpt showing part of the timeline produced by the RCW during the 4 min preceding Randy’s decision to warn while working the 22 May 2011 case. His responses to probe questions during the third sweep are also included. (a) Information (use their timeline, identifying key judgments) I1 What were the key judgments you made during this event? I2 What information did you need or use to make this judgment? I3 What did you do with this information? I4 Did you miss having environmental data during the case itself? If so, at what point? What did you want to know? I5 Was there any point at which geography, population, or similar information would have made a difference? Tell me about that. I6 At what point(s) would you have sought storm reports? Decision making D1 What let you know that (issuing/not issuing a warning, changing mode of operation, etc.) was the right thing to do at this point in the case? D2 How much time pressure did you feel in making this decision? D3 How long did it take to actually make this decision? D4 How challenging was this case? How so or not so? D5 What was your warning philosophy for this case? Mental modeling M1 As you went through the process of understanding this situation, did you build a conceptual model of the problem scenario? What was it? M2 Was there anything about that that was easier to see in the PAR dataset versus what you would expect to see in the WSR-88D? M3 Did you try to imagine the important causal events and principles? What were those? M4 Can you draw me a picture of what it looks like? Knowledge base K1 Where did this case fall in the spectrum of your experience (e.g., how typical was it)? Legacy procedures L1 What normal work habits did you follow in working this case? L2 Did you do anything that would be atypical of normal work habits? (b) Time (UTC) 0108

Retrospections Here’s new one, notice right away that outbound velocities are much stronger. Indicates to me that there may be some kind of RFD surge going on. Checking, because haven’t in a while, how storm is trending in reflectivities aloft. It was very strong: 60 dBZ at 40 kft. Not too concerned about tornado yet because of broad mesocyclone, but will be watching that inflow region. 0109 Taking closer look at RFD, or at what I think is RFD. Warning comes soon after this. Was looking for circulation aloft, was seeing a little better. 0110 I’ve got a pretty significant RFD surge, closer to inflow region, so getting tighter interface between these. When see this brightening up, intensification, better go ahead and get warning out. Everything is trending toward getting a tornado. 0111 Waited one more scan to see consistency in RFD surge, and saw it, and also looking at the classic nature of reflectivity, added toward issuing. 0112 By now circulation has intensified enough to issue warning. Pulled up WarnGen and issued. RFD is closer to the inflow—getting adjacent/closer to each other. Judgment key (use their timeline, identifying key judgments) I1 What were the key judgments you made during this event? I2 What information did you need or use to make this judgment? I3 What did you do with this information? Judgment 1, 0100 UTC, responses I1 Decided not to warn. I2 Wasn’t sure that it was real. Overall meso was still broad. Figured it was. I3 Did not warn, and because meso broad, it was the main area to watch. Judgment 2, 0112 UTC, responses I1 Decided to issue tornado warning I2 Saw consistency of RFD surge in previous 3–4 min. And it was getting closer to the stronger inflow. Those two were closer together. Had convergence. I3 Forecasting that there would be a tornado. I4 Did you miss having environmental data during the case itself? If so, at what point? What did you want to know? Not necessarily. Maybe missed having current helicity values and the SPC [Storm Prediction Center] mesoanalysis. And maybe lowlevel jet—guessing it increased closer to dark. That would have been more confirmation that occurred. It would have enhanced helicities. Would have liked this especially leading up to first warning, conditions improving.

FEBRUARY 2015

HEINSELMAN ET AL.

63

TABLE 2. (Continued) (b) I5 Was there any point at which geography, population, or similar information would have made a difference? Tell me about that. Yes. As far as the warning, where would have placed polygon, because clipping about six counties —could have trimmed two of them out. Done some polygon-ology. Threshold might be lower in a highly populated area. Can afford waiting a little, maybe, in the rural areas. But if heading toward Norman, or someplace like that, would have put stress level higher, more worried, possibly more trigger happy. I6 At what point(s) would you have sought storm reports? When had the warning out. Soon after, when babysitting, would have used the time to pull up live camera (storm chaser), or look at TV, or maybe call reliable source. Decision making D1 What let you know that (issuing/not issuing a warning, changing mode of operation, etc.) was the right thing to do at this point in the case? Did not issue: trying to catch a quick spinup...that is difficult. By time you’d get warning out, it would have dissipated. Let this one slide. Not totally sure should not have warned. Warning #1: About 7 min after put it out, was probably producing. Thought that in data, even before phone call. Warning #2: Velocity confirmed that storm still had a tornado or high chance of producing one. D2 How much time pressure did you feel in making this decision? Not warn: very pressured. First warning: felt moderate amount Second warning: lesser, because had confirmation, and had seen evolution expected when issuing first warning D3 How long did it take to actually make this decision? Not warn: 5–6 min First warning: 4 min Second warning: 3 min D4 How challenging was this case? How so or not so? It was challenging. Because wasn’t totally sure about data quality, and had to determine when first warning needed—when it would start producing. Mental modeling M1 As you went through the process of understanding this situation, did you build a conceptual model of the problem scenario? Yes. M2 Was there anything about your conceptual model that was easier to see in PAR dataset versus what you would expect to see in 88D? RFD surge. And low-level convergence where low-level inflow met RFD. M3 Did you try to imagine the important causal events and principles? Yes. M4 Did you make a spatial picture in your head? Yes. M5 Can you draw me an example of what it looks like? RFD surge coming but separated from inflow. Those met up. Both got stronger at the same time. Knowledge base K1 Where did this case fall in the spectrum of your experience (e.g., how typical)? There was a part of it that was difficult. But the overall scheme, one storm and one classic storm. Probably easier because it was classic—met conceptual model so well. Don’t have many classic events in my area, but it was a classic event. Get more QLCS [quasilinear convective system] tornado events. Legacy procedures L1 What normal work habits did you follow in working this case? Interrogating with all tilts, stepping up and down to see trend of mesocyclone aloft. Seeing how strong updraft was. How high 50 dBZ was. Loading more frames to see how evolving over time, and overall motion—was deviating to right? L2 Did you do anything that would be atypical of normal work habits? Loaded up the four-panel with the lowest four tilts because was able to see almost real time, updating of each elevation. That was a big help in overall monitoring of the strength of circulation. Would not necessarily use this before a warning. The four-panel allowed more hands-off. All tilts is more hands-on, as you go up–down, through time. Felt like could hold off on warning. Because had more data to back up decisions. And more data to see the evolution.

participants traveled to Norman. On the first morning, they heard an overview of the characteristics, capabilities, and data collection strategies of the PAR. The motivation behind and objectives for 2012 PARISE were then explained. Thereafter, participants received a review of AWIPS-2 and spent about an hour on a workstation to practice loading archived PAR data

and drawing polygons using the Warning Generation (WarnGen) tool. That afternoon and over the next 1.5 days, participants individually worked the four cases (Table 1; Figs. 1 and 2) in simulated real time, as if they were responsible for issuing real warnings. Case order varied each experiment week so that signatures seen in one

64

WEATHER AND FORECASTING

case would not systematically affect analyses of subsequent cases. They were told that their job was to decide whether a tornado warning was merited. Prior to each case, participants viewed a prerecorded weather briefing provided by a member (J. LaDue) of the NWS Warning Decision Training Branch. The weather briefing attempted to provide each participant with a shared understanding of the environment and expectations for the event. Although participants had not worked these events, prior knowledge of them was possible. Case dates were deemphasized by using Latin letters to name each (e.g., alpha, beta, delta, and zeta) and time for recall was minimized by starting the case directly following the weather briefing. During each case, mock phone calls were timed to coincide with spotter reports received during real-time operations. While working each event, RecordMyDesktop software (http://recordmydesktop.sourceforge.net) recorded participant interactions with the AWIPS-2 software and products issued were saved to a database. About 10 min thereafter, the forecaster–researcher pairs participated in the RCW described above.

4. Tornado verification results As in Heinselman et al. (2012), this study focused on EF0- and EF1-rated tornadoes produced by supercells. Although these tornadoes are less destructive to life and property than higher-rated tornadoes, they are also the most unwarned (e.g., Brotzge and Erickson 2010). For example, based on the NWS Performance Management System, from 1 January 2008 to 31 January 2014, 27.66% of EF0 and EF1 tornadoes were unwarned nationwide, compared to 10.21% of tornadoes rated EF2 or higher. As this study focused on springtime supercells and engaged forecasters from the NWS’s Central and Southern Regions, of greater relevance are tornado statistics during the ‘‘spring’’ seasons (1 March–30 June) from 2008 to 2013, both within these regions and within the forecasters’ offices. Over this period 24.76% of EF0 and EF1 tornadoes were unwarned in the NWS’s Central and Southern Regions, compared to 8.5% of tornadoes rated EF2 or higher. With regard to participants’ combined office statistics, 22.06% of EF0 and EF1 tornadoes were unwarned in the NWS offices, compared to 1.5% of tornadoes rated EF2 or higher. Hence, less damaging tornadoes are also the most unwarned at these seasonal and spatial scales. The corresponding median polygon probability of detection values (section 4a) for EF0 and EF1 tornadoes computed for these spring seasons were 0.72 for the Central and Southern Regions combined, and 0.74

VOLUME 30

for the NWS offices, while median probability of false alarm values (section 4a) for these groups were 0.71 and 0.72, respectively. Finally, the corresponding median tornado-warning lead times (section 4a) for EF0 and EF1 tornadoes were 11 min for the Central and Southern Regions combined, and 13.5 min for the NWS offices. Although tornado statistics associated with each participant and the Norman WFO for cases studied were also considered for comparison, they are not included for important reasons. First, because some participants do not sign their warnings or had used different signatures at offices where they had worked previously, but did not remember them, statistics were unavailable for all forecasters. Second, actual warning lead times from the Norman WFO were not used because they were not comparable: participants did not have radar data available to them from storm initiation or over the same region (908 versus 3608).

a. Storm-based tornado verification The implementation of storm-based tornado verification by the NWS in October 2007 (Sutter and Erickson 2010) introduced two new measures: polygon probability of detection (PPOD) and mean tornado-warning lead time (TLT). These statistics are computed along the entire tornado path, rather than at only the initial tornado location and time. Their computation requires the creation of a path-relative 2 3 1 contingency table. The two terms computed in this table are XP and YP, the number of warned and unwarned points along the tornado path, respectively. Points along the path were defined at 1-min intervals by assuming that the tornado traveled in a straight line and at a constant speed (Fig. 3). For the 2012 PARISE cases, tornado reports were obtained from the NWS Performance Management System (https://verification.nws.noaa.gov/), which uses the Storm Data publication as ground truth. Although there are limitations to the use of Storm Data in tornado verification (Witt et al. 1998), the timing and location of tornadoes reported therein were reasonably consistent with circulation signatures seen in the PAR radial velocity data. Once the time-incremented tornado path had been determined, every 1-min point was examined to assess whether a warning was valid at the time the event occurred (i.e., compute XP and YP). If the point was located within two or more polygons, those polygons were combined spatially into a single polygon. These point locations and the times associated with them were then used to compute the PPOD and TLT. The PPOD is defined as the average percentage of events (i.e., tornado track) that were warned across all events:

FEBRUARY 2015

65

HEINSELMAN ET AL.

FIG. 3. The 1-min-interval tornado tracks (white boxes with letters inside) and warnings (redline polygons) associated with an EF0-rated tornado near Millcreek, OK, on 11 May 2010. Track and warnings are overlaid on the (top) 0.58 reflectivity and (bottom) base velocity. The velocity couplet is located just north of the reported track.

n

PPOD 5

XP(n)

å XP(n) 1 YP(n) 1

X 1Y

,

(1)

where X is the number of verified warnings of events (hits) and Y is the number of unverified warning of events (misses). The numerator of PPOD is the proportion of events warned, summed over the number (n) of tornado events, and the denominator is the number of observed tornado events. The TLT is defined as the mean tornado-warning lead time through the event’s duration: p

TLT 5

å LT( p) 1

XP 1 YP

,

(2)

where LT( p) is the difference between the time at a given point and the time the warning was issued. If multiple warnings were valid when the event occurred, LT( p) was computed using the first warning issued (e.g., Fig. 3). At points without a valid warning, LT( p) was set to zero. For the case shown in Fig. 3, the PPOD is 1 and TLT is 28 min. An advantage of PPOD [Eq. (1)] and TLT [Eq. (2)] is that these derived measures take into consideration the spatiotemporal extent to which warnings cover the full tornado path. A traditional measure computed in this study was the probability of false alarm [POFA; also known as the false alarm ratio (Barnes et al. 2009)]. POFA is defined as POFA 5

Z , X 1Z

(3)

66

WEATHER AND FORECASTING

VOLUME 30

FIG. 4. (a) Distribution of tornado-warning lead times (min) computed for the 11 May 2010 and 22 May 2011 events: EF0-rated tornado on 11 May 2010 (A) and three tornadoes (EF0 and EF1) on 22 May 2011 (B, C, D), listed in chronological order. Horizontal blue lines denote the mean national tornado-warning lead time computed from 1 Jan 2008 to 31 Jan 2014 for EF0 and EF1 tornadoes (12.5 min) and EF2 and higher rated tornadoes (18 min). (b) Distribution of PPOD (red circles) and POFA (black squares) computed for 11 May 2010 and 22 May 2011 events. The horizontal line at 0.5 indicates the POFA attainable by chance. Forecaster pseudonyms are given along the x axis.

where Z denotes the number of events warned but not observed (false alarms) and X denotes the number of events warned and observed (hits).

b. Tornado-warning lead times The TLTs (N 5 48) resulting from forecasters’ use of rapid-scan PAR data ranged from 0 to 39 min (Fig. 4a). The median TLT for all events was 20 min, which exceeded both the regional- (11 min) and office-based (13.5 min) median TLTs for weak tornado events by 9 and 6.5 min, respectively (Fig. 4). With regard to each forecaster, median TLTs for all events ranged from 8.5

to 29.5 min (Fig. 5a). An examination of these TLTs with respect to years of experience reveals similar results: the majority made warning decisions that resulted in median TLTs that met or exceeded the median regional- and office-based TLTs for low-end tornadoes. The two highest mean TLTs were associated with decisions made by two of the less-experienced forecasters, Bridget and Elmer. These two forecasters were among those who voiced the most up-to-date conceptual models. A comparison of TLTs by case indicates that timelier tornado warning decisions were made for the EF0 event on 11 May 2010 (A in Fig. 4a) than for the first tornado

FEBRUARY 2015

HEINSELMAN ET AL.

67

FIG. 5. (a) Mean TLT and (b) POFA across the four cases, plotted with respect to years of experience of each forecaster.

event on 22 May 2011 (B in Fig. 4a). Although we had not intended for forecasters to make warning decisions during the first few minutes of any case, most forecasters working 11 May 2010 (N 5 10) decided to issue

a tornado warning within the first 2 min (Fig. 6a). As a result, mean TLT was 22.75 min for A. The mean TLT for tornado B was 13.83 min, which reflected a broader range of TLTs. Seven of the TLTs ranged from 18 to

68

WEATHER AND FORECASTING

VOLUME 30

FIG. 6. Start and end times of tornado warnings issued by participants on (a) 11 May 2010, (b) 22 May 2011, (c) 14 Apr 2011, and (d) 22 Apr 2011. False alarms are represented by dashed lines. The first, second, and third tornado warnings are denoted by a cross, filled circle, and filled triangle, respectively. Duration of PAR data is shaded gray and tornado duration is shown by vertical gray lines.

23 min, whereas five were 10 min or lower, two of which were 0 min. An examination of the 0-min TLTs revealed that, in one case, the tornado occurred a few minutes prior to the warning start time. In the other case, the tornado occurred just outside of the west edge of the warning polygon. For 8 of the 12 forecasters working the 22 May 2011 case, the TLTs for subsequent tornadoes (C and D) exceeded those of tornado B (Fig. 4a). For tornado C, individual TLTs ranged from 9 to 34 min (Fig. 4a) and the mean TLT was 18 min. Eleven of the 12 TLTs for

tornado D exceeded 18 min (Fig. 4a). The average TLT for tornado D was 24 min. As one might expect, the broad range of TLTs for tornadoes C and D can be attributed to variations in warning issue times and durations of the participants’ initial and subsequent warnings.

c. PPOD and POFA The four tornado paths were fully verified by nine of the forecasters’ warnings (Fig. 4b), resulting in PPODs of 1. PPOD values less than 1 (N 5 3; 0.75 or 0.9) were

FEBRUARY 2015

HEINSELMAN ET AL.

due to the absence of verified points along tornado tracks. These PPODs tended to exceed both the median seasonal region- and office-based PPODs, which were 0.72 and 0.74, respectively. The majority of participants’ POFA values ranged from 0.2 to 0.4 (Fig. 4b). Exceptions were Avery’s and Randy’s POFAs, which were 0.5 and 0.0, respectively. Given that half of the cases were null events, POFAs lower than 0.5 indicate performance superior to random chance. Their median POFA of 0.33 was also superior compared to the median seasonal region- and office-based POFAs of 0.71 and 0.72, respectively. During each null case, 4 of the 12 forecasters decided not to warn (Figs. 6c,d). Although an examination of POFA values with respect to each forecaster’s years of experience revealed similar performance, Maggie and Randy, two of the relatively less-experienced forecasters, achieved POFDs of 0.2 or less (Fig. 5b). Randy was the only forecaster who did not issue warnings during both null cases. From a verification perspective, false alarms were a result of one of the following: 1) one or more warnings issued during one of the null cases (N 5 3; Figs. 6c,d), 2) warnings issued during both of the null cases (N 5 5; Figs. 6c,d), and 3) one warning issued during one null case and an unverified warning issued during a tornadic case (N 5 3). We conclude from the median 20-min TLT and strong NWS-based performance statistics that forecasters’ use of PAR data was beneficial to their decision making. However, being a relatively small dataset, these results are not generalizable. Forecasters’ timelines, though, provide the opportunity to understand these statistics within the context of their warning decision processes. One null and one tornadic case were analyzed: 14 April and 22 May 2011, respectively. Of the two null cases, 14 April 2011 was the one that appeared more challenging in real operations and in simulated real time produced the broadest spread in warning times. Of the two tornadic cases, 22 May 2011 was chosen because the spread in TLTs for the first tornado warnings indicated it was the more challenging case. Given the apparent challenge of these cases to forecasters, we thought their analysis would result in the most meaningful insights.

5. 14 April 2011 nontornadic event Forecasters classified this event as supercells occurring in an environment conducive to tornado development (Fig. 1c). Within the first few minutes, all forecasters determined that a right-moving storm was the supercell on which to focus, due to its organization,

69

FIG. 7. The tornadogenesis conceptual model articulated by Elmer and Bridget: vorticity (labeled in the diagram) generated within the inflow along the forward flank is brought around the future tornado location by an increased surge of outflow in the rear flank. (Drawing by Bridget. All conceptual model drawings were digitized to help protect anonymity.)

strong updraft, and midlevel mesocyclone. None thought the data initially supported a warning. By the end of the case, eight forecasters decided that the storm merited a tornado warning (Fig. 6c). One forecaster also issued a tornado warning on the left mover (Pat’s second warning; Fig. 6c), and most issued second warnings on the right mover. This study focuses on first decisions because they are not influenced by the timing of prior decisions. Two main conceptual models drove forecaster decision making. Also key was the ability to assess anticipated and unanticipated storm evolution as the rapid updates came in, including the representativeness of weak, small circulation patterns that appeared in the data. These findings are further discussed in the subsequent subsections.

a. Conceptual models and supercell evolution driving first decisions to warn Warning decisions were driven by two main conceptual models. The first model (Fig. 7) was clearly articulated by just two forecasters: if streamwise vorticity generated within strong inflow winds transitions to crosswise vorticity in a surging outflow, the process leads to tornadogenesis. For this model, Bridget and Elmer sought radar indications of strong inflow winds being brought around into the reflectivity hook through an increased outflow surge. The other forecasters all looked for low-altitude inflow balanced by outflow, but most described the second conceptual model: that of a tightening bounded weak-echo region (BWER) and descending mesocyclone. The focus for most participants was watching for midlevel rotation to tighten and begin to descend to lower altitudes (Fig. 8); Elmer and Bridget also looked for this process while ultimately

70

WEATHER AND FORECASTING

VOLUME 30

FIG. 8. Conceptual model of a BWER aloft that tightens, accompanied by a descending mesocyclone, leading to increased rotation in low levels that is indicated by the inflow–outflow couplet that focuses on the tip of the reflectivity hook. (Drawing by Avery.)

citing the first model as behind their decision to warn. All forecasters continually interrogated low altitudes for indications of increasing inflow and outflow driven by the descending circulation. Implications of these two conceptual models, along with a forecaster’s personal threshold of either strength or persistence of a low-altitude couplet, were the reasons that warning decisions settled into three clusters of time: early, mid-case, and no warning. The two early case decisions were by the forecasters who revealed during post-case questioning that they held the low-level vorticity conceptual model. Their decisions were made early, at 2102 and 2103 UTC, when they believed that tornadogenesis was just beginning. It was the ‘‘[indications] of the storm coming out of a cycling stage’’ (reformation of a hook), and ‘‘strong enough outbounds making it to the surface’’ to enhance low-level vorticity that drove Bridget and Elmer to make these two early decisions. The conceptual model of a tightening BWER and descending mesocyclone drove mid-case decisions made between 2105 and 2109 UTC. At 2058 UTC these

forecasters (Avery, Ben, Jay, and Mike) saw only weak rotation at low levels, below a strong updraft and mesocyclone. In the next three scans (2059–2101 UTC), this group all identified some rotation beginning to show at the lowest tilt. This group of four is focused mainly aloft, noting that the BWER was starting to fill in and lower, and the mesocyclone was tightening and starting to descend (2102–2103 UTC). At 2104 UTC, they all focused on the tightening midlevel rotation. As Mike explained after being prompted during the RCW, ‘‘the rotation might be becoming more concentrated. . .so getting close to tornadogenesis.’’ The fourth of this group was worried about possible dealiasing issues, and wanted to be sure the feature was not transient. For Mike, the changes in the BWER and persistence of rotation at the surface were sufficient to make his warning decision at 2108 UTC. Four forecasters did not issue a warning (Brad, Bob, Maggie, and Randy). Three came close to warning, though one (Brad) was dubious about the quality of the few low-altitude pixels indicating rotation during the case. The two more experienced forecasters each held

FEBRUARY 2015

HEINSELMAN ET AL.

71

FIG. 9. Depiction of descent and stretching circulation patterns, but failure of low-level circulation to tighten sufficiently to indicate tornadogenesis. (Drawing by Bob.)

one of the conceptual models as primary: Bob, the descending rotation (Fig. 9), and Brad, the rear-flank downdraft (RFD) push leading to low-level intensification (Fig. 10). Bob ultimately did not warn because the low-altitude mesocyclone did not tighten sufficiently. The two less-experienced forecasters, Maggie and Randy, exhibited consideration of both conceptual models, with slight focus on low-level features. Brad was very aware of the possibility of bad velocity data where reflectivities are low; several did not trust the small couplet signature. Maggie thought those few pixel signatures were bad data at least until 2112 UTC, when she said they were ‘‘where you’d want them.’’ Maggie and Randy provide an interesting contrast to Pat. These two young forecasters did not warn during the case. Pat, however, made the earliest warning decision of any participant, at a point in the case when there was insufficient evidence for any other forecaster. He explained, ‘‘A lot of the time in our forecast area if you have a supercell you’d better have a warning out... [you may not be] 100% sure if or when it’s going to produce... [but] can I let this go all the way across my CWA [county warning area] without issuing if the conditions are favorable? This [storm] had a lot of the signs.’’ Importantly, all forecasters who made warning decisions in this case were making decisions at early indications of tornadogenesis. For the early group, it was primarily the strong inflow and first indications of a surge in outflow. For the mid-case group, it was the filling–collapse of the BWER, low LCLs, and first indications of descending rotation that prompted the decisions made at 2105–2106 UTC. Some of those who did not warn were tempted to do so on more than one occasion. Ultimately, these forecasters did not issue warnings because they did not observe the low-level evolution they wanted to see. In particular, their decisions not to warn were tied to the lack of longevity

and/or intensity of rotation observed at lower elevation angles (e.g., 0.58 and 0.88).

b. Impacts of scan time on warning decisions and work processes The decision of the majority of forecasters to warn on this null event indicates that the evolution that they saw in the PAR data coincided with their conceptual models for potential supercell tornadogenesis. As mentioned above, although two primary conceptual models were identified, variations in the timing of decisions to warn were driven by differences in personal thresholds of forecasters’ warning decision triggers. The time intervals over which these ‘‘trigger thresholds’’ developed and the eight forecasters decided to warn ranged from 2 to 5 min, or within the time frame of a single WSR-88D scan (i.e., 4–6 min). In this case, these warning decision triggers included rapidly evolving storm features such as early indications of descent of an intensifying mesocyclone aloft, descent of the RFD, intensification of rotation in the lowest four elevations, and BWER collapse coincident with tightening low-level rotation. Decisions not to warn relied on scan-to-scan determinations as to whether or not forecasters’ rotation

FIG. 10. Depiction of the RFD push that did not lead to low-level intensification. (Drawing by Brad.)

72

WEATHER AND FORECASTING

VOLUME 30

FIG. 11. The occlusion process, poorly depicted in WSR-88D data, was clearly evident to forecasters using PAR data. (Drawing by Bridget.)

and intensity thresholds had been met. For example, Maggie’s and Randy’s decisions not to warn were driven by information gained from a single scan. In both instances, these forecasters had opened WarnGen in preparation to warn should trends observed in the velocity data persist. In Maggie’s case, she was monitoring for persistence in a gate-to-gate signature at 0.58. As the next scan came in at 2109 UTC, she blurted, ‘‘Pixels seen gateto-gate are gone!’’ And she decided not to warn. Near the same time, a gate-to-gate velocity signature seen by Randy at 1.18 and 1.58 led him to start WarnGen, and wait one scan to see if the signature developed at 0.58 before deciding to warn. During the next scan (2110 UTC), he decided not to warn because the anticipated couplet 0.58 had not formed. Two of the younger forecasters, Elmer and Bridget, reported that interpreting the data was sometimes challenging. This case was the first worked by Elmer and the second worked by Bridget. Elmer explained that the temporal evolution allowed him to ‘‘jump on things quicker, but at the same time throws you for a loop because [you] see so much more than before.’’ While working this as his first case, he found that he was speeding up his interrogation of the data to keep up with each scan as it came in, as he did not want to skip anything. After working a second case, Elmer reported having used normal processes for data display. According to Bridget, she ‘‘didn’t feel solid in her conceptual model,’’ unsure whether she was seeing ‘‘regular variability or signs that the storm was changing.’’ Furthermore, Bridget noted that the more frequent scans confused her sense of time, such that she wanted to issue warning updates faster than usual. While working her last two cases, Bridget continued to note uncertainty in her conceptual model, but did not mention issues with her sense of time. Also feeling rushed by the update time, Dirk recalled using ‘‘more frequent last frame clicks’’ to keep up with the most recent data; this was his fourth case. However, rather than working harder,

Avery, Mike, and Randy reported working smarter by leveraging four-panel displays of base data to keep up with the higher-temporal resolution data. Avery made this change in process while working his first case. Having gained experience using this display while working the last three cases, Randy explained that the four-panel display of reflectivity and velocity allowed him ‘‘to passively monitor data while it comes in without hav[ing] to control anything (less window switching, fewer keyboard clicks).’’ He ‘‘did this when [he was] confident he didn’t need to issue [a warning].’’ Of these three, Randy decided not to warn.

6. The May 2011 tornado event As in the null event, forecasters immediately identified the storm of interest to be a supercell with the potential to produce tornadoes (Fig. 1b). The issue times of their first tornado warnings ranged from 0054 to 0121 UTC (Fig. 6b). Forecasters’ attention was focused on the evolution of velocity signatures and their collocation with key reflectivity signatures. They monitored downward movement of midlevel rotation and tightening of those mesocyclone features, strength of the inflow, and a repeating occlusion process as evidenced by a tight couplet that moved toward the back of the storm (Fig. 11). They studied how these velocity features corresponded to both hook-echo evolution and the inflow reflectivity gradient. Fourteen minutes into the case, forecasters experienced a 4-min data gap (0104– 0108 UTC), due to an issue during PAR data collection. This data gap provided the opportunity to see how they would respond to a change in update time.

a. Conceptual models and supercell evolution driving first warning decisions Brad, Dirk, Elmer, Bob, Ben, Pat, and Bridget issued warnings between 0054 and 0101 UTC (Fig. 6b), resulting in tornado lead times of 18 min or higher

FEBRUARY 2015

HEINSELMAN ET AL.

(Fig. 4a). Six of the first seven (Dirk, Elmer, Bob, Ben, Pat, and Bridget) had similar reasons for their first warning decisions. Avery’s reasoning also fit with this group: he was ‘‘getting ready to pull the trigger’’ at 0104 UTC, but waited 5 min more; Brad’s reasoning differed (more on this later). These early warning decisions were made upon early indications of the tightening and lowering of a mesocyclone seen at midor low levels, coincident with interaction of the RFD with the inflow, indicated tornadogenesis would likely ensue. Other important features were the characteristics of the BWER aloft, the reflectivity gradient along the forward flank, spectrum width trends, or an environment supportive of the storm to ‘‘organize quickly.’’ All forecasters identified this storm as a classic supercell. Brad made the earliest warning decision of any forecaster, at 0054 UTC, but it was because a circulation high in the hook (occlusion) had tightened sufficiently to meet his personal threshold. Just 1 min before he had declared, ‘‘the tornado threat was low because the pendant was located behind the gust front.’’ No other forecasters warned on that feature, although most of them mentioned seeing it. Ben had an additional reason for his early 0055 UTC decision: an associated increase in spectrum width values indicated tornado development was possible. He stated that the trends in reflectivity and velocity were more important than this feature in his decision, and noted that he ‘‘hadn’t looked at [spectrum width] in an operational sense in 2–3 years’’ but did in this case due to the ‘‘closeness [of the storm] to the radar.’’ By 0103–0104 UTC, all 12 forecasters saw signals indicating tornadogenesis. Ten of them identified features important to their conceptual model, such as an RFD surge, tightening circulation aloft, persistent updraft, descending circulation, and better collocation of lowlevel features. Two of these 10 expressed confidence that a tornado might be imminent as early as 0103 UTC; for five more their ‘‘growing’’ or ‘‘really high [confidence]’’ occurred around 0112–0113 UTC. The other three were either in the process of a warning decision or monitoring important evolution without making a statement of confidence. For the two who did not express confidence, Maggie said ‘‘nothing screams ‘issue a warning’,’’ though she identified a ‘‘wonderful’’ hook. The velocities simply were not tight or strong enough. Maggie did not make her decision until 0114 UTC. Elmer was watching the RFD continue to progress southward, but said the inflow was ‘‘relatively weak.’’ He was waiting for another circulation pattern to form. Elmer became confident by 0108 UTC that a tornado would develop. He did not need to take any actions at that point because he had already issued his warning. Maggie made her first warning decision at 0114 UTC, deciding to ‘‘go with [her] gut.’’ She had seen

73

a tightening circulation and increasing outflow but it was ‘‘not great.’’ Interestingly, four other forecasters who were actively assessing the storm at this time expressed ‘‘[really high] confidence’’ they would ‘‘see something in a few minutes’’: a tornado was ‘‘occurring or about to occur.’’ This judgment was one of only a few notably different judgments during the case. The others were less consequential. Randy and Maggie made their first warning decisions 6 and 4 min prior to the first verified tornado, retrospectively. Randy made his decision just prior to when Maggie had made hers, at 0112 UTC. Coming out of the data gap from 0104 to 0108 UTC, he was ‘‘unconcerned about a tornado [being imminent] yet,’’ but in the next 2 min saw a ‘‘pretty significant RFD surge, closer to the inflow, [leading to a] tighter interface between [them].’’ This meant that ‘‘everything was trending toward a tornado.’’ By 0112 UTC the circulation had ‘‘intensified enough to issue.’’ At this point he was still ‘‘forecasting a tornado.’’ The timing of Jay’s and Mike’s decisions to warn (0119 UTC) coincided with the first reported tornado, resulting in 0-min tornado lead times (Fig. 4a). Their conceptual models for warning included seeing the development of tightening circulations within mid- to low levels, and persistence of a velocity couplet at 0.58. These two forecasters were among those at 0103–0104 UTC who said they were either ‘‘anticipating that the circulation will descend’’ or that they were ‘‘getting close to maybe issuing a tor[nado warning].’’ They both held off, however, stating that the circulation was ‘‘too loose’’ at 0108 UTC, and remaining concerned about the circulation in lower levels either not being collocated with the tip of the hook, or being too broad to indicate a tight circulation pattern. They exhibited one of the few differences in judgment at 0113 UTC, with Jay observing the ‘‘tightening of circulation [was now] collocated with the tip of the appendage’’ whereas Mike said ‘‘nothing [was] really organized.’’ The tightening Jay observed at 0113 UTC loosened 2 min later. At 0116–0117 UTC both were seeing a better-defined, tighter circulation pattern just aloft, and as the tornado apparently formed they finally identified a ‘‘good’’ or ‘‘better’’ couplet, in the right place. Just before a phone call report Mike said he was ‘‘starting to think that [the radar] was starting to sample a tornado.’’ Nearly all forecasters remained highly confident in the tornado continuing through 0121– 0128 UTC, though there was a break in the verification data during that time period.

b. Impacts of scan time on warning decisions and work processes Although forecasters shared similar conceptual models, variations in tornado-warning lead times were

74

WEATHER AND FORECASTING

driven by differences in personal thresholds of forecasters’ warning decision triggers. The time intervals over which these ‘‘trigger thresholds’’ developed and forecasters decided to warn ranged from 1 to 8 min; 11 of the 12 were 4 min or less. Hence, while working this event, forecasters were observing rapidly evolving storm features indicative of tornado development significant enough to warn on at time scales shorter than a traditional WSR-88D scan (i.e., 4–6 min). Most forecasters shared the conceptual model that tornadogenesis would likely ensue, following the tightening and lowering of a mesocyclone seen at mid- or low levels, coincident with interaction of the RFD with the inflow. Variations in the timing of their decisions to warn depended mostly on the location and intensity of the mesocyclone. Dirk, Ben, Elmer, and Bob decided to warn the earliest (0054– 0058 UTC), after observing tightening of the mesocyclone within mid-levels of the storm, coincident with increases in the intensity of the RFD (Fig. 12). Within the next 3 min (0059–0101 UTC), it was the additional observation of rotation beginning to develop below midlevels that triggered Pat and Bridget to warn. Furthermore, Bridget noted that she could not recall having seen either the occlusion process or RFD descent in the WSR88D data, both of which she had observed by this time. Finally, it was the intensification of low-level circulations that triggered the last three verified warning decisions by Avery, Randy, and Maggie (0109–0114 UTC). It is possible that 1-min updates could lead some forecasters to postpone their warning decisions to wait for ‘‘just one more scan.’’ The consideration of waiting longer to warn when using the PAR data was noted by two forecasters in this case. According to Pat, who warned early, seeing the details in storm evolution meant that he did not ‘‘see as much sudden change that you need to act on really fast,’’ and reduced stress in the warning process. For Randy, who warned later in the event, knowing that he was not missing anything meant that he ‘‘could hold off on warning because he had more data to back up decisions.’’ In these cases the use of rapid-scan data helped to reduce stress and improved knowledge of the storm’s evolution. Its use becomes problematic when the data improve one’s confidence, but not the decision, as was the case with the youngest forecaster, Mike, who warned too late. Regardless, he reported that ‘‘[the data] being quicker did make me feel better—except when there were data gaps—that I wasn’t missing anything.’’ In context, this statement makes sense, as Mike had just taken the DLOC course for new forecasters a few months before the experiment, and had only been the primary warning forecaster on one event involving a possible tornado. To better view trends in the PAR data and to reduce workload, some forecasters changed their work process.

VOLUME 30

FIG. 12. The first warning decisions on the 22 May 2011 case were made after seeing a classic reflectivity structure aloft; tornadogenesis was anticipated after observing a tightening of the mesocyclone coincident with an increase in intensity of the RFD. (Drawing by Bob.)

These changes were similar to those reported in the 14 April 2011 case. For example, Mike commented that while working this first case, increasing the number of volume scan frames he used to loop the data helped him attain a better sense of the storm motion. Although this was the first case worked by Bridget and Randy, they both added use of four-panel displays to their usual use of the ‘‘all tilts’’ display. Randy particularly liked the four-panel view of the lowest elevations because he was ‘‘able to see almost real-time updating of each elevation.’’ For him, ‘‘That was a big help in overall monitoring of the strength of circulations.’’ Randy also found use of the four-panel presentation ‘‘more hands-off’’, hence requiring less work. Both forecasters chose to apply this change in process to subsequent cases. In contrast, although Ben found the rapid updates to be helpful to his decision-making process, he also found that they increased his workload. He noted that it

FEBRUARY 2015

HEINSELMAN ET AL.

75

FIG. 13. Forecasters using rapid-temporal, full-volumetric PAR data had multiple scans upon which to confidently identify the persistence of features aloft that preceded tornadogenesis. Differences in warning decisions were then driven by their personal thresholds for location and strength of key features.

‘‘required more effort to view more data.’’ He did not report attempting to change his work process in response to the increased workload.

7. Discussion These analyses help us to better understand variation in forecaster performance, and how new radar technologies may impact warning decision making. The two cases included in this paper were the ones in which forecasters exhibited the widest range in lead time. Our analyses point to three important explanations for these differences: 1) a forecaster’s personal threshold for persistence and intensity of a feature, developed through the use of 4–6-min updates of the WSR-88D scans; 2) the forecasters’ understanding of the importance of the vertical location of those features within the storm; and 3) the idiosyncratic nature of each forecaster’s experience. Figure 13 illustrates the clustering of decisions for classic supercell tornadogenesis in the 22 May 2011 case, and how those clearly fall within the scan interval of the WSR-88D. These precursors do not always lead to tornadogenesis, however, as illustrated in the 14 April 2011 case, and for this additional research from field projects and modeling studies are critically important to improving the accuracy of our field’s conceptual model of tornadogenesis. The new scanning strategy, SAILS, which

provides more frequent updates at 0.58 but slower updates at all other elevations, will likely aid tornado detection but not help forecasters identify feature strength and persistence aloft, which were key to early warning decisions in this study. A related concern is whether the SAILS approach to scanning would be beneficial to the Warn-on-Forecast program (Stensrud et al. 2009). An observing system simulation experiment using synthetic rapid-scan radar data from a supercell (Yussouf and Stensrud 2010) and an assimilation experiment using real PAR data (Wicker et al. 2014) have demonstrated improved analyses and ensemble forecasts compared to control experiments using radar data collected with a conventional WSR88D scan strategy. A question often asked about rapid-scan radar data is whether forecasters will be able to keep up with a ‘‘fire hose’’ of data. By providing only PAR data to forecasters, changes they made to their work process help to inform effective practices for viewing such data. In response to the faster updates, all forecasters increased the number of frames they usually allocated to loop radar data. At times, the maximum number of frames available in AWIPS-2 was inadequate for observing the evolution of a radar attribute or determining a representative storm motion. While a few forecasters felt they had to work harder to keep up with every scan (e.g.,

76

WEATHER AND FORECASTING

Elmer), others (e.g., Avery, Bridget, Randy, and Mike) changed their modus operandi. Specifically, they found that adding four-panel displays containing information at multiple levels to their all-tilts work method improved their ability to keep up with incoming scans and focus their attention on changes in storm attributes important to their decision-making processes. This work method implies that not all incoming radar data need to be diagnosed all of the time, which is what we saw most forecasters do in practice. As a case began, forecasters examined the full data volume. Once they understood the storm’s state, they focused their interrogation on areas of the storm where they anticipated evolution that would best inform their tornado-warning decisions. They then intermittently reexamined the full data volume for changes in vertical storm structure. When a warning was issued, the forecaster monitored the storm for radar-based evidence of tornado development. Therefore, with more radar data available to peruse, future display systems need to facilitate forecasters’ ability to maintain awareness of storms where and when that awareness is needed. Example display characteristics could include a rapid response to user requests, the ability to simultaneously display multiple loops over an extensive period, and robust methods for displaying the full radar volume (e.g., three-dimensional volumes). Furthermore, robust storm-attribute algorithms could aid forecasters in managing their workloads by identifying and tracking storm intensity indicators and severe storm precursors. Although we observed forecasters making changes to their work methods, acclimating to the pace of rapidscan PAR, and performing fairly well while using the PAR data, of interest is how forecasters’ data processing and performance might change over time, and in a real operational setting. In operations, forecasters have many other environmental and model-based datasets available and challenges that purposefully were not represented in 2012 PARISE. These differences could be addressed through the integration of rapid-scan PAR data into a WFO test site. Such integration, though, presents its own challenges, such as the ability of an operational AWIPS-2 to display adaptively scanned radar data, where preset VCPs, azimuthal sectors, and update times are not the norm. Another option could be the implementation of a longitudinal study, where the same forecasters intermittently work PAR cases in simulated real time for an extended period. Such studies would reveal changes in work processes as forecasters adapt to the data, result in the development of best practices for data interrogation, and further shed light on human factors, training issues, and how to best integrate these data into operations.

VOLUME 30

8. Conclusions In 2012 PARISE, 12 forecasters worked two tornadic and two nontornadic supercell events in simulated real time. Forecasters’ use of rapid-scan PAR data resulted in strong statistical performance in terms of TLT, PPOD, and POFA. The resulting 20-min median TLT exceeds both the participants’ regional- (NWS Central and Southern Regions) and WFO-based median TLTs of 11 and 13.5 min, for similarly rated tornadoes within the spring seasons of 2008–13. This 20-min median TLT also exceeds the highest tornado-warning lead time achieved in 2010 PARISE [18-min; Heinselman et al. (2012)]. Their 20-min median TLT was accompanied by a median PPOD of 0.95 and a median POFA of 0.33; both values indicate performance superior to the associated median regional and WFO PPODs (0.72 and 0.74, respectively) and POFAs (0.71 and 0.72, respectively). Examination of median TLTs and POFAs for each forecaster showed similar performance regardless of the level of experience (Fig. 6). While the increase in sample size from 2010 PARISE to 2012 PARISE and the evaluation of different severe weather events with different forecasters enhance the external validity of both studies, the sample size is still too small for the results to be generalized. Future studies that increase the sample size and variety of storm types and intensities are needed to further improve external validity. Given, though, that this was the forecasters’ first exposure to rapid-scan PAR data, with no time or training given to acclimate to the increased data load, these statistical results may represent the low end of the potential for rapid-scan PAR data to aid forecaster performance. To put these results in context, cognitive task analysis was applied to the two cases with the greatest spread of warning decision times: 14 April 2011 (nontornadic) and 22 May 2011 (tornadic). The RCW protocol (Hoffman 2005) adapted for this experiment was an effective method for gaining understanding of forecasters’ decision-making processes, conceptual models, and trigger points, as well as gaining insight regarding the relative importance of rapid-scan data. We recommend use of this method to others working in testbed environments and seeking to understand users’ perspectives on new products or technologies. Forecasters’ ability to match the attributes and evolution seen in the PAR data to their conceptual models resulted in successful performance statistics overall (Fig. 4). As found in this study, Andra et al. (2002) reported that such cognitive activity is important to successful NWS warning operations. Forecasters in this case benefited from the rapid temporal, full-volumetric data in gaining confidence in their early warning decisions. They

FEBRUARY 2015

HEINSELMAN ET AL.

were able to see the strength, intensity, persistence, and important changes in features aloft that are precursors to tornadogenesis. Acknowledgments. We thank the 12 NWS forecasters for their participation in this study; the Southern and Central Region SSD chiefs, MICs, and SOOs for aiding recruitment; and Les Lemon and Steve Martinaitis for their participation in our experiment test run. Interactions with Brent W. MacAloney II and Rick Smith about storm-based NWS verification aided our quantitative analysis; thank you both! We thank Vicki Farmer for skillfully reproducing forecasters’ conceptual model drawings into production-quality figures. We also thank the following colleagues for their contributions to this research: Experimental Warning Program leads Greg Stumpf, Travis Smith, and David Andra; A/V specialist James Murnan; GIS expert Ami Arthur; WDSS-II expert Kiel Ortega; simulation expert Dale Morris; and software experts Eddie Forren and Hoyt Burcham. Thanks also to Jimmy Correia for helping with data collection, Jim LaDue for preparing the weather briefings, and Harold Brooks for thought-provoking discussions on this research. Finally, thanks to those who reviewed earlier versions of this manuscript, including Katie Bowden, Rodger Brown, Bill Bunting, Susan Cobb, Kurt Hondl, Charles Kuster, David Priegnitz, and Lans Rothfusz, and to the three anonymous reviewers who provided substantive comments that improved the paper. Funding was provided by NOAA/Office of Oceanic and Atmospheric Research under NOAA–University of Oklahoma Cooperative Agreement NA11OAR4320072, U.S. Department of Commerce. REFERENCES Andra, D. L., E. M. Quoetone, and W. F. Bunting, 2002: Warning decision making: The relative roles of conceptual models, technology, strategy, and forecaster expertise on 3 May 1999. Wea. Forecasting, 17, 559–566, doi:10.1175/1520-0434(2002)017,0559: WDMTRR.2.0.CO;2. Barnes, L. R., D. M. Schultz, E. C. Gruntfest, M. H. Hayden, and C. C. Benight, 2009: False alarm rate or false alarm ratio? Wea. Forecasting, 24, 1452–1454, doi:10.1175/2009WAF2222300.1. Bluestein, H. B., M. M. French, I. PopStefanija, R. T. Bluth, and J. B. Knorr, 2010: A mobile, phased-array Doppler radar for the study of severe convective storms. Bull. Amer. Meteor. Soc., 91, 579–600, doi:10.1175/2009BAMS2914.1. Brooks, H. E., C. A. Doswell III, and M. P. Kay, 2003: Climatological estimates of local daily tornado probability for the United States. Wea. Forecasting, 18, 626–640, doi:10.1175/ 1520-0434(2003)018,0626:CEOLDT.2.0.CO;2. Brotzge, J., and S. Erickson, 2010: Tornadoes without NWS warning. Wea. Forecasting, 25, 159–172, doi:10.1175/2009WAF2222270.1. Brown, R. A., and V. T. Wood, 2012: Simulated vortex detection using a four-face phased-array Doppler radar. Wea. Forecasting, 27, 1598–1603, doi:10.1175/WAF-D-12-00059.1.

77

——, ——, and D. Sirmans, 2000: Improved WSR-88D scanning strategies for convective storms. Wea. Forecasting, 15, 208–220, doi:10.1175/1520-0434(2000)015,0208:IWSSFC.2.0.CO;2. ——, ——, R. M. Steadham, R. R. Lee, B. A. Flickinger, and D. Sirmans, 2005: New WSR-88D volume coverage pattern 12: Results of field tests. Wea. Forecasting, 20, 385–393, doi:10.1175/ WAF848.1. Chrisman, J. N., 2009: Automated volume scan evaluation and termination (AVSET): A simple technique to achieve faster volume scan updates for the WSR-88D. Preprints, 34th Conf. on Radar Meteorology, Williamsburg, VA, Amer. Meteor. Soc., P4.4. [Available online at http://ams.confex.com/ams/ pdfpapers/155324.pdf.] Crandall, B., G. Klein, and R. R. Hoffman, 2006: Working Minds: A Practitioner’s Guide to Cognitive Task Analysis. The MIT Press, 332 pp. Crum, T., S. D. Smith, J. N. Chrisman, R. E. Saffle, R. W. Hall, and R. J. Vogt, 2013: WSR-88D radar projects: 2013 update. Proc. 29th Conf. on Environmental Information Processing Technologies, Austin, TX, Amer. Meteor. Soc., 8.1. [Available online at https://ams.confex.com/ams/93Annual/webprogram/Manuscript/ Paper221461/2013EIPT_WSR88D_Radar_Projects_2013Update_ Final3.pdf.] Curtis, C. D., and S. M. Torres, 2011: Adaptive range oversampling to achieve faster scanning on the National Weather Radar Testbed phased-array radar. J. Atmos. Oceanic Technol., 28, 1581–1597, doi:10.1175/JTECH-D-10-05042.1. Ericsson, K. A., N. Charness, P. J. Feltovich, and R. R. Hoffman, Eds., 2006: Cambridge Handbook of Expertise and Expert Performance. Cambridge University Press, 901 pp. Greene, D. R., J. D. Nilsen, R. E. Saffle, D. W. Holmes, M. D. Hudlow, and P. R. Ahnert, 1983: RADAP II, an interim radar data processor. Preprints, 21st Conf. on Radar Meteorology, Edmonton, AB, Canada, Amer. Meteor. Soc., 404–408. Heinselman, P. L., and S. M. Torres, 2011: High-temporal resolution capabilities of the National Weather Radar Testbed Phased-Array Radar. J. Appl. Meteor. Climatol., 50, 579–593, doi:10.1175/2010JAMC2588.1. ——, D. S. LaDue, and H. Lazrus, 2012: Exploring impacts of rapid-scan radar data on NWS warning decisions. Wea. Forecasting, 27, 1031–1044, doi:10.1175/WAF-D-11-00145.1. Hoffman, R. R., 2005: Protocols for cognitive task analysis. Advanced Decision Architectures Collaborative Technology Alliance, 108 pp. [Available online at http://www.dtic.mil/cgibin/GetTRDoc?AD5ADA475456.] Isom, B., and Coauthors, 2013: The atmospheric imaging radar: Simultaneous volumetric observations using a phased array weather radar. J. Atmos. Oceanic Technol., 30, 655–675, doi:10.1175/JTECH-D-12-00063.1. Johnson, J. T., M. D. Eilts, A. White, W. Armstrong, T. J. Ganger, and M. Istock, 1999: The common operations and development environment: An environment for development and testing hydrometeorological applications. Preprints, 29th Conf. on Radar Meteorology, Montreal, QC, Canada, Amer. Meteor. Soc., 65–68. Karstens, C. D., W. A. Gallus Jr., B. D. Lee, and C. A. Finley, 2013: Analysis of tornado-induced tree fall using aerial photography from the Joplin, Missouri, and Tuscaloosa–Birmingham, Alabama, tornadoes of 2011. J. Appl. Meteor. Climatol., 52, 1049–1068, doi:10.1175/JAMC-D-12-0206.1. Lee, R. R., and R. M. Steadham, 2004: WSR-88D algorithm comparisons of VCP 11 and new VCP 12. Preprints, 20th Conf. on Interactive Information and Processing Systems (IIPS) for Meteorology, Oceanography, and Hydrology, Seattle, WA,

78

WEATHER AND FORECASTING

Amer. Meteor. Soc., 12.7. [Available online at https://ams. confex.com/ams/pdfpapers/69402.pdf.] Marshall, T. P., W. Davis, and S. Runnels, 2012: Damage survey of the Joplin tornado. Proc. 26th Conf. on Severe Local Storms, Nashville, TN, Amer. Meteor. Soc., 6.1. [Available online at https://ams.confex.com/ams/26SLS/webprogram/Paper211662. html.] Militello, L. G., and R. J. B. Hutton, 1998: Applied cognitive task analysis (ACTA): A practitioner’s toolkit for understanding cognitive task demands. Ergonomics, 41, 1618–1641, doi:10.1080/ 001401398186108. NOAA, 2011: Joplin, Missouri, Tornado—May 22, 2011. NWS Central Region Service Assessment, 40 pp. [Available online at http://www.nws.noaa.gov/os/assessments/pdfs/Joplin_tornado. pdf.] Office of the Federal Coordinator for Meteorological Services and Supporting Research, 2013: Part A: System concepts, responsibilities, and procedures. Doppler Radar Meteorological Observations, Federal Meteorological Handbook 11, FCH-H11A-2013, 49 pp. [Available online at http://www.ofcm. gov/fmh11/fmh11.htm.] Omodei, M. M., and J. McLennan, 1994: Studying complex decision making in natural settings: Using a head-mounted video camera to study competitive orienteering. Perceptual Mot. Skills, 79, 1411–1425, doi:10.2466/pms.1994.79.3f.1411. Priegnitz, D. L., and P. Heinselman, 2013: Detection and adaptive scheduling on the NWRT phased-array radar. Proc. 36th Conf. on Radar Meteorology, Breckenridge, CO, Amer. Meteor. Soc., P.147. [Available online at https://ams.confex.com/ams/ 36Radar/webprogram/Paper228570.html.] Saffle, R. E., 1976: D/RADEX products and field operation. Preprints, 17th Conf. on Radar Meteorology, Seattle, WA, Amer. Meteor. Soc., 555–559.

VOLUME 30

Stensrud, D. J., and Coauthors, 2009: Convective-scale warn-onforecast system. Bull. Amer. Meteor. Soc., 90, 1487–1499, doi:10.1175/2009BAMS2795.1. Sutter, D., and S. Erickson, 2010: The time cost of tornado warnings and savings with storm-based warnings. Wea. Climate Soc., 2, 103–112, doi:10.1175/2009WCAS1011.1. Wicker, L. J., C. K. Potvin, T. E. Thompson, D. J. Stensrud, and P. L. Heinselman, 2014: Improved convective scale prediction from the assimilation of rapid-scan phased array radar data. Preprint, 22nd Conf. on Numerical Weather Prediction, Atlanta, GA, Amer. Meteor. Soc., 7.7. [Recorded presentation available online at https://ams.confex.com/ams/94Annual/ webprogram/Paper240008.html.] Witt, A., M. D. Eilts, G. J. Stumpf, E. D. Mitchell, J. T. Johnson, and K. W. Thomas, 1998: Evaluating the performance of WSR-88D severe storm detection algorithms. Wea. Forecasting, 13, 513–518, doi:10.1175/1520-0434(1998)013,0513:ETPOWS.2.0.CO;2. Wurman, J., D. Dowell, Y. Richardson, P. Markowski, E. Rasmussen, D. Burgess, L. Wicker, and H. B. Bluestein, 2012: The Second Verification of the Origins of Rotation in Tornadoes Experiment: VORTEX2. Bull. Amer. Meteor. Soc., 93, 1147–1170, doi:10.1175/BAMS-D-11-00010.1. Yu, T.-Y., M. B. Orescanin, C. D. Curtis, D. S. Zrnic, and D. E. Forsyth, 2007: Beam multiplexing using the phased-array weather radar. J. Atmos. Oceanic Technol., 24, 616–626, doi:10.1175/JTECH2052.1. Yussouf, N., and D. J. Stensrud, 2010: Impact of phased-array radar observations over a short assimilation period: Observing system simulation experiments using an ensemble Kalman filter. Mon. Wea. Rev., 138, 517–538, doi:10.1175/2009MWR2925.1. Zrnic, D. S., and Coauthors, 2007: Agile-beam phased array radar for weather observations. Bull. Amer. Meteor. Soc., 88, 1753– 1766, doi:10.1175/BAMS-88-11-1753.

Suggest Documents