1080
JOURNAL OF HYDROMETEOROLOGY
VOLUME 13
Evaluation of Radar Precipitation Estimates from the National Mosaic and Multisensor Quantitative Precipitation Estimation System and the WSR-88D Precipitation Processing System over the Conterminous United States WANRU WU AND DAVID KITZMILLER NOAA/NWS/Office of Hydrologic Development, Silver Spring, Maryland
SHAORONG WU NOAA/NWS/Office of Hydrologic Development, Silver Spring, Maryland, and TCAssociates, Inc., Springfield, Virginia (Manuscript received 25 May 2011, in final form 10 November 2011) ABSTRACT This study evaluated 24-, 6-, and 1-h radar precipitation estimated from the National Mosaic and Multisensor Quantitative Precipitation Estimation System (NMQ) and the Weather Surveillance Radar-1988 Doppler (WSR-88D) Precipitation Processing System (PPS) over the conterminous United States (CONUS) for the warm season April–September 2009 and the cool season October 2009–March 2010. Precipitation gauge observations from the Automated Surface Observing System (ASOS) were used as the ground truth. Gridded StageIV multisensor precipitation estimates were applied for supplementary verification. The comparison of the two systems consisted of a series of analyses including the linear correlation coefficient (CC) and the root-mean-square error (RMSE) between the radar precipitation estimates and the gauge observations, large precipitation amount detection categorical scores, and the reliability of precipitation amount distribution. Data stratified for the 12 CONUS River Forecast Centers (RFCs) and for the cold rains events with bright-band effects were analyzed additionally. Major results are 1) the linear CC of NMQ versus ASOS are generally higher than that of PPS versus ASOS over CONUS, while the spatial variations stratified by the RFCs may switch with seasons; 2) compared to the precipitation distribution of ASOS, NMQ shows less deviation than PPS; 3) for the cold rains verified against ASOS, NMQ has higher CC and PPS has lower RMSE for 6-h and higher RMSE for 1-h cold rains; and 4) for the precipitation detection categorical scores, either NMQ or PPS can be superior, depending on the time interval and season. The verification against StageIV gridded precipitation estimates showed that NMQ consistently had higher correlations and lower biases than did PPS.
1. Introduction Radar precipitation estimates are used extensively in National Oceanic and Atmospheric Administration (NOAA) National Weather Service (NWS) operations (Maddox et al. 2002). The National Mosaic and Multisensor Quantitative Precipitation Estimation (QPE) System (NMQ; Seo et al. 2005; Vasiloff et al. 2007; Langston et al. 2007; Zhang et al. 2005, 2011) is a joint effort among several NOAA and non-NOAA offices [e.g., National
Corresponding author address: Dr. Wanru Wu, Hydrology Laboratory, Office of Hydrologic Development, National Weather Service, NOAA, 1325 East-West Highway, Silver Spring, MD 20910. E-mail:
[email protected] DOI: 10.1175/JHM-D-11-064.1 Ó 2012 American Meteorological Society
Severe Storms Laboratory (NSSL), National Centers for Environmental Prediction (NCEP), Office of Hydrologic Development (OHD), Federal Aviation Administration (FAA), etc.] intended to test and demonstrate QPE algorithms that are not currently implemented in the Next Generation Weather Radar (NEXRAD) Precipitation Processing System (PPS; Fulton 1998; Fulton et al. 1998). Following favorable test results and positive feedback from field users, the NWS is investigating options for implementing NMQ operationally. Toward that end, we have undertaken a broad-scale evaluation and comparison of NMQ and currently operational PPS QPE products by using real-time data from most of the conterminous United States (CONUS). This study is conducted to explore and better understand when and where NMQ and PPS radar-only products differ significantly. This
JUNE 2012
WU ET AL.
effort provides information to field personnel, who have access to both NMQ and PPS products and who must select and blend input from the two sources in operations. Evaluation of QPE over a wide range of space–time scales is a challenge because of the large space–time variability of precipitation and the need to address different operational uses of QPE. For instance, the 1-h time scale is presently applied in some distributed and lumped operational hydrologic models, and we can anticipate that such uses will expand in the near future. The 6-h interval matches the time step currently used in most operational lumped river models, and is a standard for gauge-only analysis in much of the western United States. The 24-h interval corresponds to daily observations, which are collected at many more points than are subdaily observations and which are routinely used in some hydrologic operations. Hence, we collected 24-, 6-, and 1-h precipitation estimates spanning April 2009–March 2010 from 1) the real-time NMQ prototype system developed and maintained by NSSL and 2) the mosaicked Weather Surveillance Radar-1988 Doppler (WSR-88D) PPS products prepared by the National Climatic Data Center (NCDC) for the warm season of April–September 2009 and NCEP (Lin and Mitchell 2005) for the cool season of October 2009–March 2010. It is generally accepted that the most successful operational technique for improving the radar precipitation estimates has been to ‘‘calibrate’’ the radar with rain gauges (Wilson and Brandes 1979). The primary verification data were from the Automated Surface Observing System (ASOS) rain gauge observations, which are considered the ground truth. Comparisons also were made relative to the best available blended gridded product—the NCEP StageIV data, which were composited from grids generated operationally at River Forecast Centers (RFCs) with rain gauge input and manual quality control, and were considered among the best 4-km gridded precipitation estimates available in near–real time (Lin and Mitchell 2005). The NMQ and PPS radar-only QPEs were evaluated against ASOS observations in terms of linear correlation coefficient (CC) and root-mean-square errors (RMSE) for precipitation in different precipitation thresholds. Treatment of snow and melting snow as distinct hydrometeor types is done in NMQ but not in the PPS. We know that significant PPS errors are caused by some overly simple assumptions. We wished to assess the actual success of the refinement by evaluating cold rain events, which were selected based on an ASOS-reported temperature range of 28–58C at ASOS sites in the cool season. Categorical (yes– no) scores such as the probability of detection (POD), false alarm rate (FAR), and critical success index (CSI) were calculated with respect to ASOS reports for certain amount
1081
thresholds. Since hydrologic models implicitly rely on a realistic depiction of the spectrum of precipitation intensities, a quantile analysis was also computed to show if the remote sensing systems were capable of retrieving high-intensity precipitation as well as producing unbiased estimates over a variety of events. The rank correlation and pattern correlation, relative to StageIV, were calculated as a supplementary evaluation, primarily to assess automated quality control contained in the two radar algorithm suites. In this paper, we review further details of the data used and describe the methods applied for the study in section 2, and present and discuss the results of the radar–rain gauge verification in section 3. The complementary radar– StageIV verification results are described in section 4. Section 5 summarizes the study.
2. Data and methods The radar-only precipitation estimates from the NMQ system and the PPS products were applied to perform statistical analysis against ASOS rain gauge station observations and NCEP StageIV Multisensor Precipitation Estimator (MPE; Fulton 2005; Kitzmiller et al. 2012) gridded products. The two sets of statistics were compared to assess the performance of the two QPE systems. The data collection and statistical analysis are detailed below.
a. Data collection This study analyzed 24-, 6-, and 1-h radar-only precipitation estimates from the NMQ and PPS products and verification data from ASOS precipitation gauge observations and from StageIV gridded products. The data spanned from April 2009 to March 2010 and cover warm and cool seasons. The 24-h precipitation ends 1200 coordinated universal time (UTC), and 1- and 6-h precipitation ends 0000, 0600, 1200, and 1800 UTC (i.e., the 24- and 6-h accumulations cover the entire day while the 1-h samples cover only 4 h of each day). The NMQ mosaic domain includes eight computational tiles over CONUS (see details at http://www.nssl.noaa.gov/projects/q2/ tutorial/3dmosaic.php). The original 0.018 NMQ mesh data were interpolated to the 4-km Hydrologic Rainfall Analysis Project (HRAP; Schaake 1989; Reed and Maidment 1999) grid, which is a polar stereographic grid coordinate system with 1121 3 881 grids used within the NWS. The HRAP grid values are the mean of the NMQ grid values falling within each grid box. This remapping of the original NMQ data corresponds to the products delivered to and used at RFCs. Because of data availability, the mosaicked PPS products from NCDC were applied for the warm season (April–September 2009) and that from NCEP StageII radar-only precipitation estimates for cool season (October 2009–March 2010).
1082
JOURNAL OF HYDROMETEOROLOGY
The corresponding 24-, 6-, and 1-h ASOS gauge precipitation amounts were from 375 sites where observations were manually augmented. The StageIV multisensor gridded precipitation products were applied as verification for NMQ and PPS grids to assess automated radar quality control and spatial correlation of precipitation patterns in general. To test the performance of the radar algorithms in bright-band situations, the cold rain events were selected from ASOS, NMQ, and PPS based on an ASOS-reported temperature range of 28–58C at ASOS sites in the cool season. Note that the NMQ and PPS products were derived solely from radar data. During the course of the investigation, it was discovered that NCEP StageII data designated ‘‘radar only’’ had in fact been adjusted with rain gauge–radar bias information prior to 1 October 2009 (Y. Lin and S. Vasiloff 2009, personal communication). This necessitated the use of data from an alternative archive, prepared by NCDC, for the preceding warm season. Henceforth, we refer to either the NCDC or StageII mosaics as ‘‘PPS’’ products. ASOS rain gauge and temperature observations, including only those reports with valid data (precipitation $ 0, no missing data flags), were extracted from NWS Meteorological Development Laboratory archives. The NMQ and PPS estimates at the corresponding latitude–longitude points were extracted from their respective grids. Hereafter we will employ the following naming conventions: 1) NMQ: the radar-only precipitation estimates from NMQ system, 2) PPS: the mosaicked PPS products from NCDC in the warm season and/or NCEP StageII radar-only products in cool season, 3) ASOS: the ASOS rain gauge precipitation, and 4) StageIV: the StageIV multisensor precipitation estimation products. It should be noted that we generally refer to quantities as precipitation rather than rainfall; no effort was made to identify rainfall versus frozen precipitation cases except in our experiment designed to focus on cold rain situations. The ASOS and StageIV verification data are liquid precipitation amounts. The NMQ estimates are for liquid precipitation rate, specifically formulated to estimate the liquid equivalent precipitation when temperature data indicate the radar is detecting snow. The PPS assumes all radar detections are of rainfall (Fulton et al. 1998). This is a limitation in some weather situations.
b. Major algorithm differences between NMQ and PPS There are some functional differences between NMQ and PPS that may contribute to the comparison results, such as the quality control (QC) procedure, the Z–R relationship, the freezing-level identification, etc. The PPS QC concentrated on ground clutter, anomalous
VOLUME 13
propagation, and fixed ground targets such as buildings. The NMQ QC treats these phenomena and further attempted to identify and remove more categories of nonprecipitation reflectivity, including biota and sun strobes. In the NMQ system the Z–R relationships are specified dynamically, while in the PPS system they are set by the radar meteorologist and applied over the entire umbrella. The NMQ system incorporates freezing-level identification and precipitation phase based on data from the Rapid Update Cycle 2 (RUC2) model (Benjamin et al. 2004). The PPS system is being supplanted by newer dualpolarization-based algorithms. The NMQ system is still evolving. A vertical reflectivity profile correction was implemented in the NMQ system just after our study period (Zhang et al. 2011). Further studies of the effects of this correction, particularly in mountain regions, are ongoing.
c. Quality of verification data Gauge reports are commonly treated as ground truth and used for operational bias correction in NWS operations. Their readings are not subject to the numerous errors affecting radar estimates. This study relied primarily on ASOS rain gauge reports since these observations are of generally high quality, and utilize a weighing mechanism less subject to mechanical error than the tipping-bucket mechanism commonly used at automated reporting sites. Some details about local random errors in tipping-bucket rain gauge measurements are described in Ciach (2003). The ASOS reports used in this study were from 375 sites at which the observations were manually augmented, indicating the real-time attendance by observers. While the gauge measurements themselves are subject to errors, the errors are considerably smaller than those from radar estimates. Therefore, gauge reports are critical to operational precipitation estimation, and they are used to correct biases in radar estimates. Our analysis of StageIV data is potentially complicated by the fact that both NMQ and PPS products generally enter RFC operational analyses, which are in turn composited for StageIV. However, since StageIV is generally prepared from both radar and rain gauge information in addition to receiving manual quality control, our aim in comparing NMQ and PPS radar-only products with StageIV was to determine which one generally comes the closest to the best gridded precipitation estimates currently available in near–real time, and if the general results are consistent with that verified with ASOS. No specific quality control was applied to either the ASOS gauge reports or the StageIV gridded data. Our intent was not to assess the absolute accuracy of the radar estimates, but rather their relative ability to approximate two generally reliable reference datasets. As reported in
JUNE 2012
1083
WU ET AL.
TABLE 1. Statistical summary for 24-, 6-, and 1-h precipitation verification against ASOS in warm and cool seasons, including valid data number (case), CC, RMSE (mm), and CONUS AVG (mm) with the numbers 1 denoting ASOS, 2 denoting NMQ, and 3 denoting PPS. Precipitation amount stratification is based on the ASOS-reported amount. Time 24 h
Warm
Cool
6h
Warm
Cool
1h
Warm
Cool
Range (mm)
Case
CC12
CC13
RMSE12
RMSE13
AVG1
AVG2
AVG3
$0 $2.5 $12.5 $25 $0 $2.5 $12.5 $25 $0 $2.5 $12.5 $25 $0 $2.5 $12.5 $25 $0 $2.5 $5 $0 $2.5 $5
45205 13942 5134 2253 64039 15431 5154 2230 181408 18715 4981 1823 232297 21454 4470 1266 194885 3898 1952 241052 3984 1547
0.871 0.826 0.720 0.616 0.846 0.789 0.672 0.553 0.855 0.778 0.612 0.476 0.825 0.730 0.530 0.394 0.781 0.624 0.530 0.493 0.228 0.095
0.843 0.785 0.662 0.535 0.783 0.699 0.522 0.356 0.828 0.738 0.552 0.383 0.770 0.635 0.367 0.165 0.759 0.590 0.488 0.645 0.298 0.144
4.5 8.2 12.8 17.5 4.5 9.0 14.8 20.3 2.1 6.6 11.7 16.6 2.0 6.4 12.5 19.5 0.7 5.2 7.1 1.1 8.1 12.8
5.3 9.5 14.8 20.2 4.7 9.6 15.8 21.7 2.4 7.5 13.0 19.0 2.0 6.5 13.0 20.0 0.8 5.6 7.7 0.6 4.1 6.2
3.1 9.8 21.0 32.1 2.3 9.2 21.0 32.4 0.8 7.4 18.1 28.2 0.7 6.6 17.2 27.2 0.1 5.2 8.0 0.1 3.9 6.1
3.4 10.6 22.1 33.6 2.6 10.3 22.8 35.0 0.9 8.0 19.2 29.7 0.7 7.0 18.8 32.2 0.1 5.3 8.2 0.1 4.5 7.8
3.8 11.9 23.9 36.2 2.0 8.1 17.3 25.9 1.0 8.7 20.5 32.0 0.5 5.3 13.2 21.4 0.2 5.6 8.7 0.1 3.0 4.7
later sections, the radar–gauge correlations at individual sites, for both NMQ and PPS, were generally .0.7—a characteristic of correlations with specially maintained gauge networks. The results of this evaluation should be fundamentally consistent even in the event that additional, differing verification data become available.
Thus, by substituting critical z value of 1.64, 1.96, or 2.58 for 90%, 95%, or 99% significance level, the corresponding minimum significant correlation coefficient rmin pffiffiffiffi can be estimated using rmin ’ z/ N . That is, the correlation coefficient r is considered statistically significant when r $ rmin .
d. Statistical analysis approaches
3. Verification against ASOS rain gauge observations
For the ground truth taken from ASOS precipitation reports, NMQ–ASOS and PPS–ASOS correlation coefficients and corresponding root-mean-square errors are calculated for precipitation amounts larger than 2.5, 12.5, and 25 mm in warm season, cool season, and selected cold rain events. The categorical scores such as POD, FAR, and CSI are calculated for various precipitation amount detection based on the ASOS reports. The quantile analyses are conducted to determine NMQ, PPS, and ASOS rain gauge distribution percentiles and corresponding typical values. For the best available blended gridded product StageIV MPE data, the spatial pattern correlations of NMQ–StageIV and PPS–StageIV are calculated to evaluate independently as a supplement to the ASOS verification. Since the sizes of data samples are large in this study, the significance z test of the correlation coefficients can be simplified (see z-test details in Sprinthall 2003). The pffiffiffiffi z score can be estimated on the upper bound by z ’ r N for a correlation coefficient r with sample number N.
In this section, we first present the results of NMQ and PPS as verified against ASOS in terms of linear correlation and RMSE, then the spatial variation of linear correlation over ASOS sites and stratification according to RFC coverage area. The gridded radar estimates were scored in categorical (yes–no) terms relative to specific precipitation amounts. The statistics were also calculated for cold rain events, which represented a challenge to radar estimation because of vertical profile of reflectivity effects. We conclude with quantile analysis results to assess the reliability of the precipitation distribution.
a. Radar–rain gauge correlation The linear correlation between radar QPE and rain gauge observations was calculated from valid data at 375 manually augmented ASOS sites across the CONUS to determine the relative skill of NMQ and PPS with respect to ASOS. The statistics for the entire CONUS are summarized in Table 1 for warm and cool seasons, with four
1084
JOURNAL OF HYDROMETEOROLOGY
VOLUME 13
TABLE 2. Rank CC of NMQ vs ASOS (denoted as NMQ) and PPS vs ASOS (denoted as PPS) and corresponding case number (case) for 24-, 6-, and 1-h precipitation in warm and cool seasons. 24 h
6h
1h
Rank CC
NMQ
PPS
Case
NMQ
PPS
Case
NMQ
PPS
Case
Warm Cool
0.818 0.752
0.748 0.690
45205 64039
0.807 0.788
0.712 0.702
181408 232297
0.751 0.749
0.637 0.661
194885 241052
precipitation amount regimes listed as ‘‘range’’ and the sample size shown in the ‘‘case’’ column for the significance level reference described in section 2d, ‘‘CC’’ and ‘‘RMSE’’ defined the same as before, ‘‘AVG’’ the mean value of corresponding cases, and the numbers 1–3 denoting ASOS ‘‘1,’’ NMQ ‘‘2,’’ and PPS ‘‘3,’’ respectively. The results show that NMQ generally had better skill scores (higher CC and lower RMSE) than PPS, especially for larger precipitation categories ($12.5 and $25 mm). One exception was the 1-h precipitation in the cool season, in which PPS had higher skill compared to NMQ (however, the differences in CC values are as small as the noise level and not above the significance level). It appears that both systems had better skill in the warm season than in the cool season. Compared to ASOS mean precipitation values, NMQ values were generally biased high, while PPS values were biased high in the warm season and low in the cool season. A primary difference between NMQ and PPS is that NMQ is designed to identify different hydrometeor types and rainfall regimes and applies different Z–R relationships to each, while PPS uses a single Z–R relationship for the entire umbrella. We note that in most of the subsamples of the data, the NMQ had smaller bias than did the PPS estimates. The finding that NMQ produces larger amounts than PPS during the cool season could be due to its explicit identification and treatment of snow aloft. Since snow is less reflective than raindrops with equivalent
liquid mass, NMQ applies a Z–R relationship that compensates by nearly doubling the precipitation rate relative to a Z–R relationship based on liquid. The finding that the then-current version of NMQ produced generally lower precipitation than PPS during the warm season is more difficult to explain. Possibilities include differing assumptions about Z–R relationships, which are adjusted often in time and space within NMQ but are generally fixed for long periods in the PPS products contributing to StageII, and also the possibility that anomalous accumulations from biota are added to real precipitation more often in the PPS products. Some of the assumptions about the utility of the linear correlation coefficient depend on the variables being bivariate normal. Precipitation distributions are not normal. Therefore, we also analyzed the data according to the nonparametric rank correlation. The Spearman’s rank correlation coefficients were calculated based on the entire data sample, as shown in Table 2. In this analysis, the NMQ had a consistently higher correlation with ASOS than did the PPS. To further explore the cause of different results obtained from the linear CC and the rank CC for NMQ 1-h precipitation in cool season, we calculated for each of the ASOS sites the linear CC and rank CC of NMQ and PPS with ASOS. The differences in linear and rank correlations for the two systems (i.e., CC of NMQ versus ASOS minus CC of PPS versus ASOS) are plotted in Fig. 1, in
FIG. 1. The rank correlation differences (solid line) and the linear correlation differences (dashed line) between NMQ vs ASOS and PPS vs ASOS for 1-h precipitation in the cool season (October 2009–March 2010).
JUNE 2012
WU ET AL.
1085
FIG. 2. Linear correlation coefficients of time series of (a) NMQ vs ASOS and (b) PPS vs ASOS at valid ASOS sites, (c) averaged correlation coefficients for 12 RFCs, and (d) the map showing RFCs with higher skills for NMQ (blue), PPS (red), and no significant difference (yellow) compared to each other for 24-h precipitation in the warm season (April–September 2009).
which the linear CC difference is the dashed line and the rank CC difference is the solid line. It is evident that at most of the ASOS sites, the correlation coefficients of NMQ versus ASOS were higher than that of PPS versus ASOS. At 12 sites, most in the north–central portion of the CONUS, the linear correlations for NMQ were much lower (shown as the significantly negative dashed lines in Fig. 1)—a finding that can be traced to a number of unreasonably large NMQ precipitation estimates at those points. Because of the limitations of the data, the causes of the large estimates cannot be identified in this study. Since the 1-h data samples were collected only at 0000, 0600, 1200, and 1800 UTC (the same as 6-h data), the results only represent these four measurements of a day, and it is possible for some apparent inconsistencies to arise between the 1-h and the 6- and 24-h NMQ–gauge correlations—the 6- and 24-h results being calculated from data covering entire days.
b. Spatial variation of skill, and stratification by RFC area To identify any geographical dependency in the verification statistics, the spatial variations of time series correlation coefficients between 24-h NMQ–PPS and ASOS were plotted as shown in Figs. 2 and 3 for warm season and cool season, respectively, with regional averaged values displayed for 12 RFCs over CONUS as well, which
are Northwest (NW), Missouri Basin (MB), North Central (NC), Ohio (OH), Northeast (NE), Middle Atlantic (MA), California Nevada (CN), Colorado Basin (CB), Arkansas-Red Basin (AB), West Gulf (WG), Lower Mississippi (LM), and Southeast (SE). The spatial pattern of correlation in Fig. 2 shows some features that could be anticipated, such as generally lower values west of the 1108 meridian, where lighter precipitation, a sparse radar network, and some beam blockage zones compromise radar estimates. In Fig. 3, a general decrease in correlations over the northern and western United States is evident, as might be expected in areas subject to much snow and mixed precipitation. According to Figs. 2c and 3c, the largest differences of correlation coefficients between NMQ versus ASOS and PPS versus ASOS exist in CN and CB for the warm season and in NW and WG for the cold season. While in general NMQ showed the higher skill over the entire CONUS, the results vary for individual RFCs. Compared to NMQ, PPS had higher skills over CN, AB, and WG and similar skills (no significant difference) in NE and SE in the warm season; while in the cool season, PPS shows higher skill in NC, OH, and NE and similar skills in MB, AB, and MA, based on Figs. 2d and 3d. The comparison of the radar–rain gauge linear correlation analysis and RFC stratification between NMQ and PPS implies that NMQ had overall higher skill over the
1086
JOURNAL OF HYDROMETEOROLOGY
VOLUME 13
FIG. 3. As in Fig. 2 but for the cool season (October 2009–March 2010).
entire CONUS, though the PPS can have significantly higher skills than NMQ in some RFC areas or over individual ASOS sites. For both cool and warm seasons, the spatial extent of areas with higher NMQ correlation was larger than that for areas with higher PPS correlation.
c. Categorical scores For some decision-making processes, a prime consideration is whether or not a certain precipitation amount threshold is exceeded. The POD, CSI, and FAR categorical scores are defined as POD 5 H/(H 1 M), FAR 5 F/(F 1 H), and CSI 5 H/(H 1 M 1 F), where H is the number of hits (events predicted and occurred), F is the number of false alarms (events predicted but did not occur), and M is the number of misses (events did not
predict but occurred). Thus, for a perfect estimate, POD 5 1 (no misses), FAR 5 0 (no false alarms), and CSI 5 1 (no misses and no false alarms). The categorical scores were calculated for both NMQ and PPS precipitation events exceeding 2.5, 12.5, and 25 mm for 24- and 6-h precipitation accumulations based on ASOS precipitation reports. For 1-h accumulations, in which 25-mm observations were very rare, the thresholds of 2.5, 5, and 7.5 mm were used for the calculation. The results are summarized in Table 3. It shows that, generally, in the warm season, NMQ had higher CSI and lower FAR but lower POD, and in the cool season NMQ had higher POD and CSI but also higher FAR compared to PPS. POD and CSI increase with longer averaging time (i.e., 24 h . 6 h . 1 h). The
TABLE 3. Categorical scores for precipitation detection on 24-, 6-, and 1-h accumulations in warm and cool seasons. The precipitation detection ranges are 2.5, 12.5, and 25 mm for 24- and 6-h accumulations and 2.5, 5, and 7.5 mm for 1-h accumulations. Warm season NMQ
24 h
6h
1h
Cool season PPS
NMQ
PPS
Range (mm)
POD
FAR
CSI
POD
FAR
CSI
POD
FAR
CSI
POD
FAR
CSI
$2.5 $12.5 $25 $2.5 $12.5 $25 $2.5 $5 $7.5
0.90 0.79 0.67 0.83 0.71 0.59 0.70 0.63 0.58
0.22 0.28 0.37 0.24 0.35 0.45 0.33 0.40 0.45
0.72 0.60 0.48 0.66 0.52 0.40 0.52 0.44 0.39
0.90 0.80 0.70 0.83 0.72 0.60 0.71 0.64 0.58
0.33 0.33 0.41 0.32 0.38 0.51 0.35 0.43 0.50
0.62 0.58 0.47 0.60 0.50 0.37 0.51 0.43 0.37
0.82 0.77 0.67 0.75 0.66 0.59 0.63 0.56 0.52
0.28 0.31 0.38 0.27 0.38 0.53 0.36 0.49 0.64
0.63 0.57 0.48 0.58 0.47 0.35 0.47 0.37 0.27
0.76 0.62 0.43 0.68 0.43 0.26 0.45 0.29 0.21
0.28 0.28 0.34 0.27 0.33 0.50 0.35 0.50 0.64
0.59 0.50 0.35 0.54 0.35 0.21 0.36 0.23 0.15
JUNE 2012
WU ET AL.
1087
FIG. 4. The linear correlation coefficients of NMQ vs ASOS (dark) and PPS vs ASOS (pattern) for (a) 6- and (b) 1-h cold rain events, selected based on ASOS temperature range 28–58C during October 2009–March 2010.
POD dropped sharply for PPS in cool season for the larger precipitation amounts. We noted that for all conditions, the CSI—which is effectively the fraction of correct forecasts in the ‘‘critical’’ subset where the event occurred or was forecasted—was at least 0.01 higher for the NMQ than for PPS. The differences in CSI were largest ($0.05) for lower amount thresholds in the warm season and in the cool season.
d. Cold rain (28–58C) events A particularly challenging situation for radar QPE is with a low freezing level when bright-band effects and uncertainty about hydrometeor phase can render basic assumptions incorrect. While neither PPS nor NMQ during this time period had reflectivity profile corrections, NMQ did incorporate logic that accounts for hydrometeor phase, including tracking of the freezing level. The cold rain data were selected based on ASOS site temperatures ranging from 28 to 58C for 6- and 1-h rain gauge measurements during October 2009–March 2010.
While some of these cases might have included wet snow at the surface, such conditions generally produce rain. Since estimates of the location of the melting layer based on radar or numerical analyses are subject to spatial errors, surface temperature was used to identify appropriate cases. The 24-h events were not considered in the cold rain dataset because it was difficult to locate many cases in which the temperature criteria held throughout the event. The evaluation results of linear CC, RMSE, and average values are displayed in Figs. 4–6, with the number of cases plotted below the precipitation regime labeled for the x axis. Within this sample of cases, NMQ estimates had the higher correlation coefficients for both 6- and 1-h precipitation accumulations (Fig. 4). NMQ produced lower RMSE than did the PPS for 1-h values but higher RMSE for 6-h values (Fig. 5). The higher RMSE might have been due to a high bias for larger precipitation events ($12.5 mm for 6-h precipitation and $2.5 mm for 1-h precipitation) as shown in Fig. 6. Overall,
FIG. 5. The RMSEs of NMQ vs ASOS (dark) and PPS (Stage II) vs ASOS (pattern) for (a) 6and (b) 1-h cold rain events, selected based on ASOS temperature range 28–58C during October 2009–March 2010.
1088
JOURNAL OF HYDROMETEOROLOGY
VOLUME 13
FIG. 6. The average values (mm) of ASOS (dark), NMQ (string pattern), and PPS (check pattern) for (a) 6- and (b) 1-h cold rain events, selected based on ASOS temperature range 28– 58C during October 2009–March 2010.
however, NMQ–PPS differences in RMSE were modest except in the small fraction of heavier precipitation events. Recent improvements in NMQ, such as vertical reflectivity profile corrections (Zhang et al. 2011), might reduce such errors.
e. Precipitation amount distributions The NMQ and PPS data used in this study were interpolated to the HRAP grid, which is about 4 3 4 km2
resolution—the smallest scale treated in NWS hydrologic operations at present—and roughly the size of the smallest basins considered in flash flood monitoring. An important consideration for hydrologic applications is the ability of a remote sensing system to deliver not only unbiased but realistically distributed QPE values. This constraint is necessary to ensure proper streamflow response to the precipitation estimates, particularly for flooding in small basins. To assess the reliability of the
FIG. 7. The 24-h precipitation histogram calculated based on 0.5-mm-unit precipitation from ASOS (solid), NMQ (dashed), and PPS (dotted) data for the warm season (April–September 2009): (a) the smoothed histogram for small precipitation of probabilities .5%, (b) the smoothed histogram for large precipitation of probabilities ,5%, (c) the accumulative histogram for top 10% large precipitation, and (d) the accumulative histogram for up to 90% of precipitation.
JUNE 2012
1089
WU ET AL.
FIG. 8. As in Fig. 7, but for the cool season (October 2009–March 2010).
precipitation probability density function based on the small spatial HRAP grid data, a quantile analysis was performed to obtain precipitation histograms of NMQ and PPS. The HRAP grid NMQ and PPS data were interrogated over the ASOS sites. The results are shown in Figs. 7 and 8 for warm and cool seasons, respectively. The histograms were discretized based on 0.5-unit precipitation (i.e., 0.5 mm), thus the minimum precipitation that can be displayed is 0.25 mm. As marked in the plots, the dark lines show ASOS, the red lines show NMQ, and the blue lines show PPS. In the warm season, for light precipitation less than 1 or 0.5 mm, the probability for any one amount could vary up to 10% between NMQ and ASOS and up to 20% between PPS and ASOS. The NMQ and PPS precipitation amounts corresponding to cumulative probability above 75% differed from ASOS by only a few millimeters. For
the overall distribution, NMQ agreed better with ASOS compared to PPS. In the cool season, the differences between PPS and ASOS were more pronounced compared to those between NMQ and ASOS. Similar analyses as applied to 6- and 1-h precipitation, and some typical numbers, are summarized in Table 4. In general, the NMQ distribution for higher cumulative probabilities (and amounts) more closely matched the ASOS distribution than did the PPS values.
4. Verification relative to StageIV gridded MPE The statistical verification against StageIV was performed to give a supplementary evaluation of the two radar QPE systems in terms of pattern correlation for precipitation thresholds $1, 2.5, 5, 10, 12.5, and 25 mm, respectively. Our aim was to assess the degree to which
TABLE 4. Precipitation distribution amount (mm) corresponding to typical percentiles 75%, 95%, and 99% of 24-, 6-, and 1-h accumulations in the warm and cool seasons. 24 h
6h
1h
Season
%
ASOS
NMQ
PPS
ASOS
NMQ
PPS
ASOS
NMQ
PPS
Warm
75% 95% 99% 75% 95% 99%
5.6 26 54 4.1 24 49
6.7 27 54 4.9 25 54
7.3 28 57 4.1 19 39
2.7 15 34 2.5 13 27
3.1 16 35 2.6 13 31
3.6 17 37 2.2 10 21
0.8 5.5 16 0.9 3.6 8.3
0.9 5.7 16 0.7 3.8 9.7
1.0 6.1 17 0.6 2.8 6.4
Cool
1090
JOURNAL OF HYDROMETEOROLOGY
NMQ and PPS approximated the final best-estimate precipitation pattern on a day-to-day basis. It is true that both PPS and NMQ input are used in the StageIV analysis process over much of the United States; however, the introduction of quality control and rain gauge data has major effects in altering the original radar input. Over the western United States, the 6- and 24-h StageIV precipitation amounts are strongly influenced by rain gauge data, which is the primary driver for operational analyses there. Figure 9 is an example to show the differences among radar-only StageII (PPS), NMQ, and StageIV precipitation. The effects of biota, in this case migrating birds, are apparent in these images showing 1-h accumulations from 0600 UTC 2 October 2010. The PPS precipitation (Fig. 9a) makes only minimal discrimination among aerial targets, and most biota return is accepted as light precipitation, appearing as circular areas of accumulation around each radar site. The NMQ quality control removes substantial areas of these echoes (Fig. 9b), resulting in a much closer match with the final manually edited estimates in the StageIV product (Fig. 9c). Figure 10 displays the pattern correlation coefficients and root-mean-square errors of PPS and NMQ against StageIV for 24-, 6-, and 1-h precipitation in warm and cool seasons. In this analysis, the pattern correlation was computed for each day’s national grid when the NMQ, PPS, and StageIV grids were reduced to 0–1 binary values based on the threshold amounts mentioned above. The traces in the plots represent seasonal averages for each threshold. Compared to PPS, NMQ consistently had a higher pattern CC with StageIV for both warm and cool seasons and a smaller RMSE for warm season and slightly larger RMSE in cool season. It appears that differences between CC patterns were larger in the cool season than in the warm season, and the values decreased sharply for larger amounts. Though radar input strongly controls the general precipitation pattern in StageIV, rain gauge input also has an appreciable influence in the location and timing of heavier rain amounts. The 24-h precipitation average values of StageIV, NMQ, and PPS are plotted in Fig. 11 to illustrate the bias. In the warm season, both NMQ and PPS tend to overestimate the overall precipitation compared to StageIV; while in the cool season, NMQ tends to overestimate and PPS underestimate. NMQ’s precipitation estimation appears closer to the verification StageIV in both warm and cool seasons. The results are consistent with the verification against ASOS.
5. Concluding remarks This study evaluated the radar precipitation estimates from the NMQ and PPS products, with verification data from the ASOS precipitation gauge reports and NCEP
VOLUME 13
FIG. 9. The 1-h precipitation accumulations from 0600 UTC 2 Oct 2010: (a) radar-only StageII, (b) NMQ, and (c) StageIV.
StageIV gridded MPE products in terms of the following metrics: linear correlation and pattern correlation, spatial and temporal variation, root-mean-square error, precipitation distribution, categorical scores, and cold rain
JUNE 2012
WU ET AL.
1091
FIG. 10. (a)–(c) Pattern CCs and (d)–(f) RMSEs of NMQ vs StageIV (solid lines) and PPS vs StageIV (dashed lines) for (top to bottom) 24-, 6-, and 1-h precipitation. Red color for the warm season (April–September 2009) and blue color for the cool season (October 2009–March 2010).
events. The results indicated that NMQ radar precipitation estimates generally had higher correlation and smaller errors relative to rain gauge reports when compared with PPS over the entire CONUS. Over individual regions such as RFC coverage domains or different seasons, either NMQ or PPS might exhibit the higher skill level. The PPS skills were appreciably lower for higherprecipitation categories ($12.5 and $25 mm) than for smaller amount categories. The NMQ precipitation amount probability distribution agreed with ASOS more closely than did those of the PPS, especially during the cool season. The categorical scores for detection of larger amounts varied by season, but generally NMQ yielded better values than PPS, particularly in the cool season. For cold rain events, NMQ had higher CC and PPS had
lower RMSE for 6-h and higher RMSE for 1-h cold rains, while on average NMQ was overestimated and PPS was underestimated compared to ASOS. It appears that random error in 1-h NMQ estimates was greater than for the corresponding PPS estimates, as shown in the Table 1 results. The overall skill at 1-h for either radar estimate was much lower in the cool season than in the warm season. For the much longer 6- and 24-h intervals, these random errors appear to cancel out, resulting in improved statistics for the cool season overall. Note the ‘‘cool season’’ set had much larger coverage (.230 000 cases), while the ‘‘cold rain’’ set was much smaller, though still big (.35 000 cases); hence, results could be different. The spatial pattern correlation between NMQ and StageIV gauge–radar estimates
1092
JOURNAL OF HYDROMETEOROLOGY
VOLUME 13
FIG. 11. The 24-h precipitation average values (mm) of StageIV (dark), NMQ (string pattern), and PPS (check pattern) for (a) the warm season (April–September 2009) and (b) the cool season (October 2009–March 2010), with thresholds 0, 2.5, 12.5, and 25 mm.
were consistently higher than those of the PPS, suggesting that algorithm elements such as more sophisticated quality control make a significant contribution to the overall quality of the NMQ. As in any verification study, the nature of ‘‘ground truth’’ is debatable. In this study, we used the ASOS gauge data and StageIV data to represent the bestquality station observations and the best-quality gridded precipitation products that are generally available in operations. In this study we use correlation coefficient and root-mean-square error to evaluate and to compare the two radar-only QPE systems. Thus, the uncertainties that may be caused by the systematic errors of the rain gauge measurements are largely avoided. Some results may differ if other verification data are applied; nevertheless, the differences should not be anything fundamental. On balance, the presented results confirm those noted in other studies (e.g., Kitzmiller et al. 2011; Zhang et al. 2011) and reported subjectively by operational users: that NMQ has certain consistent advantages over the PPS. Note that NMQ is still evolving; several new features including vertical reflectivity profile correction were added since the completion of this study (K. Howard and J. Zhang 2011, personal communication; Vasiloff et al. 2007). While this study focused on radar-only QPE, the dataintegration capabilities of NMQ provide possibilities for more efficient implementation of multisensor algorithm enhancements than is currently feasible. Historically, the MPE design was constrained by the need to ingest radar precipitation products generated elsewhere, followed by merging with rain gauge and satellite data, and expert human input. While human expertise in quality control and data blending remains crucial in creating high-quality QPE products, the NMQ approach enhances the capability for multisensor QPE development and research-tooperation transitions, thus facilitating improvements to
NOAA’s water resource services. The NMQ products from the real-time prototype generating system are routinely incorporated in RFC operations and QPE products (Zhang et al. 2011). While operational WSR-88D units will soon be upgraded for dual-polarimetric measurements (Giangrande and Ryzhkov 2008), improvements to horizontal polarization QPE, such as NMQ, remain important for utilization of other operational networks such as Terminal Doppler Weather Radar and for interpretation of the large archive of WSR-88D data collected prior to 2011, which represents our best estimates of precipitation at high spatial–temporal resolution during that period. Acknowledgments. We wish to thank NSSL staff including Kenneth Howard, Carrie Langston, and Jian Zhang for advice and supplying the NMQ radar products. Brian Nelson at NCDC supplied the warm-season mosaicked PPS products. Our project sponsors (Jeffrey Myers and Thomas Adams, Ohio River Forecast Center, and Greg Story, West Gulf River Forecast Center) gave valuable advice on the conduct of the study. ASOS data archives were maintained by the Meteorological Development Laboratory. Financial support was granted by the Advanced Hydrologic Prediction Service program. We thank the anonymous reviewers for their valuable comments, suggestions, and advice that contributed to improving the manuscript. REFERENCES Benjamin, S. G., and Coauthors, 2004: An hourly assimilation– forecast cycle: The RUC. Mon. Wea. Rev., 132, 495–518. Ciach, G. J., 2003: Local random errors in tipping-bucket rain gauge measurements. J. Atmos. Oceanic Technol., 20, 752– 759. Fulton, R., 1998: WSR-88D polar-to-HRAP mapping. National Weather Service, Hydrologic Research Laboratory Tech.
JUNE 2012
WU ET AL.
Memo. 9-9-01, 33 pp. [Available online at http://www.nws. noaa.gov/oh/hrl/papers/wsr88d/hrapmap.pdf.] ——, 2005: Multisensor Precipitation Estimator (MPE) workshop. Advanced Hydrologic Applications Course, Kansas City, MO, National Weather Service Training Center, 74 pp. [Available online at http://www.nws.noaa.gov/oh/hrl/papers/wsr88d/ MPE_workshop_NWSTC_lecture2_121305.pdf.] ——, J. Breidenbach, D.-J. Seo, D. Miller, and T. O’Bannon, 1998: The WSR-88D rainfall algorithm. Wea. Forecasting, 13, 377–395. Giangrande, S., and A. Ryzhkov, 2008: Estimation of rainfall based on the results of polarimetric echo classification. J. Appl. Meteor. Climatol., 47, 2445–2462. Kitzmiller, D., and Coauthors, 2011: Evolving multisensor precipitation estimation methods: Their impacts on flow prediction using a distributed hydrologic model. J. Hydrometeor, 12, 1414–1431. ——, D. Miller, R. Fulton, and F. Ding, 2012: Radar and multisensor precipitation estimation techniques in National Weather Service hydrologic operations. J. Hydrol. Eng., in press. Langston, C., J. Zhang, and K. Howard, 2007: Four-dimensional dynamic radar mosaic. J. Atmos. Oceanic Technol., 24, 776–790. Lin, Y., and K. E. Mitchell, 2005: The NCEP Stage II/IV hourly precipitation analyses: Development and applications. Preprints, 19th Conf. on Hydrology, San Diego, CA, Amer. Meteor. Soc., 1.2. [Available online at http://ams.confex.com/ams/ Annual2005/techprogram/paper_83847.htm.] Maddox, R., J. Zhang, J. Gourley, and K. Howard, 2002: Weather radar coverage over the contiguous United States. Wea. Forecasting, 17, 927–934.
1093
Reed, S. M., and D. R. Maidment, 1999: Coordinate transformations for using NEXRAD data in GIS-based hydrologic modeling. J. Hydrol. Eng., 4, 174–182. Schaake, J., 1989: Importance of the HRAP grid for operational hydrology. Preprints, United States/People’s Republic of China Flood Forecasting Symp., Portland, OR, NOAA/NWS, 331– 355. Seo, D.-J., C. R. Kondragunta, D. Kitzmiller, K. Howard, J. Zhang, and S. Vasiloff, 2005: The National Mosaic and Multisensor QPE (NMQ) Project—Status and plans for a community testbed for high-resolution multisensor quantitative precipitation estimation (QPE) over the United States. Preprints, 19th Conf. on Hydrology, San Diego, CA, Amer. Meteor. Soc., 1.3. [Available online at http://ams.confex.com/ams/Annual2005/ techprogram/paper_86485.htm.] Sprinthall, R. C., 2003: Basic Statistical Analysis. 7th ed. Pearson, 672 pp. Vasiloff, S., and Coauthors, 2007: Improving QPE and very short term QPF: An initiative for a community-wide integrated approach. Bull. Amer. Meteor. Soc., 88, 1899–1911. Wilson, J., and E. A. Brandes, 1979: Radar measurement of rainfall—A summary. Bull. Amer. Meteor. Soc., 60, 1048–1058. Zhang, J., K. Howard, and J. J. Gourley, 2005: Constructing threedimensional multiple-radar reflectivity mosaics: Examples of convective storms and stratiform rain echoes. J. Atmos. Oceanic Technol., 22, 30–42. ——, and Coauthors, 2011: National Mosaic and Multi-Sensor QPE (NMQ) System: Description, results, and future plans. Bull. Amer. Meteor. Soc., 92, 1321–1338.