AUGUST 2005
MCPHEE AND MARGULIS
441
Validation and Error Characterization of the GPCP-1DD Precipitation Product over the Contiguous United States JAMES MCPHEE*
AND
STEVEN A. MARGULIS
Department of Civil and Environmental Engineering, University of California, Los Angeles, Los Angeles, California (Manuscript received 29 September 2004, in final form 24 January 2005) ABSTRACT A validation and error characterization study of the Global Precipitation Climatology Project, 1 degree daily (GPCP-1DD) precipitation product over the contiguous United States is presented. Daily precipitation estimates over a 1° grid are compared against aggregated precipitation values obtained from the forcing field of the North American Land Data Assimilation System (LDAS). LDAS daily values are consistent with the National Centers for Environmental Prediction Climate Prediction Center (CPC) gauge-based daily precipitation product and hence are regarded as realistic ground-truth values with full coverage of the United States. Continuous and categorical measures of skill are presented, so that both the ability of GPCP-1DD to identify a precipitation event and its accuracy in determining cumulative precipitation amounts are evaluated. Daily values are aggregated into seasonal averages, and spatial averages are computed for five arbitrarily defined zones that cover most of the study area. Results show that in general there is good agreement between GPCP-1DD and LDAS values, except for areas where GPCP-1DD is unable to identify high-intensity events, particularly the Pacific coast north of parallel 40°N. Computation of continuous statistics shows that average bias is negligible in most areas of the United States except for humid regions north of parallel 40°N. However, the rmse statistics shows that differences in estimated precipitation for individual 1° cells can be significant, exceeding in most cases the magnitude of the average precipitation. Beyond the validation, the error characterization presented here can significantly enhance the utility of the GPCP-1DD product by providing necessary inputs for ensemble hydrologic modeling and forecasting.
1. Introduction For many decades, researchers have worked on improving methods for estimating global precipitation. Global precipitation estimates can only be obtained using satellite observations because rain gauge measurements are not available in vast regions of the globe (including oceans and unpopulated areas). With the development of advanced sensors and merging algorithms, new products that combine different kinds of data have been made available to the research community. One such product is the Global Precipitation Cli-
* Additional affiliation: Departamento de Ingenieria Civil, Facultad de Ciencias Fisicas y Matematicas, Universidad de Chile, Santiago, Chile.
Corresponding author address: Steven Margulis, 5732D Boelter Hall, Department of Civil and Environmental Engineering, UCLA, Los Angeles, CA 90095. E-mail:
[email protected]
© 2005 American Meteorological Society
JHM429
matology Project, 1 degree daily (GPCP-1DD), of the Global Precipitation Climatology Center (Huffman et al. 2001). Finescale estimates of global precipitation are required to support diverse research efforts such as numerical weather model initialization and validation, climate change analyses, and surface hydrology. Traditional rain gauge measurements have limitations in these applications. Rain gauge spatial coverage is sparse in many regions of the world, including vast extensions of land and oceans, and the lack of a reporting time standard among regions of the world increases the difficulty of generating global daily analyses (Huffman et al. 2001). Furthermore, point measurements provided by rain gauges are statistically noisy and can be extrapolated to the surrounding areas only with extreme caution (Petty 1995). On the other hand, radar estimates provide better spatial coverage at ground level, but are limited to isolated areas of the globe (United States, Europe, Japan, etc.) and are subject to diverse sources of uncertainty that preclude their direct adoption as accurate ground values. These sources of
442
JOURNAL OF HYDROMETEOROLOGY
uncertainty include the range and sampling volume effect, interstorm and intrastorm variability of raindrop size distributions, and statistical effects affecting the validity of functions relating reflectivity and rain intensity (Datta et al. 2003; Morin et al. 2003; Uijlenhoet et al. 2003; Joyce et al. 2004; Sharif et al. 2004). Meteorological satellites offer the possibility of acquiring rainfall data for those areas of the globe that are not covered by dense surface observational networks. Furthermore, they can achieve this at a fraction of the cost of a surface-based network of equivalent spatial density. This advantage is countered by the fact that satellite observations measure quantities only indirectly related to precipitation intensity. Infrared (IR) sensors aboard geosynchronous satellites provide frequent cloud-top temperature measurements, which are related to rainfall rate using empirical algorithms. In cases where this indirect relationship is weak, these algorithms can perform poorly (Joyce et al. 2004). Passive microwave (PMW) sensors, on the other hand, estimate precipitation more directly by being sensitive to internal emissions from raindrops and hydrometeor-induced scattering of upwelling radiation from the earth’s surface. Unfortunately, passive microwave sensors are only deployed on polar-orbiting satellites, thus limiting significantly the spatial and temporal sampling resolution associated with these products. Only by averaging significantly over time can some of these sampling deficiencies be corrected. However, because of the sunsynchronous orbit of most PMW sensor-equipped satellites, parts of the diurnal cycle will never be adequately sampled. The aforementioned characteristics of satellite-based rainfall estimates have prompted research efforts aimed at combining the strengths of each particular method, of which the GPCP-1DD is one example. Other satellite-based datasets are being continuously developed and improved. Those already in use by the modeling/ diagnostic community (Adler et al. 2001) include merged products (Xie and Arkin 1996) and singletechnique products, which make use of either IR or microwave sensors (Arkin and Meisner 1987; Wilheit et al. 1991; Spencer 1993; Ferraro and Marks 1995; Ferraro 1997; Susskind et al. 1997). All these products provide near-global data in gridded fields. The spatial resolution generally varies from 1° to 2.5°, and in most cases monthly values are reported. As is noted in Adler et al. (2001), the quasi-standard products generally outperform other techniques (such as model-based estimates) in estimating precipitation events over the globe. However, development issues remain in terms of reducing uncertainty of the estimates, improving bias scores, and augmenting frequencies of storm detection.
VOLUME 6
Recent work (Salvucci and Song 2000; Margulis and Entekhabi 2001) has shown that even coarse (monthly) satellite-derived precipitation products can be useful for hydrologic applications. The advent of the GPCP1DD product provides a potentially more useful product, which requires not only validation but also characterization of errors that are not as important in the monthly product (e.g., undetected events, false alarms, etc.). The objective of this work is to validate the daily precipitation estimates provided by GPCP-1DD and characterize its error structure over the contiguous United States. The United States presents advantageous characteristics for a validation exercise of this nature. First, a highly dense network of rain gauges and near–real time sampling allow for the generation of gridded precipitation estimates that can be regarded as relatively accurate ground-truth estimates. Second, given the size of the United States, the analyzed domain is subject to a wide array of different climatic regimes. Therefore, the performance of the GPCP-1DD over these different climatic regions may be used more effectively to guide validity assessments in other ungauged regions. This paper is organized as follows: section 2 describes the datasets used in the analysis. The validated dataset is the GPCP-1DD global estimate, whereas the reference dataset is the gauge-based precipitation field of the Land Data Assimilation System (LDAS) product. Section 3 presents the validation procedure, including the adopted fit measures and the spatial and temporal disaggregation adopted in order to incorporate climatic variability. Section 4 presents and discusses the results obtained from computing the fit indices for various U.S. regions at different time scales. Section 5 contains conclusions and suggested future work.
2. Datasets a. GPCP-1DD The GPCP-1DD product (resolution 1° ⫻ 1°, daily) is a complement to the GPCP Version 2 SG Combination (SG) product (Adler et al. 2003). The 1DD uses data from geostationary-satellite IR sensors to compute the threshold-matched precipitation index (TMPI) to compute precipitation estimates on a 1° ⫻ 1° grid at 3-hourly intervals within the 40°N–40°S band. The basic output of the TMPI is a sequence of instantaneous 3-hourly estimates, which are summed to produce the daily value. Estimates outside this band are computed based on recalibrated Television Infrared Observation Satellite (TIROS) Operational Vertical Sounder (TOVS) data from polar-orbiting satellites (Susskind et al. 1997). Additionally, the 1DD product is scaled to match
AUGUST 2005
MCPHEE AND MARGULIS
443
FIG. 1. Seasonal plot of average daily precipitation from (a) GPCP-1DD and (b) LDAS-CPC datasets (mm day⫺1).
the monthly accumulation provided by the SG product for both data regions, which combines satellite and gauge observations at a monthly time scale on a 2.5° ⫻ 2.5° grid. Figure 1a illustrates the GPCP-1DD average daily precipitation over the contiguous United States for different seasons for the period of analysis. From Fig. 1a, it is clear that the GPCP-1DD dataset shows
realistic spatial and temporal climate regime signatures over the continental United States. Rudolf and Rubel (2000) and Skomorowski et al. (2001) presented the first validation studies for GPCP1DD using data from dense networks of rain gauges in the European Alps. These studies focused on satellite versus ground-truth comparisons for various portions
444
JOURNAL OF HYDROMETEOROLOGY
of 1997. Skomorowski et al. (2001) based their analysis on approximately 60 days of rain gauge data obtained for the period of June to July 1997 from the Mesoscale Alpine Program (MAP) network. Results from their research suggest that GPCP underestimates the ground measurements by an average of 0.6 mm day⫺1, or approximately 15% of the mean observed value with an rmse of almost 6 mm day⫺1 (150% of the mean). Analysis for the entire year yielded similar results for bias and rmse in terms of the percentage of observed precipitation (Rudolf and Rubel 2000). Approximately 3100 rain gauges were used for their verification. The study presented here is similar, but for the continental United States, and draws on data from a much longer observation period, thereby increasing the statistical significance of the results.
b. NLDAS–CPC The reference dataset used to validate the GPCP1DD product over the United States is the precipitation field provided by the North American Land Data Assimilation System (NLDAS; Cosgrove et al. 2003). The LDAS product provides hourly precipitation rates in a 1/8° ⫻ 1/8° grid by combining hourly stage IV Doppler radar precipitation (for temporal disaggregation purposes only) and daily National Centers for Environmental Prediction Climate Prediction Center (CPC) rain gauge precipitation. When aggregated to the daily time scale this product is identical to the CPC gaugebased product. It is important to emphasize that the reference dataset does not contain model-based precipitation data. On a typical day, the number of rain gauges incorporated into the NLDAS–CPC product ranges from approximately 12 000 to 15 000. The high density of this measurement network allows us to regard the NLDAS–CPC product as a realistic estimate of true precipitation values over the entire contiguous United States. NLDAS also includes data for Canada and Mexico, but because the rain gauge coverage is coarser in those countries, this work includes only U.S. data. For a detailed description of the algorithms and products used in generating the NLDAS product the reader is referred to Cosgrove et al. (2003). As with any gauge-based precipitation estimate, systematic errors due to undercatch bias and poor spatial sampling in complex terrain are inevitable. Undercatch biases can be expected to range from 5% to 30% depending on location and time of year (Groisman and Legates 1994). While these potential biases need to be considered when making comparisons, the high quality of data and large number of gauges in the CPC daily rainfall network offer the best opportunity for validation and error characterization of the satellite-based
VOLUME 6
GPCP data. Potential impacts of systematic errors are discussed below. An additional source of potential concern is the overlapping of rain gauges used in generating the GPCP-1DD and NLDAS datasets. GPCP-1DD has no day-to-day gauge input, so the two products can be regarded as independent in this sense. Nevertheless, the 1DD product is scaled to match the monthly values of the GPCP version-2 SG product, which does incorporate data from approximately 1200 rain gauges in the United States. However, because the number of gauges used in the GPCP estimate is much smaller than the number of gauges incorporated by the NLDAS product, in this work it is considered that both estimates are independent. To use the NLDAS estimate to validate the GPCP1DD product, both datasets have to be converted to the same spatial and temporal discretization. The NLDAS data (1/8° ⫻ 1/8°, hourly) were converted into daily values over a 1° ⫻ 1° grid, which is exactly aligned with the GPCP-1DD grid. First, spatial aggregation was achieved by simple averaging of the 64 NLDAS values that fall into the 1° cell corresponding to the GPCP grid. Since the NLDAS data do not cover ocean surface, the averaged value was multiplied by the fraction of cells (within a particular 64-cell cluster) that correspond to land surface. If less than 50% of the original 64-cell cluster corresponded to land surface, the aggregated 1° cell was omitted from the analysis. Finally, temporal aggregation was achieved by simple accumulation of the hourly values for each day. Figure 1b presents the seasonal averages of daily precipitation over a 1° ⫻ 1° grid computed from the LDAS product for the period of validation. The NLDAS product shows similar coherent spatial and seasonal climate patterns when compared to the GPCP-1DD data.
3. Validation method Once the GPCP-1DD and NLDAS datasets are expressed on the same grid, results of the verification process can be quantified by computing continuous and categorical statistics. Continuous statistics include bias, rmse, and correlation coefficient. Categorical statistics include probability of detection (POD), false alarm ratio (FAR), and the Hanssen and Kuipers score, or true skill statistic (TSS). The POD and FAR indicators belong to the category of accuracy measures, whereas the TSS belongs to the category of skill scores.
a. Continuous statistics Bias is computed as the simple average of differences between observed and estimated precipitation values for any given grid cell, or
AUGUST 2005
MCPHEE AND MARGULIS
BIAS ⫽
1 Ndi
445
Ndi
兺共pg
it
⫺ plit兲,
共1兲
t⫽1
where Ndi is the total number of days with information for grid cell i; pgit and plit are the GPCP-1DD and LDAS estimates for cell i and day t, respectively. Likewise, rmse is RMSE ⫽
冑
1 Ndi
Ndi
兺共pg
it
⫺ plit兲2.
共2兲
t⫽1
b. Categorical statistics The categorical statistics are based on the contents of the contingency table described in Fig. 2. Using the notation for hits (h), false alarms ( f ), misses (m), and zeroes (z), the POD is written as POD ⫽
h . h⫹m
共3兲
The probability of detection represents how often real precipitation events are detected in the GPCP-1DD product. It ranges from zero (no detection) to one (perfect detection). FAR is the ratio of false alarms to the total number of estimated rain events, or FAR ⫽
f . f⫹h
共4兲
The FAR represents how often false precipitation events are registered (given no actual event). FAR also ranges from zero (no false alarms) to one (always registering a false alarm). The previous definitions depend on what is considered to be a precipitation event in the observed data. A standard practice is to adopt precipitation thresholds: if the amount of rain for any given grid cell and day is less than the adopted threshold then the event is ignored. Various thresholds are adopted, so the performance of the estimate can be evaluated independently for different precipitation regimes (Skomorowski et al. 2001; Katsanos et al. 2004). In this study, most of the presented results are based on a threshold value of 0.1 mm day⫺1, with sensitivity to this threshold discussed below. Skill scores measure the improvement of the GPCP1DD estimate over a reference estimate, such as random chance, persistence, or climatology. The true skill statistic is computed with TSS ⫽
f h ⫺ , h⫹m f⫹z
共5兲
which is equivalent to the probability of detection minus the probability of false detection. The TSS can
FIG. 2. Contingency table for categorical statistic computation.
adopt values between minus one and one. A value of zero indicates no skill, and the perfect score is equal to one.
c. Spatial and temporal aggregation Continuous and categorical statistics were computed for each individual cell over a 7-yr time window spanning from 1 January 1997 to 31 December 2003. Daily results were aggregated into seasonal and annual averages. For seasonal analysis the daily data are grouped by winter [December–January–February (DJF)], spring [March–April–May (MAM)], summer [June–July–August (JJA)], and fall [September–October–November (SON)]. Furthermore, the contiguous United States was divided into five zones in an attempt to capture the climatic variability observable over large latitude, longitude, and elevation variations. Four main zones were defined within the quadrants defined by parallel 40°N and meridian 100°E. These boundaries are entirely arbitrary, but result from an extremely simplified application of the Koeppen climatological classification to the conterminous United States (Bonan 2002). It has been argued that the Koeppen climatology fails to capture the complexity associated with a diverse climate such as that of the United States (Triantafyllou and Tsonis 1994). However, for the purposes of this work, the highly simplified zonation pattern was adopted to compare results across different climate regimes. In addition to simple climate zone delineation, the zone boundary along parallel 40°N has the explicit goal of separating the two geographical regions where the GPCP product obtains data from different sources (i.e., between parallels 40°N–S, GPCP uses data obtained using the TMPI, whereas north of parallel 40°N data come from TOVS). The zones are named, clockwise from the northwest-
446
JOURNAL OF HYDROMETEOROLOGY
VOLUME 6
FIG. 3. Simplified Koeppen zonation scheme for the contiguous United States.
ern quadrant: Mountain, Northeast, Southeast, and Southwest. A fifth, smaller, zone (Northwest) was defined within the Mountain zone in order to account for more intense precipitation phenomena along the northern Pacific coast. Figure 3 presents the spatial aggregation scheme and its five zones. The LDAS mean daily precipitation by season and zone is shown in Table 1. The LDAS precipitation product relies heavily on ground observations that are unevenly distributed across the U.S.–Canada and U.S.–Mexico borders. Therefore the LDAS product includes a “transition zone” of width 2° along the both international borders, which is depicted in Fig. 3 without a caption. The grid cells corresponding to the transition zone were not considered in the analysis.
4. Results a. Continuous statistics Figure 4 presents scatterplots of average daily precipitation for each zone. The Southeast region presents the best agreement between GPCP and LDAS products throughout the year, with high correlation coefficient values that are in all cases greater than 0.85. Sec-
ond in performance is the Southwest region, with GPCP showing acceptable capability to estimate spring and summer convective precipitation. During winter and fall seasons the GPCP product has more difficulty in reproducing the LDAS estimates, showing a low bias for events of higher intensity (3 mm day⫺1 and higher) for the Mountain, Southwest, and Northwest zones. Beyond the latitude 40°N band, the Northeast region shows high correlation except for summer season, when larger data scatter yields a correlation coefficient of 0.66. Notwithstanding these high correlations, the Northeast region shows positive bias for all seasons. The bias seems to have an upper limit around 2 mm day⫺1 and appears to be independent of precipitation intensity. For the Mountain region, in contrast, the scatter in the data generally increases with precipitation intensity. The correlation coefficient ranges between 0.59 and 0.89 for seasonal values and is equal to 0.56 for the annual estimate due to the large influence of winter precipitation errors on the annual statistics. The GPCP product does not perform well in the Northwest region, which comprises a smaller area along the coast of Washington, Oregon, and northern California. Seasonal correlation coefficients range from 0.21 to 0.33, and the GPCP product underestimates intense
TABLE 1. Average daily precipitation obtained from the LDAS dataset, 1 Jan 1997–31 Dec 2003 (mm day⫺1). Zone Time aggregation
Mountain
Northeast
Southeast
Southwest
Northwest
Winter (DJF) Spring (MAM) Summer (JJA) Fall (SON) Annual
1.16 1.34 1.27 1.00 1.19
1.28 2.25 2.85 2.07 2.12
2.75 3.21 3.44 2.89 3.08
0.96 0.85 1.10 0.90 0.95
6.03 3.27 0.87 3.78 3.47
AUGUST 2005
MCPHEE AND MARGULIS
447
FIG. 4. Scatterplot of GPCP-1DD vs LDAS daily precipitation for each zone and time period (mm day⫺1).
precipitation events (negative bias). On the other hand, in the region of low LDAS values, GPCP yields estimates that are bigger than the gauge-based product. Spatial analysis of the grid cells corresponding to the largest differences showed that greater overestimation occurs in the northernmost cells of the Northwest region. Figure 5 presents histograms of bias for each zone, and Table 2 shows the average and standard deviation of the cell values. Results are temporally aggregated according to four seasons, and annual totals are also presented. Among the four main zones, only the Northeast presents a large number of values significantly greater than zero. Consequently, the average bias for Northeast ranges from 0.3 to 0.6 mm day⫺1, whereas for Mountain, Southeast, and Southwest the average bias is
much closer to zero. The worst performance among zones is observed for Northwest, were average bias ranges from –0.06 to –0.9 mm day⫺1. The spread of the distribution is much higher in Northwest as well, with standard deviation values as high as 2.2 mm day⫺1 for winter season. No significant seasonal effects are apparent in the ability of GPCP to estimate the precipitation values obtained from the LDAS product, except for the Northwest zone. In the latter case, bias values are widely spread around zero during the heavy rainy season comprising the winter months (DJF), with magnitudes reaching up to 5 mm day⫺1. The distribution seems to be low biased, with GPCP not being able to reproduce high-intensity precipitation events shown by LDAS. The same phenomenon can be observed in the Northwest region during spring and fall, when average
448
JOURNAL OF HYDROMETEOROLOGY
VOLUME 6
FIG. 5. Bias distribution (mm day⫺1) for each zone and time period.
bias is approximately –1 to –2 mm day⫺1. Positive bias is observed in the Northeast region for all seasons, with individual cell values up to 2 mm day⫺1. This effect is accentuated during winter, when values on the order of 1 mm day⫺1 increase their relative frequency.
A similar summary for rmse is presented in Fig. 6 and Table 3. Geographical influence on the characteristics of rmse is expressed by higher values in the Southeast and Northwest zones. From Fig. 1b, these zones are the ones with higher seasonal precipitation averages. Av-
TABLE 2. Bias (mm day⫺1) in GPCP-1DD data by zone and season (mean value is shown in first row for each time period, and standard deviation is shown in second row for each time period). Zone Time aggregation Winter (DJF) Spring (MAM) Summer (JJA) Fall (SON) Annual
Mountain
Northeast
Southeast
Southwest
Northwest
⫺0.08 0.74 0.01 0.41 0.06 0.3 ⫺0.02 0.38 ⫺0.01 0.41
0.56 0.32 0.37 0.3 0.25 0.48 0.35 0.38 0.38 0.33
⫺0.06 0.29 ⫺0.05 0.32 0.02 0.35 ⫺0.004 0.32 ⫺0.02 0.26
⫺0.14 0.62 ⫺0.01 0.29 ⫺0.06 0.26 ⫺0.03 0.23 ⫺0.06 0.28
⫺0.93 2.2 ⫺0.5 1.24 0.06 0.46 ⫺0.45 1.54 ⫺0.45 1.33
AUGUST 2005
449
MCPHEE AND MARGULIS
FIG. 6. Rmse distribution (mm day⫺1) for each zone and time period.
erage rmse for the Southwest and Mountain zones is relatively similar, so a direct influence of the change of latitude band across latitude 40°N cannot be stated with certainty. As with bias, no seasonal effects are directly
seen. Only the Northwest zone shows large differences between rmse average between winter and summer, with spring and fall values resulting somewhere in between these two extremes.
TABLE 3. Rmse (mm day⫺1) in GPCP-1DD data by zone and season (mean value is shown in first row for each time period, and standard deviation is shown in second row for each time period). Zone Time aggregation Winter (DJF) Spring (MAM) Summer (JJA) Fall (SON) Annual
Mountain
Northeast
Southeast
Southwest
Northwest
2.26 1.57 2.63 0.67 2.70 1.26 2.33 0.84 2.42 0.60
3.31 1.31 4.55 0.95 5.62 0.93 4.61 1.08 4.46 0.81
6.18 1.41 6.10 0.88 5.42 0.86 6.08 0.91 5.86 0.75
3.21 2.01 2.77 1.12 2.43 1.24 2.99 0.83 2.75 0.79
8.29 1.56 5.16 1.00 2.31 0.64 6.48 1.41 5.36 1.01
450
JOURNAL OF HYDROMETEOROLOGY
VOLUME 6
FIG. 7. POD distribution for each zone and time period.
b. Categorical statistics Figure 7 presents histograms for POD for all zones for a reference precipitation threshold of 0.1 mm day⫺1. Since categorization of a GPCP observation in the contingency table shown in Fig. 2 depends on the definition of what constitutes a precipitation event, it is common to adopt precipitation thresholds that separate rain from no-rain events (McBride and Ebert 2000; Skomorowski et al. 2001). The influence of the precipitation threshold over POD and FAR is discussed below. The results presented in Fig. 7 show that in general POD ⬎ 0.5 for most zones and seasons. However, with respect to the spread of the distribution of POD values within each region, some differences are apparent. For the three zones north of parallel 40°N (encompassing TOVS data), POD values are generally more concentrated, with the most likely value around 0.6. This indicates a consistently higher skill of GPCP in identifying precipitation events over the entire zone. On those
zones south of latitude 40°N, POD values tend to be more widely distributed, with individual cell values as low as 0.2. This is particularly noticeable in the Southwest region, where fall and winter values are positively skewed, indicating that GPCP is less successful in identifying the precipitation events during that season. Table 4 shows the mean and standard deviation values from distributions shown in Fig. 7. It can be seen that the average POD for all zones and seasons lies between approximately 0.3 and 0.7. In addition to the ability to detect precipitation, an important question for error characterization is the magnitude of true precipitation events that are not being detected. Figure 8 and Table 5 describe the characteristics of undetected precipitation amounts. While Table 5 summarizes the absolute values of the mean and standard deviation of undetected precipitation, Fig. 8 presents histograms of the spatial distribution of undetected precipitation divided by the mean precipitation obtained from LDAS, or relative undetected pre-
AUGUST 2005
451
MCPHEE AND MARGULIS
TABLE 4. Probability of detection for GPCP-1DD data by zone and season (mean value is shown in first row for each time period, and standard deviation is shown in second row for each time period). Zone Time aggregation Winter (DJF) Spring (MAM) Summer (JJA) Fall (SON) Annual
Mountain
Northeast
Southeast
Southwest
Northwest
0.56 0.07 0.63 0.04 0.59 0.07 0.59 0.07 0.59 0.05
0.58 0.07 0.67 0.05 0.68 0.06 0.65 0.05 0.65 0.05
0.47 0.07 0.57 0.06 0.67 0.06 0.57 0.05 0.57 0.05
0.31 0.09 0.42 0.10 0.51 0.11 0.41 0.06 0.41 0.06
0.64 0.05 0.61 0.05 0.52 0.08 0.63 0.05 0.60 0.05
cipitation. From Table 5 it can be seen that the average of undetected precipitation increases slightly for the more humid Southeast, Northeast, and Northwest zones. At the same time, for these zones relative unde-
tected precipitation is mostly concentrated around values that are less or equal than 1.0. For the drier Mountain and Southwest zones, relative undetected precipitation shows higher variability, with values greater and
FIG. 8. Distribution of undetected precipitation divided by average daily precipitation (relative undetected precipitation) for individual cells by season and region.
452
JOURNAL OF HYDROMETEOROLOGY
VOLUME 6
TABLE 5. Undetected precipitation magnitude (mm day⫺1) in GPCP-1DD data by zone and season (mean value is shown in first row for each time period, and standard deviation is shown in second row for each time period). Zone Time aggregation Winter (DJF) Spring (MAM) Summer (JJA) Fall (SON) Annual
Mountain
Northeast
Southeast
Southwest
Northwest
1.02 0.59 1.13 0.34 1.21 0.60 1.03 0.34 1.09 0.28
1.13 0.38 1.75 0.40 2.52 0.60 1.71 0.51 1.78 0.30
2.16 0.47 1.94 0.38 1.44 0.34 1.88 0.37 1.84 0.23
1.70 0.86 1.37 0.46 1.02 0.35 1.35 0.37 1.33 0.32
3.72 1.36 2.49 0.78 1.16 0.40 3.05 0.92 2.52 0.78
smaller than 1.0 distributed in approximately equal proportion. The Southwest zone shows the largest variability, with values as large as 10 for individual cells. This means that the undetected precipitation in this region
can be as much as 10 times larger than the average LDAS precipitation value reported for those cells. Figure 9 depicts histograms and Table 6 presents summary statistics of FAR. Results in this case indicate
FIG. 9. FAR distribution for each zone and time period.
AUGUST 2005
453
MCPHEE AND MARGULIS
TABLE 6. False alarm ratio in GPCP-1DD data by zone and season (mean value is shown in first row for each time period, and standard deviation is shown in second row for each time period). Zone Time aggregation Winter (DJF) Spring (MAM) Summer (JJA) Fall (SON) Annual
Mountain
Northeast
Southeast
Southwest
Northwest
0.30 0.22 0.27 0.12 0.26 0.13 0.36 0.14 0.30 0.13
0.38 0.13 0.28 0.06 0.25 0.05 0.33 0.08 0.31 0.06
0.26 0.11 0.16 0.06 0.08 0.05 0.20 0.06 0.18 0.06
0.52 0.16 0.43 0.14 0.21 0.19 0.34 0.11 0.39 0.09
0.08 0.05 0.15 0.06 0.38 0.08 0.19 0.07 0.21 0.06
a broader range of GPCP skill when compared to the previous POD values. The Northwest zone presents acceptable values during most of the year with the exception of summer season, when for some cells the value of FAR rises up to 60%. Overestimation in the number of rainy days appears to be more common in the Southwest, Northeast, and Mountain zones, with individual cell values reaching up to approximately 0.8. Values in Table 6 suggest that FAR is larger during winter and fall for most zones, except for the Northwest zone. In the latter case, FAR values for winter average 0.08 compared with 0.38 during the summer months. For both accuracy measures, inspection of Figs. 7 and 9 suggests that GPCP-1DD performs better at estimating precipitation events during the summer months (JJA). During that period, the histograms for most zones tend to shift slightly to the better end of the spectrum (1.0 for POD, 0 for FAR). This is likely explained by the fact that cloud-top temperature (IR) algorithms used in producing the GPCP product perform better in the summer because rainfall is more convective in nature. The Northwest zone presents the only exception to this behavior, with summer POD and FAR values that are generally worse than for other seasons because of lighter rain. Figure 10 and Table 7 provide information regarding the magnitude of false alarm precipitation estimates, that is, the value of those rainfall events appearing in GPCP-1DD but not in LDAS. Similarly to Fig. 8, Fig. 10 shows the distribution of relative false precipitation. It can be seen that the ratio of false alarm precipitation estimates to mean daily precipitation for each cell is less than 2.0 for most zones and seasons. Noticeably, winter season presents higher variability for this parameter, and the Southwest zone also has ratios as high as 10, with high spatial variability. Nonetheless, even in this two last cases the values of relative false alarm precipitation seem to be concentrated around 2.0, meaning
that when precipitation is erroneously diagnosed by GPCP-1DD, estimated values are on average twice as big as the mean daily precipitation observed at each individual cell. From Table 7, mean values are higher for false alarm precipitation, with most averages located in between 1.5 and 4.0 mm day⫺1. Only the Mountain zone presents similar behavior for undetected and false alarm precipitation estimates. Within regions, values appear to be higher during winter and fall seasons for the Southeast, Southwest, and Northwest zones. The Mountain and Northeast zones show higher values during spring and summer seasons. Finally, Fig. 11 and Table 8 provide information regarding TSS results over the five defined zones, for the four considered seasons, as well as on an annual basis. Average TSS values cluster around 0.3 for most zones and seasons, with extreme values of 0.15 and 0.57 for the Southwest winter and Southeast summer, respectively. From Fig. 11 it is seen that TSS values are rather uniform, showing little spread over the spatial domain of each zone and generally falling in the (0, 1) range. Skomorowski et al. (2001) report similar values of TSS, although the results shown here are not directly comparable with their work since they are averaged over time.
c. Sensitivity of categorical statistics to precipitation threshold As noted before, computation of categorical statistics requires a precipitation threshold in order to identify nonzero precipitation events. Figure 12 depicts the influence of the precipitation threshold on POD. Each plot corresponds to the simple spatial average of individual-cell POD values obtained for different thresholds for each geographical zone. The ability of GPCP to identify precipitation events remains approximately constant at about 0.55 for all precipitation thresholds in the range 0.1–10 mm day⫺1, for the two eastern zones
454
JOURNAL OF HYDROMETEOROLOGY
VOLUME 6
FIG. 10. Distribution of false alarm precipitation estimates divided by average daily precipitation (relative false precipitation) for individual cells by season and region.
and for the Northwest zone. On the other hand, POD decreases almost linearly with threshold values for the Southwest and Mountain regions, to reach a minimum of about 0.10. This behavior can be explained by the
fact that large precipitation events tend to be less frequent west of the Rockies. Therefore, as the precipitation threshold increases fewer events qualify as “rain ⬎0” events, causing the estimations by GPCP to fall
TABLE 7. False alarm precipitation estimate (mm day⫺1) in GPCP-1DD data by zone and season (mean value is shown in first row for each time period, and standard deviation is shown in second row for each time period). Zone Time aggregation Winter (DJF) Spring (MAM) Summer (JJA) Fall (SON) Annual
Mountain
Northeast
Southeast
Southwest
Northwest
1.36 0.65 1.50 0.33 1.49 0.66 1.34 0.32 1.44 0.32
2.18 0.64 2.53 0.55 2.83 0.58 2.43 0.55 2.52 0.39
3.79 0.95 2.90 0.91 2.29 0.88 3.15 0.92 2.99 0.46
2.31 1.01 1.90 0.70 1.52 0.93 2.32 0.81 1.98 0.60
3.91 1.28 2.51 0.62 1.54 0.52 2.51 0.92 2.74 0.79
AUGUST 2005
455
MCPHEE AND MARGULIS
FIG. 11. Distribution of TSS.
into the “f” (observed ⫽ no, predicted ⫽ yes) category, or into the “z” (observed ⫽ no, predicted ⫽ no) category. It is worth mentioning that as the precipitation threshold increases, fewer cells yield finite values for POD, and hence the spatial averages depicted in Fig. 12
are computed over a different number of cells. It is also important to note that most of the precipitation values reported by GPCP-1DD as well as NLDAS are smaller than or equal to 0.1 mm day⫺1 (about 60%), as is shown in Table 9.
TABLE 8. True skill statistics by zone and season (mean value is shown in first row for each time period, and standard deviation is shown in second row for each time period). Zone Time aggregation Winter (DJF) Spring (MAM) Summer (JJA) Fall (SON) Annual
Mountain
Northeast
Southeast
Southwest
Northwest
0.29 0.08 0.28 0.07 0.35 0.06 0.32 0.08 0.32 0.05
0.29 0.06 0.37 0.05 0.32 0.06 0.35 0.06 0.34 0.04
0.35 0.07 0.47 0.05 0.57 0.06 0.47 0.06 0.47 0.05
0.15 0.08 0.25 0.08 0.42 0.10 0.31 0.06 0.31 0.06
0.42 0.06 0.31 0.04 0.34 0.05 0.43 0.04 0.40 0.03
456
JOURNAL OF HYDROMETEOROLOGY
VOLUME 6
FIG. 12. Spatial average of POD for U.S. zones as a function of precipitation threshold.
Figure 13 presents an equivalent plot for FAR. An increasing trend is clearly observed as the precipitation threshold increases, which is expected since fewer observations are regarded as positive events. Beyond trend considerations, it is seen that the lowest values of FAR are obtained for the Northwest zone with values ranging from about 0.1 to 0.5. However, care must be taken in drawing conclusions about the performance of GPCP estimates in the Northwest zone, since it is seen that other indicators do rather poorly. Next in performance come the Southeast, Northeast, and Mountain zones, where the best skill values are about 0.3. The worst performance is observed in the Southwest zone, where GPCP diagnoses precipitation events erroneously on average about 60% of the time in the best case. FAR for this zone reaches up to 0.9 when the precipitation threshold is 10 mm day⫺1. This is explained by the fact that at that threshold, very few observed events are regarded as nonzero precipitation and the number of hits tends to be zero. Figure 14 shows little influence of the precipitation
threshold over TSS spatial averages for all zones. Values increase slightly when the threshold is shifted from 0.1 to 1.0 mm day⫺1, but then remain almost constant within the range [0.3, 0.5], depending on the geographical zone. The Mountain zone is an exception to this behavior, with TSS values decreasing almost linearly for threshold values larger than 2.0 mm day⫺1 and reaching a minimum of approximately 0.2.
d. Discussion The results presented here are relatively consistent with previous validation efforts by Skomorowski et al. (2001) and Huffman et al. (2001). Although the analysis presented in this paper considers significantly more data than the aforementioned studies, scatterplots of LDAS versus GPCP-1DD data show similar behavior to the comparison between the Oklahoma Mesonet dataset and GPCP-1DD (Huffman et al. 2001). Winter correlation coefficients reported here are in the same order of magnitude as those presented in Huffman et al. (2001), but are higher than those for summer season.
TABLE 9. Relative frequency distribution of daily precipitation estimates (p).
GPCP-1DD NLDAS
⬍0.1
0.1 ⬍ p ⬍ 1.0
1.0 ⬍ p ⬍ 2.0
2.0 ⬍ p ⬍ 4.0
4.0 ⬍ p ⬍ 6.0
6.0 ⬍ p ⬍ 8.0
8.0 ⬍ p ⬍ 1.0
⬎10.0
0.61 0.54
0.12 0.21
0.06 0.07
0.07 0.07
0.04 0.03
0.03 0.02
0.02 0.01
0.06 0.05
AUGUST 2005
MCPHEE AND MARGULIS
457
FIG. 13. Same as in Fig. 12, but for FAR.
The validation of GPCP-1DD over the conterminous United States is more promising than the results including MAP data presented by Skomorowski et al. (2001), which show higher scatter in the data. Among the con-
tinuous statistics, only rmse is common to this work and both of the aforementioned studies. Rmse values show similar behavior, with average values in the order of 5 mm day⫺1 in all cases. Bias values reported by Rudolf
FIG. 14. Same as in Fig. 12, but for TSS.
458
JOURNAL OF HYDROMETEOROLOGY
and Rubel (2000) and Skomorowski et al. (2001) are ⫺0.3 and –0.6 mm day⫺1 respectively, while our results show an average bias for the United States of approximately ⫹0.3 mm day⫺1. As was discussed previously, the effect of undercatch bias cannot be neglected in this analysis. The GPCP-1DD product is adjusted for undercatch bias using the Legates–Wilmott correction, so it can be expected that it would produce estimates that are 5% to 30% higher than NLDAS depending on the location and season. From Table 2 and Fig. 5, no significant bias is seen on average, although positive bias is present for the Northeast and Northwest zones (north of parallel 40°N). One would expect the Mountain zone to present positive bias as well, but on average this is not observed. Given the level of spatial aggregation of the analysis presented here it is complex to provide a hypothesis on the effect that considering undercatch bias explicitly would have on individual results. This issue, however, will be addressed in future work. Average POD in this study (0.6) is almost exactly the same value as in Rudolf and Rubel (2000) and Skomorowski et al. (2001). On the other hand, FAR values presented here (0.3) are slightly higher than those computed with the MAP dataset (0.2), whereas TSS averages are almost the same for both studies (NLDAS ⫽ 0.37; MAP ⫽ 0.36). Finally, the behavior of categorical statistics as a function of precipitation threshold is qualitatively similar as published in Skomorowski et al. (2001). However, the full range of threshold adopted in the latter study could not be used here, since the maximum precipitation value observed in either dataset (GPCP-1DD and LDAS) was smaller than 15 mm day⫺1.
5. Conclusions This paper provided a systematic comparison of a recently developed global daily precipitation product (GPCP-1DD) and a gauge-based precipitation estimate (LDAS-CPC) over the contiguous United States. LDAS data are treated as a ground-truth estimate of daily precipitation due to the extensive coverage of the rain gauges that contribute information to the final product. By comparing a satellite-based estimate such as GPCP-1DD with a local product like LDAS, preliminary conclusions can be obtained regarding the applicability of GPCP-1DD in sparsely sampled regions of the world. The comparison is carried out using continuous and categorical statistics. Continuous statistics include bias, correlation, and rmse, whereas categorical or skill measures include probability of detection, false alarm ratio, and true skill score. To investigate climatological ef-
VOLUME 6
fects on the ability of GPCP-1DD to reproduce LDAS observations, the area of study was divided into five zones, and seasonal averages were presented. The zonation pattern applied to the area of study is by and large arbitrary, but the main purpose was to be consistent with the climatological description of the United States that results from applying the Koeppen system for delineating climates. The comparison is performed using 7 yr of daily values over the period between 1 January 1997 and 31 December 2003. Analysis of the continuous statistics shows that GPCP-1DD is able to replicate LDAS estimates with little bias for three zones and for all seasons. Biases significantly different from zero are observed in the Northeast and Northwest zones, with positive averages for the former and negative averages for the latter. Both zones have in common that they are located north of latitude 40°N (where fewer and different satellite data are available) and that they present relatively high intensity events. Rmse values with climate zones are more spread than bias, but in general show a fairly concentrated behavior with histograms that suggest Gaussian behavior. In drier regions like the Southwest and Mountain zones, rmse tends to be positively skewed, with modes on the order of 2 mm day⫺1. The ability of GPCP-1DD to identify precipitation events is characterized by POD and FAR. Typical values for POD are approximately 0.55 with the exception of the Southwest zone where the average value is approximately 0.4. For the Southeast, Northeast, and Northwest zones the undetected precipitation values are generally less than or equal to the mean daily value, while the Mountain and Southwest zones show undetected precipitation centered about the mean daily value, but with higher variability. In terms of FAR, the Mountain, Northeast, Southeast, and Northwest zones show average annual values of approximately 0.25, while the Southwest zone is generally the worst performer, with an average annual value of approximately 0.4. The magnitude of the falsely predicted precipitation is generally between zero and twice the mean for all zones except the Southwest, where false alarms as high as 10 times the mean are seen. This paper presents simple temporal and spatial averages for all quantities computed at cell level. Future work will consist of investigating spatial and temporal autocorrelation and their effect over the inferences that can be made about the structure of estimation errors. In addition to the validation effort, the error characterization presented here provides an important dataset for efforts aimed at extending such remotely sensed products to hydrologic modeling applications. Specifically, a follow-on paper will use the results presented here in
AUGUST 2005
MCPHEE AND MARGULIS
conjunction with an ensemble-based disaggregation and forecasting system to illustrate the utility of GPCP1DD precipitation products in hydrologic modeling applications. Acknowledgments. This study was supported by the National Science Foundation under Grant EAR0333133. The authors would also like to thank George Huffman and two anonymous reviewers for their helpful comments on the manuscript. REFERENCES Adler, R. F., C. Kidd, G. Petty, M. Morissey, and H. M. Goodman, 2001: Intercomparison of global precipitation products: The third Precipitation Intercomparison Project (PIP-3). Bull. Amer. Meteor. Soc., 82, 1377–1396. ——, and Coauthors, 2003: The Version-2 Global Precipitation Climatology Project (GPCP) Monthly Precipitation Analysis (1979–present). J. Hydrometeor., 4, 1147–1167. Arkin, P. A., and B. N. Meisner, 1987: The relationship between large-scale convective rainfall and cold cloud over the Western Hemisphere during 1982–84. Mon. Wea. Rev., 115, 51–74. Bonan, G. B., 2002: Ecological Climatology: Concepts and Applications. Cambridge University Press, 678 pp. Cosgrove, B. A., and Coauthors, 2003: Real-time and retrospective forcing in the North American Land Data Assimilation System (NLDAS) project. J. Geophys. Res., 108, 8842, doi:10.1029/2002JD003118. Datta, S., W. L. Jones, B. Roy, and A. Tokay, 2003: Spatial variability of surface rainfall as observed from TRMM field campaign data. J. Appl. Meteor., 42, 598–610. Ferraro, R. R., 1997: Special Sensor Microwave Imager derived global rainfall estimates for climatological applications. J. Geophys. Res., 102 (D14), 16 715–16 735. ——, and G. F. Marks, 1995: The development of SSM/I rain-rate retrieval algorithms using ground-based radar measurements. J. Atmos. Oceanic Technol., 12, 755–770. Groisman, P. Y., and D. R. Legates, 1994: The accuracy of United States precipitation data. Bull. Amer. Meteor. Soc., 75, 215– 227. Huffman, G. J., R. F. Adler, M. M. Morrissey, D. T. Bolvin, S. Curtis, R. Joyce, B. McGavock, and J. Susskind, 2001: Global precipitation at one-degree daily resolution from multisatellite observations. J. Hydrometeor., 2, 36–50. Joyce, R. J., J. E. Janowiak, P. A. Arkin, and P. Xie, 2004: CMORPH: A method that produces global precipitation estimates from passive microwave and infrared data at high spatial and temporal resolution. J. Hydrometeor., 5, 487–503. Katsanos, D., K. Lagouvardos, V. Kotroni, and G. J. Huffmann,
459
2004: Statistical evaluation of MPA-RT high-resolution precipitation estimates from satellite platforms over the central and eastern Mediterranean. Geophys. Res. Lett., 31, L06116, doi:10.1029/2003GL019142. Margulis, S. A., and D. Entekhabi, 2001: Temporal disaggregation of satellite-derived monthly precipitation estimates and the resulting propagation of error in partitioning of water at the land surface. Hydrol. Earth Syst. Sci., 5, 27–38. McBride, J. L., and E. E. Ebert, 2000: Verification of quantitative precipitation forecasts from operational numerical weather prediction models over Australia. Wea. Forecasting, 15, 103– 121. Morin, E., W. F. Krajewski, D. C. Goodrich, X. Gao, and S. Sorooshian, 2003: Estimating rainfall intensities from weather radar data: The scale-dependency problem. J. Hydrometeor., 4, 782–797. Petty, G. W., 1995: The status of satellite-based rainfall estimation over land. Remote Sens. Environ., 51, 135–137. Rudolf, B., and F. Rubel, 2000: Regional validation of satellitebased global precipitation estimates. Proc. Meteorological Satellite Data Users’ Conf., Bologna, Italy, EUMETSAT, 601–608. Salvucci, G. D., and C. Song, 2000: Derived distributions of storm depth and frequency conditioned on monthly total precipitation: Adding value to historical and satellite-derived estimates of monthly precipitation. J. Hydrometeor., 1, 113–120. Sharif, H. O., F. L. Ogden, W. F. Krajewski, and M. Xue, 2004: Statistical analysis of radar rainfall error propagation. J. Hydrometeor., 5, 199–212. Skomorowski, P., F. Rubel, and B. Rudolf, 2001: Verification of GPCP-1DD global satellite precipitation products using MAP surface observations. Phys. Chem. Earth, 26, 403–409. Spencer, R. W., 1993: Global oceanic precipitation from the MSU during 1979–91 and comparisons to other climatologies. J. Climate, 6, 1301–1326. Susskind, J., P. Piraino, L. Rokke, T. Iredell, and A. Mehta, 1997: Characteristics of the TOVS Pathfinder Path A dataset. Bull. Amer. Meteor. Soc., 78, 1449–1472. Triantafyllou, G. N., and A. A. Tsonis, 1994: Assessing the ability of the Koppen system to delineate the general world pattern of climates. Geophys. Res. Lett., 21, 2809–2812. Uijlenhoet, R., M. Steiner, and J. A. Smith, 2003: Variability of raindrop size distributions in a squall line and implications for radar rainfall estimation. J. Hydrometeor., 4, 43–60. Wilheit, T. T., A. T. C. Chang, and L. S. Chiu, 1991: Retrieval of monthly rainfall indices from microwave radiometric measurements using probability distribution functions. J. Atmos. Oceanic Technol., 8, 118–136. Xie, P. P., and P. A. Arkin, 1996: Analyses of global monthly precipitation using gauge observations, satellite estimates, and numerical model predictions. J. Climate, 9, 840–858.