Comparing the scores of hydrological ensemble forecasts issued by ...

75 downloads 24456 Views 657KB Size Report
Feb 25, 2010 - Keywords: streamflow forecasting; hydrological ensemble prediction; verification. 1. Introduction. At operational flood forecasting centers, forecasters usually have ..... Hydrologic verification: a call for action and collaboration.
ATMOSPHERIC SCIENCE LETTERS Atmos. Sci. Let. 11: 100–107 (2010) Published online 25 February 2010 in Wiley InterScience (www.interscience.wiley.com) DOI: 10.1002/asl.259

Comparing the scores of hydrological ensemble forecasts issued by two different hydrological models A. Randrianasolo,1 M. H. Ramos,1 * G. Thirel,2† V. Andr´eassian1 and E. Martin2 1 Cemagref, Hydrology Research Group, Antony, France 2 CNRM-GAME, M´ et´eo-France, CNRS, GMME/MOSAYC,

*Correspondence to: M. H. Ramos, Cemagref Antony, UR HBAN, Parc de Tourvoie, BP 44-92163 Antony Cedex, France. E-mail: [email protected] † Present

address: G. Thirel, JRC, DG Joint Research Centre, European Commission, Institute for Environment and Sustainability, Ispra, Italy.

Toulouse, France

Abstract A comparative analysis is conducted to assess the quality of streamflow forecasts issued by two different modeling conceptualizations of catchment response, both driven by the same weather ensemble prediction system (PEARP M´et´eo-France). The two hydrological modeling approaches are the physically based and distributed hydrometeorological model SIM (M´et´eo-France) and the lumped soil-moisture-accounting type rainfall-runoff model GRP (Cemagref). Discharges are simulated at 211 catchments in France over 17 months. Skill scores are computed for the first 2 days of forecast range. The results suggest good performance of both hydrological models and illustrate the benefit of streamflow data assimilation for ensemble short-term forecasting. Copyright  2010 Royal Meteorological Society Keywords:

streamflow forecasting; hydrological ensemble prediction; verification

Received: 31 August 2009 Revised: 9 December 2009 Accepted: 19 January 2010

1. Introduction At operational flood forecasting centers, forecasters usually have to deal with forecasts issued by different models and combine them to support their decisions and communicate flood alerts to end users (Ramos et al., 2007). However, modeling approaches or setups are usually too different to allow a straightforward intercomparison of the results, and forecast interpretation, especially when model results diverge, can quickly become a puzzle. The objective of this paper is to assess the impact of the use of two different hydrological models, with different modeling conceptualizations of catchment response, on scores of ensemble streamflow forecasts. Forecast verification is a vast topic and discussions have evolved into how to define objective and user-oriented verification measures for a better guidance and decision making in hydrologic forecasting (Welles et al., 2007; Pappenberger et al., 2008). In this study, the focus is not on the development of new measures, but on the application of a selected number of well-known scores largely used in atmospheric science (Jolliffe and Stephenson, 2003) to both hydrological forecasting systems. Attention is paid to the following methodological aspects: (1) to force the hydrological models with the same ensemble weather predictions, (2) to evaluate streamflow predictions against observed discharges (and not against simulated, model-dependent, discharges), (3) to apply the scores over a long time period of forecasts, (4) to Copyright  2010 Royal Meteorological Society

conduct the analysis on a large database of catchments, representative of a variety of climate and physiographic conditions.

2. Data 2.1. The PEARP ensemble prediction system This study is based on the PEARP ensemble prediction system (EPS), which is the M´et´eo-France shortrange EPS, dedicated to detect localized and severe events (Nicolau, 2002). In this study, the PEARP is a 60-h EPS with a 0.25◦ grid resolution, which produces 11 members once a day. Singular vectors are set optimal after a 12-h period. Rainfall and temperature ensemble forecasts are the variables from PEARP used to force the hydrological models. Other variables necessary to run the models (pressure, radiation, wind, humidity, or evapotranspiration) are evaluated from the climatology. PEARP data are downscaled in order to better fit the observations, as well as to make the forecasts available on the grid resolution of the hydrometeorological model used by M´et´eoFrance (8 × 8 km). The downscaling is realized in two steps: first, the data are spatially interpolated on predefined zones, which are the climatologically homogeneous areas used to define the SAFRAN meteorological analysis system of M´et´eo-France (see Vidal et al., 2009 for details on SAFRAN). Then, the temperature data are corrected by using the usual mean atmospheric lapse rate gradient (−0.65 K/100 m). For

Comparing the scores of hydrological ensemble forecasts

101

Figure 1. Location of the studied catchments in France.

the precipitation, a point-by-point bias removal was used (Thirel et al., 2008). This bias removal is calibrated by comparing precipitation observations (from the SAFRAN analysis) to the PEARP EPS precipitation obtained after the spatial interpolation on the SAFRAN zones. These two fields are compared over a whole year in order to define a SAFRAN/PEARP ratio for each grid point.

2.2. Catchments and data A total of 211 catchments in France were studied (Figure 1). Precipitation ensemble forecasts were verified against precipitation data from the M´et´eoFrance meteorological analysis system SAFRAN. Discharges from the hydrological ensemble forecasts were compared to the observed daily streamflow data available at the French database Banque Hydro (http://www.hydro.eaufrance.fr/). A 17-month verification period was used (10 March 2005 to 31 July 2006). Two lead times were considered: day 1 corresponds to the 24-h period of the day following the day the forecast is issued, and day 2 to the 24-h period following day 1. For precipitation, climatology comes from SAFRAN data from 1995 to 2005. Climatology length of observed streamflow varies according to the catchment. Time series range from 7 to 35 years of daily data, with 75% of the studied catchments with more than 27 years of data. Copyright  2010 Royal Meteorological Society

3. Methods 3.1. Hydrological ensemble predictions Streamflow forecasts were issued by two different modeling conceptualizations of catchment response: (1) the coupled physically based hydrometeorological model SAFRAN–ISBA–MODCOU (SIM) developed at M´et´eo-France and based on a fully distributed catchment model and (2) the GRP model developed at Cemagref and based on a lumped soil-moistureaccounting type rainfall-runoff model. 3.1.1. The SAFRAN–ISBA–MODCOU model

The SIM hydrometeorological suite is a distributed model developed at M´et´eo-France. It simulates the evolution of soil moisture over France and streamflows for a total of 881 stations. SIM is composed of three different models: SAFRAN, ISBA, and MODCOU. SAFRAN (a French acronym for Analysis System that Provides Data to Snow Model; Durand et al., 1993) is a meteorological analysis providing eight parameters: 10-m wind speed, 2-m relative humidity, 2-m air temperature, total cloud cover, incoming solar and atmospheric/terrestrial radiation, snowfall and rainfall. ISBA (Interactions between Soil, Biosphere, and Atmosphere; Noilhan and Planton, 1989) is a landsurface model. It simulates water and energy fluxes between the soil and the atmosphere for 9892 grid Atmos. Sci. Let. 11: 100–107 (2010)

102

A. Randrianasolo et al.

meshes (8 × 8 km) distributed over France. MODCOU (MOD`ele COUpl´e ; Ledoux et al., 1989) is a distributed hydrogeological model. It simulates the spatial and temporal evolution of some aquifers and routes the water toward and into rivers. Besides precipitation and temperature forecasts from PEARP, the model runs with mean values of pressure, radiation, wind, and humidity evaluated from the SAFRAN climatology. The drainage and runoff variables produced by ISBA are used by MODCOU. The internal time step of simulation in SIM is variable (20 min for the ISBA part of the model, 3 h for discharges in MODCOU, and 1 day for groundwater). SIM was validated over the 881 French stations (Habets et al., 2008) and showed realistic water and energy budgets, streamflow, aquifer levels, and snowpack simulations.

the forecast (Equation (1)). For each lead time, the RMSE was calculated for both the precipitation and the streamflow values. To compare the scores over the 211 studied catchments, we computed normalized scores by dividing the RMSE of each catchment by its average observed precipitation (or streamflow) over the verification period.

3.1.2. The GRP rainfall-runoff model

3.2.2. The standard deviation (or spread)

The GRP model is a lumped soil-moisture-accounting type rainfall-runoff model developed at Cemagref. Input data are daily precipitation and mean evapotranspiration. The model is composed of a production function, which computes the effective rainfall over the catchment and a routing function, including a unit hydrograph and a nonlinear routing store, which transforms effective rainfall into flow at the catchment outlet. It has three parameters that need to be calibrated against observed discharge: one parameter corresponds to a volume-adjustment factor that controls the volume of effective rainfall; the second parameter is the capacity of the routing store; and the third parameter is the base time of the unit hydrograph. The maximum capacity of the production store is fixed. For flow forecasting, an updating procedure is applied based on the assimilation of the last observed discharge to update the state of the routing store and a model output correction according to the last model error (Berthet et al., 2009). The model runs with precipitation forecasts from PEARP and mean potential evapotranspiration (Oudin et al., 2005) from climatology. In this study, the model was adapted to run ensemble forecasts at a daily time step. To compare the GRP results to those from the SIM model, which, in this study, does not include updating, two versions are used: the GRP model with only the state updating (GRP ) and the GRP model without updating (GRP no updating).

It provides a measure of the dispersion of the ensemble members (Equation (2)). Normalized scores were computed by dividing the standard deviation of each catchment by the average forecasted precipitation (or streamflow) over the verification period.

3.2. Scores for the evaluation of forecast quality Skill scores were computed over the forecast verification period to evaluate the impact of the two hydrological models on the quality of their streamflow ensemble forecasts. The scores are briefly presented below and described in detail in Jolliffe and Stephenson (2003). 3.2.1. The root mean square error

It is a measure of forecast accuracy: the lower the root mean square error (RMSE), the more accurate Copyright  2010 Royal Meteorological Society

  N 1  (mi − oi )2 (mm/day) RMSE =  N i =1

(1)

where oi is the observation for day i , mi is the mean of the ensemble forecast for the same day, and N is the total number of days used to compute the score.

 N    1 n 1  (xk ,i − mi )2 (mm/day) σ = N i =1 n

(2)

k =1

where mi is the mean of the ensemble forecast for day i ; xk ,i is the forecast value of the member k and day i ; n is the number of forecast members (here, n = 11); N is the total number of days used to compute the score. 3.2.3. Typical 2 × 2 contingency tables

Typical 2 × 2 contingency tables, with two possible outcomes (yes or no) for observed and forecasted events, were also computed. A perfect forecast system would only produce hits (events are forecasted and observed) and correct negatives (events are neither forecasted nor observed), and no misses (observed events are not forecasted) or false alarms (forecasted events are not observed). Thresholds were specified to separate ‘yes’ and ‘no’ events. For the observed events, two streamflow thresholds were defined for each catchment: the 50th and the 90th percentiles of daily streamflows, computed over the verification period (hereafter, Q50 and Q90, respectively). For the definition of forecasted events, three thresholds were selected: if p% of the ensemble members forecast discharges exceeding the considered streamflow threshold, the event is considered as a ‘forecasted event’ (i.e. ‘yes’). Otherwise, the event is considered not forecasted. The values of p% chosen in this study are 20, 50 and 80% (hereafter, p20, p50 and p80, respectively). The combination of these thresholds results in six contingency tables. For each contingency table, the following descriptive statistics were computed: Atmos. Sci. Let. 11: 100–107 (2010)

Comparing the scores of hydrological ensemble forecasts

1. The probability of detection or hit rate (POD) gives the proportion of the observed ‘yes’ events that were correctly forecasted. It ranges from 0 to 1 (perfect score) (Equation (3)): POD =

hits hits + misses

(3)

2. The false alarm ratio (FAR) gives the proportion of the forecasted ‘yes’ events that actually did not occur. It ranges from 0 (perfect score) to 1 (Equation (4)):

103

where BS is the Brier score and N is the total number of days used to compute the score. For each realization j , oj = 1 if the event occurs and oj = 0 if the event does not occur. yj is the forecasted probability, given by the proportion of ensemble members forecasting the event (a value between 0 and 1). The same streamflow thresholds used in the contingency tables were considered to define an event: percentiles Q50 and Q90. The Brier score is negatively oriented (smaller score better) and has a minimum value of zero for a perfect (deterministic) system. 3.2.5. The Brier skill score

false alarms FAR = hits + false alarms

(4)

3. The Bias score (BIAS) measures the ratio of the frequency of forecast events to the frequency of observed events. It indicates whether the forecast system has a tendency to underforecast (BIAS < 1) or overforecast (BIAS > 1) (Equation (5)): BIAS =

hits + false alarms hits + misses

(5)

3.2.4. The Brier score

It averages the squared differences between pairs of forecast probabilities and the subsequent binary observations (Equation (6)): n 1  (yj − oj )2 BS = N j =1

(6)

It was computed (Equation (7)) using climatology and persistence (i.e. the last discharge observed at the time of forecast is forecast to persist into the forecast period) as reference forecasts: BSS =

BS BS − BSreference =1− 0 − BSreference BSreference

(7)

The Brier skill score (BSS) is positively oriented (higher values indicate better performance). It ranges from −∞ to 1 (perfect deterministic system). Negative (equal to 0) scores indicate a system that performs poorer than (like) the reference forecast.

4. Results 4.1. Normalized RMSE and spread Figure 2 shows the maps of normalized RMSE obtained for the ensemble streamflow forecasts generated by the GRP (with and without updating) and

Figure 2. Normalized RMSE of the ensemble streamflow forecasts (RMSE divided by the mean observed streamflow) from the GRP model with state updating (left), GRP model without updating (GR no updating, center) and the SIM model (right), for lead time day 1 (a, top) and day 2 (b, bottom), and 211 catchments in France. Copyright  2010 Royal Meteorological Society

Atmos. Sci. Let. 11: 100–107 (2010)

104

A. Randrianasolo et al.

Figure 3. Boxplots of normalized spread of the ensemble streamflow forecasts computed over 211 catchments for GRP (with state updating and without updating) and SIM models and for lead times day 1 (left) and day 2 (right). Medium values are indicated. The top and the bottom of the box represent the 75th and the 25th percentile, respectively, while the top and the bottom of the tail indicate the 95th and the 5th percentile, respectively.

the SIM models, and for the two lead times studied. Scores are better for GRP with state updating, while the performance of GRP without updating is closer to the performance of SIM. In general, scores tend to increase with lead time. The differences between RMSE values for day 1 and day 2 are more important for GRP model, while the models without updating seem to be less impacted. Mean normalized RMSE for lead times day 1 and day 2, respectively, equal to 0.36 and 0.56 for GRP model, 0.75 and 0.80 for GRP no updating, and 1.04 and 1.08 for SIM model. It must be noted that the errors in the precipitation ensemble forecast are already important: mean values of RMSE equal to 3.45 and 3.60 mm/day for lead times day 1 and day 2. These values, evaluated on the basis of areal mean precipitation, are similar to those presented by Thirel et al., 2008 in their study conducted over gridded PEARP forecasts in France. Normalized RMSE values show that, in some cases, the errors in precipitation forecast are twice as great as the mean observed precipitation over the same period. Higher errors are found in catchments located in south-east France, where river basins are exposed to localized and severe rain events. Ensemble streamflow forecasts from the GRP model show less spread than those from the SIM model (Figure 3). For both models, there is an increase in spread from lead time day 1 to day 2, with a higher rate of increase for the GRP model: for GRP, the mean standard deviation for day 2 is approximately six times greater than the mean standard deviation for day 1, while for the SIM model it is approximately three times greater. The observed differences may not only be due to the assimilation procedure, since the results of GRP with and without updating are of the same order and present the same behavior (Figure 3). The lumped feature of the GRP hydrological model may also play a role. A simple test was performed forcing the lumped GRP model with the precipitation amount of each PEARP grid over Copyright  2010 Royal Meteorological Society

the catchment (instead of using only the areal mean precipitation). This increased the number of ensemble members (variable according to the catchment size) and resulted in higher spread: median values of normalized standard deviation equal to 0.004 for day 1 and 0.034 for day 2, comparatively to the values of 0.0029 and 0.024 in Figure 3. Despite this increase, the spread remains generally low. The same conclusion was drawn from the evaluation of rank (Talagrand) histograms for each catchment (not shown) and the visualization of flow hydrographs. A lack of ensemble spread in PEARP precipitation and PEARP SIM-based streamflow gridded forecasts was also reported by Thirel et al. (2008).

4.2. Contingency tables Figure 4 shows the values of POD, FAR, and BIAS for the streamflow thresholds Q50 and Q90 and for the ensemble threshold p80 (80% of ensemble members exceeding the streamflow threshold). Results are shown as a function of catchment area. In general, statistics are better for larger catchments (greater than 1500–2000 km2 ), although it must be noted that only about 30% of the catchments in our database is greater than 1500 km2 . Statistics from the contingence tables do not differ significantly when different ensemble thresholds (p20, p50, and p80) were used (not shown), which can be explained by the low spread of the ensemble forecasts. POD values are generally lower for lead time day 2 and FAR is higher, although the differences are usually small (less than 10% points and smaller for the models that do not use updating procedures). Statistical measures showed more sensitiveness to changes in the streamflow threshold used to define observed events (Figure 4). As the threshold increases (here from Q50 to Q90), FAR tends to increase for all models: for day 1, mean values of FAR for GRP model equal to 4.8 and 13.5% for Q50 and Q90, respectively, 17.9 and 34.5% for GRP no updating, and 9.5 Atmos. Sci. Let. 11: 100–107 (2010)

Comparing the scores of hydrological ensemble forecasts

105

Figure 4. POD (circles), FAR (triangles), and BIAS (crosses) for 80% of ensemble members exceeding the discharge percentile 50% (left) and the discharge percentile 90% (right) for lead time day 1. Results for GRP model with state updating (top), GRP without updating (center) and SIM model (bottom) as a function of catchment area (211 catchments in France).

and 36.5% for SIM model. Mean POD decreases for GRP models, while it increases for SIM model: the average value of POD for GRP model is 96.2 and 89.0% for Q50 and Q90, respectively, while it is 89.0 Copyright  2010 Royal Meteorological Society

and 81.2% for GRP no updating, and 65.9 and 73.7% for SIM model. BIAS values are closer to the unity for the GRP model : mean BIAS of 1.01 and 1.03 for Q50 and Atmos. Sci. Let. 11: 100–107 (2010)

106

A. Randrianasolo et al.

Figure 5. Boxplots of Brier skill scores computed over 211 catchments for GRP (with state updating and without updating) and SIM models and for lead times day 1 (left) and day 2 (right). The reference is the climatology. Brier scores are computed for exceedances of the discharge percentile 50% (top) and 90% (bottom). Medium values are indicated. The top and the bottom of the box represent the 75th and the 25th percentile, respectively, while the top and the bottom of the tail indicate the 95th and the 5th percentile, respectively.

Q90, respectively, and lead time day 1. They show a tendency to overforecast (BIAS > 1) when the GRP no updating is considered: mean values of 1.11 and 1.40, for Q50 and Q90. SIM model has a tendency to underforecast (BIAS < 1) streamflow exceedances of the lower threshold (Q50), while it has a tendency to overforecast (BIAS > 1) when the higher threshold (Q90) is considered: mean BIAS of 0.76 and 1.28 for Q50 and Q90, respectively. The same tendencies are observed for day 2.

Figure 5 indicate positive values and therefore a better performance of the forecasting systems compared to the climatology. However, when the reference used is the persistence (not shown), BSS values decrease significantly and the majority of catchments show negative scores. The median BSS is only greater than zero for the GRP model with state updating, although the lowest percentile of the boxplot (5%) is negative (i.e. the forecasting system performs poorer than the naive persistence model for at least ten of the studied catchments).

4.3. Brier skill score Boxplots of BSSs computed over the 211 studied catchments and using climatological forecasts as the reference are shown (Figure 5). A better performance of streamflow ensembles is obtained for the GRP model with state updating, with BSS values closer to the unity. BSS values do not vary significantly with lead time. However, better scores are generally obtained when considering high discharge thresholds (exceedances of the percentile Q90). This is especially observed in the case of the hydrological models that do not make use of an updating procedure during forecasting. The boxplot distributions of BSS shown in Copyright  2010 Royal Meteorological Society

5. Conclusions This paper presents an assessment of the performance of two hydrologic ensemble forecasting systems driven by the same weather EPS, the PEARP, produced by M´et´eo-France (11 members). Scores and statistical measures were computed over a 17-month period and 211 catchments in France. Two lead times (24 and 48 h) were considered and forecasts were compared to observed discharges. The results suggest good performance of both hydrological models forced by the PEARP ensemble predictions. In general, better scores Atmos. Sci. Let. 11: 100–107 (2010)

Comparing the scores of hydrological ensemble forecasts

were obtained from the lumped-based GRP model, running with a state updating technique that allows the assimilation of the last observed discharge during forecasting. Similar results obtained for the distributedbased SIM model and the GRP model without updating highlight the importance of data assimilation and updating of initial hydrologic conditions in streamflow forecasting. Although scores like BSS or RMSE, POD, and FAR show globally good results, ensemble standard deviation estimates indicate low spread of the ensemble predictions. This eventual under representation of the forecast uncertainty needs to be further investigated. In this study, easily understandable measures were used in a first attempt to evaluate the impact of different hydrologic model conceptualizations in the quality of ensemble forecasts. The benefit of streamflow data assimilation for ensemble short-term forecasting is demonstrated. Ongoing work seeks to apply other verification measures and to further investigate the differences between the two modeling approaches studied, by also taking into account results from the SIM model with data assimilation.

Acknowledgements ´ We acknowledge the MEEDM (Minist`ere de l’Ecologie, de l’Energie, du D´eveloppement durable et de la Mer) for the hydrological data and M´et´eo-France for the weather data and PEARP forecasts. We also thank the French national service for flood forecasting (SCHAPI) for supporting this study.

References Berthet L, Andr´eassian V, Perrin C, Javelle P. 2009. How crucial is it to account for the antecedent moisture conditions in flood forecasting? Comparison of event-based and continuous approaches on 178 catchments. Hydrology and Earth System Sciences 13: 819–831. Durand Y, Brun E, Merindol L, Guyomarc’h G, Lesaffre B, Martin E. 1993. A meteorological estimation of relevant parameters for snow

Copyright  2010 Royal Meteorological Society

107

schemes used with atmospheric models. Annals of Glaciology 18: 65–71. Habets F, Boone A, Champeau J-L, Etchevers P, Leblois E, Ledoux E, Le Moigne P, Martin E, Morel S, Noilhan J, Quintana Segui G, Rousset-Regimbeau F, Viennot P. 2008. The SAFRAN-ISBAMODCOU hydrometeorological model applied over France. Journal of Geophysical Research 113: D06113, DOI:10.1029/2007JD008548. Jolliffe, IT, Stephenson DB (eds). 2003. Forecast Verification. A Practitioner’s Guide in Atmospheric Science. John Wiley and Sons: Chichester, West Sussex, England; 240p. Ledoux E, Girard G, de Marsily G, Deschenes J. 1989. Spatially distributed modeling: Conceptual approach, coupling surface water and groundwater. In Unsaturated Flow in Hydrologic Modeling: Theory and Practice, NATO ASI (Adv. Sci. Inst.) Series: C, MorelSeytoux H-J (ed), Vol. 275. Kluwer Academic Publishers, Norwell, MA; 435–454. Nicolau J. 2002. Short-range ensemble forecasting. In Proceedings WMO/CBS Technical Conferences On Data Processing and Forecasting Systems, Cairns, Australia, 2–3 December; 4p. Available at: http://www.wmo.ch/pages/prog/www/DPS/TC-DPFS2002/Papers-Posters/Topic1-Nicolau.pdf. Noilhan J, Planton S. 1989. A simple parameterization of land surface processes for meteorological models. Monthly Weather Review 117: 536–549. Oudin L, Hervieu F, Michel C, Perrin C, Andr´eassian V, Anctil F, Loumagne C. 2005. Which potential evapotranspiration input for a lumped rainfall–runoff model? Part 2 – Towards a simple and efficient potential evapotranspiration model for rainfall–runoff modelling. Journal of Hydrology 303: 290–306. Pappenberger F, Scipal K, Buizza R. 2008. Hydrological aspects of meteorological verification. Atmospheric Science Letters 9: 43–52. Ramos MH, Bartholmes J, Thielen J. 2007. Development of decision support products based on ensemble weather forecasts in the European Flood Alert System. Atmospheric Science Letters 8: 113–119. Thirel G, Rousset-Regimbeau F, Martin E, Habets F. 2008. On the impact of short-range meteorological forecasts for ensemble streamflow prediction. Journal of Hydrometeorology 9: 1301–1317. Vidal J-P, Martin E, Franchist´eguy L, Martine B, Soubeyroux J-M. 2009. A 50-year high-resolution atmospheric reanalysis over France with the Safran system. International Journal of Climatology DOI: 10.1002/joc.2003. Welles E, Sorooshian S, Carter G, Olsen B. 2007. Hydrologic verification: a call for action and collaboration. Bulletin of the American Meteorological Society 88: 503–511.

Atmos. Sci. Let. 11: 100–107 (2010)