Development of Statistical Typhoon Intensity Prediction: Application to ...

2 downloads 0 Views 1MB Size Report
Feb 13, 2012 - developed for the intensity prediction of western North Pacific TCs at 24-, 48-, and ..... of the cases, and STIPER provides a 1%–3% MAE re-.
240

WEATHER AND FORECASTING

VOLUME 27

Development of Statistical Typhoon Intensity Prediction: Application to Satellite-Observed Surface Evaporation and Rain Rate (STIPER) SI GAO AND LONG S. CHIU Department of Atmospheric, Oceanic and Earth Sciences, College of Science, George Mason University, Fairfax, Virginia (Manuscript received 8 March 2011, in final form 10 September 2011) ABSTRACT A statistical–dynamical model has been used for operational guidance for tropical cyclone (TC) intensity prediction. In this study, several multiple linear regression models and neural network (NN) models are developed for the intensity prediction of western North Pacific TCs at 24-, 48-, and 72-h intervals. The multiple linear regression models include a model of climatology and persistence (CLIPER), a model based on the Statistical Typhoon Intensity Prediction System (STIPS), which serves as the base regression model (BASE), and a model of STIPS with additional satellite estimates of surface evaporation (SLHF) and innercore rain rate (IRR, STIPER model). A revised equation for the TC maximum potential intensity is derived using Tropical Rainfall Measuring Mission Microwave Imager optimally interpolated sea surface temperature data, which have higher temporal and spatial resolutions. Analyses of the resulting models show the marginal improvement of STIPER over BASE. However, IRR and SLHF are found to be significant predictors in the predictor pool. Neural network models using the same predictors as STIPER show reductions of the mean absolute errors of 7%, 11%, and 16% relative to STIPER for 24-, 48-, and 72-h forecasts, respectively. The largest improvement is found for the intensity forecasts of the rapidly intensifying and rapidly decaying TCs.

1. Introduction Tropical cyclones (TCs) cause huge economic losses and human casualties worldwide. Hurricane Katrina in 2005 and the associated flood resulted in over 1800 lives being lost, and damage estimates were well in excess of $100 billion in the aftermath of the actual hurricane and the subsequent flooding. Cyclone Nargis in 2008 in Myanmar claimed at least 138 000 fatalities and its estimated death toll was more than 80 000. Damage was estimated at over $10 billion, making Nargis the most damaging cyclone ever recorded in the North Indian Ocean Basin. Accurate forecasting of TC tracks and intensities is therefore essential to minimizing economic losses and human casualties caused by TCs. An evaluation of the National Hurricane Center and Joint Typhoon Warning Center (JTWC) operational tropical cyclone intensity forecasts for the three major Northern Hemisphere tropical cyclone basins (Atlantic, eastern North Pacific, and western North Pacific) for the past two

Corresponding author address: Long S. Chiu, Dept. of Atmospheric, Oceanic and Earth Sciences, College of Science, George Mason University, 4400 University Dr., Fairfax, VA 22030. E-mail: [email protected] DOI: 10.1175/WAF-D-11-00034.1 Ó 2012 American Meteorological Society

decades only shows marginal improvement in intensity forecasting while the track forecasts have seen steady improvement (DeMaria et al. 2007), indicating that there are challenges in TC intensity forecast remaining. The Statistical Typhoon Intensity Prediction Scheme (STIPS) for the western North Pacific developed by Knaff et al. (2005) has been implemented into operations at the Joint Typhoon Warning Center (JTWC) since 2003, and it has produced skillful forecasts through 4 days. STIPS contains static predictors related to climatology and persistence and time-dependent predictors related to environmental and sea surface temperature (SST) conditions. With the advent of remote sensing technology, a number of remote sensing products are now available in real time or near–real time. The inclusion of satellite information, such as brightness temperatures from infrared imagery and ocean heat contents (OHCs) from satellitebased altimetry (DeMaria et al. 2005; Mainelli et al. 2008), as well as predictors relating to the inner-core (within 110 km of the storm’s center) precipitation and convective characteristics of TCs derived from passive microwave imagery (Jones et al. 2006), have shown improvements of up to 8% in the Statistical Hurricane Intensity Prediction Scheme (SHIPS) for the eastern North Pacific and Atlantic (DeMaria and Kaplan 1994a;

FEBRUARY 2012

241

GAO AND CHIU TABLE 1. Datasets used in this study, as well as the temporal and spatial coverages and resolutions.

Dataset

Spatial resolution

Temporal resolution

Spatial coverage

RSMC Tokyo best track

0.18

6 hourly

NCEP GFS analysis TMPA rain rate OAFlux SLHF TMI OI SST

18 0.258 18 0.258

6 hourly 3 hourly Daily Daily

Western North Pacific and the South China Sea Global 408N–408S Global 408N–408S

DeMaria and Kaplan 1999). However, the value of satellitebased information has yet to be fully exploited for the operational prediction of TC intensity. Latent heat transfer at the air–sea interface, which can be quantified by SLHF, and latent heat release occurring within the inner-core region in the atmosphere, which could be indicated by IRR, are two major heat sources for TC intensification. Gao and Chiu (2010) used satellitebased SLHF and rain-rate data to show that high initial SLHFs and IRRs are usually associated with TC rapid intensification in 24 h over the western North Pacific, suggesting that SLHF and IRR have the potential to be useful new predictors for TC intensity forecasting. The goal of this study is to examine the impacts of SLHF and IRR on the intensity prediction of TCs over the western North Pacific. In addition to multiple linear regression models, artificial neural network (NN) models are also constructed for intensity prediction. NN models have been shown to work well in TC intensity forecasts (Baik and Paek 2000; Jin et al. 2008). Section 2 describes the datasets. The predictors and the development and evaluation of multiple linear regression models are presented in section 3. NN models are developed and assessed in section 4. Finally, a summary and discussion are given in section 5.

2. Datasets The intensity and location information for each TC is taken from best-track data produced by the Japan Meteorological Agency’s Regional Specialized Meteorological Center Tokyo (RSMC Tokyo). This postanalysis best-track dataset contains 6-h the location, minimum central pressure, and 10-min maximum sustained wind speed (MWS) of all of the TCs over the western North Pacific including the South China Sea. Environmental data are derived from the National Centers for Environmental Prediction’s (NCEP) Global Forecasting System (GFS) Final (FNL) gridded analysis (Yang et al. 2006) at 1.08 3 1.08 and 6-h resolution. The environmental data include the wind, air temperature, and relative humidity at 200, 250, 300, 350, 400, 450, 500, 700, 750, 800, and 850 hPa. The divergence, relative eddy flux convergence, and relative vorticity at each grid

Temporal coverage 1951–2009 2000–present 1998–present 1985–2008 1998–present

are calculated using wind field information and the central difference method. All the environmental predictors are derived by averaging corresponding data within some specific radius; detail calculations of these parameters will be presented in the next section. The rainfall data are taken from the Tropical Rainfall Measuring Mission (TRMM) Multisatellite Precipitation Analysis (TMPA; Huffman et al. 2007). TMPA provides combined precipitation estimates from multiple satellites, as well as gauge analyses where possible. The spatial and temporal resolutions are 0.258 3 0.258 and 3 hourly, respectively. Daily gridded SLHF data at 18 3 18 resolution are obtained from the third release of the Objectively Analyzed Air–Sea Fluxes dataset (OAFlux; Yu et al. 2008). OAFlux is an objectively blended dataset; data sources are from satellite and numerical weather prediction (NWP) model outputs while in situ observations are used to assign the weights. TRMM Microwave Imager (TMI) optimally interpolated (OI) daily SSTs (Wentz et al. 2000) at 0.258 3 0.258 resolution produced by Remote Sensing Systems, together with best-track data, are utilized to estimate the maximum potential intensity (MPI). Table 1 shows the availability of the datasets, including their temporal and spatial coverages and resolutions. All the datasets are collected over the period from 2000 to 2008. This period represents an intersection of all available data used in this study. RSMC Tokyo best-track data, NCEP GFS FNL environmental data, and TMPA rainrate data are collected twice daily at 0000 and 1200 UTC. Only overwater TC samples are considered, land effects on TC intensity change are not considered in this work, and the OAFlux SLHF and TMI SST data are available over the ocean only. Except for SST and SLHF, the datasets are available for all storms at all forecast times. At 24-, 48-, and 72-h forecast times, SST and SLHF are concurrently available for 75%, 68%, and 62% of all the cases, respectively.

3. Linear regression models Multiple linear regression models are used to predict how the overwater 24-, 48-, and 72-h intensities

242

WEATHER AND FORECASTING

TABLE 2. Potential climatological, environmental, and satellitebased predictors. The predictors that are evaluated at the beginning of the forecast period are static (S), and the predictors that are averaged along a storm track from the initial time to the forecast time are time dependent (T).

Predictor

Description

Climatology and persistence MWS0 Initial maximum sustained wind speed (MWS) DMWS Change in MWS during the past 12 h JDAY Absolute value of (Julian day 2 248) SPD Storm translational speed LAT Latitude of storm center LON Longitude of storm center Environmental POT Maximum potential intensity based on Eq. (1) minus initial maximum wind speed RHLO Area-averaged (200–800 km) RH at 850–700 hPa RHHI Area-averaged (200–800 km) RH at 500–300 hPa U200 Area-averaged (200–800 km) zonal wind at 200 hPa T200 Area-averaged (200–800 km) temperature at 200 hPa d200 Area-averaged (0–1000 km) divergence at 200 hPa REFC Relative eddy flux convergence within 600 km at 200 hPa SHR Area-averaged (200–800 km) 200–850-hPa wind shear USHR Area-averaged (200–800 km) 200–850-hPa zonal wind shear z850 Area-averaged (0–1000 km) 850-hPa relative vorticity Satellite based SLHF Area-averaged (58 3 58 box) surface latent heat flux IRR Area-averaged (0–100 km) inner-core rain rate

Static (S) or time dependent (T) S S S S S S T

T T T T T T T T T

S S

(i.e., MWS) change (predictand, DELV) from the initial forecast time.

a. Model formulation Table 2 summarizes the predictors (including 16 original and 2 new satellite-based predictors: SLHF and IRR) used in this study. The computation of the traditional climatological and environmental predictors follows the approach in Knaff et al. (2005). All of the environmental predictors are obtained using a ‘‘perfect prog’’ approach (Kalnay 2003). Both the NCEP GFS FNL analysis and the actual TC best track (from RSMC Tokyo) are used to develop the models. The predictors that are evaluated at the beginning of the forecast period are static, such as

VOLUME 27

those predictors related to climatology and persistence; predictors that are averaged along the track of the storm from the initial observation to the forecast time are time dependent, providing the mean conditions for the storm, such as those predictors related to SST, moisture, and wind fields. The MPI is defined as the upper bound of the TC intensity for a set of given atmospheric and oceanic thermal conditions (Camp and Montgomery 2001). MPI can be estimated theoretically (e.g., Miller 1958; Emanuel 1988; Holland 1997) or empirically (e.g., Merrill 1987; DeMaria and Kaplan 1994b; Whitney and Hobgood 1997; Knaff et al. 2005; Zeng et al. 2007). Theoretically derived MPIs are often represented by minimum central pressures. The relationship between TC maximum wind and minimum pressure relationship is complex and is determined by factors such as latitude, TC size, environment pressure, and intensification rate (Knaff and Zehr 2007). To be consistent with the TRMM-derived rainfall, we use the SST derived from TMI and hence the empirical approach of determining the MPI as an exponential function of SST is used. The interpolated 18 3 18 monthly SST climatology (Levitus 1982) or 18 3 18 weekly SST analysis (Reynolds et al. 2002) used in previous studies is replaced by the high-resolution daily SST data retrieved from TMI in this study. Daily SSTs are expected to provide more precise and up-to-date thermal information on the ocean. TMI SSTs were stratified into SST bins with midpoints from 16.58 to 32.58C at 0.58C intervals, and each observation is assigned to the nearest SST midpoint. The empirical exponential MPI function described by Eq. (1) is derived with an SST cutoff of 28.58C since the flattening or decrease of maximum winds over the warmest waters occurs at this SST. The resulting coefficients are given by A 5 29.59 kt (or 15.22 m s21, where 1 kt 5 0.514 m s21), B 5 108.1 kt, C 5 0.12928C21, and T0 5 30.08C. Figure 1 shows this MPI function together with the data used for its development. The highest MPI is given as 140 kt: MPI 5 A 1 BeC(T2T0 ) .

(1)

The coefficients in the MPI–SST relation are different from those of Knaff et al. (2005). Knaff et al. used 1-min sustained maximum wind speed in JTWC best-track data whereas the RMSC Tokyo 10-min sustained maximum wind speed is used here. SLHF and IRR predictors are computed as the average within a box of 58 3 58 and within a radius of 100 km centered at the TC position, respectively. The averaging areas optimize their correlation coefficients with intensity and intensity changes, as has also been suggested in previous studies (Chang et al. 1997; Rodgers et al. 1994).

FEBRUARY 2012

243

GAO AND CHIU

FIG. 1. The empirical relationship between MPI (kt) and SST (8C). The relationship is derived from 9 yr (2000–08) of data and the individual data points used for its development are also shown.

For operational purposes, SLHF and IRR are static predictors, since no coupled atmosphere–ocean models are used for prognostic purposes. A linear stepwise regression procedure is used to select parameters from the potential predictor pool. A 99% statistical significance level based on an F test (e.g., Wilks 2006) is the threshold for an individual predictor to be added initially into the model. Once selected, a predictor can only be removed if its significance level becomes less than 98% after the addition–removal of another predictor. Three regression models are developed. The first serves as a control. The stepwise procedure is applied on the original 16 STIPS predictors to select significant predictors and create a base regression model (hereafter referred to as BASE). The second model is developed by conducting a stepwise procedure on the original predictor pool plus two more satellite-based parameters

SLHF and IRR (hereafter referred to as STIPER). The stepwise procedure is also employed on the six predictors related to climatology and persistence to create another regression model called CLIPER, which is generally a baseline for evaluating the skill of the operational models. A model can be considered to produce a skillful intensity forecast if it has a smaller error than CLIPER. Assuming the independence of annual statistics, the samples in one year are used for verification and the samples in the other years are used for model development. As a result, for each CLIPER, BASE, and STIPER model, there are nine regression equations (one for each verification year), which may contain different sets of significant predictors due to different training samples. For consistency and a fair comparison, the final predictor set for each model (CLIPER, BASE, and STIPER) is the union of all the significant predictors identified in the nine regression equations. Each model is rerun using the same set of predictors for the nine verification years. The predictand as well as the predictors are normalized by subtracting their means and dividing by their standard deviations before regression; the resulting coefficients can be used to compare the relative contribution of each predictor directly.

b. Model interpretation 1) 24-h MODELS Table 3 lists the normalized coefficients associated with each predictor for each STIPER 24-h forecast equation. The numbers of samples used to develop the regression equations are shown in parentheses at the top of the table. The 99% statistically significant predictors for each verification years are indicated in boldfaced italics in Table 3. There are around 1100 samples for training in each model. The 24-h STIPER models contain 10 statistically significant predictors: initial intensity (MWS0), previous 12-h intensity change (DMWS),

TABLE 3. Normalized regression coefficients in the 24-h STIPER forecast model during different verification years. The predictors are listed in the first column and the verification years are listed in the top row. Here, N (shown in parentheses) is the number of samples used to develop the equation. Coefficients significant above the 99% statistical significance level from an F test are indicated in boldfaced italics. Year (N) Predictor

2000 (1126)

2001 (1102)

2002 (1065)

2003 (1097)

2004 (1039)

2005 (1108)

2006 (1111)

2007 (1144)

2008 (1176)

1) MWS0 2) DMWS 3) JDAY 4) LAT 5) POT 6) RHLO 7) RHHI 8) SHR 9) SLHF 10) IRR

20.15 0.29 20.12 20.11 0.31 0.07 20.01 20.17 0.05 0.04

20.17 0.31 20.10 20.09 0.29 0.04 0.02 20.18 0.06 0.07

20.17 0.30 20.12 20.10 0.29 0.05 0.03 20.17 0.05 0.04

20.14 0.30 20.11 20.10 0.31 0.04 0.04 20.18 0.03 0.07

20.15 0.30 20.12 20.13 0.30 0.05 0.01 20.18 0.04 0.03

20.15 0.30 20.10 20.10 0.30 0.04 0.04 20.16 0.04 0.06

20.13 0.30 20.10 20.09 0.33 0.05 0.01 20.17 0.05 0.05

20.16 0.28 20.12 20.13 0.29 0.06 0.00 20.18 0.05 0.05

20.14 0.28 20.11 20.12 0.32 0.06 20.01 20.18 0.05 0.04

244

WEATHER AND FORECASTING

VOLUME 27

TABLE 4. As in Table 3, but for 48-h forecasts. Year (N) Predictor

2000 (759)

2001 (736)

2002 (702)

2003 (726)

2004 (684)

2005 (738)

2006 (744)

2007 (770)

2008 (837)

1) MWS0 2) DMWS 3) JDAY 4) LAT 5) POT 6) RHLO 7) RHHI 8) REFC 9) SHR 10) USHR 11) SLHF 12) IRR

20.11 0.14 20.15 20.13 0.56 0.05 20.05 0.09 20.16 0.08 0.02 0.09

20.16 0.16 20.13 20.07 0.50 0.01 20.02 0.12 20.21 0.09 0.03 0.13

20.11 0.15 20.17 20.12 0.55 20.00 0.03 0.11 20.19 0.13 20.01 0.11

20.09 0.15 20.13 20.11 0.57 0.02 0.01 0.06 20.20 0.09 20.00 0.12

20.16 0.15 20.16 20.16 0.51 0.02 20.04 0.11 20.22 0.12 20.01 0.09

20.13 0.15 20.13 20.11 0.53 0.01 0.02 0.08 20.18 0.11 20.00 0.13

20.12 0.16 20.13 20.09 0.53 0.02 20.01 0.10 20.18 0.08 0.02 0.10

20.16 0.15 20.15 20.13 0.51 0.04 20.05 0.10 20.20 0.10 0.02 0.10

20.13 0.15 20.14 20.11 0.53 0.02 20.01 0.10 20.19 0.10 0.01 0.10

absolute value of Julian day minus 248 (JDAY), initial latitude of the storm (LAT), potential (POT), 850– 700-hPa average relative humidity (RHLO), 500–300-hPa average relative humidity (RHHI), 200–850-hPa wind shear (SHR), SLHF, and IRR. DMWS, JDAY, POT, and SHR are significant for all the verification years. IRR is significant for the verification cases in the years 2001, 2002, 2003, 2005, 2007, and 2008 (six out of nine), and SHLF is significant for the 2001, 2004, 2006, 2007, and 2008 (five out of nine) verification cases. Correspondingly, MWS0, DMWS, JDAY, and LAT are used to develop the CLIPER models and MWS0, DMWS, JDAY, LAT, POT, RHLO, RHHI, and SHR are used in the BASE models. In all of these models, the four most important predictors are POT, DMWS, SHR, and MWS0. As expected, the persistence term, DMWS, is associated with a positive regression coefficient, since storms that have intensified in the previous 12 h tend to intensify in the next 24 h (Knaff et al. 2005). Intensity change is negatively correlated with MWS0 because weak storms are further from their MPIs and hence have more potential to intensify. Vertical wind shear has a negative impact on the intensification of TCs. One explanation for this is that the heat and moisture at upper levels are advected in a different direction relative to the low-level cyclonic circulation and therefore the ‘‘ventilation’’ of heat away from the circulation inhibits the development of the storm (Gray 1968). DeMaria (1996) proposed an alternate explanation: the tilt of the upper- and lower-level potential vorticities due to vertical wind shear produces a midlevel temperature increase near the vortex center. This midlevel warming is hypothesized to reduce the convective activity and thus inhibit storm development. The coefficient of JDAY is negative since this variable indicates the number of days from the peak of the typhoon season. LAT is negatively correlated with intensity

change since the SST generally decreases toward the north in the western North Pacific Basin. RHLO and RHHI can affect TC intensification rates because high relative humidity in the middle atmosphere reduces the entrainment of dry air into the cumulus convection, which is a direct source of TC energy. Nearly the same positive coefficients associated with the two new satellitebased predictors suggest that latent heat release in the atmosphere and latent heat transfer at the ocean-atmosphere interface have comparable effects on TC development in 24 h.

2) 48-h MODELS The statistically significant predictors in STIPER selected from the potential predictor pool for 48-h intensity forecasts include MWS0, DMWS, JDAY, LAT, POT, RHLO, RHHI, 200-hPa relative eddy flux convergence (REFC), SHR, the zonal component of the 200–850-hPa wind shear (USHR), SLHF, and IRR (Table 4). DMWS, POT, SHR, and IRR are significant for all of the verification years. SHLF is significant only for the years 2001 and 2006. There are around 700 samples for training of each 48-h model. The 48-h CLIPER models have the same predictors as in the 24-h version, and the 48-h BASE models contain 12 predictors. The additional predictors in the 24-h models are REFC and USHR. The predictors related to SST and vertical wind shear (POT and SHR) are most important. The contribution of the persistence term DMWS is weaker than in the 24-h STIPER model; persistence is more important for shorter-term forecasts. Relative to the 24-h STIPER model, the initial SLHF has a lower impact on the 48-h intensity change; however, IRR is significant in all of the verification years and has a larger normalized regression coefficient, suggesting the increasing role of initial innercore latent heating in longer-term forecasts.

FEBRUARY 2012

245

GAO AND CHIU TABLE 5. As in Table 3, but for 72-h forecasts.

Year (N) Predictor

2000 (522)

2001 (505)

2002 (470)

2003 (489)

2004 (464)

2005 (509)

2006 (509)

2007 (536)

2008 (572)

1) MWS0 2) DMWS 3) JDAY 4) LAT 5) LON 6) POT 7) RHHI 8) REFC 9) SHR 10) USHR 11) SLHF 12) IRR

20.13 0.10 20.15 20.11 0.05 0.64 20.04 0.08 20.12 0.06 0.01 0.12

20.19 0.12 20.13 20.02 0.01 0.57 20.01 0.14 20.19 0.09 0.02 0.13

20.21 0.09 20.16 20.09 0.07 0.53 0.02 0.12 20.16 0.08 0.01 0.12

20.24 0.09 20.16 20.14 0.06 0.52 20.06 0.13 20.21 0.11 20.03 0.11

20.24 0.09 20.16 20.14 0.06 0.52 20.06 0.13 20.21 0.11 20.03 0.11

20.22 0.09 20.11 20.08 0.07 0.53 0.02 0.08 20.15 0.07 0.01 0.15

20.20 0.11 20.13 20.08 0.03 0.54 20.01 0.l1 20.16 0.06 0.02 0.12

20.26 0.09 20.14 20.11 0.04 0.49 20.02 0.10 20.17 0.06 0.01 0.12

20.21 0.10 20.14 20.09 0.05 0.55 20.01 0.11 20.17 0.07 0.01 0.13

The positive regression coefficients associated with USHR indicate that westerly shear (and westerly 200-hPa winds) is favorable for TC intensification. This positive relationship is consistent with the finding in STIPS, but different from the negative relationships found in the Eastern Pacific basin and in the Atlantic basin (DeMaria and Kaplan 1994a; DeMaria and Kaplan 1999). There are two possible explanations for this. 1) The accompanying westerly winds from tropical upper-tropospheric troughs (TUTTs; Sadler 1976, 1978) or midlatitude troughs on the north sides of the TCs weaken the easterly shear normally observed over the storm center and result in weak westerlies within the 200–800-km annulus where the vertical wind shear is computed. This mechanism is likely more effective in the western North Pacific Basin, where most of the tropical cyclogenesis is associated with monsoon troughs (Zehr 1992; Briegel and Frank 1997; Ritchie and Holland 1999), which often have intense upper-level easterlies (Wang and Xu 1997) that inhibit storm intensification. 2) Some observations suggest that typhoon peak intensity often occurs at or near recurvature (Riehl 1972; Evans and McKinley 1998; Knaff 2009). The positive relationships between the momentum flux predictor REFC and the 48-h intensity change are expected, since REFC tends to be large when a TC is moving toward an upper-level trough in the midlatitude westerlies or interacting with upper-level cold lows at low latitudes, and the large REFC that makes the upper-level circulation more cyclonic could result in an increase in the storm intensification rate (Holland and Merrill 1984; Molinari and Vollaro 1989; DeMaria et al. 1993). Note that REFC is not a significant predictor for the 24-h BASE models.

3) 72-h MODELS Twelve predictors [MWS0, DMWS, JDAY, LAT, initial longitude of the storm (LON), POT, RHHI, REFC, SHR, and USHR] are selected as statistically significant

predictors in STIPER from the potential predictor pool for 72-h intensity forecasts (Table 5). DMWS, POT, SHR, and IRR are significant for all of the verification cases. SHLF is significant only for the 2001 and 2006 verification cases. Around 500 samples are available for each 72-h model development. The 72-h CLIPER models have one more predictor (LON) compared to the 24- and 48-h CLIPER models. The 72-h BASE and 72-h STIPER models have two additional predictors (REFC and USHR) compared to the 24-h version of STIPER, while RHLO, which is included in the 48-h model, is eliminated in the 72-h models. The predictors related to SST and vertical wind shear (POT and SHR), and the initial intensity (MWS0), are the most important, while the persistence term DMWS is less important in the 24- and 48-h versions of STIPER. As the 48-h intensity changes, the initial SLHF has a smaller impact on the 72-h intensity change than in the 24-h case. The other satellite-based predictor (IRR) has a comparable normalized regression coefficient as in the 48-h models, suggesting that the importance of the initial inner-core latent heating extends to 72-h forecasts. The positive regression coefficients associated with USHR and REFC indicate that westerly shear (and westerly 200-hPa winds) and large upper-level momentum flux are also favorable for TC intensification in 72 h. The positive regression coefficient of LON indicates that the intensification rate would be higher if the storms are at higher longitudes. A possible explanation is that storms situated farther west are at their initial stages and, hence, have a greater potential to intensify. Furthermore, SSTs are higher to the west over the western North Pacific.

4. Neural network models The nonlinear response of the TC intensities suggests the use of nonlinear models. Models with nonlinear terms, such as those developed by Knaff et al. (2005),

246

WEATHER AND FORECASTING

VOLUME 27

TABLE 6. MAEs (kt) with adjusted explained variances (R2, %) in parentheses for CLIPER, BASE, and STIPER 24-h forecasts during different verification years with the number of samples (N). The best model for each verification year is indicated by boldfaced italics. CLIPER

BASE 2

STIPER 2

Year

N

MAE

R

MAE

R

2000 2001 2002 2003 2004 2005 2006 2007 2008

120 144 181 149 207 138 135 102 70

7.01 6.51 7.30 6.94 7.94 7.60 7.43 9.87 8.67

37.5 47.5 42.6 50.5 39.5 49.3 50.5 44.3 55.5

6.54 6.33 6.77 6.73 7.58 7.47 7.18 8.67 8.09

43.8 54.0 48.7 53.4 45.5 54.6 54.6 54.6 60.8

MAE

R2

6.49 6.46 6.76 6.73 7.37 7.60 7.18 8.59 8.05

44.0 52.4 49.5 52.6 46.7 54.3 54.7 55.1 61.7

FIG. 2. Architectural graph of an NN with one hidden layer.

were developed and tested. Improvements of up to 4% are found for individual verification years. However, the NN method seems to work best. Back-propagation threelayer NN models for 24-, 48-, and 72-h intensity predictions (NN24, NN48, and NN72) are developed using the same predictors from the linear regression model STIPER. The NN architecture is indicated in Fig. 2. The input layers of the NN24, NN48, and NN72 models contain 10, 12, and 12 neurons, in that order, which correspond to the predictors used in the linear regression forecasting models for the same time period. All of the NN models have seven neurons in the hidden layers and one neuron in the output layers. The neuron in the output layers corresponds to the predictand (24-, 48-, or 72-h intensity change). The log-sigmoid transfer function is selected from the input layer to the hidden layer and the linear transfer function is used from the hidden layer to the output layer in all NN models. Similar to linear regression models, the NN models are developed using 8-yr data samples and the remaining 1-yr data samples are used for model verification. The samples in the development set of NN models are separated into three sets: a training set, a verification set, and a test set (Demuth et al. 2007). The numerical solution of the model is one that minimizes mean absolute error (MAE) while trying to avoid model overfitting. Overfitting can result in some unreasonable forecast models.

5. Model evaluation The linear regression models and NN models developed above are evaluated in this section. Tables 6–8 show MAEs with adjusted explained variances (R2) for the CLIPER, BASE, and STIPER models at 24-, 48-, and 72-h forecasts for different verification years. The best model for each verification year is indicated in boldfaced

italics. The numbers of samples used for verification are also shown. Table 9 indicates the verification statistics for the NN24, NN48, and NN72 models. For 24-h models, there are as few as 70 samples for verification year 2008 and up to 207 samples for verification year 2004. The BASE, STIPER, and NN24 models each can produce better forecasts than CLIPER. Among nine verification years, STIPER is the best model in six of the cases, and STIPER provides a 1%–3% MAE reduction relative to BASE for those six cases. The average MAEs for all verification years of CLIPER, BASE, and STIPER are 7.58, 7.18, and 7.16 kt, respectively, indicating that STIPER is only slightly better than BASE for the 24-h forecast. The MAEs of the NN24 models range from 5.97 kt in 2001 to 8.04 kt in 2007 (Table 9). Figure 3 shows a direct comparison of three linear regression models (CLIPER, BASE, and STIPER) and the NN models at 24-h forecast interval for the nine verification years. Compared to 24-h STIPER, NN24 provides a 6%–10% MAE reduction for the nine verification years. The average MAE of NN24 for all nine verification years is 6.63 kt, or 7% improvement relative to that of 24-h STIPER (7.16 kt). TABLE 7. As in Table 6, but for 48-h forecasts. CLIPER

BASE

STIPER

Year

N

MAE

R2

MAE

R2

MAE

R2

2000 2001 2002 2003 2004 2005 2006 2007 2008

78 101 135 111 153 99 93 67 40

10.92 9.37 10.42 10.03 11.62 10.00 10.35 14.78 12.57

33.3 60.9 50.9 64.2 45.8 60.5 61.4 31.0 59.3

10.21 8.93 9.73 8.86 10.92 8.87 9.09 11.97 12.24

40.9 66.7 55.6 70.9 50.1 70.9 70.1 55.0 62.1

9.89 9.42 9.68 8.62 10.55 9.20 8.92 11.56 11.85

43.9 62.1 56.8 71.0 52.2 69.0 71.3 56.9 63.9

FEBRUARY 2012

247

GAO AND CHIU

TABLE 8. As in Table 6, but for 72-h forecasts. CLIPER

BASE 2

Year

N

MAE

R

2000 2001 2002 2003 2004 2005 2006 2007 2008

50 67 102 83 108 63 63 36 26

11.99 10.76 10.48 11.82 13.00 10.62 10.82 16.11 9.60

22.4 69.2 55.9 70.4 51.0 68.7 70.2 26.0 67.5

STIPER 2

MAE

R

12.43 11.34 9.61 9.62 12.45 8.79 9.57 12.08 9.64

17.8 68.7 62.8 80.3 51.4 79.8 79.3 63.9 67.1

MAE

R2

12.16 11.50 9.56 9.45 12.17 9.51 8.95 11.44 11.58

21.1 68.3 65.2 80.6 53.1 76.9 80.8 67.8 54.5

To examine the 24-h model’s performance at specific stages of TCs, MAEs of the BASE, STIPER, and NN24 models are stratified by initial TC intensity and 24-h intensity change for all nine verification years as indicated in Fig. 4. Figure 4 (top) shows MAEs as a function of initial intensity classified into 5-kt bins. The STIPER model outperforms the BASE model for typhoons with intensity between 80 and 105 kt except for the 85–95-kt bins. The two models have very similar levels of performance for bins with lesser intensity. The NN24 model outperforms the STIPER and BASE models in all TC stages except for the 90–95-kt bin, where the BASE model produces the best forecasts. Figure 4 (bottom) indicates MAEs binned at every 5-kt change as a function of 24-h intensity change. Most of the samples are located within an intensity range of 620 kt. The STIPER model outperforms the BASE model for nearly all intensity change bins especially for the change between 35 and 40 kt. The NN24 model outperforms the STIPER and BASE models for all intensity change bins. The largest MAE reduction of the NN24 model occurs for the rapidly intensifying (24-h intensity change $ 35 kt) or rapidly decaying (24-h intensity change # 230 kt) storms. For 48-h models, there are 40 samples for verification year 2008 and up to 153 samples for verification year 2004. Each of the BASE, STIPER, and NN48 models can provide better forecasts than CLIPER except for STIPER in verification year 2001. Among the nine verification years of the three linear regression models, STIPER produces the best forecasts except for verification years 2001 and 2005, and MAE reductions of STIPER are 1%–3% compared to BASE for the seven years. The average MAE for all nine verification years of the 48-h CLIPER, BASE, and STIPER simulations are 10.88, 9.90, and 9.79 kt, respectively, indicating that STIPER is only slightly better than BASE. The MAEs of the NN48 models range from 7.81 kt in 2003 to 10.45 kt in 2007 (Table 9). Figure 5 shows the direct comparison of three linear regression models (CLIPER, BASE, and STIPER) and the NN models at

TABLE 9. MAEs (kt) and adjusted explained variances (R2, %) of neural network modes (NN24, NN48, and NN72) for 24-, 48-, and 72-h intensity forecasts during different verification years. NN24

NN48 2

Year

MAE

R

2000 2001 2002 2003 2004 2005 2006 2007 2008

6.02 5.97 6.27 6.21 6.95 6.91 6.72 8.04 7.21

47.2 62.7 55.9 58.8 51.1 57.2 59.0 59.1 67.2

NN72 2

MAE

R

8.29 8.10 8.43 7.81 9.13 8.71 8.52 10.45 9.78

53.6 70.8 62.9 74.4 61.9 69.3 73.3 63.6 72.7

MAE

R2

10.03 9.61 8.23 7.79 10.26 8.26 8.51 8.72 8.70

52.6 74.0 72.8 83.0 62.5 81.3 80.0 74.4 75.5

48-h forecast intervals for the nine verification years. NN48 provides a 4%–17% MAE reduction relative to the 48-h STIPER simulation for the nine verification cases, with an average MAE of 8.68 kt, which shows an 11% improvement relative to that of 48-h STIPER (9.79 kt). For 72-h models, there are 26 samples for verification year 2008 and up to 108 samples for verification year 2004. Among the nine verification years, five STIPER models, three CLIPER models, and one BASE model provide the best forecasts, and MAEs of the STIPER models are 1%–6% smaller than those of the corresponding BASE models for the five cases. The 72-h models are less stable than shorter-term forecast models for 24- and 48-h intensity prediction. This is probably due to the decreasing sample size with 72-h models compared to 24- and 48-h models, as indicated in Tables 3–5. The average MAE for all nine verification years of the 72-h CLIPER, BASE, and STIPER simulations are 11.63, 10.61, and 10.58 kt, respectively, again, indicating that STIPER is only slightly better than BASE.

FIG. 3. The MAEs (kt) of four regression models (CLIPER, BASE, STIPER, and NN) in different verification years at 24-h forecast intervals.

248

WEATHER AND FORECASTING

VOLUME 27

FIG. 5. As in Fig. 3, but for the 48-h forecast interval.

some predictors related to CLIPER, which have a greater impact on the intensity changes over shorter time periods.

6. Summary and discussion

FIG. 4. (top) The 24-h BASE, STIPER, and NN24 model MAEs stratified by best-track initial intensity (MWS0) in 5-kt bins and (bottom) 24-h intensity change (DELV) in 5-kt bins for all nine verification years. Lower values of MAEs represent better forecasts. Dashed–dotted lines represent the numbers of valid observations within a particular bin.

The MAEs of the NN72 models range from 7.79 kt in 2003 to 10.26 kt in 2004 (Table 9). Figure 6 shows a direct comparison of the CLIPER, BASE, STIPER, and NN models at 72-h forecast intervals for the nine verification years. NN48 provides a 6%–28% improvement relative to 72-h STIPER for the nine verification years. The average MAE of NN72 for all nine verification years is 8.92 kt, which shows a 16% improvement relative to that (10.58 kt) of 72-h STIPER. Student’s t tests show that, for all of the 9-yr forecasts, the differences between the NN models and each of the linear regression models for the three forecast intervals are statistically significant at a 95% confidence level, indicating that the NN models significantly outperform the linear regression models for the typhoon intensity forecast. The increase of the average MAEs with the longer forecast time intervals is primarily due to the inclusion of

A new maximum potential intensity equation is derived using TMI OI sea surface temperature data with high temporal and spatial resolutions (daily and 1/ 48). This equation together with environmental information obtained from NCEP GFS FNL analysis, best tracks taken from RSMC Tokyo, and SLHF and IRR results derived from satellite data are then utilized to develop multiple linear regression models and NN models for western North Pacific TC intensity forecasting at 24-, 48-, and 72-h intervals. Compared to the multiple linear regression models (BASE) with climatology, persistence, and environmental predictors only, the linear regression models that include additional satellite-based SLHF and IRR (STIPER) data provides a 1%–3% improvement in 24-h forecasting (for six out of nine cases) and 48-h forecasting (for seven

FIG. 6. As in Fig. 3, but for the 72-h forecast interval.

FEBRUARY 2012

249

GAO AND CHIU

out of nine cases) and a 1%–6% improvement in 72-h forecasting (for five out of nine cases). In terms of the average MAE for all nine of the verification years, STIPER is only slightly better, and the largest improvement of the satellite-enhanced STIPER model occurs for those rapidly intensifying storms with intensity change between 35 and 40 kt. The NN models developed using the same predictors as those used in the STIPER models outperform those of the STIPER models. The improvements are up to 10%, 17%, and 28% at 24-, 48-, and 72-h forecast times for individual verification cases, respectively. The average MAEs of all nine of the verification years show 7%, 11%, and 16% reductions relative to STIPER for the 24-, 48-, and 72-h NN models, respectively. The error reduction exists at almost all of the TC stages and for all intensity change categories, and forecasts of the rapidly intensifying and rapidly decaying storms show the largest improvement. The NN model outperforms CLIPER at all forecast ranges (24, 48, and 72 h), indicating skill in all ranges. The other three models (BASE, STIPER, and NN) show improvements over CLIPER for the 24-h forecasts (see Fig. 3), indicating forecast skill at near range. However, for extended-range (72-h) forecasts, STIPER and BASE outperform CLIPER only six out of nine verification years (Table 8), while CLIPER performs better than BASE and STIPER in three out of nine cases. It would be interesting to add IRR and SLHF to the static parameters in the CLIPER model and to compare their use in extended forecasts. The predictor IRR is significant for six out of the nine verification years at 24 h, and for all 48- and 72-h forecast intervals, indicating the importance of IRR. However, the impacts of the SLHF predictor are only significant for some of the verification years. The SLHF used in this study is based on a merged analysis and has relatively coarse temporal and spatial resolutions (daily and 18). Higher temporal and spatial resolution versions of SLHF may provide a more detail depiction of TC–ocean interaction. The progress on the third version of the National Aeronautics and Space Administration (NASA) Goddard Space Flight Center Satellite-based Surface Fluxes (GSSTF3; Shie et al. 2009) SLHF data with resolutions of 12 h and 1/ 48, which is retrieved from a number of satellite datasets, may be potentially useful for further improvements to our ability to forecast TC intensity. A study by Knaff and DeMaria (2005) found 5%–10% improvements by using a NN version of SHIPS for the Atlantic but they were not able to duplicate the NN improvement in operational forecasts. In fact, the NN results were inferior to the original multiple regression version of the SHIPS model when both versions were

run. They examined 2 yr’s worth of independent cases using real-time GFS forecast fields and National Hurricane Center (NHC) forecast tracks. The use of GFS forecasts injected forecast errors into their TC intensity model while perfect prognoses are used in our model development. Thus, in order to ascertain the value of IRR and SLHF in TC intensity model forecasts, our models need to be run in an operational setting to determine if the NN models with IRR and SLHF can actually improve TC intensity forecasts operationally. Acknowledgments. This work constitutes part of a Ph.D. dissertation submitted by SG to the Institute of Space and Earth Information Science, Chinese University of Hong Kong, Hong Kong, China. Support from the CEMAPS project ITS/058/09FP is acknowledged. The authors thank Dr. Roongroj Chokngamwong for helpful discussions and the two anonymous reviewers for their valuable comments. LSC thanks the NASA PMM/TRMM program for its support. REFERENCES Baik, J.-J., and J.-S. Paek, 2000: A neural network model for predicting typhoon intensity. J. Meteor. Soc. Japan, 78, 857–869. Briegel, L. M., and W. M. Frank, 1997: Large-scale influences on tropical cyclogenesis in the western North Pacific. Mon. Wea. Rev., 125, 1397–1413. Camp, J. P., and M. T. Montgomery, 2001: Hurricane maximum intensity: Past and present. Mon. Wea. Rev., 129, 1704–1717. Chang, A. T. C., L. S. Chiu, G. R. Liu, and K. H. Wang, 1997: Analysis of 1994 typhoons in the Taiwan region using satellite data. Proceedings of COSPAR Colloquium on Space Remote Sensing of Subtropical Oceans (SRSSO), C.-T. Liu, Ed., COSPAR Colloquia Series, Vol. 8, Elsevier Science, 89–96. DeMaria, M., 1996: The effect of vertical shear on tropical cyclone intensity change. J. Atmos. Sci., 53, 2076–2087. ——, and J. Kaplan, 1994a: A Statistical Hurricane Intensity Prediction Scheme (SHIPS) for the Atlantic basin. Wea. Forecasting, 9, 209–220. ——, and ——, 1994b: Sea surface temperature and the maximum intensity of Atlantic tropical cyclones. J. Climate, 7, 1324–1334. ——, and ——, 1999: An updated Statistical Hurricane Intensity Prediction Scheme (SHIPS) for the Atlantic and eastern North Pacific basins. Wea. Forecasting, 14, 326–337. ——, J.-J. Baik, and J. Kaplan, 1993: Upper-level eddy angular momentum fluxes and tropical cyclone intensity change. J. Atmos. Sci., 50, 1133–1147. ——, M. Mainelli, L. K. Shay, J. Knaff, and J. Kaplan, 2005: Further improvements to the updated Statistical Hurricane Intensity Prediction Scheme (SHIPS). Wea. Forecasting, 20, 531–543. ——, J. A. Knaff, and C. Sampson, 2007: Evaluation of long-term trends in tropical cyclone intensity forecasts. Meteor. Atmos. Phys., 97, 19–28. Demuth, H., M. Beale, and M. Hagan, 2007: Neural Network Toolbox 5 users’ guide. MathWorks, Inc. [Available online at http://www.mathworks.com/help/pdf_doc/nnet/nnet_ug.pdf.] Emanuel, K. A., 1988: The maximum intensity of hurricanes. J. Atmos. Sci., 45, 1143–1155.

250

WEATHER AND FORECASTING

Evans, J. L., and K. McKinley, 1998: Relative timing of tropical storm lifetime maximum intensity and track recurvature. Meteor. Atmos. Phys., 65, 241–245. Gao, S., and L. S. Chiu, 2010: Surface latent heat flux and rainfall associated with rapidly intensifying tropical cyclones over the western North Pacific. Int. J. Remote Sens., 31, 4699–4710. Gray, M. W., 1968: Global view of the origin of tropical disturbances and storms. Mon. Wea. Rev., 96, 669–700. Holland, G. J., 1997: The maximum potential intensity of tropical cyclones. J. Atmos. Sci., 54, 2519–2541. ——, and R. T. Merrill, 1984: On the dynamics of tropical cyclone structure changes. Quart. J. Roy. Meteor. Soc., 110, 723–745. Huffman, G. J., and Coauthors, 2007: The TRMM Multisatellite Precipitation Analysis (TMPA): Quasi-global, multiyear, combined-sensor precipitation estimates at fine scales. J. Hydrometeor., 8, 38–55. Jin, L., C. Yao, and X. Y. Huang, 2008: A nonlinear artificial intelligence ensemble prediction model for typhoon intensity. Mon. Wea. Rev., 136, 4541–4554. Jones, T. A., D. Cecil, and M. DeMaria, 2006: Passive-microwave enhanced Statistical Hurricane Intensity Prediction Scheme. Wea. Forecasting, 21, 613–635. Kalnay, E., 2003: Atmospheric Modeling, Data Assimilation and Predictability. Cambridge University Press, 341 pp. Knaff, J. A., 2009: Revisiting the maximum intensity of recurving tropical cyclones. Int. J. Climatol., 29, 827–837. ——, and M. DeMaria, 2005: Improvements in deterministic and probabilistic tropical cyclone surface wind predictions. U.S. Weather Research Program, Joint NOAA/Navy/NASA Hurricane Testbed Final Rep., 7 pp. ——, and R. M. Zehr, 2007: Reexamination of tropical cyclone– pressure wind relationships. Wea. Forecasting, 22, 71–88. ——, C. R. Sampson, and M. DeMaria, 2005: An operational statistical typhoon intensity prediction scheme for the western North Pacific. Wea. Forecasting, 20, 688–699. Levitus, S., 1982: Climatological Atlas of the World Ocean. NOAA Prof. Paper 13, 173 pp. and 17 microfiche. Mainelli, M., M. DeMaria, L. K. Shay, and G. Goni, 2008: Application of oceanic heat content estimation to operational forecasting of recent Atlantic category 5 hurricanes. Wea. Forecasting, 23, 3–16. Merrill, R. T., 1987: An experiment in statistical prediction of tropical cyclone intensity change. NOAA Tech. Memo. NWS NHC 34, 33 pp. Miller, B. I., 1958: On the maximum intensity of hurricanes. J. Meteor., 15, 184–195. Molinari, J., and D. Vollaro, 1989: External influences on hurricane intensity. Part I: Outflow layer angular momentum fluxes. J. Atmos. Sci., 46, 1093–1105.

VOLUME 27

Reynolds, R. W., N. A. Rayner, T. M. Smith, D. C. Stokes, and W. Wang, 2002: An improved in situ and satellite SST analysis for climate. J. Climate, 15, 1609–1625. Riehl, H., 1972: Intensity of recurved typhoons. J. Appl. Meteor., 11, 613–615. Ritchie, E. A., and G. J. Holland, 1999: Large-scale patterns associated with tropical cyclogenesis in the western Pacific. Mon. Wea. Rev., 127, 2027–2043. Rodgers, E. B., S. W. Chang, and H. F. Pierce, 1994: A satellite observational and numerical study of the precipitation characteristics in western North Atlantic tropical cyclones. J. Appl. Meteor., 33, 573–593. Sadler, J. C., 1976: A role of the tropical upper tropospheric trough in early season typhoon development. Mon. Wea. Rev., 104, 1266–1278. ——, 1978: Mid-season typhoon development and intensity changes and the tropical upper tropospheric trough. Mon. Wea. Rev., 106, 1137–1152. Shie, C.-L., and Coauthors, 2009: A note on reviving the Goddard Satellite-Based Surface Turbulent Fluxes (GSSTF) dataset. Adv. Atmos. Sci., 26, 1071–1080. Wang, B., and X. Xu, 1997: Northern Hemispheric summer monsoon singularities and climatological intraseasonal oscillation. J. Climate, 10, 1071–1085. Wentz, F. J., C. L. Gentemann, D. K. Smith, and D. B. Chelton, 2000: Satellite measurements of sea-surface temperature through cloud. Science, 288, 847–850. Whitney, L. D., and J. S. Hobgood, 1997: The relationship between sea surface temperatures and maximum intensities of tropical cyclones in the eastern North Pacific Ocean. J. Climate, 10, 2921–2930. Wilks, D. S., 2006: Statistical Methods in the Atmospheric Sciences. 2nd ed. International Geophysics Series, Vol. 59, Academic Press, 627 pp. Yang, F., H.-L. Pan, S. K. Krueger, S. Moorthi, and S. J. Lord, 2006: Evaluation of the NCEP Global Forecast System at the ARM SGP site. Mon. Wea. Rev., 134, 3668–3690. Yu, L., X. Jin, and R. A. Weller, 2008: Multidecade Global Flux Datasets from the Objectively Analyzed Air–Sea Fluxes (OAFlux) project: Latent and sensible heat fluxes, ocean evaporation, and related surface meteorological variables. Woods Hole Oceanographic Institution, OAFlux Project Tech. Rep. OA-2008-01, 64 pp. Zehr, R. M., 1992: Tropical cyclogenesis in the western North Pacific. NOAA Tech. Rep. NESDIS 61, 181 pp. Zeng, Z., Y. Wang, and C.-C. Wu, 2007: Environmental dynamical control of tropical cyclone intensity—An observational study. Mon. Wea. Rev., 135, 38–59.

Suggest Documents