Statistical modeling of total ozone: Selection of appropriate ... - Wiley

1 downloads 0 Views 768KB Size Report
ozone observations to a statistical model using the Quasi-. Biennial Oscillation (QBO) and the 11-year solar cycle as explanatory variables describing natural ...
JOURNAL OF GEOPHYSICAL RESEARCH, VOL. 112, D11108, doi:10.1029/2006JD007694, 2007

Statistical modeling of total ozone: Selection of appropriate explanatory variables Jo¨rg A. Ma¨der,1 Johannes Staehelin,1 Dominik Brunner,1,2 Werner A. Stahel,3 Ingo Wohltmann,4 and Thomas Peter1 Received 23 June 2006; revised 19 January 2007; accepted 12 February 2007; published 7 June 2007.

[1] A set of 44 potential explanatory variables is used for statistical modeling of monthly

mean total ozone values of 158 ground-based stations. A stepwise elimination process leads to zonally optimized multiple regression models, which account for approximately 78% of the variance in total ozone in the tropics, 85% south of 60°S and more than 90% in the three remaining zones while only retaining six explanatory variables in the model. In all regions the dynamics appear to dominate ozone variability, which is primarily described by a proxy specifically designed to describe the effects of short-term isentropic excursions at different altitude levels and the compression/expansion of air connected to convergence/divergence. The influence of equivalent effective stratospheric chlorine (EESC) is also important in all regions, indicating the significant effects of anthropogenic emissions of ozone depleting substances (ODS). In addition to the dynamics and EESC, the influence of volcanic eruptions, represented by the integrated surface area density of stratospheric aerosols (SAD), has the largest impact on total ozone in northern regions. The results of the analysis are less clear in the Southern Hemisphere where only a few long-enough ozone time series are available. Citation: Ma¨der, J. A., J. Staehelin, D. Brunner, W. A. Stahel, I. Wohltmann, and T. Peter (2007), Statistical modeling of total ozone: Selection of appropriate explanatory variables, J. Geophys. Res., 112, D11108, doi:10.1029/2006JD007694.

1. Introduction [2] The largest fraction of stratospheric ozone, which is responsible for filtering harmful solar UV-B and UV-C radiation, resides in the middle and lower stratosphere. Processes in this region of the atmosphere therefore dominate the variability in total ozone columns. In previous assessments of the World Meteorological Organization (WMO), long-term ozone trends were determined by fitting ozone observations to a statistical model using the QuasiBiennial Oscillation (QBO) and the 11-year solar cycle as explanatory variables describing natural variability, plus a linear trend, which was attributed to anthropogenic ozone depletion [see, e.g., Staehelin et al., 2001]. In addition to these factors, a number of other processes influencing total ozone have been identified. The influence of synoptic-scale meteorological variability, for instance, has been known already for decades [Dobson and Harrison, 1926; Schubert and Munteanu, 1988; Steinbrecht et al., 1998]. More important for long-term ozone trends, however, is the 1 Institute for Atmospheric and Climate Science, Eidgeno¨ssische Technische Hochschul Zu¨rich, Zurich, Switzerland. 2 Now at EMPA – Materials Science and Technology, Du¨bendorf, Switzerland. 3 Seminar fu¨r Statistik, Eidgeno¨ssische Technische Hochschul Zu¨rich, Zurich, Switzerland. 4 Alfred Wegener Institute for Polar and Marine Research, Potsdam, Germany.

Copyright 2007 by the American Geophysical Union. 0148-0227/07/2006JD007694

influence of decadal or even longer-scale climate variability as discussed for instance by Hood and Zaff [1995], Chandra et al. [1996] and Hood [1997]. Several studies indicate that climate variability described by the Northern Atlantic Oscillation (NAO) [e.g., Appenzeller et al., 2000; Orsolini and Limpasuvan, 2001; Orsolini and Doblas-Reyes, 2003], the Arctic Oscillation (AO) [Thompson and Wallace, 2000] or ENSO [Bro¨nnimann et al., 2004] might have significantly modulated long-term stratospheric ozone changes over the northern extratropics. [3] In this paper we systematically explore the most dominant factors influencing long-term total ozone evolution using a statistical modeling approach. By means of ground-based total ozone measurements of 158 measuring stations obtained from the World Ozone and Ultraviolet Data Center (WOUDC) at Toronto, Canada, and the British Antarctic Survey (BAS) we first construct a multiple regression model for each station separately, in which we include many explanatory variables that might conceivably affect ozone variability, including short-term fluctuations as well as long-term trends. Then we apply a stepwise elimination procedure in order to determine those explanatory variables which contribute most significantly to the observed variability. The method is based on the significance levels of the individual variables. This leads to ranking tables of the explanatory variables for the individual stations. The obtained ranking tables are optimized in a second elimination procedure for five zonal bands: one tropical, two midlatitude and two polar bands.

D11108

1 of 16

¨ DER ET AL.: STATISTICAL MODELING OF TOTAL OZONE MA

D11108 Table 1. Used Abbreviationsa Abbreviation

Definition

AO AS BAS EA EAJ EAWR ECMWF EESC EL EP EPFC ERA JAK JISAO M NAO NAT NCEP NM NOAA NP NP ODS PANC PC1-3 PDO PE PNA PSC PSCVNC PT PTP PV# QBO S SAD SCA SF SM SOI SP SST SZ T T# TOZ TNH WMO WOUDC WP

Arctic/Antarctic Oscillation Asian Summer Index British Antarctic Survey East Atlantic Index East Atlantic Jet Index East Atlantic – Western Russian Index European Centre for Medium-Range Weather Forecasts equivalent effective stratospheric chlorine equivalent latitude ozone proxy East Pacific Index Eliassen-Palm-Flux ECMWF reanalysis sea level pressure at Jakarta Joint Institute for the Study of the Atmosphere and Ocean annual cycle on a monthly basis (12 values) North Atlantic Oscillation nitric acid trihydrate National Center of Environmental Prediction northern midlatitude region National Oceanographic and Atmospheric Administration northern polar region North Pacific Index ozone depleting substances polar area at 50 hPa below TNAT principal pressure components Pacific Decadal Oscillation Index Polar-Eurasia Index Pacific – North American Pattern polar stratospheric clouds PSC volume below TNAT Pacific Transition Index pressure at tropopause altitude (thermal definition) potential vorticity at isentropic level (e.g., PV340) Quasi-Biennial Oscillation annual cycle on a seasonal base (4 values) surface area density of the stratospheric aerosols Scandinavia Index solar flux southern midlatitude region Southern Oscillation Index southern polar region Global-SST ENSO Index Subtropical Zonal Index tropical region temperature at pressure level (e.g., T10) total ozone Tropical – Northern Hemisphere Index World Meteorological Organization World Ozone and Ultraviolet Data Center West Pacific Index

a Italics indicate explanatory variables used in this study (see Appendix A for definitions).

[4] A similar approach concerning the determination of the importance of explanatory variables has only been applied to the total ozone series of Hohenpeissenberg, southern Germany [Steinbrecht et al., 2001] and, with fewer potential explanatory variables, in a recent study of total ozone satellite measurements [Steinbrecht et al., 2003]. [5] This purely statistical approach for selecting the most relevant explanatory variables is complementary to a selection based on a priori knowledge about the physical and chemical processes influencing stratospheric ozone, an approach which is followed in a companion paper [Wohltmann et al., 2007]. While the former approach requires a posteriori justification by a proper analysis of

D11108

physical and chemical causalities in order to avoid statistical coincidence, the latter may be biased toward the expectations of the respective scientist. Satellite observations offer an alternative to the ground-based network. They provide near global coverage but data records useful for trend analysis were only started in the late 1970s, much later than the start of operation of many ground stations. A statistical analysis of satellite observations of total and vertically resolved stratospheric ozone is presented in a second companion paper by Brunner et al. [2006a], which is based on the new quasi three-dimensional ozone data set CATO [Brunner et al., 2006b].

2. Method, Data Set, and Performance Tests 2.1. Data Set: Overview [6] The complete list of data used in this study including references and additional information is given in Appendix A and a list of the used abbreviations in Table 1. Our data set consists of the monthly averaged total ozone values of 158 ground-based stations (Dobson, Brewer and filter instruments) from the WOUDC and BAS (see Figure 1) and 44 time series of explanatory variables. Sites were included if they fulfilled the following criteria simultaneously: (1) At least 120 monthly means are available; (2) all measurements only include data from one single type of instrument in order to avoid problems of inhomogeneity of the records. For data quality consideration see Appendix A1. All selected total ozone time series were grouped into five different zonal bands (see Table 2 and Figure 1). The number of stations in the two southern regions is small, which makes the interpretation of the corresponding results delicate. Therefore we will focus our interpretations on the other regions. [7] To describe the anthropogenic influence on stratospheric ozone due to the release of ozone depleting substances (ODS), we used equivalent effective stratospheric chlorine (EESC), as introduced in WMO [World Meteorological Organization (WMO), 1995]. EESC is a measure of the stratospheric halogen loading weighting the influence of individual ODS by considering the enhanced efficiency of bromine to deplete ozone and the relative rates at which different halocarbons decompose and release their halogen into the stratosphere. In previous statistical modeling studies, the effect of ODS on stratospheric ozone trends was usually described by a linear trend starting at the beginning of the 1970s. An early exception is the study of Steinbrecht et al. [2001], in which measurements of stratospheric chlorine were used instead. The use of a linear trend was appropriate until the mid-1990s, since EESC increased almost linearly from the beginning of the 1970s until then. However, the ratification of the Montreal Protocol and its amendments prevented the EESC from growing linearly after 1995. More recently other concepts were developed to study the effectiveness of the Montreal protocol [e.g., Reinsel, 2002; Reinsel et al., 2002, 2005; Newchurch et al., 2003; Steinbrecht et al., 2004; Weatherhead and Andersen, 2006; Brunner et al., 2006a], which is out of scope of the presented analysis. [8] For the description of the dynamical influences on total ozone we used various climate indices such as the Arctic Oscillation (AO) as well as several station-related time series derived from meteorological analysis data sets

2 of 16

D11108

¨ DER ET AL.: STATISTICAL MODELING OF TOTAL OZONE MA

D11108

Figure 1. Map of total ozone measurements used in this study. The symbols indicate the regions, and the symbol size indicates the number of available monthly means. For abbreviations, see Table 1.

(see Appendix A). For example, temperature and potential vorticity at various stratospheric levels as well as tropopause height above each station were obtained from National Center for Environmental Prediction (NCEP) reanalysis data. A proxy describing the influence of short-term dynamical variability, which is mainly associated with planetary waves, was computed separately for each station. The calculation of this proxy as well as its applicability to total ozone trend studies is described in detail by Wohltmann et al. [2005] and Wohltmann et al. [2007]. In short, the proxy (hereafter termed as Equivalent Latitude ozone proxy (EL)) represents a synthetic ozone column, obtained by vertically integrating a climatological ozone distribution along the equivalent latitude profile above the station. Equivalent latitude profiles were calculated at 6 hour time intervals from isentropic potential vorticity fields of the 40-year reanalysis data of the European Centre for Medium-Range Weather Forecasts (ECMWF). Synthetic ozone columns were then calculated for each profile separately and finally averaged monthly. The EL proxy describes the vertically integrated effect of meridional advection of air from other latitudes, where climatological mean ozone mixing ratios may be higher or lower, as well as the effect of vertical compression or expansion of isentropic layers [Wohltmann et al., 2005]. For all local variables the days were selected when ozone measurements are available and the monthly means were derived by averaging over the respective daily mean values. [9] The set of explanatory variables also contains the Quasi-Biennial Oscillation (QBO), the solar cycle (solar flux at 10.7 cm, SF), a proxy for the volcanic influence (vertically integrated surface area density of the strato-

spheric aerosols, SAD, separated for 32 zonal bands), the Eliassen-Palm-Flux (cumulative, starting in early winter, EPFC) to describe the strength of the residual (or BrewerDobson) circulation, and a proxy for polar ozone depletion represented by the vortex area at 50 hPa below nitric acid trihydrate (NAT) formation temperature (integrated starting in early winter). For more details see Appendix A. 2.2. Statistical Model [10] The linear statistical regression model for total ozone (TOZ) used in this study can be written as TOZ ¼ ak þ

Xm j¼1

bkj Xj þ ;

ð1Þ

which describes total ozone as being the result of linear effects of the explanatory variables Xj, with possibly different coefficients bkj for each month k of the year (k = 1, 2, . . ., 12 for January to December), an additional seasonal cycle ak, and a random deviation . Such a model allows the full observed data set to be used, without deseasonalizing or splitting it into different subsets for various months or seasons.

Table 2. Definition of the Zonal Bands and Number of Used Stations Zone

Abbreviation

Latitude

Number of Stations

North polar Northern midlatitudes Tropics Southern midlatitudes South polar

NP NM T SM SP

north of 62°N 33 – 62°N 30°S to 33°N 60 – 30°S south of 60°S

14 95 33 10 6

3 of 16

D11108

¨ DER ET AL.: STATISTICAL MODELING OF TOTAL OZONE MA

[11] As part of the elimination process described in the next section, the model (1) was also tested to find out if it is necessary to use different coefficients b for each month separately, or whether it is sufficient to use seasonal coefficients (4 instead of 12) or even a single coefficient bj without deteriorating the fit. [12] In the literature of modern applied statistics and statistical software packages the following convenient notation is used to describe our model [Venables and Ripley, 2002]: TOZ  M þ X1 þ ::: þ Xm þ M : X1 þ ::: þ M : Xm ;

ð2Þ

where the first term on the right-hand side stands for the ‘‘factor’’ month (M ), i.e., for the parameters ak in the model. The sum X1 + .. + Xm collects the main effects (with a single coefficient bj for each variable) and the sum M : X1 + . . . + M : Xm describes the interactions with the months, i.e., the different coefficients bjk of a variable for the different months. The latter terms are dropped if a single coefficient is used or if it is replaced by S: Xj if four coefficients are used for the 4 seasons. We shall use this model notation below. 2.3. Selection of the Variables: Elimination Process [13] To derive a reliable set of explanatory variables for total ozone, we first arranged all variables according to their potential influence. This order was calculated by the following nested stepwise elimination process, which was separately applied to the zonal bands. 1. Initialization: As a starting point, a regression model including all adequate explanatory variables (including the month as a factor and all interaction terms) is used. This model will over fit the data with irrelevant explanatory variables. 2. Repeat the following (outer) loop (steps 2.1 and 2.2) as long as the current starting model has more than one explanatory variable: 2.1. For each station of a region perform the following (inner) loop (steps 2.1a– 2.1d): 2.1a. Eliminate the least significant term (highest p-value). If this concerns an interaction M : Xj or S: Xj the term is reduced to Xj. If the candidate for elimination consists of a variable Xj for which the interaction term M : Xj (or S: Xj) still appears in the model, it is not eliminated and, instead, the next lowest p-value is found. This prevents Xj being eliminated before its interaction. 2.1b. Recalculate the reduced model. 2.1c. Repeat steps 2.1a and 2.1b until only one explanatory variable is retained. 2.1d. Calculate the ranks of the variables Xj, defined as the reversed order of the elimination: The variable eliminated first obtains rank m, the variable remaining in the last model, rank 1. Note that with this definition a high rank (‘‘bad’’ rank) indicates poor significance whereas a low rank (‘‘good’’ rank) indicates high significance. 2.2. Average the rankings of all stations resulting from step 2.1. Drop the variable with the highest average rank and use this reduced model as new starting model for step 2.1. 3. The result of the outer loop (step 2) represents the overall zonal ranking (reversed order of the elimination). [14] Not all stations provide enough measurements to calculate the complete initial model. Therefore these sta-

D11108

tions are used in the inner loop as soon as enough variables are eliminated (number of ozone measurements larger than number of coefficients to be estimated). [15] This procedure is applied to each zonal band by using all the stations of the band. On the basis of the resulting rankings, a suitable model size is chosen taking some additional considerations into account (see section 2.5). [16] The procedure may appear somewhat heavy for selecting a model. However, the nested application of a selection method is needed for finding a common model for the collection of data sets. This problem has not been addressed to our knowledge in the statistical literature. We will give more justification for it in the following subsection. For model selection in simple data sets, the traditional procedures are forward selection, backward elimination and a combination of both, often addressed as stepwise procedure. Normally, these methods lead to a single model by applying a stopping rule. Since we are interested in ranking tables, we ignore stopping rules and retain only the sequence of variables identified by the procedure without stopping. The rankings between forward selection and backward elimination may differ, but general practice shows that such differences are usually minimal. We chose backward elimination, because it is known that forward selection may stop prematurely, and therefore backward elimination is generally preferred. The general stepwise procedure would not lead directly to a ranking and is therefore less suitable for our purpose. [17] Some of the explanatory variables we use, especially climate indices and QBO, are often used with a time lag. In the case of QBO we use two different time series at different heights. Therefore the model can adjust the correct time lag directly [Bojkov and Fioletov, 1996]. However, this approach works only for (quasi) periodic time series, which is not the case for the other explanatory variables. For these, the time series could be included with different lags as new (independent) variable and the elimination process would select the optimal lags. An alternative way would be to preselect an optimal lag using the correlation with total ozone. However, we decided to not use lagged time series for the following reasons: (1) Many climate indices do not show distinct lags [see, e.g., Appenzeller et al., 2000; Bro¨nnimann et al., 2004]; (2) including time lags of up to six months for each variable would lead to a too large number of time series for the elimination procedure; (3) a preselection of a time lag requires an optimization based on ozone but ignoring all other variables, which contradicts the concept of our study to include all possible interactions; (4) the preoptimization possibly leads to an additional and not necessarily justified ‘‘advantage’’ in the elimination procedure compared to non lagged variables; and (5) one of our main goals is that the whole optimization of model and coefficients is done in a single procedure. 2.4. Correlation [18] Autocorrelation may affect the p-values used for testing the terms of a model significantly. However, as long as all important variables remain in the model, the structure of the autocorrelation of the errors must basically be the same. Because this is the case in our process of sorting out the unimportant variables according to significance, autocorrelation is expected to have only a small influence on our

4 of 16

D11108

¨ DER ET AL.: STATISTICAL MODELING OF TOTAL OZONE MA

D11108

Figure 2. Boxplots of the rankings for 100 simulated ozone series performed to explore the stability of the determined ranking (here shown for Arosa, Switzerland, see text for more details). The p horizontal line ffiffiffi inside each box indicates the median. The notch represents the interval median ±1.58 QD/ n, where n is the number of values (here 100) and QD is the difference of the first and third quartiles. The box covers the middle 50% of the data, and the dashed line covers all points which are closer than 1.5 QD to the box. All other values are indicated by single points. The right plot is an enlargement of the first twenty ranks. The y axis is the same as in the left plot. approach and therefore we did not include an autocorrelation term in our model. As a test for our hypothesis that autocorrelation does not influence the elimination process, we included in our calculations an autocorrelation term for Arosa (Switzerland) but this did not lead to major changes for the confidence intervals. [19] Correlations among explanatory variables cause problems due to collinearity [see, e.g., Kerzenmacher et al., 2006]. In our data sets, 5% of all pairs of variables showed a correlation higher than 0.5 or lower than 0.5. The variables can be clustered into five groups, each representing specific modes of climate variability or distinct stratospheric processes, such that high correlations only occur within groups (see Appendix A for description of all explanatory variables). The groups are (1) climate indices in the Southern Hemisphere; (2) first mode of climate variability in the Northern Hemisphere (i.e., the Arctic or North Atlantic Oscillation); (3) second mode of climate variability in the Northern Hemisphere; (4) the QBO measured at two different pressure levels; and (5) station-related meteorological and dynamical variables such as tropopause pressure, temperature and PV at different levels. Group 5 was also found to be correlated with Eliassen-Palm-Flux (EPF) and with the volume of polar air below the formation temperature of NAT. [20] Some authors advocate orthogonalization of explanatory variables before statistical modeling, e.g., by principal component regression. The coefficients then refer to linear combinations of the original explanatory variables. Since the physical interpretation of these linear combinations is difficult, we refrain from applying such a method and rather explain the target variable by nonorthogonal but physically meaningful variables. [21] Collinearity has a strong effect on a stepwise elimination process. If, for example, two variables are highly

correlated, they are usually not simultaneously needed in the model. The elimination process will remove either one early in the process, which leads to a high rank for the removed variable. The remaining variable, on the other hand, may attain a low rank, assuming that it is important. When processing many stations, one would then expect some low and some high ranks for both variables, and averaging them would lead to the conclusion that neither variable was really important. It is the purpose of the nested scheme to avoid this difficulty. If one of the two variables is eliminated in the outer loop, then it no longer appears in the inner loop, and the remaining variable will get a low rank for all stations, and therefore a low average rank. Hence the elimination process automatically leads to a decision between correlated variables. 2.5. Model Size and Performance Tests [22] The result of an elimination process is a ranking table with a meaningful order of the variables. However, there is no direct information about the ideal number of variables that should be used in the model. To help in the decision on the optimal model size we used the ‘‘adjusted’’ version of the coefficient of determination, R2adj (which includes a penalty term for larger models [Draper and Smith, 1998]) and a sensitivity analysis based on the following simulations: [23] We first calculated the fitted ozone values on the basis of the model including all explanatory variables. To these values we added a normally distributed noise with a variance and autocorrelation structure similar to the residuals of the model. In this way we generated 100 different simulated ozone data sets for every single station. To each of these we applied a backward elimination procedure (step 2.1 of the algorithm described in section 2.3) leading to 100 different ranking tables for each station. Figure 2 shows the distribu-

5 of 16

D11108

¨ DER ET AL.: STATISTICAL MODELING OF TOTAL OZONE MA

Figure 3. Distribution of the number of significant terms determined for each station separately by eliminating formally insignificant terms. Many stations show around six to seven significant terms. tions of the ranks of each variable over these 100 repetitions for the station of Arosa, Switzerland, as an example. It demonstrates that the best rankings of the elimination process are relatively well defined (here 5). In middle ranks the variation increases strongly, and an interpretation of higher ranks is not meaningful. Close inspection of the rankings of all stations shows that the four to seven best variables show a stable behavior in the sense that they always attain the best (i.e., lowest) ranks. Less significant variables, however, show increasing variability of ranks up to a level where an interpretation of the ranks is no longer reasonable. [24] One problem of this procedure is that it likely yields a too large number of well defined variables as the following

D11108

argument shows: The simulated ozone time series have partly been constructed from variables which have no influence on ozone but, by chance, have relatively large nonzero coefficients (within their uncertainty limits). Since these estimated coefficients are used as true coefficients in the simulation, the corresponding variables will generally obtain a better rank than they deserve. [25] For a second hint to an upper bound on the useful number of variables, we proceeded as follows: The selection procedure was run for each station until all variables included in the model had a formally significant coefficient (different from zero, tested with F-statistics on the 10% level, ignoring autocorrelation). In most cases the models contained up to 7 significant variables as it is shown in Figure 3, except for the tropical region where many stations show only two or three significant variables. [26] Figure 4 shows the final result of the elimination process. It is clearly visible that the variables with the best ranking (up to rank 4 to 6) show the smallest differences between the regions. The two southern regions show the largest differences. This may be caused by the small number of stations. Higher ranking variables behave in a less structured way, which hints that they have negligible influence on total ozone. [27] The final selected model sizes based on this analysis are presented in section 3.2.

3. Results of the Statistical Analysis 3.1. Seasonal Dependencies [28] In order to account for a possible seasonal dependence of the influence of different explanatory variables on total ozone we used several different variations of the

Figure 4. Overview of ranking tables of the variables of the individual stations. Horizontal axis shows mean rank of the explanatory variables, and vertical axis gives explanatory variables, sorted by the global mean rank. The five regions are represented by different symbols (see right plot). For abbreviation, see Appendix A. Each variable occurs twice, including and excluding the interaction term with season S : Xj. The right plot is an enlargement of the lower left part of the left plot. Because of the large number of explanatory variables, the vertical axis of the left plot is not labeled. 6 of 16

D11108

¨ DER ET AL.: STATISTICAL MODELING OF TOTAL OZONE MA

D11108

Figure 5. Proportion of the variance explained by the models in relation to the number of terms in the model for the different regions. The list describes the ranking list for the individual regions (top down: reverse order of elimination).

statistical model with or without the inclusion of monthly or seasonally varying coefficients. A seasonal differentiation only has a notable effect on the quality of the model in terms of explained variance in polar regions for EESC. In these cases the results of a model with seasonal dependency (using 4 values, represented as S:EESC for example) is comparable to a model with monthly dependency (12 values). Therefore the smaller seasonal model should be preferred. The remaining seasonal cycle of ozone not explained by the proxies, on the other hand, is better represented by monthly (represented as M ) rather than by seasonal coefficients. [29] One of our main goals is to develop a simple and comprehensive model for total ozone in all seasons optimized for zonal bands. Therefore we decided not to split up the data set into single months or seasons. The major part of seasonal cycles in total ozone can be captured by the seasonal cycle of the explanatory variables themselves in combination with a remaining annual cycle (M ). Only for polar stations a seasonal dependency for EESC is required. Further seasonal dependencies do not lead to significant improvements of the model. 3.2. Optimized Models [30] Figure 5 shows the adjusted coefficient of determination (R2adj) for all five zonal bands in relation to model size. The explained variance for the tropical stations is clearly smaller than for the other regions. This is primarily

caused by the smaller variability of total ozone in this region. Therefore the instrumental uncertainty is proportionally larger than in other regions (see Table 3 for the median standard deviation of the residuals). The other regions show a strong increase of R2adj in the first two to five terms but no further significant increase after the seventh term (please note that an explanatory variable including a seasonal dependency is counted here as two terms since it would need two steps to remove it according to our elimination process; see section 2.3). On the basis of these R2adj curves and Figure 3 we decided on the following model sizes for the different regions: [31] In the north polar (NP) region, most stations show seven significant terms. This choice is consistent with R2adj (see Figure 5).

Table 3. Comparison Between the WMO and the Optimized Model (Both Including SAD) for the Median of the Five Regions in Terms of R2adj and Standard Deviation of the Residuals 2

Radj

sd(residuals)

Optimized Optimized Model WMO Difference Model NP NM T SM SP

7 of 16

92.6% 91.7% 71.3% 91.6% 86.8%

81.5% 81.5% 74.3% 88% 69.7%

+11.1% +10.1% 3.0% +3.6% +17.1%

14.3 DU 9.8 DU 6.9 DU 7.5 DU 16 DU

WMO

Difference

21.4 DU 14.2 DU 6.3 DU 9.3 DU 23.1 DU

7.0 DU 4.4 DU +0.5 DU 1.8 DU 7.0 DU

D11108

¨ DER ET AL.: STATISTICAL MODELING OF TOTAL OZONE MA

D11108

North polar TOZ  M þ EL þ S : EESC þ PSCVNC þ T50 þ SAD

Northern midlatitudes TOZ  M þ EL þ EESC þ T10 þ SAD

Tropical TOZ  M þ EL þ EESC

ð3Þ

Southern midlatitudes TOZ  M þ EL þ QBO30 þ T50

South polar TOZ  M þ EL þ S : EESC þ T50 þ PV470:

Figure 6. Mean correlation in percent of the finally used variables of the five zonal bands. [32] In the northern midlatitudes (NM) region, the shape of R2adj is not conclusive, but the ranking is relatively stable in the first five ranks. Most stations show five and more significant terms (see Figure 3). [33] In the tropical (T) region, the picture is not clear. The first three variables show a stable behavior in the elimination process. This group is followed by another group of 6 variables which reach similar ranks. This second group contains among others the SF and QBO. Many stations show only two or three significant terms. The reason of this behavior possibly includes (1) smaller variability in total ozone compared to extratropical stations (see above) and (2) the class ‘‘tropics’’ includes besides tropical also subtropical sites (extending from 30°S to 33°N) where variability might be caused by different processes. On the basis of this result we cautiously decided on three terms only. [34] In the southern midlatitudes (SM) region, neither figure shows a clear picture for this region. In the light of the small amount of data we selected only four terms. [35] In the south polar (SP) region, above six terms, the increase in R2adj is small and the first six variables show a stable behavior in the elimination process. [36] The decisions described above lead to the following models representing the results of our optimization process (see Table 1 for the abbreviations):

[37] In all five regions the dynamical influence represented by the equivalent latitude proxy (EL) is selected by the elimination process, in four of them on the best rank. The residual seasonal cycle (M) was selected in all regions too. Less outstanding but also important in all regions (except southern midlatitudes) is the anthropogenic ozone depletion described by the EESC time series. The selection of the other variables varies between the regions. The often used QBO and solar cycle (SF) show good rankings at tropical and midlatitudes stations. The importance of the QBO and the solar cycle on total ozone in the tropics has also been shown in previous studies [e.g., van Loon and Labitzke, 2000; Baldwin et al., 2001]. However, regarding the number of significant variables (see Figure 3) we decided not to include them in the final model for the tropical region. Also remarkable is the influence of volcanic eruptions (variable SAD) and the local temperature at 10 hPa (T10) in northern midlatitudes. In the Southern Hemisphere more dynamical variables apart from EL, such as temperature and potential vorticity, obtain good ranks. Given the small number of stations, this result should be interpreted with care. Nevertheless, EL is an important variable also in the Southern Hemisphere, in agreement with the results for the other regions. [38] Figure 6 shows the mean correlation structure of the used variables for the five zonal bands. The correlations between the selected variables are generally low which allows a good interpretation of the results. Only at southern extra tropical stations the dynamical variables show higher correlations (see also section 4.1). 3.3. Comparison With the WMO Model [39] The model used in the previous WMO assessments was compared to the optimized model (equation (3)) on the basis of the long-term series of Arosa, as well as on the basis of the complete data sets of each region. Since both models use slightly different approaches, the following adaptations were made to enable the comparison: (1) Instead of using deseasonalized data, the month factor M was introduced in the WMO model; (2) the coefficient of EESC in our model was estimated independently for each month

8 of 16

¨ DER ET AL.: STATISTICAL MODELING OF TOTAL OZONE MA

D11108

D11108

improvement compared to the WMO model in all extratropical regions. Table 3 shows the median improvement per region of our optimized models of section 3.2 compared to the WMO model as shown in equation (4). All extratropical stations (except three at northern midlatitudes) show higher 2 values for the optimized model. Radj

4. Interpretation of the Statistical Model Results

Figure 7. (top) Time series of total ozone of Arosa (Switzerland: monthly (black) and annual (grey) means) and of residuals of the two statistical models (see text) for total ozone series of Arosa (Switzerland). (middle) Residuals for the WMO model including (black) and excluding (grey) SAD (volcanic influences). The grey curve is only visible after large volcanic eruptions (around 1983 and 1993); otherwise it is hidden by the black curve. (bottom) Same for the optimized model. (M:EESC) since the WMO model does the same for the linear trend (M:LT); (3) instead of using a lagged QBO we used two QBO time series at two pressure levels allowing the model to adjust the correct time lag automatically [Bojkov and Fioletov, 1996]; and (4) the explanatory variable SAD is also included in the WMO model. This adaptation leads to the following two models: WMO model TOZ  M þ M : LT þ QBO30 þ QBO50 þ SF þ SAD

ð4Þ

[41] In four of the five regions, the variable with the largest influence is EL, which mainly represents the dynamical influence exerted by planetary waves. At midlatitudes to high latitudes, EL is closely connected to the polar vortices. Meridional excursions of polar air masses have a large impact on total ozone values at stations which are alternatively located inside and outside of the vortex domain. A detailed inspection of EL and T10 is given in section 4.1. [42] The QBO and solar cycle, which were widely used in earlier studies, show relatively strong impacts on total ozone in the tropics, where their influence is comparatively direct and comprehensible. Outside the tropics their influence is indirect and consists of modulating the circulation [Lawrence et al., 2000; Ruzmaikin and Feynman, 2002] and the wave propagation. In our results the effects of QBO and SF outside the tropics are rather small. Their effects, however, may be indirectly included by other variables such as T50. Over the Arctic, for instance, Labitzke and van Loon [2000] found a strong correlation between SF and lower stratospheric temperatures in February when the QBO is in the westerly phase. [43] The anthropogenic influence (EESC) is clearly visible in all regions, although in southern midlatitudes it appears only on rank 10 (see Figure 5). Because of the small number of stations and the correlation between EESC and some local variables in this region, this may be just an artifact of the elimination process. As mentioned in section 2.1, we prefer the use of EESC instead of a synthetic linear trend, because this time series reflects the expected anthropogenic influence in a more realistic way than just a single slope. [44] Large volcanic eruptions and the subsequent aerosol loading in the stratosphere (SAD) can cause a decrease in total ozone after the eruption [Randel et al., 1995]. The eruption of Mount Pinatubo in 1991 for example led to very large SAD values in the following two years for the stations in the northern midlatitudes and, to a lesser degree, in the neighboring zones. Total ozone decreased correspondingly. The importance of this process is visible in our analysis by the fact that SAD attained rank 5 in northern midlatitudes and rank 6 and 7 at northern polar latitudes and in the tropics, respectively. In southern regions it reached higher

Optimized model TOZ  M þ M : EESC þ EL þ T10 þ SAD:

ð5Þ

[40] In Figure 7 and Table 4 the two models are compared on the basis of the adjusted coefficient of determination, R2adj and the residuals, that is, the differences between observed and fitted values of the model using the measurements of Arosa. The optimized model shows a clear

Table 4. Comparison Between the WMO and the Optimized Model for Arosa (Switzerland) in Terms of R2adj and Standard Deviation of the Residuals 2

Radj

Excluding SAD Including SAD

9 of 16

sd(residuals)

Optimized Model

WMO

Optimized Model

93.2% 93.6%

82.7% 83.7%

8.7 DU 8.4 DU

WMO 13.7 DU 13.3 DU

D11108

¨ DER ET AL.: STATISTICAL MODELING OF TOTAL OZONE MA

ranks, in agreement with previous findings that the effect of Pinatubo was stronger in the Northern Hemisphere [Randel et al., 1995]. Excluding SAD leads to larger residuals in the time period following volcanic eruptions as shown by the grey line in Figure 7. By including SAD, the residuals are of the same order as during quiescent time periods not affected by volcanic eruptions. We prefer to include SAD in the model instead of excluding the data following the Mount Pinatubo eruption as did by Reinsel et al. [2005] for example. 4.1. Representation of Dynamical Processes [45] The influence of dynamical processes on total ozone has been known for a long time [Dobson and Harrison, 1926]. It was represented in statistical models in different ways, by climate indices such as AO, NAO and ENSO, by local variables such as tropopause pressure, temperature, geopotential height, and potential vorticity (PV) evaluated at different levels above a station, or by zonal winds and finally by EP flux. A comprehensive overview of the use of these approaches in a number of studies is given in our companion paper [Wohltmann et al., 2007]. [46] The relation between changes in ozone and the dynamical proxies is not always straightforward. Changes in tropopause altitude, for instance, do not simply lead to a compression or expansion of the stratospheric ozone layer but are at the same time associated with meridional transport of air from regions with higher or lower climatological mean ozone concentrations [Salby and Callaghan, 1993; Koch et al., 2005; Wohltmann et al., 2007]. [47] Therefore we have looked for a dynamical proxy which describes the effects of meridional excursions and vertical compression of isentropic layers on total ozone columns in a more direct way. This led to the development of a new proxy which quantifies the effect of these processes directly. The proxy is obtained by integrating a climatological ozone field, given as a function of potential temperature and equivalent latitude, along the equivalent latitude profile computed for each station and each measurement separately. The method is described in detail by Wohltmann et al. [2005]. The proxy is based on the assumption that short-term meridional transport, which is mainly associated with planetary waves in the stratosphere, is adiabatic and frictionless to first order. For these conditions PV, as well as potential temperature, is conserved simultaneously [Hoskins et al., 1985]. Further, it is assumed that the lifetime of ozone is long enough to be considered as a passive tracer on the typical timescales of this transport. [48] The PV-equivalent latitude and potential temperature may therefore be used as a coordinate system in which passive tracers are approximately conserved on a short timescale. This property has been used for mapping stratospheric trace gas observations into a physically meaningful reference framework [Lait et al., 1990; Schoeberl et al., 1989] or for comparing spatially and temporally noncoincident observations from different platforms [Lait et al., 2004]. However the EL proxy is expected to provide erroneous results in case of a permanent displacement of isentropes that can alter photochemical and dynamical conditions and in turn change the ozone mixing ratios at that isentropes. This particular aspect is discussed by Wohltmann et al. [2007].

D11108

[49] The introduction of this new EL proxy in the statistical model leads to the removal of most other dynamical variables by the elimination process. The fact that EL reaches the best rank four of the five regions demonstrates its ability to account for a large part of dynamical variability in the stratosphere. The EL proxy can be approximated by the other local variables using the following equation: EL  M þ T 10 þ T 50 þ T 150 þ T 300 þ PTP þ PV 340 þ PV 400 þ PV 470 þ PV 550 þ PV 650:

ð6Þ

[50] We used this equation to extend the EL time series to the periods not covered by the ERA-40 data set (ECMWF reanalysis, data from September 1957 to December 2003). The R2adj of this regression model is always above 95%, and it is above 98% at 88% of all stations. [51] In earlier studies, strong correlations between ozone and local meteorological variables (e.g., temperatures at 100 hPa [WMO, 1993]) were documented, but the scientific interpretation of such relations was controversial. The close link of these variables with EL supports our interpretation that they are largely related to dynamical processes. However, each of these variables appears to reproduce only a limited fraction of the overall dynamical variability and can only represent the processes in a certain altitude range. The EL proxy, on the other hand, is able to describe the vertically integrated effect, at least to a large extent. [52] The strong impact of T10 in the northern midlatitudes (rank 3) is more difficult to interpret. If T10 were primarily a proxy for variations in gas phase chemical reaction rates, a negative coefficient would be expected. This means that a positive temperature anomaly would lead to a negative ozone anomaly. However, at only 2 out of the 95 stations in the northern zonal bands the corresponding coefficient is negative. A second process related to temperature is the Brewer-Dobson (BD) circulation. An enhanced BD circulation leads to faster transport of air from the tropics to the extratropics. Ozone in the lower tropical stratosphere is reduced in this case because the faster upwelling leaves less time for the air to approach its radiative equilibrium ozone concentration. For the same reason more ozone is produced in the tropics per unit of time which, however, is transported away more rapidly to the extratropics where ozone concentrations are larger than normal in this case. At the same time, a strong BD circulation leads to a negative temperature anomaly in the tropics and a positive anomaly at high latitudes due to adiabatic expansion and compression, respectively. Hence variations in the BD circulation are associated with positively correlated variations in temperature and ozone [Salby and Callaghan, 2002; Randel et al., 2002]. This positive temperature anomaly will additionally decrease the ozone destruction by heterogeneous chemistry. This process would be in agreement with the sign of the coefficients in our model. A third explanation for the good ranking of T10 is related to the calculation of the EL proxy. In the absence of the EL proxy, T10 is usually eliminated very early and hence attains only poor rankings. Integration of EL in the model improves the rankings of T10 significantly. One explanation would be that the EL proxy overestimates the

10 of 16

D11108

¨ DER ET AL.: STATISTICAL MODELING OF TOTAL OZONE MA

D11108

Figure 8. Quantitative influence of the variables of the optimized models. The heights of each bar represent the mean of standardized coefficients averaged over all stations of a region (see equation (7)). The vertical bar marks the annual mean confidence interval (95%) disregarding autocorrelation. effects of transport processes at 10 hPa which then needs to be compensated by the T10 proxy. [53] In the Southern Hemisphere other local dynamical variables also show good rankings (T50 and PV400). Because of the small number of stations, an interpretation of these results, however, would be questionable. Additionally, T50 is more strongly correlated with EESC in this region than in others [Steinbrecht et al., 2003] and may therefore replace EESC during the elimination process. 4.2. Model Run With Data Starting at 1979 [54] Whereas at the two northern regions the data set is very large and the results stable and reliable the situation for the three other regions is more difficult. Especially the dynamical data before 1979 for the Southern Hemisphere are not as reliable as for the Northern Hemisphere [Uppala et al., 2005]. Therefore we reprocessed our model using only data after 1979. The results of this analysis do not strongly differ from the results of the analysis including the years before 1979, but there are some smaller reorderings of the ranks. These reorderings are as expected rather small at the two northern regions and larger at the two southern but not as large to justify a revision of our selection. In the tropics the situation is the most complex because (1) the variance of total ozone is relatively low, (2) the data quality of some total ozone series may be doubtful, (3) the EL is more problematic than in the other regions but still reliable [Brunner et al., 2006b], and (4) this region has the highest variability within as it includes both tropical and subtropical stations. Nevertheless, also the changes at the tropical stations are not larger than in the other regions when the analysis only includes data after 1979. 4.3. Quantitative Influence on Ozone [55] The elimination process is based on the p-values of a regression model and is therefore primarily driven by the significance of the variables. To compare the quantitative

influence of a variable on total ozone, we used standardized coefficients given by cj ¼ bj

sdj ; sdTOZ

ð7Þ

where cj is the standardized coefficient of variable Xj, bj is the coefficient of explanatory variable Xj, sdj is the standard deviation of variable Xj, and sdTOZ is the standard deviation of total ozone. [56] Standardized coefficients are dimensionless and therefore comparable within a model as well as between different models. A value of cj = 0.5 for a standardized coefficient means that total ozone decreases by one half of its own standard deviation if the variable Xj increases by one standard deviation. As shown in Figure 8, the annual ozone cycle M, the dynamics (described by EL) and the anthropogenic influence (EESC) are most important in all regions. The other variables have a much smaller direct influence on total ozone. However, this does not imply that these variables can be removed, since they improve the quality of the model e. g. by reducing the uncertainty of the other coefficients. 4.4. Trends [57] We used our optimized model for the attribution of the trends (between 1970 and 1995) in total ozone to the different explanatory variables too. The trend influence was calculated by Ij ¼ bj Tj ;

ð8Þ

where Ij is the trend influence of variable Xj to total ozone trends, bj is the coefficient of explanatory variable Xj, and Tj is the simple linear trend of variable Xj (units per year) obtained by linear regression of the variable against time after 1970.

11 of 16

¨ DER ET AL.: STATISTICAL MODELING OF TOTAL OZONE MA

D11108

D11108

Figure 9. Seasonal contribution of the selected variables to the ozone trends (after 1970, see equation (8)). The first group represents the observed linear mean ozone trend for the region (O3), and the second group represents the trend in the model (FIT), which is the sum of the trends attributed to the explanatory variables (remaining groups). The scale of the trends is different for each panel. The vertical bar marks the confidence interval (95%). [58] The results are shown in Figure 9. As expected, the magnitude of the trend is lowest and not significant in the tropics and increases poleward. Also clearly visible is the dominating role of anthropogenic emissions (EESC) at the polar stations. In (northern) midlatitudes the dynamics (EL) also have an important influence on trends of a similar magnitude as changes due to EESC. This effect is particularly strong during the Northern Hemispheric winter (DJFM). [59] Different to other current studies [e.g., Reinsel, 2002; Reinsel et al., 2002, 2005; Newchurch et al., 2003; Steinbrecht et al., 2004; Weatherhead and Andersen, 2006; Brunner et al., 2006a] our analysis is not focused on trends and possible changes of them.

5. Conclusions [60] Optimized statistical models to describe the longterm total ozone measurements of 158 ground-based stations were developed for five zonal bands representing the tropics and the midlatitude and polar regions in both hemispheres. In this study the variability is described on the basis of monthly mean data but other statistical methods (such as wavelet transformation) might be more suitable to describe ozone variabilities on other timescales [Borchi et al., 2006]. The optimized models were obtained using a systematic elimination procedure for selecting the most significant

variables from an initial set of 44 potential explanatory variables. The elimination procedure does not lead to the same selection for all stations of a given zonal band. Nevertheless, in view of the small differences, it makes sense to define standard models for larger regions instead of individual solutions for every location. On the other hand, the differences between the five regions are too large to yield a satisfactory global variable selection. [61] In four bands, the largest part of variance of monthly mean total ozone values can be attributed to the dynamics represented by the proxy EL which describes the effects of short-term isentropic transport [Wohltmann et al., 2005]. At the fifth region (south polar) the variable PV470, representing dynamics on the 470 K isentrope, reaches the best rank and EL appears at the third place. The anthropogenic influence on total ozone described by EESC is also discernible in all regions. In the two polar regions this influence shows a pronounced dependency on season, in contrast to the other regions. The influence of volcanic eruptions is primarily visible in the northern midlatitudes and still discernible in the neighboring regions. By introducing this explanatory variable, a large amount of variance around the eruption of Mt Pinatubo may be covered, thus preventing the need of excluding this period from analysis. [62] Many statistical models, used for example in previous WMO assessments, employed ‘‘global variables’’ such as QBO, solar cycle, and climate indices to describe the

12 of 16

D11108

¨ DER ET AL.: STATISTICAL MODELING OF TOTAL OZONE MA

D11108

Table A1. Web Addresses of Data Sets Used in This Study (See Appendix A) Data Set TOZ NCEP EL NAO NAO_H AO SST SOI JAK PNA PC1-3 NHTP PDO SAD EESC QBO

SF PANC PSCVNC EPF

Address ftp://woudc:woudc*@ftp.tor.ec.gc.ca/Archive-NewFormat/totalozone_1.0_1, http://www.antarctica.ac.uk/met/jds/ozone/ http://www.cdc.noaa.gov/Datasets/ncep.reanalysis http://www.ecmwf.int/research/era/ http://www.cru.uea.ac.uk/ftpdata/nao.dat http://jisao.washington.edu/data_sets/nao/nao.ascii NH: http://www.cpc.ncep.noaa.gov/products/precip/CWlink/daily_ao_index/monthly.ao.index.b50.current.ascii, SH: http://www.cpc.noaa.gov/products/precip/CWlink/daily_ao_index/aao/monthly.aao.index.b79.current.ascii http://www.jisao.washington.edu/data/globalsstenso/globalsstenso18002004.ascii http://www.cgd.ucar.edu/cas/catalog/climind/SOI.signal.annstd.ascii http://www.jisao.washington.edu/data/jakarta_slp http:///www.jisao.washington.edu/data/pna/pna19482004.dat NH: http://jisao.washington.edu/analyses0302/slpanompc1948may2005daily.ascii, SH: http://jisao.washington.edu/data/aao/850zpcsh19482002.ascii http://www.cpc.ncep.noaa.gov/data/teledoc/telecontents.shtml, ftp://ftp.cpc.ncep.noaa.gov/wd52dg/data/indices/tele_index.nh ftp://ftp.atmos.washington.edu/mantua/pnw_impacts/INDICES/PDO.latest http://www.giss.nasa.gov/data/strataer/tau_line.txt http://dataservice.eea.eu.int/dataservice ftp://ftp.ncep.noaa.gov/pub/cpc/wd52dg/data/indices/Old_data/singa50, ftp://ftp.ncep.noaa.gov/pub/cpc/wd52dg/data/indices/qbo.u50.index, ftp://ftp.ncep.noaa.gov/pub/cpc/wd52dg/data/indices/Old_data/singa30, ftp://ftp.ncep.noaa.gov/pub/cpc/wd52dg/data/indices/qbo.u30.index ftp://ftp.ngdc.noaa.gov/STP/SOLAR_DATA/SOLAR_RADIO/FLUX/MONTHPLT.ADJ http://www.cdc.noaa.gov/Datasets/ncep.reanalysis/pressure http://www.awi-potsdam.de/www-pot/atmo/candidoz http://www.awi-potsdam.de/www-pot/atmo/candidoz

influence of natural variability on ozone. Global in this context means that the same time series is used for all stations. As opposed to this, our models take advantage of ‘‘local variables’’ such as EL and meteorological quantities defined at different pressure levels, which are specified for each station separately. Our results suggest that the effects of large-scale forcings such as NAO or the solar cycle are largely integrated into changes in the local variables. Since local variables may better reproduce the different effects of these forcings on the different stations than global ones, the latter usually drop out earlier during the elimination process than their local counterparts. Using this approach, we are therefore able to cover a large amount of the variance in total ozone by simple multiple regression models including only 3 to 7 variables. The standard deviation of the residuals (measurements minus model prediction) is only of the order of 2 to 3% of the total ozone columns, which is much lower than in earlier regression models including global variables only.

Appendix A:

Description of the Data Set

[63] Most of the used data are available through the Internet. The URLs are listed in Table A1, and the abbreviations are summarized in Table 1. They are available if not indicated differently from January 1948 to July 2006. A1. Total Ozone Values [64] Total ozone measurements were obtained from the WOUDC and the BAS. In this study we included measurements of Dobson (94 stations) and Brewer (24) spectrophotometers and the Russian filter instruments (40) [see, e.g., Staehelin et al., 2001]. At stations with instrumental changes the records were treated as separated series. The 158 selected sites are shown in Figure 1. Continuous

measurements before 1950 are only available for the station Arosa (Switzerland [Staehelin et al., 1998]). Measurements at several stations started in 1958 because of coordinated activities during the International Geophysical Year and the number of selected stations was approximately constant since the middle of the 1970s. [65] Data quality of some total ozone records, in particular in the early years, is debatable and remains a significant problem. The data quality assurance program of WMO, including regular intercomparisons of the various Dobson instruments with standard instruments, started in the 1970s. Some of the total ozone series stored at WOUDC were excluded because of instrumental problems documented in earlier WMO reports [WMO, 1999]. Parts of the records of some stations showing obvious discontinuities were excluded after visual inspection. In this way, a total of 12 periods of 9 different total ozone series were excluded from the analysis [see Ma¨der, 2004]. No additional attempts were made to exclude or correct suspicious measurements. A2. Local Variables [66] Most of the local variables are from the reanalysis of the National Center of Environmental Prediction (NCEP). They were extracted with linear interpolation from the grid points nearest to the individual stations and only for those days on which total ozone measurements were available (using linear interpolation to calculate a value for twelve noon local time). [67] 1. The temperatures (°C) at four pressure levels are as follows: T300 (at 300 hPa, approximately 9 km asl), T150 (at 150 hPa), T50 (at 50 hPa, corresponding to the altitude of ozone maximum in partial pressure at midlatitudes, approximately 22 km asl), and T10 (at 10 hPa, approximately 32 km asl).

13 of 16

D11108

¨ DER ET AL.: STATISTICAL MODELING OF TOTAL OZONE MA

[68] 2. PTP is pressure (hPa) at tropopause altitudes for the thermal tropopause definition. For more information see NCEP Web site (URL given in Table A1). [69] 3. Potential vorticities (PV) at the altitudes of five isentropes are as follows: PV340 (at 340 K, lowermost stratosphere), PV400 and PV470 (at 400 and 470 K, lower stratosphere), and PV550 and PV650 (at 550 K and 650 K, middle stratosphere, close to 40 and 20 hPa, respectively). [70] The equivalent latitude weighted by a (monthly and zonal) climatological ozone profile was calculated by Wohltmann on the basis of the ERA-40 data set [Wohltmann et al., 2005]. In time periods where the ERA-40 data set is not available (before September 1957 and after December 2003) we used a simple regression model based on the local data from NCEP and a seasonal component to extend the EL time series (equation (6)). A3. Global Variables [71] These variables are globally defined which means that there is only one single global value even if the definition might be of local nature (e.g., North Atlantic Oscillation). A3.1. Climatic Indices [72] For the climatic indices of NAO and PNA we used data from two different sources. Those from the NHTP data set (see below) will be marked with an index T in the text. For the proxies AO and PC1-3, either the Northern or Southern Hemisphere data set was used depending on the location of the station. Therefore the Antarctic Oscillation (AAO) is referenced as AO too. [73] 1. The North Atlantic Oscillation (NAO) index gives the pressure difference between Iceland and Azores. It is an important climate index for the Atlantic region, particularly in winter. It is provided by the University of East Anglia, Norwich (United Kingdom). The period is January 1948 to February 2006. [74] 2. The North Atlantic Oscillation (NAOH) index gives yearly values according to the definition of Hurrell [1995]. It is provided by the Joint Institute for the Study of the Atmosphere and Ocean (JISAO). The period is January 1948 to December 2000. [75] 3. The Arctic/Antarctic Oscillation (AO) index gives the climate pattern of the Northern/Southern Hemisphere derived from hemispheric sea level pressure field. It is provided by NOAA. The period is January 1979 to February 2006 for SH and January 1950 to February 2006 for NH. [76] 4. The Pacific – North American Pattern (PNA) index gives the dominant component of low-frequency oscillation of the extratropical Northern Hemisphere, consisting of a quadrupole including poles over the Aleutian and southeast of USA and Hawaii and northern America/central Canada. We used the definition of Wallace and Gutzler [1981]. It is provided by JISAO. The period is January 1948 to September 2004. [77] 5. The Global-SST ENSO (SST) Index gives the averaged anomaly of sea surface temperature between 20°S and 20°N minus values outside this area. It is provided by JISAO. The period is January 1948 to November 2004. [78] 6. The Southern Oscillation Index (SOI) is calculated from the monthly averages of sea surface pressure anoma-

D11108

lies between Tahiti and Darwin. It is provided by the National Center for Atmospheric Research (NCAR). The period is January 1948 to February 2006. [79] 7. Sea level pressure at Jakarta (JAK) data are provided by JISAO. The period is January 1948 to February 2006. [80] 8. The three first principal components of the pressure anomaly north of 20°N are PC1, PC2, and PC3. Note PC1 is strongly related to AO/AAO and in the Northern Hemisphere and PC2 to PNA. They are provided by JISAO. The period is January 1948 to December 2002 for SH and January 1948 to March 2006 for NH. [81] 9. The Pacific Decadal Oscillation (PDO) Index gives the first principal component of the EOF-analysis of the temperature at sea level (SST) between November and March north of 20°N. Long-term changes were removed from the data. It is provided by JISAO. The period is January 1948 to September 2004. [82] 10. The Northern Hemisphere Teleconnection Patterns (NHTP) are AS, EA, EAJ, EAWR, EP, NAOT, NP, PE, PNAT, PT, SCA, SZ, TNH, and WP. This data set is derived from a rotated principal component analysis (RPCA) based on the anomaly of the pressure plane at 700 mbar. For calculation of the monthly means, the values of the preceding and the following months were also included in the analysis. If the teleconnection patterns were not important they were set to zero. The following indices and patterns are used: Asian Summer (AS), East Atlantic (EA), East Atlantic Jet (EAJ), East Atlantic – Western Russian (EAWR), East Pacific (EP), North Atlantic Oscillation (NAOT), North Pacific (NP), Polar-Eurasia (PE), Pacific – North America (PNAT), Pacific Transition (PT), Scandinavia (SCA), Subtropical Zonal (SZ), Tropical – Northern Hemisphere (TNH), and West Pacific (WP). They are provided by NOAA. The period is March 1950 to November 2005. A3.2. Remaining Global Variables [83] 1. M (month) and S (season) are synthetic variables to describe the seasonal variation. In the final model M is only used for the remaining annual ozone cycle and has 12 different values for each month (but the same for all years). S is used to describe the seasonal dependencies of the explanatory variables. It is similar to M but has only 4 values for the different seasons (DJFM, AM, JJA and SON). See also section 3.1. [84] 2. Vertically integrated surface area density (SAD) of stratospheric aerosol is measured from satellite [Thomason et al., 1997] (available for 32 zonal bands) extended by solar light absorption at 550 nm. The data after 2000 were set to zero as recommended by the responsible scientific staff of NASA. There are 32 zonal bands. The data are provided by NASA. The period is January 1948 to July 2006. [ 85 ] 3. Effective Equivalent Stratospheric Chlorine (EESC) describes anthropogenic ozone depletion by ozone depleting substances such as chlorofluorocarbons (CFCs) and halons [WMO, 2003]. Note that this time series includes a time lag corresponding to a mean transport time to the stratosphere (note that we did not use a different EESC values for polar regions). The data are provided by the European Environment Agency (EEA). The period is January 1948 to July 2006.

14 of 16

D11108

¨ DER ET AL.: STATISTICAL MODELING OF TOTAL OZONE MA

[86] 4. The 11-year solar cycle (SF) describes the intensity of solar radio flux at 10.7 cm from measurements at Penticton, Canada [Covington, 1969; Jain and Hasan, 2004]. The data are provided by NOAA. The period is January 1948 to May 2006. [ 87 ] 5. The Quasi-Biennial Oscillation (QBO 30 and QBO50) series is a combination of two data sources: the older Canton (1953 – 1967), Gan (1967 – 1975) and Singapore (after 1976) adapted to the more recent data (from NCEP reanalysis) by linear transformation, and the coefficients were determined from the overlapping period. The NCEP data series shows a inhomogeneity in 1978 [Huesmann and Hitchman, 2003] and was used only after 1979. The use of two series on different pressure levels (30 and 50 mbar) allows the statistical model to estimate an optimized lag for each station by calculating a superposition of them [Bojkov et al., 1995]. The data are provided by NOAA. The period is January 1953 to May 2006. [88] 6. The polar area at 50 hPa below TNAT (PANC) is an explanatory variable that was introduced to describe arctic ozone depletion. It was calculated in two steps: (1) The area of air with temperatures below 195 K was used as a proxy for the equilibrium temperature of nitric acid trihydrate (NAT) at altitudes of 50 hPa north of 60°N and was determined from NCEP reanalysis data. NAT is one of the most important compounds that induce heterogeneous ozone depletion over the Arctic [see, e.g., Peter, 1997]. (2) The areas were accumulated over the months from December to the month of the ozone measurement because it has been empirically shown that the Arctic ozone depletion is proportional to the accumulated PSC volume [Rex et al., 2004]. The influence of polar ozone depletion mainly reaches the midlatitude stations after the breakdown of the polar vortex, which usually occurs in late winter or early spring. Polar ozone depletion in winter also influences the ozone measurements at extratropical stations in the following months until ozone is replenished by transport from the tropics by the Brewer Dobson circulation [Fioletov and Shepherd, 2003]. The variable is calculated from NCEP data. The period is January 1948 to December 2005. [89] 7. Eliassen-Palm-Flux (EPFC) is used to describe the strength of the residual circulation (one time series for each hemisphere). It is used in a cumulative way starting before winter (October in the Northern Hemisphere and April in the Southern Hemisphere). The data are provided by the EU project CANDIDOZ. The period is October 1948 to December 2004. [90] 8. PSC volume below TNAT (PSCVNC) is similar to PANC, but PSC volume has been calculated by counting all grid points below the formation temperature of NAT north of 60° and weighting them with their volume. NAT formation temperatures have been calculated according to Hanson and Mauersberger [1988] as a function of nitric acid mixing ratio, water vapor mixing ratio and pressure. More details are available at the Web site of the CANDIDOZ project (see Table A1). Data are provided by the EU project CANDIDOZ. The period is January 1948 to December 2004. [91] Acknowledgments. We are grateful for the support of MeteoSwiss and the EU project CANDIDOZ, as well as NCEP and ECWMF for the provided data sets and for the ozone measurements of the WOUDC and

D11108

BAS. The software R (http://www.R-project.org) was used for the statistical analysis in this study.

References Appenzeller, C., A. Weiss, and J. Staehelin (2000), North Atlantic oscillation modulates total ozone winter trends, Geophys. Res. Lett., 27(8), 1131 – 1134. Baldwin, M., et al. (2001), The quasi-biennial oscillation, Rev. Geophys., 39(2), 179 – 229. Bojkov, R., and V. Fioletov (1996), Total ozone variations in the tropical belt: An application for quality of ground based measurements, Meteorol. Atmos. Phys., 58(1 – 4), 223 – 240. Bojkov, R., L. Bishop, and V. Fioletov (1995), Total ozone trends from quality-controlled ground-based data (1964 – 1994), J. Geophys. Res., 100, 25,867 – 25,876. Borchi, F., P. Naveau, P. Keckhut, and A. Hauchecorne (2006), Detecting variability changes in arctic total ozone column, J. Atmos. Sol. Terr. Phys., 68(12), 1383 – 1395. Bro¨nnimann, S., J. Luterbacher, J. Staehelin, T. Svendby, G. Hansen, and T. Svenoe (2004), Extreme climate of the global troposphere and stratosphere in 1940 – 42 related to El Nino, Nature, 431(7011), 971 – 974. Brunner, D., J. Staehelin, H. Kunsch, and G. Bodeker (2006a), A Kalman filter reconstruction of the vertical ozone distribution in an equivalent latitude-potential temperature framework from TOMS/GOME/SBUV total ozone observations, J. Geophys. Res., 111, D12308, doi:10.1029/ 2005JD006279. Brunner, D., J. Staehelin, J. A. Maeder, I. Wohltmann, and G. E. Bodeker (2006b), Variability and trends in total and vertically resolved stratospheric ozone based on the CATO ozone data set, Atmos. Chem. Phys., 6, 4985 – 5008. Chandra, S., C. Varotsos, and L. Flynn (1996), The mid-latitude total ozone trends in the Northern Hemisphere, Geophys. Res. Lett., 23(5), 555 – 558. Covington, A. (1969), Solar radio emission at 10.7 cm, 1947 – 1968, J. R. Astron. Soc. Can., 63, 125. Dobson, G. B. M., and D. N. Harrison (1926), Measurements of the amount of ozone in the Earth’s atmosphere and its reaction to other geophysical conditions, Proc. R. Soc. London, Ser. A, 110, 660 – 693. Draper, N., and H. Smith (1998), Applied Regression Analysis, 3rd ed., John Wiley, New York. Fioletov, V., and T. Shepherd (2003), Seasonal persistence of midlatitude total ozone anomalies, Geophys. Res. Lett., 30(7), 1417, doi:10.1029/ 2002GL016739. Hanson, D., and K. Mauersberger (1988), Laboratory studies of the nitricacid trihydrate: Implications for the south polar stratosphere, Geophys. Res. Lett., 15(8), 855 – 858. Hood, L. (1997), The solar cycle variation of total ozone: Dynamical forcing in the lower stratosphere, J. Geophys. Res., 102, 1355 – 1370. Hood, L., and D. Zaff (1995), Lower stratospheric stationary waves and the longitude dependence of ozone trends in winter, J. Geophys. Res., 100, 25,791 – 25,800. Hoskins, B. J., M. McIntyre, and A. Robertson (1985), On the use and significance of isentropic potential vorticity maps, Q. J. R. Meteorol. Soc., 111, 877 – 975. Huesmann, A., and M. Hitchman (2003), The 1978 shift in the NCEP reanalysis stratospheric quasi-biennial oscillation, Geophys. Res. Lett., 30(2), 1048, doi:10.1029/2002GL016323. Hurrell, J. W. (1995), Decadal trends in the North Atlantic Oscillation: Regional temperatures and precipitation, Science, 269, 676 – 679. Jain, K., and S. Hasan (2004), Reconstruction of the past total solar irradiance on short timescales, J. Geophys. Res., 109, A03105, doi:10.1029/ 2003JA010222. Kerzenmacher, T., P. Keckhut, A. Hauchecorne, and M. Chanin (2006), Methodological uncertainties in multi-regression analyses of middleatmospheric data series, J. Environ. Monit., 8(7), 682 – 690. Koch, G., H. Wernli, C. Schwierz, J. Staehelin, and T. Peter (2005), A composite study on the structure and formation of ozone miniholes and minihighs over central Europe, Geophys. Res. Lett., 32, L12810, doi:10.1029/2004GL022062. Labitzke, K., and H. van Loon (2000), The QBO effect on the solar signal in the global stratosphere in the winter of the Northern Hemisphere, J. Atmos. Sol. Terr. Phys., 62(8), 621 – 628. Lait, L. R., et al. (1990), Reconstruction of O3 and N2O fields from ER-2, DC-8, and balloon observations, Geophys. Res. Lett., 17, 521 – 524. Lait, L., et al. (2004), Non-coincident inter-instrument comparisons of ozone measurements using quasi-conservative coordinates, Atmos. Chem. Phys., 4, 2345 – 2352. Lawrence, J., A. Cadavid, and A. Ruzmaikin (2000), The response of atmospheric circulation to weak solar forcing, J. Geophys. Res., 105, 24,839 – 24,848.

15 of 16

D11108

¨ DER ET AL.: STATISTICAL MODELING OF TOTAL OZONE MA

Ma¨der, J. A. (2004), Haupteinflussfaktoren auf das stratospha¨rische Ozon in der no¨rdlichen Hemispha¨re (in German), Ph.D. thesis, Swiss Fed. Inst. of Technol., ETH Zurich, Zurich, Switzerland. Newchurch, M. J., E. Yang, D. M. Cunnold, G. C. Reinsel, J. M. Zawodny, and J. M. Russell III (2003), Evidence for slowdown in stratospheric ozone loss: First stage of ozone recovery, J. Geophys. Res., 108(D16), 4507, doi:10.1029/2003JD003471. Orsolini, Y., and F. Doblas-Reyes (2003), Ozone signatures of climate patterns over the Euro-Atlantic sector in the spring, Q. J. R. Meteorol. Soc., 595, 3251 – 3263. Orsolini, Y., and V. Limpasuvan (2001), The North Atlantic Oscillation and the occurrences of ozone miniholes, Geophys. Res. Lett., 21, 4099 – 4102. Peter, T. (1997), Microphysics and heterogeneous chemistry of polar stratospheric clouds, Annu. Rev. Phys. Chem., 48, 785 – 822. Randel, W., F. Wu, J. Russell, J. Waters, and L. Froidevaux (1995), Ozone and temperature changes in the stratosphere following the eruption of Mount Pinatubo, J. Geophys. Res., 100, 16,753 – 16,764. Randel, W., F. Wu, and R. Stolarski (2002), Changes in column ozone correlated with the stratospheric EP flux, J. Meteorol. Soc. Jpn., 80(4B), 849 – 862. Reinsel, G. (2002), Trend analysis of upper stratospheric Umkehr ozone data for evidence of turnaround, Geophys. Res. Lett., 29(10), 1451, doi:10.1029/2002GL014716. Reinsel, G. C., E. Weatherhead, G. C. Tiao, A. J. Miller, R. M. Nagatani, D. J. Wuebbles, and L. E. Flynn (2002), On detection of turnaround and recovery in trend for ozone, J. Geophys. Res., 107(D10), 4078, doi:10.1029/2001JD000500. Reinsel, G. C., A. J. Miller, E. C. Weatherhead, L. E. Flynn, R. M. Nagatani, G. C. Tiao, and D. J. Wuebbles (2005), Trend analysis of total ozone data for turnaround and dynamical contributions, J. Geophys. Res., 110, D16306, doi:10.1029/2004JD004662. Rex, M., R. J. Salawitch, P. von der Gathen, N. R. P. Harris, M. P. Chipperfield, and B. Naujokat (2004), Arctic ozone loss and climate change, Geophys. Res. Lett., 31, L04116, doi:10.1029/2003GL018844. Ruzmaikin, A., and J. Feynman (2002), Solar influence on a major mode of atmospheric variability, J. Geophys. Res., 107(D14), 4209, doi:10.1029/ 2001JD001239. Salby, M., and P. Callaghan (1993), Fluctuations of total ozone and their relationship to stratospheric air motions, J. Geophys. Res., 98, 2715 – 2727. Salby, M., and P. Callaghan (2002), Interannual changes of the stratospheric circulation: Relationship to ozone and tropospheric structure, J. Clim., 15(24), 3673 – 3685. Schoeberl, M., et al. (1989), Reconstruction of the constituent distribution and trends in the Antarctic polar vortex from ER-2 flight observations, J. Geophys. Res., 94, 16,815 – 16,845. Schubert, S., and M. Munteanu (1988), An analysis of tropopause pressure and total ozone correlations, Mon. Weather Rev., 116(3), 569 – 582. Staehelin, J., A. Renaud, J. Bader, R. Mcpeters, P. Viatte, B. Hoegger, V. Bugnion, M. Giroud, and H. Schill (1998), Total ozone series at Arosa (Switzerland): Homogenization and data comparison, J. Geophys. Res., 103, 5827 – 5841. Staehelin, J., N. Harris, C. Appenzeller, and J. Eberhard (2001), Ozone trends: A review, Rev. Geophys., 39(2), 231 – 290.

D11108

Steinbrecht, W., H. Claude, U. Kohler, and K. Hoinka (1998), Correlations between tropopause height and total ozone: Implications for long-term changes, J. Geophys. Res., 103, 19,183 – 19,192. Steinbrecht, W., H. Claude, U. Kohler, and P. Winkler (2001), Interannual changes of total ozone and Northern Hemisphere circulation patterns, Geophys. Res. Lett., 28, 1191 – 1194. Steinbrecht, W., B. Hassler, H. Claude, P. Winkler, and R. Stolarski (2003), Global distribution of total ozone and lower stratospheric temperature variations, Atmos. Chem. Phys., 3, 1421 – 1438. Steinbrecht, W., H. Claude, and P. Winkler (2004), Enhanced upper stratospheric ozone: Sign of recovery or solar cycle effect?, J. Geophys. Res., 109, D02308, doi:10.1029/2003JD004284. Thomason, L., L. Poole, and T. Deshler (1997), A global climatology of stratospheric aerosol surface area density deduced from Stratospheric Aerosol and Gas Experiment II measurements: 1984 – 1994, J. Geophys. Res., 102, 8967 – 8976. Thompson, D., and J. Wallace (2000), Annular modes in the extratropical circulation, Part I: Month-to-month variability, J. Clim., 5, 1000 – 1016. Uppala, S., et al. (2005), The ERA-40 re-analysis, Q. J. R. Meteorol. Soc., 131(612), 2961 – 3012. van Loon, H., and K. Labitzke (2000), The influence of the 11-year solar cycle on the stratosphere below 30 km: A review, Space Sci. Rev., 94(1 – 2), 259 – 278. Venables, W. N., and B. D. Ripley (2002), Modern Applied Statistics with S, Springer, New York. Wallace, J., and D. Gutzler (1981), Teleconnections in the geopotential height field during the Northern Hemisphere winter, Mon. Weather Rev., 4, 784 – 812. Weatherhead, E., and S. Andersen (2006), The search for signs of recovery of the ozone layer, Nature, 441, 39 – 45. Wohltmann, I., M. Rex, D. Brunner, and J. Ma¨der (2005), Integrated equivalent latitude as a proxy for dynamical changes in ozone column, Geophys. Res. Lett., 32, L09811, doi:10.1029/2005GL022497. Wohltmann, I., M. Rex, R. Lehmann, D. Brunner, and J. Ma¨der (2007), A process-oriented regression model for column ozone, J. Geophys. Res., doi:10.1029/2006JD007573, in press. World Meteorological Organization (1993), Handbook for Dobson Ozone Data Re-evaluation, Rep. 29, Geneva, Switzerland. World Meteorological Organization (1995), Scientific Assessment of Ozone Depletion: 1994, Rep. 37, Geneva, Switzerland. World Meteorological Organization (1999), Scientific Assessment of Ozone Depletion: 1998, Rep. 44, Geneva, Switzerland. World Meteorological Organization (2003), Scientific Assessment of Ozone Depletion: 2002, Rep. 47, Geneva, Switzerland. 

D. Brunner, EMPA – Materials Science and Technology, CH-8600 Du¨bendorf, Switzerland. J. A. Ma¨der, T. Peter, and J. Staehelin, Institute for Atmospheric and Climate Science, ETH Zurich, 8092 Zurich, Switzerland. (joerg.maeder@ env.ethz.ch) W. A. Stahel, Seminar fu¨r Statistik, ETH Zurich, CH-8092 Zurich, Switzerland. I. Wohltmann, Alfred Wegener Institute for Polar and Marine Research, D-14401 Potsdam, Germany.

16 of 16

Suggest Documents