title: investigating the influence of synoptic-scale ... - Rob J Hyndman

9 downloads 1668 Views 701KB Size Report
where the dominant synoptic „types‟ over the region of Melbourne, Australia are ..... step in the selection of individual models for O3, PM10, and NO2 was to fit a.
1

TITLE:

2

METEOROLOGY ON AIR QUALITY USING SELF-ORGANIZING MAPS AND

3

GENERALIZED ADDITIVE MODELLING.

4

John L. Pearcea*, Jason Beringera, Neville Nichollsa, Rob J. Hyndmanb, Petteri Uotilaa, and

5

Nigel J. Tappera

6

a

School of Geography and Environmental Science, Monash University, Melbourne, Australia

7

b

Department of Econometrics and Business Statistics, Monash University, Melbourne,

8

Australia

INVESTIGATING

THE

INFLUENCE

OF

SYNOPTIC-SCALE

9 10

Keywords: air pollution, generalized additive models, self-organizing maps, and synoptic

11

meteorology.

12 13

*Corresponding author: School of Geography and Environmental Science, Monash

14

University,

15

[email protected]

Clayton,

Victoria

3800.

Tel:

+61399054457.

E-mail:

16 17 18 19 20 21 22 23 24 25 1

26

INVESTIGATING THE INFLUENCE OF SYNOPTIC-SCALE CIRCULATION ON

27

AIR

28

ADDITIVE MODELLING.

QUALITY

USING

SELF-ORGANIZING

MAPS

AND

GENERALIZED

29

ABSTRACT

30

The influence of synoptic-scale circulations on air quality is an area of increasing interest to

31

air quality management in regards to future climate change. This study presents an analysis

32

where the dominant synoptic „types‟ over the region of Melbourne, Australia are determined

33

and linked to regional air quality. First, a self-organising map (SOM) is used to generate a

34

time series of synoptic charts that classify the annual daily circulation affecting Melbourne

35

into 20 different synoptic types. SOM results are then employed within the framework of a

36

generalized additive model (GAM) to identify links between synoptic-scale circulations and

37

observed changes air pollutant concentrations. The GAMs estimate shifts in pollutant

38

concentrations under each synoptic type after controlling for long-term trends, seasonality,

39

weekly emissions, spatial variation, and temporal persistence. Results showed the aggregate

40

impact of synoptic circulations in the models to be quite modest as only 5.1% of the daily

41

variance in O3, 4.7% in PM10, and 7.1% in NO2 were explained by shifts in synoptic

42

circulations. Further analysis of the partial residual plots identified that despite a modest

43

response at the aggregate level, individual synoptic categories had differential effects on air

44

pollutants. In particular, increases of up to 40% in NO2 and PM10 and 30% in O3 occur when

45

a synoptic conditions result in a north-easterly gradient wind over the Melbourne area.

46

Additionally, NO2 and PM10 levels also showed increases of up to 40% when a strong high

47

pressure system was centered directly over the Melbourne area. In sum, the unified approach

48

of SOM and GAM proved to be a complementary suite of tools capable of identifying the

49

entire range synoptic circulation patterns over a particular region and quantifying how they

50

influence local air quality. 2

51

1. Introduction

52

Increased air pollutant concentrations in the urban environment do not typically result

53

from sudden increases in emissions, but rather from meteorological conditions that impede

54

dispersion in the atmosphere or result in increased pollutant generation (Cheng, Campbell et

55

al. 2007). A combination of meteorological variables important to these conditions includes

56

temperature, winds, radiation, atmospheric moisture, and mixing depth (EPA 2009). Because

57

synoptic-scale circulations are the envelope that govern all the above meteorological features

58

synoptic weather typing has become a popular approach for evaluating impacts of

59

meteorological conditions on air pollution (Triantafyllou 2001; Chen, Cheng et al. 2008;

60

Beaver and Palazoglu 2009). This has led the air quality community to recognize synoptic-

61

scale circulations as an important driver of local air pollution (EPA 2009).

62

We wish to increase the understanding of the relationship between synoptic-scale

63

circulations and air pollution in Melbourne, Australia. The city of Melbourne, with a

64

population of approximately 3.9 million (ABS 2010), is situated on Port Phillip Bay at the

65

south-eastern edge of continental Australia in close proximity to the Southern Ocean at 37°

66

48‟ 49” S and 144° 57 47” E (Figure 1). The climate of Melbourne can best be described as

67

moderate oceanic and the city is famous for its changeable weather conditions (BOM 2009).

68

This is due in part to the city's location at the pole-ward margin of a sub-tropical continent

69

that results in the passage of very differing air masses over the region. The mid-latitude

70

synoptic weather systems that affect the region produce persistent westerly winds between

71

the subtropical high pressures to the north and the Southern Ocean lows to the south. This

72

region is also dominated by fronts, which result from the interaction of subtropical and polar

73

air masses. Although Melbourne‟s air quality can be described as relatively good when

74

compared to other urban centres of similar size, recent periods of anomalous environmental

75

conditions present an interesting opportunity for analysis (Murphy and Timbal 2008). During 3

76

this time levels of ozone and particles have not always met air quality standards and events

77

such as bushfires, heatwaves, and dust storms have likely influenced regional air quality. The

78

overall objective of this research is to investigate how the dominant types of synoptic-scale

79

circulation over Melbourne influence local air quality. Moreover, a practical methodology is

80

presented in order to achieve these results.

81

2. Data

82

a. Large Scale Meteorological Data

83

The four-time daily gridded MSLP data from the ERA-Interim reanalysis (1989 to

84

2008) was used to develop the SOM for synoptic-scale circulations affecting Melbourne. This

85

reanalysis was produced by the European Centre for Medium-Range Weather Forecasts

86

(ECMWF) and is discussed in more detail by Uppala et al. (2008). MSLP fields were

87

obtained for 10 a.m., 4 p.m., 10 p.m., and 4 a.m. local standard time (LST) for each day in the

88

reanalysis period at a spatial resolution of 0.72° over the spatial domain of 35-44° S and 140-

89

150° E. It is important to note that ERA40 and NCEP/NCAR reanalysis products were also

90

trialled for this analysis. While these products produced climatologies of similar agreement

91

ERAI was chosen as the enhanced spatial resolution of the data produced synoptic types that

92

were more interpretable for the Melbourne region. MSLP was chosen as a proxy for

93

atmospheric circulation because it has been shown to relate well to the spatial pattern of these

94

processes.

95

b. Local Meteorological Data

96

Links between synoptic types and local weather conditions were made using daily

97

automatic weather station observations for site number 086282 (Melbourne International

98

Airport) for the period of 1999 to 2006. This site is located at 37° 40‟ 12” S and 144° 49‟ 48”

99

E with an elevation of 113 m and was chosen because a comprehensive range of measures are

4

100

collected on a consistent basis. Variables provided by Climate Information Services, National

101

Climate Centre, Bureau of Meteorology Variables included:

102



Maximum daily temperature (°C)

103



Mean sea level pressure (hPa)

104



Global radiation (MJ/m2)

105



Water vapour pressure (hPa)

106



Zonal (u) and meridional (v) wind components (km/hr)

107



Precipitation (mm).

108

Additionally, boundary layer height (BLH) was taken from the ERA-Interim data using the

109

location of 37° 30‟ 0” S and 145° 30‟ 0” E for 4 p.m. LST - the approximate time of

110

maximum boundary layer depth.

111

c. Air Pollutant Monitoring Data

112

Local air pollution data was provided by the Environmental Protection Authority

113

Victoria (EPAV) taken from the Port Phillip Bay air monitoring network (Figure 1).

114

Pollutants included ozone (O3), particulate matter ≤ 10 µg (PM10), and nitrogen dioxide

115

(NO2). O3 and NO2 concentrations are reported in parts per billion by volume (ppb) and were

116

measured using pulsed fluorescence chemiluminescence and ultra violet absorption

117

techniques. PM10 concentrations were measured using photospectrometry and are reported in

118

micrograms per cubic meter (µg/m3). This analysis uses the daily maximum value for 8-hr O3,

119

the 24-hr mean value of PM10, and the daily maximum value for 1-hr NO2 from all available

120

monitoring locations over the period of 1999 to 2006 (Table 1). These timeframes were

121

selected to parallel air quality objectives in the State Environment Protection Policy for

122

ambient air quality (SEPP 1999). Additionally, days on which significant air quality events

123

(bushfires, dust storms, factory emissions, etc.) were known to have occurred and data below

124

5 ppb for O3 and NO2 and 3 µg/m3for PM10 were removed. 5

125

3. Methods

126

a. Self-organizing Maps

127

Synoptic typing is an approach used in synoptic climatology where the atmospheric

128

state is partitioned into broad categories either in terms of the spatial patterns associated with

129

a proxy (ex. MSLP, 500 hPa height) or the multivariate characteristics of an air mass. These

130

synoptic categories or „types‟ are then used to provide insight into the influence of large scale

131

processes on local environmental conditions (Hewitson and Crane 2002). The classification

132

of synoptic types used in this study was produced using a neural networking algorithm known

133

as self-organizing maps (SOM) (Kohonen 2001). This approach was has been shown to be

134

effective means of classifying the expected synoptic circulation patterns over a particular

135

region (Hewitson and Crane 2002; Hope, Drosdowsky et al. 2006). In short, SOMs can be

136

described as a data reduction technique used to visualize and interpret large high-dimensional

137

data sets onto a two dimensional array of representative nodes (Kohonen 2001). This two

138

dimensional array of nodes is commonly referred to as the self-organizing „map‟. One feature

139

of the SOM that has been found particularly useful in synoptic climatology is the matching of

140

each period in the analysis to a particular synoptic type (Hope, Drosdowsky et al. 2006). This

141

essentially results in a time series of synoptic charts that a can be used to link environmental

142

responses to specific types of systems through time. For more information on SOM and their

143

application in synoptic climatology please see Hewitson and Crane (2002).

144

It is important to note that in practice the dimensions of the SOM are selected by the

145

user. This has direct implications for the number of synoptic states represented. We assessed

146

a multiple array of SOM dimensions in effort to identify a dimension that was adequate to

147

represent the expected range of synoptic patterns for the region and practical for use in our

148

statistical analysis. It was determined that a 4X5 array of circulation patterns met these

6

149

criteria. The software used to create the SOM is part of the SOM_PAK, available from

150

http://www.cis.hut.fi/research/som-research.

151

b.Generalized Additive Models

152

There are a multitude of techniques for analysing the effects of meteorology on air

153

pollution (Thompson, Reynolds et al. 2001). One approach that has been found particularly

154

effective at handling the complex non-linearity‟s associated with air pollution is Generalized

155

Additive Modelling (GAM) (Aldrin and Haff 2005; Carslaw, Beevers et al. 2007). The

156

additive model in the context of a concentration time series can be written in the form (Hastie

157

and Tibshirani 1990):

( )

158

∑ (

)

(2.1)

159

where g[ ] is a Gaussian distributed „log-link‟ function, yi is the ith air pollution concentration,

160

E( ) is the expected concentration of y, β0 is the overall mean of the response, sj(xij) is the

161

smooth function of ith value of covariate j, n is the total number of covariates, and εi is the ith

162

residual with var(εi)= σ2, which is assumed to be normally distributed. Smooth functions are

163

developed through an integration of model selection and automatic smoothing parameter

164

selection using penalised regression splines, which while optimizing the fit, make an effort to

165

minimize the number of dimensions in the model (Wood 2006). Interaction terms, e.g. s(x1,

166

x2), can also be modelled as thin-plate regression splines or tensor product smooths. This is a

167

notable feature for air quality applications as interaction terms have used to effectively for

168

modelling complex responses to wind components and to account for spatial trends (Bivand

169

et al. 2008; Carslaw 2007). The choice of the smoothing parameters is made through

170

restricted maximum likelihood (REML) and confidence intervals are estimated using an 7

171

unconditional Bayesian method (Wood 2006). This analysis was conducted using the gam

172

modelling function in R environment for statistical computing (R Development Core Team

173

2009) with packages „mgcv‟ (Wood 2006).

174

c. Model Development

175

The first step in the selection of individual models for O3, PM10, and NO2 was to fit a

176

preliminary base model. This was fit to each pollutant in order to control for the seasonality,

177

persistence, spatial trend, and weekly emissions patterns that exist in these data. Following

178

model (2.1) the preliminary model can be written as:

179

log(yi) = β0 + s(time) + s(dow) + s (long, lat) + s (yi-1) + εi

180

(2.2)

181

where time is a numeric vector ranging from 1 to 2922 included to account for long-term

182

trends and seasonality, dow is a numeric vector ranging from 0 to 6 included to account for

183

day-of-the-week, long and lat are the spatial coordinates of each monitor location included to

184

account for spatial trend, and yi-1 is one day lag term included to account for short-term

185

temporal persistence. It is important to note that the residual spatial variation is controlled by

186

including a tensor product smooth , s(long, lat), in the model and a smooth function of the

187

preceding day‟s pollutant concentration, s(yi-1), was included to control for autocorrelation in

188

residuals. Additionally, since air pollution data are inherently cyclic, a predetermined

189

smoothing parameter of k=32 (one knot (k) for each change of season) was used for the

190

construction of the spline function for time. The motivation for this control is that function

191

should represent a relatively symmetric cyclic pattern in the data. To check the adequacy of

192

our methods for controlling for space-time effects, box-plots and time-series plots of

8

193

residuals by monitor location were examined. No violations of assumptions were obvious in

194

any pollutant.

195

Finally, the categorical predictor for synoptic-scale circulations (C) -- included to

196

represent synoptic-scale circulation types – was added to the model. Following model (2.1)

197

final models can be written as:

198

log(yi) = β0 +s(time) + s(doy) + s(dow) + s (long, lat) + s(yi-1) + Cp + ε, (2.3)

199 200

where the term Cp represents the effect of the pth synoptic type and p = 0, 1, ..., 19.

201

d. Characterization of Synoptic Influence on Air Pollution

202

The explanatory power of model (2.3) was measured using the R2 statistic. The

203

aggregate impacts of synoptic circulations on each pollutant are assessed by taking the

204

difference in the R2 of model (2.2) and model (2.3). Individual relationships between

205

particular synoptic types and each air pollutant are assessed using partial response plots.

206

Partial response plots are used to reveal the marginal effect of each synoptic type on

207

each air pollutant. A partial response plot shows the static effect (i.e. effects that are stable

208

over time) of a particular synoptic type on a particular pollutant after accounting (i.e.,

209

controlling) for the effects of all other explanatory variables in the model (Camalier, Cox et al.

210

2007). The y-axis of each plot, which has been centred to the mean value of the response,

211

shows the marginal effect of a covariate (Harrell 2001) on a percentage scale. Specifically,

212

the marginal effect is the average percentage change in pollutant concentration as the

213

covariate of interest is varied, while all other explanatory variables are kept fixed. Partial

214

response plots make it easy to compare the size of the marginal effects of different covariates

215

on the different pollutants.

216

Technically, the displayed marginal effects are given by 100 * (Cp / Cp{ref}-1 ), where p

217

is the synoptic type of interest, Cp is the corresponding coefficient in model (2.3), and Cp{ref} is 9

218

the coefficient for the chosen reference value for p – in our case synoptic type 00. In sum,

219

the marginal effects are the estimated average effects of each synoptic type on the predicted

220

values produced by model (2.3).

221

4. Results and Discussion

222

a. Self-Organizing Maps

223

The SOM of MSLP provides a clear visualization of the range of circulation features

224

affecting Melbourne (Figure 2). Additionally, the method arranges the output grid so that

225

similar types are near each other while more distinct types are further apart. Individual maps

226

within the grid are referenced throughout the text using XY coordinates with category 00

227

being in the top left corner and category 43 being in the bottom right corner. Synoptic types

228

with strong regions of high pressure are located near the bottom right corner of the grid while

229

types with broad regions of low pressure are located near the top left. These pressure patterns

230

can be used to infer regional scale wind speeds and direction along with mixing. A secondary

231

feature of the SOM algorithm is the generation of a time series where each six hour period in

232

the analysis has been classified under a particular synoptic type represented in Figure 2.

233

Frequency analysis of these periodic classifications found that higher frequency nodes are

234

presented on the periphery of the SOM grid and that lower frequency nodes are presented

235

towards the center (Figure 3). This indicates that dominant circulations on are presented on

236

the periphery and transitional states are presented closer to the center of the grid. It is should

237

be noted that the transitions between synoptic features would be expected to reflect the

238

generally eastward movement of weather systems in this region.

239

Meteorological conditions associated with each synoptic type were determined by

240

matching the automatic weather station observations on any one day to the synoptic types

241

associated with the SOM classes 10 a.m. for that day (Table 2). Across group variability for

10

242

each meteorological variable indicates that each synoptic type has distinct local

243

meteorological conditions.

244

b.Generalized Additive Models

245

b.1 Ozone (O3)

246

Results found significant differences (F=122.6, d.f.=19, p