Climatic Change (2014) 125:7–21 DOI 10.1007/s10584-013-0935-9
Assessment of CMIP5 global model simulations over the subset of CORDEX domains used in the Phase I CREMA N. Elguindi · F. Giorgi · U. Turuncoglu
Received: 21 December 2012 / Accepted: 8 September 2013 / Published online: 27 September 2013 © Springer Science+Business Media Dordrecht 2013
Abstract We present an assessment of the CMIP5 global model simulations over a subset of CORDEX domains used in the Phase I CREMA (CORDEX RegCM hyper-MAtrix) experiment (Africa, HYMEX-MED (Mediterranean), South America, Central America and West Asia). Three variants of the transformed Mielke measure are used to assess (1) the model skill in simulating surface temperature and precipitation historical climatology, (2) the degree of surface temperature and precipitation change occurring under greenhouse gas forcing, and (3) the consistency of a model’s projected change with that of the Multi Model Ensemble (MME) mean. The majority of models exhibit varying degrees of skill depending on the region and season; however, a few models are identified as performing well globally. We find that resolution improves the model skill in most regional and seasonal cases, especially for temperature. Models with the highest and lowest climate sensitivity, as well as those whose change most resembles the ensemble mean are also discussed. Although the higher resolution models perform better, we find that resolution does not have a statistically significant impact on the models’ response to GHG forcing, indicating that model biases do not play a primary role in affecting the model response to GHG forcing. We also assess the three selected models for the CREMA Phase I experiment (HADGEM2ES,
This article is part of a Special Issue on “The Phase I CORDEX RegCM4 Experiment MAtrix (CREMA)” edited by Filippo Giorgi, William Gutowski, and Ray W. Arritt. Electronic supplementary material The online version of this article (doi:10.1007/s10584-013-0935-9) contains supplementary material, which is available to authorized users. N. Elguindi (B) · F. Giorgi Earth System Physics Section, The Abdus Salam International Centre for Theoretical Physics, Trieste, Italy e-mail:
[email protected] U. Turuncoglu Informatics Institute, Istanbul Technical University, Istanbul, Turkey
8
Climatic Change (2014) 125:7–21
MPI-ESMMR and GFDL-ESM2M) and find that they are characterized by a relatively good level of performance, a range of high to low climate sensitivities and a good consistency with the MME changes, thereby providing a reasonably representative sample of the CMIP5 ensemble.
1 Introduction The first step towards the production of scenarios based on nested Regional Climate Model (RCM) simulations is the selection of the driving Global Climate Models (GCMs). In fact, the information derived from the lateral boundaries affects strongly the RCM solution (e.g. Giorgi and Mearns 1999), and this effect depends on the size of the domain, its location and the lateral boundary forcing technique (Laprise et al. 2008). The quality of the driving GCM meteorological fields is thus critical for a successful RCM simulation (Giorgi and Mearns 1999). In addition, because different GCMs respond differently to the same greenhouse gas (GHG) forcing, the choice of GCMs represents an important element of uncertainty in RCM-based projections. Ideally a large ensemble of GCMs should be used, as recommended in the CORDEX protocol (Giorgi et al. 2009). However, in practice the number of experiments performed is limited by the availability of computing and storage resources, implying that only a subset of the full available range of GCMs can be used. This further stresses the importance of the selection of driving GCMs. Different criteria can lead the selection of the driving GCMs. One is the performance of the GCM in reproducing present day climate over the domain of interest: a better performing GCM would provide better quality lateral boundary conditions for the RCM. Another is the GCM climate change response over the domain of interest: the selected GCMs should cover as much as possible the range of different responses by the full ensemble of available GCMs. Thirdly, one might want to select GCMs that are representative of the Multi-Model Ensemble (MME) mean changes. In practice, however, the choice is heavily affected by the availability of high temporal resolution boundary data from the GCMs: most often data from only a limited number of GCMs is indeed available for RCM experiments. For these reasons, in order to assess the value of an RCM ensemble, it is important to place the selected subset of driving GCMs within the context of the full ensemble available. In the phase I CORDEX RegCM hyper-MAtrix (CREMA) experiment detailed in this special issue, a subset of GCMs was chosen among those participating in the CMIP5 (Climate Model Intercomparison Project 5) program that made available 6-hourly boundary fields usable for RCM nesting at the time of the beginning of the experiment. In order to place this subset within the context of the full CMIP5 ensemble, in this paper we present an analysis of the CMIP5 GCMs over the five domains employed in the CREMA phase I. A standard metric, the M-Score, is used to assess the performance of the GCMs in reproducing present day climate, the inter-model spread of 21st century climate change responses and the distance of the individual models from the ensemble average change. Seasonal and annual surface air temperature and precipitation are analyzed. After an overall analysis of the entire CMIP5 ensemble, the subset of CREMA GCMs is discussed within the context of the full ensemble.
Climatic Change (2014) 125:7–21
9
2 Data and methods 2.1 Models An ensemble of 45 GCMs from 19 institutes participating in the CMIP5 experiment (Taylor et al. 2012) are analyzed (Supplemental Table 1). The range of complexity among the GCMs varies considerably, from coupled atmosphere-ocean models to Earth System Models with more complex components representing chemistry, carbon and biogeochemical processes. All of the GCM outputs, which have varying spatial resolutions, are interpolated to a common 0.5 × 0.5 degree grid corresponding to the grid of the observation dataset (see next section). Seasonal and annual climatologies of surface air temperature and precipitation are constructed for the historical (1976–2005) and future (2070–2099) periods (note that all models have an ocean component thus the historical simulations are not forced with observed SSTs). For the future period, for brevity we show results only from the high end RCP8.5 scenario, but we emphasize that similar conclusions were found also for the lower end RCP4.5 scenario. To simplify our analysis, we chose to only evaluate surface temperature and precipitation in this study. While these are considered among the most important variables for impact studies, one of the ultimate goals of CORDEX, we recognize that a complete evaluation of model performance would also include other variables, including upper-atmosphere variables.
2.2 Observations For observations we use the global 0.5 × 0.5 degree 2-meter temperature and precipitation datasets (version 2.1) produced by the Climate Research Unit (CRU) of the University of East Anglia (New et al. 2000). Seasonal and annual climatologies of surface temperature and precipitation are constructed for the historical period 1976–2005.
2.3 Skill metrics Three variations of the transformed Mielke’s measure, M, are utilized to evaluate the CMIP5 models over the 5 regions depicted in Fig. 1 (land only), which broadly encompass the five CORDEX domains used in the CREMA Phase I experiment. All calculations are performed on time-averaged (for the historical (1976–2005) and future (2070–2099) periods), spatially distributed fields. The M-score has been proven to be a useful measure of climate model performance (Meehl et al. 2007, Watterson et al., under review) and is described in detail by Watterson (1996). Essentially, the M-score is a measure of both the mean square error (mse) and the correlation coefficient, r, of two fields. In our case, the mse is calculated between two climatological fields for a particular spatial domain and the correlation coefficient refers to the spatial correlation between the two fields. When the variance and mean of the two fields are the same, M is equivalent to r, but converges to the root of the mse rather than the mse, which is useful for fields such as temperature where r tends to be close to one (Watterson et al., under review).
Climatic Change (2014) 125:7–21
10
Fig. 1 The CREMA regions
First, we use the M-score in its classical sense to validate the models’ ability in reproducing the historical climate. The M1 score is defined as, (1) M1 = (2/π )sin−1 1 − mse/ V X + VY + (G X − GY )2 where X and Y are the time-averaged, spatially distributed model and observed fields, respectively, V is the spatial variance, G is the spatial mean, and 2/π is a normalizing factor for the arcsin term which varies from zero to π/2. The mean square error (mse) is calculated as, mse =
n (Yi − Xi )2
(2)
i=1
where i is the grid point over a particular spatial domain. An M1 score of 1 indicates perfect model skill, while a score of zero indicates no skill. Next, to assess the models’ climate sensitivity to GHG increases we define the M2 score as follows, M2 = 1 − (2/π )sin−1 [1 − mse/(V X + VY + (G X − GY )2 )] (3) where X refers to the model’s future field and Y to the model’s historical field. Subtracting the score from one means that higher scores indicate greater change and a value of zero indicates no change. Lastly, we define the M3 score in order to evaluate how much a model’s projected change resembles that of the ensemble mean. Seasonal and annual climatologies of the MME for the future period are constructed using an unweighted average of all available models. The M3 score is identical to the M1 score (1), except that X refers to the change in the individual model field and Y refers to the change in the MME field. In this case, values closer to 1 indicate that the model is closer to the ensemble mean.
Climatic Change (2014) 125:7–21
11
The M-score based metrics described here offer a simple and elegant method for comparing the performance and behavior between among a large set of models, such as the CMIP5 ensemble. However, the metric does have some limitations. Namely, the M-score is based on squared differences and thus does not provide any information regarding positive or negative biases or changes. Furthermore, when applied to large domains encompassing a variety of physical environments with high spatial variance, the M-score becomes domain-dependent so that values cannot be compared across different regions. Despite these caveats, we nonetheless find the M-score metrics to be a useful and concise method for ranking large sets of models. To visualize the results, the M-scores are presented in so called “portrait” plots using four different colors, each of which represent a quartile. The quartiles are based on the model scores for a particular region and season. This allows for easier visual interpretation of the results as the actual score value is not as important as a model’s relative score in the context of the full ensemble. Descriptive statistics of the M scores are also presented in tables to assess inter-model variability.
3 Results 3.1 Validation of historical climate In this section, we evaluate the ability of the CMIP5 models to reproduce the historical climate over the five CREMA domains. The models’ M1 scores for precipitation and temperature are presented in Fig. 2. As mentioned in the previous section, the plots only provide information regarding which quartile a given model’s score falls in for a particular region or season, thus the four colors are represented equally within each column of Fig. 2. As a result, only the percentiles are presented in the portrait plots rather than the actual scores. High (low) M1 scores are represented by red (blue) colors which represent better (worse) model performance. Descriptive statistics (range and median values) are provided in Supplemental Table 2 and the standard deviation of the M1 scores for each domain are presented in Table 1 to further assess the spread in model skill over the regions of interest. In addition, to determine how much of an influence resolution has on model performance, we divide the models into two groups of high and low resolution using a cutoff value of 2 degrees, and compare the average M1 scores for each region and season (Table 2). 3.1.1 Precipitation The M1-scores for precipitation are presented in Fig. 2. While most models perform better in some regions and seasons and worse in others, there are a few that stand out with consistently high scores in nearly all cases. Notably, all versions of the UKMO.HADGEM model (HADGEM2ES, HADGEM2C, and HADGEMAO) are in the top quartile for almost every domain and season, and the UKMO.HADCM3 model is in the top 50th percentile for all cases. Other top performers include the NCAR models (with the exception of NCAR.CESM1WACCM) and CSIRO-AC10. A few models seem to perform better in one season than the other, such as MPI.ESMP and ECMWF. With the exception of the few models noted above, none of the models simulate precipitation exceptionally well globally.
12
Climatic Change (2014) 125:7–21
Fig. 2 M1score for precipitation (top) and temperature (bottom) from the historical simulations. Color scale represents quartiles based on each season and region
Climatic Change (2014) 125:7–21
13
Table 1 Standard deviation of M1-scores for all models
Africa Central America Hymex-MED South America West Asia
Precipitation DJF 0.073 (0.658) 0.104 (0.433) 0.067 (0.479) 0.086 (0.367) 0.070 (0.634)
JJA 0.067 (0.655) 0.094 (0.419) 0.113 (0.579) 0.093 (0.464) 0.066 (0.544)
ANN 0.059 (0.647) 0.108 (0.280) 0.051 (0.540) 0.093 (0.295) 0.061 (0.561)
Temperature DJF 0.046 (0.762) 0.042 (0.751) 0.069 (0.714) 0.068 (0.655) 0.030 (0.816)
JJA 0.045 (0.736) 0.090 (0.469) 0.100 (0.676) 0.043 (0.784) 0.048 (0.715)
ANN 0.067 (0.690) 0.068 (0.595) 0.070 (0.733) 0.049 (0.744) 0.035 (0.796)
Mean values are given in parentheses
The standard deviation of M1 scores are presented as a proxy for variability in model performance (Table 1). West Asia and Africa have the smallest standard deviations, ranging from 0.059 to 0.073, indicating that there is relatively little difference in skill among the models over those regions. Contrarily, the models display the largest variability in skill over South and Central America, as well as the Mediterranean region during JJA, where standard deviations range from 0.086 to 0.113. We also find that resolution has a clear impact on model skill. In all cases, the average M1-score for precipitation is greater for the high resolution than the low resolution models (Table 2). These differences are statistically significant annually for all domains. Among these cases, the difference in M1-score ranges from 0.032 in HYMEX-MED to 0.069 in Central America. 3.1.2 Temperature In general, the M1-scores indicate that model performance with respect to temperature tends to be more consistent across domains and seasons than for precipitation (Fig. 2), although there are still quite a few exceptions in which a model performs well in some cases and poor in others. As with precipitation, the NCAR models (again with the exception of NCAR. CESM1-WACCM) stand out as consistently having the highest scores in most cases. The UKMO models, which had high precipitation Table 2 Average M1-score for low and high resolution models
Africa Low High Central America Low High HYMEX-MED Low High South America Low High West Asia Low High
Precipitation DJF JJA
ANN
Temperature DJF JJA
ANN
0.631 0.687
0.633 0.678
0.624 0.671
0.741 0.784
0.709 0.765
0.647 0.734
0.421 0.445
0.400 0.438
0.246 0.315
0.736 0.767
0.417 0.523
0.561 0.630
0.469 0.490
0.544 0.615
0.574 0.606
0.694 0.735
0.652 0.700
0.713 0.753
0.356 0.379
0.452 0.476
0.270 0.320
0.610 0.701
0.760 0.809
0.713 0.775
0.620 0.648
0.530 0.559
0.544 0.578
0.803 0.831
0.683 0.749
0.770 0.822
Values in bold represent means in which the differences are statistically significant
14
Climatic Change (2014) 125:7–21
M1-scores globally, fare slightly worse for temperature but still perform relatively well, in particular the HADGEM2AO and HADCM3 models. Other models which perform relatively well globally include MRI.CGCM3, the MPI models, IPSL.CM5AMR, CSIRO.AC10, the CMCC models, and CCSR.MIROC4H. Among the domains, the largest variability of skill in predicting temperature is found during JJA in the Central America and HYMEX-MED domains, while west Asia has the smallest variability with an annual standard deviation of 0.035 (Table 1). The impact of resolution on improving model skill in simulating temperature is consistent among the models as well (Table 2). Annually, the increase in skill for the higher resolution models is 0.062. Almost all of the cases are statistically significant, with the exception of HYMEX during JJA. 3.2 Evaluation of projected climate change In this section we use the M2 score to evaluate the degree of change in surface temperature and precipitation that occurs under projected GHG warming among the models and across the different regions. Climatologies from the future period (2070–2099) for both the RCP4.5 and RCP8.5 scenarios are compared to a reference climatology based on the years 1976–2005. Portrait plots of the M2 scores for the RCP8.5 simulations are presented in Fig. 3. High (low) M2 scores are represented by red (blue) colors which represent higher (lower) climate sensitivity. Although the magnitudes are different, the overall patterns of RCP4.5 M2 scores are quite similar to that of RCP8.5, therefore the RCP4.5 portrait plots are only presented as online supplementary material (Supplemental Figure 1). Descriptive statistics (range and median values) are provided in supplemental Table 3. In order to quantify the amount of model agreement in change for each region and season, standard deviations of the M2 scores are provided in Table 3. 3.2.1 Precipitation The M2-scores for precipitation are presented in Fig. 3. The models with the highest M2 scores globally include IPSL.CM5AMR, IPSL.CM5ALR, the CMCC models, and CCCMA.ESM2. These models show the largest response in precipitation to GHG warming across nearly all regions and seasons. Contrarily, NCAR.CESM1BGC, NCAR.CCSM4, IPSL.CM5BLR, CNRM.CM5, and ECMWF.EC-EARTH have the lowest climate sensitivity, globally. The rest of the models exhibit varying degrees of sensitivity to increases in GHG depending on the region and season. Standard deviations of M2-scores (Table 3) range between 0.024 to 0.055 across most regions and seasons, indicating that the spread in climate sensitivity is similar for most cases. The exception is in the Mediterranean region which has the highest standard deviation during JJA (more than twice as much as the other regions) meaning that relative to the other regions there is high inter-model variability in projected precipitation change in this season. Although not shown here, we found no statistically significant differences between the M2 scores of the high- and low-resolution models, suggesting that, although higher resolution models exhibit a better performance, model resolution does not have an impact on a model’s response to GHG forcing, at least in terms of precipitation. This also suggests that model biases do not affect the projected changes, as also found by Giorgi and Coppola (2010) for the CMIP3 ensemble.
Climatic Change (2014) 125:7–21
15
Fig. 3 M2score for precipitation (top) and temperature (bottom) from the RCP8.5 simulations. Color scale represents quartiles based on each season and region
Climatic Change (2014) 125:7–21
16 Table 3 Standard deviation of M2-scores for all RCP8.5 models
Africa Central America HYMEX-MED South America West Asia
Precipitation DJF 0.024 (0.118) 0.034 (0.165) 0.054 (0.206) 0.036 (0.173) 0.029 (0.134)
JJA 0.043 (0.126) 0.055 (0.178) 0.126 (0.275) 0.051 (0.161) 0.031 (0.150)
ANN 0.031 (0.119) 0.050 (0.186) 0.038 (0.158) 0.050 (0.187) 0.034 (0.141)
Temperature DJF 0.047 (0.343) 0.063 (0.357) 0.077 (0.441) 0.060 (0.440) 0.036 (0.221)
JJA 0.068 (0.454) 0.077 (0.672) 0.113 (0.594) 0.051 (0.307) 0.066 (0.391)
ANN 0.074 (0.549) 0.085 (0.582) 0.093 (0.531) 0.056 (0.372) 0.050 (0.303)
Mean values are given in parentheses
3.2.2 Temperature The M2 scores for temperature, which reflect the amount of regional warming, are tied to the model’s global climate sensitivity, since it has been shown by various authors that regional temperature change scales well with global climate sensitivity (Giorgi 2008; Mitchell 2003). This is evident in the high level of crossregional agreement seen in Fig. 3. The UKMO models, IPSL.CM5A models, GFDL.CM3, CCSR.ESM models, and CCCMA.ESM2 have the highest climate sensitivity across all regions and season. Contrarily, MRI.CGCM3, INM.CM4, GISS.E2R, GFDL.ESM models, and CNRM.CM5 have the lowest climate sensitivity, globally. There are a few models, such as the IPSL.CM5A models and CCCMA.ESM2, that exhibit a large response to GHG warming for both temperature and precipitation. The M2 score standard deviations range from 0.036 to 0.113 for temperature (Table 3). The greatest variability occurs in the Mediterranean region during all seasons, which may in part be influenced by the relatively smaller size of the domain. We find that resolution does not have a significant impact on the response of temperature to GHG forcing, again indicating the relative independence of model projected changes with respect to model performance. 3.3 Resemblance to MME Here we use the M3 score to measure how close the projected change of each model is to that of the MME. This information is useful for example when attempting to select models that are representative of the ensemble mean changes. Portrait plots of M3 scores for the RCP8.5 scenario are presented in Fig. 4 (RCP4.5 results are presented in Supplemental Figure 2). High (low) M3 scores (represented by red (blue) colors) indicate that the change projected by a particular model is similar (dissimilar) to the MME. Statistics (min, max, median) of the M3 scores are given in Supplemental Table 4. 3.3.1 Precipitation In general, the M3 scores for precipitation change range from about zero to almost 0.80 (Supplemental Table 4). In DJF, the Mediterranean region has the highest median score, indicating that overall the models are in better agreement with the MME, or in other words, that there is less uncertainty in projected precipitation change patterns there than in the other regions. The median score is substantially higher in DJF than JJA over Africa and the Mediterranean regions suggesting less
Climatic Change (2014) 125:7–21
17
Fig. 4 M3score for precipitation (top) and temperature (bottom) from the RCP8.5 simulations. Color scale represents quartiles based on each season and region
18
Climatic Change (2014) 125:7–21
uncertainty during DJF. For the other regions, the median scores are similar during both seasons. Clear patterns emerge in Fig. 4 identifying those models which consistently produce changes similar to the MME, as well as those which do not. Namely, the UKMO, NOR, NCAR, and CMCC models project changes which are relatively similar to the MME. Contrarily, the IPSL.CM5AMR, IPSL.CM5ALR, INM, GISS, CSIRO.MK6, CSIRO.AC13. CCSR.ESM-CHEM, CCSR.ESM, CCCMA.ESM2 and BNU.ESM models are dissimilar to the MME in their projected change. Remaining models have mixed results where in some regions and seasons the change is simiiar to the MME, but not in others. For example, the MPI.ESMMR model behaves very similar to the MME in DJF, having a M3 score in the upper quartiles, but less so during the summer where the M3 scores are in the lower 50th percentile. 3.3.2 Temperature The range of M3 scores for temperature change are generally similar to those for precipitation, except during JJA where the maximum and median values are slightly higher (Supplemental Table 4). The highest median scores are found in the Mediterranean region during DJF, and in Africa, South America and West Asia during JJA. Those models which project temperature changes similar (or dissimilar) to the MME are easily identifiable in the portrait plots (Fig. 4). In almost every case, the UKMO.HADGEM2AO, NCAR, MPI, CSIRO and BCC models have M3 scores in the upper 50th percentile, indicating a relatively similar change to the MME. On the other hand, the following models stand out as being dissimilar, having M3 scores which are in the bottom 50th percentile in most regions and seasons; UKMO.HADGEM2ES, UKMO.HADGEM2CC, MRI.CGCM3, IPSL.CM5AMR, INM.CM4, GISS.E2R, GFDL.ESM, GFDL.ESM2M and CCSR.ESM-CHEM. 3.4 The CREMA GCMs within the context of the full CMIP5 ensemble The selection of the CREMA GCMs was based on the availability of a relatively small set of models that had provided 6-hourly boundary fields necessary for running the RegCM (Giorgi et al. 2012) by the time of the beginning of the experiment. After some specific RegCM tests over the different CORDEX domains, based on the performance of the GCM-driven RegCM runs we selected three GCMs: HADGEM2ES (Africa, HYMEX-MED, S. America and C. America domains), MPI-ESMMR (all domains) and GFDL-ESM2M (W. Asia and S. America domains). Therefore, the two main driving GCMs were HADGEM2ES and MPI-ESMMR, although HADGEM2ES produced substantially underestimated monsoon precipitation over the India continent, leading to a corresponding large underestimation in the nested RegCM, and thus was not used as driver for the West Asia domain. GFDL-ESM2M was used as third GCM to better explore the inter-model source of uncertainty in a subset of domains where it performed better. In this section we attempt to place these three GCMs within the context of the broader CMIP5 ensemble. In terms of model performance, Fig. 2 shows that HADGEM2ES exhibit overall some of the highest M1 scores of all the CMIP5 GCMs, therefore it is one of the
Climatic Change (2014) 125:7–21
19
better performing models in terms of reproducing present day precipitation patterns. The model skill for temperature is somewhat lower, but still mostly in the upper quantiles of the full ensemble of cases. In addition, HADGEM2ES is a relatively high sensitivity model (see Fig. 3, bottom panel), and thus represents a high end response member. Finally, the precipitation response patterns in HADGEM2ES are mostly in line with the MME, although less so for temperature. In summary HADGEM2ES shows good performance, has a high sensitivity, and is well representative of the ensemble, particularly for precipitation. The other model used in the RCP4.5 and RCP8.5 scenarios is MPI-ESMMR. Figure 2 indicates a medium performance in terms of precipitation, but good in terms of temperature, with a medium climate sensitivity compared to HADGEM2ES (Fig. 3) and reasonably good consistency with the MME (Fig. 4). MPI-ESMMR is thus also a generally good performing model, but with lower climate sensitivity. Finally GFDL-ESM2M shows a reasonably good performance over the two domains where it is used (Fig. 2) particularly over the India continent, and a low climate sensitivity (Fig. 3), although some of the regional response are different from the MME (Fig. 4). In summary, the three models selected are characterized by a generally good performance in terms of M1 score, a range of climate sensitivities and a variety of similarities in regional response compared to the MME. Given the relatively small size of the CREMA Phase I ensemble, they thus offer a good spectrum of response within the full CMIP5 ensemble.
4 Discussion and conclusions This paper presents an assessment of the CMIP5 global model simulations over a subset of CORDEX domains used in the Phase I CREMA experiment. Three variations of the transformed Mielke measure, M, are used to assess (1) the models’ skill in simulating surface temperature and precipitation historical climatology (1976–2005), (2) the degree of surface temperature and precipitation change occurring under greenhouse gas warming, and (3) the consistency of a model’s projected change with that of the ensemble mean. These are all important criteria for the selection of GCMs in nested RCM experiments. Furthermore, the impact of resolution on model skill and response to GHG forcing is evaluated. In terms of precipitation, we find that in general the relative skill of the models is dependent on the region and season. However, as measured by the M1 score, there are a few models that stand out as performing well globally across most regions. These models include the UKMO models, the NCAR models and CSIRO-AC10. We also find that resolution has a statistically significant impact on precipitation skill in DJF and ANN (with increasing skill for higher resolution), but not in JJA. For temperature, we find more consistency in model performance across different domains than with precipitation, although many models still exhibit considerable variations in performance depending on the region and season. Models which have high M1 scores globally include the UKMO models (HADGEM2AO and HADCM3), the NCAR models (all except NCAR.CESM1-WACCM), MRI.CGCM3, the MPI models, IPSL.CM5AMR, CSIRO.AC10, the CMCC models and CCSR.MIROC4h. We find that resolution impacts temperature skill in JJA and ANN, but not DJF.
20
Climatic Change (2014) 125:7–21
We also find considerable variability in regional changes in precipitation resulting from GHG forcing among the models, although some models such as the IPSL.CM5A models, the CMCC models and CCCMA.ESM2 exhibit large changes across most regions. Likewise, NCAR.CESM1-BGC, NCAR.CCSM4, IPSL.CM5BLR, CNRM.CM5 and ECMWF.EC-EARTH all exhibit small changes in precipitation across most regions. For temperature, there is clearly a high level of cross-regional agreement where the UKMO models, the IPSL.CM5A models, GFDL.CM3, CCSR.ESM and CCCMA.ESM2 show the highest climate sensitivity globally. The MRI.CGCM3, INM.CM4, GISS.E2R, the GFDL.ESM models and CNRM.CM5 have the lowest climate sensitivity, globally. We find that resolution does not have a statistically significant impact on the models’ response to GHG forcing which, based on the fact that resolution affects model skill, implies that model biases do not play a primary role in affecting the model response to GHG forcing. The UKMO, NOR, NCAR and CMCC models project precipitation changes which are most similar to the MME, while for temperature, the changes projected by the NCAR, MPI, CSIRO and BCC models most resemble the MME. The Mediterranean region has the least amount of uncertainty associated with the projected changes in temperature and precipitation according to the relatively high M3 scores which indicate that overall the models are in better agreement with the MME than the other regions. Finally we assessed the three selected models for the CREMA Phase I experiment, HADGEM2ES, MPI-ESMMR and GFDL-ESM2M and showed how they are characterized by a relatively good level of performance (M1 score), a range of high to low climate sensitivities (M2 score) and a reasonably good consistency with the MME (M3 score), therefore given the necessarily small size of the initial CREMA ensemble they do provide a reasonably representative sample of the full CMIP5 ensemble.
References Giorgi F, Coppola E, Solmon F, Mariotti L, Sylla M, Bi X, Elguindi N et al (2012) RegCM4: model description and preliminary tests over multiple CORDEX domains. Clim Res. doi:10.335/cr01018 Giorgi F, Coppola E (2010) Does the model regional bias affect the projected regional climate change? An analysis of global model projections. Clim Chang Lett. doi:10.1007/s10584010-9864-z Giorgi F, Jones C, Asrar G (2009) Addressing climate information needs at the regional level: the CORDEX framework. WMO Bull 58:175–183 Giorgi F (2008) A simple equation for regional climate change and associated uncertainties. J Clim 21:1589–1604 Giorgi F, Mearns L (1999) Regional climate modeling revisited: an introoduction to the special issue. J Geophys Res 104:6335–6352 Laprise RRDE, Caya D, Biner S, Lucas-Picher P, Diaconescu E, Leduc M, Alexandru A, Separovic L (2008) Challenging some tenets of Regional Climate Modeling. Meteorol. Atmos Phys 100:3–22 Meehl G, Stocker T, Collins W, Friedlingstein P, Gaye A, Gregory J, Kitoh A, Knutti R, Murphy J, Noda A, Raper S, Watterson I, Weaver A, Zhao ZC (2007) Global climate projections. In: Solomon S, Qin D, Manning M, Chen Z, Marquis M, Tignor KAM, Miller H (eds) Climate Change 2007. The Physical Science Basis. Cambridge University Press, pp 747–845 Mitchell T (2003) Pattern scaling. An examination of the accuracy of the technique for describing future climate. Clim Chang 60:217–242
Climatic Change (2014) 125:7–21
21
New M, Hulme M, Jones P (2000) Representing twentieth century space time climate fields part II: development of a 1901–1996 mean monthly terrestrial climatology. J Clim 13:2217–2238 Taylor K, Stouffer R, Meehl G (2012) An overview of CMIP5 and the experiment design. Bull Am Meteorol Soc. doi:10.1175/BAMS-D-11-00094.1 Watterson I (1996) Non-dimensional measures of climate model performance. Bull Amer Meteorol Soc 16:379–391