Remotely sensed vegetation phenology for describing ...

1 downloads 0 Views 2MB Size Report
with arid regions, especially the Nama Karoo (Fig. 3b). Grassland, Savanna and Coastal show the least variability in START date, but even in these biomes the ...
(BWUK AVSC 1100 Webpdf:=07/19/2010 03:34:57 2315247 Bytes 19 PAGES n operator=n.bhuvaneswari) 7/19/2010 3:56:08 PM

Applied Vegetation Science ]]: 1–19, 2010 DOI: 10.1111/j.1654-109X.2010.01100.x & 2010 International Association for Vegetation Science

1 2 3 4 5 6 7 8 9 10 11 Q2 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54

AVSC

Remotely sensed vegetation phenology for describing and predicting the biomes of South Africa Konrad Wessels, Karen Steenkamp, Graham von Maltitz & Sally Archibald Abstract Questions: What are the phenological patterns of remotely sensed vegetation phenology, including their inter-annual variability, across South Africa? What are the phenological attributes that contribute most to distinguishing the different biomes? How well can the distribution of the recently redefined biomes be predicted using a regression tree based on remotely sensed, phenology and productivity metrics? Location: Southern Africa, south of 12 0 3000 S and east of 34 0 1200 E; however, the biome-based analysis focused solely on South Africa. Method: Ten-day, 1 km, normalized difference vegetation index (NDVI) Advanced Very High Resolution Radiometer (AVHRR) were analysed for the period 1985–2000. Phenological metrics such as start, end and length of the growing season and estimates of net primary production, based on small and large integral (SI, LI) of NDVI curve, were calculated for each season. From these the long-term means, coefficients of variation and standard deviations were calculated for each metric. A random forest regression tree was run using a range of phenology and productivity metrics as the input variables and the South African biomes as the dependent variable. The importance of the different phenology metrics in differentiating the biomes was assessed. A map of the biomes was reproduced using the final regression tree based on remotely sensed metrics of phenology and productivity. Results: Phenology metrics showed a clear relationship with the known seasonality of rainfall, i.e. winter growing season along the south western and west coast versus the summer growing season in the rest of the sub-continent. The distribution of the LI and SI were significantly correlated with mean annual precipitation. The regression tree initially Wessels, K. & Steenkamp, K. (Corresponding author, [email protected]): CSIR Meraka Institute, PO Box 395, Pretoria 0001, South Africa. von Maltitz, G. & Archibald, S.: CSIR Natural Resources and the Environment, PO Box 395, Pretoria 0001, South Africa.

1100

A V S C

1 1 0 0

Journal Name

Manuscript No.

split the biomes based on vegetation production and then by the seasonality of growth. The difference in the biome-specific responses given by the LI and SI can be used as a key variable for differentiating biomes. A regression tree was used to produce a predicted biome map with a high level of accuracy (75%). Conclusion: The long-term mean phenology metrics gave ecologically meaningful results reflecting the spatial patterns of production and seasonality of vegetation growth in southern Africa. The current paper used advanced methods for extracting phenology metrics, including the start, end and length of growing season, as well as their inter-annual variability. Regression tree analysis based on remotely sensed phenology and productivity metrics performed as well as, or better than, previous climate-based predictors of biome distribution. The results confirm that the metrics capture sufficient functional diversity to classify and map biome level vegetation patterns and function. Remotely sensed phenology and productivity metrics can, and should thus play an indispensable role in the production of regional vegetation maps (biome or ecosystem functional types) and in understanding the predominant differences in their function in relation to bioclimatic drivers. Keywords: AVHRR; Biomes; NDVI; Net primary production; Phenology; Regression tree; Vegetation mapping. Abbreviations: AMP 5 Seasonal amplitude; AVH RR 5 Advanced Very High Resolution Radiometer; BASE 5 Base level; CART 5 Classification and regression trees; CV 5 Coefficient of variation; DST 5 Department of Science and Technology; END 5 End date of growing season; ENSO 5 El Nin˜o–Southern Oscillation; Evap 5 Annual mean (potential) evaporation; FPAR 5 Fraction of Photosythetically Active Radiation; LAC 5 Local Area Coverage; LI 5 Large seasonal integral; LENGTH 5 Length of growing season; MAP 5 Mean Annual Precipitation; MAX 5 Maximum NDVI; MID 5 Mid position date of season; NDVI 5 Normalised Difference Vegetation Index;

B

Dispatch: 19.7.10 Author Received:

Journal: AVSC No. of pages: 19

CE: Sukanya Op: Gowri/Arsath

55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108

(BWUK AVSC 1100 Webpdf:=07/19/2010 03:34:57 2315247 Bytes 19 PAGES n operator=n.bhuvaneswari) 7/19/2010 3:56:08 PM

2

1 2 3 4 5 6 7 8 9 10 11 12 Q3 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54

AVSC

Wessels, Konrad et al.

NLC2000 5 South African National Land Cover 2000; NPP 5 Net primary production; SD 5 Standard deviation; SM 5 Soil moisture; SI 5 Small seasonal integral; SRP 5 CSIR Strategic Research Panel; START 5 Start date of growing season; Tmin 5 Mean minimum temperature of the coldest month.

Introduction Remotely sensed vegetation phenology data have been much touted for their potential to provide spatially explicit, temporally rich information on vegetation dynamics and patterns at landscape to regional scales (Justice et al. 1985; Jo¨nsson & Eklundh 2004). These data provide an integrated measure of responses of vegetation to climatic factors such as temperature and rainfall as well as fire and other disturbances. The normalized difference vegetation index (NDVI) calculated from satellite data provides an estimate of the fraction of photosynthetically active radiation (FPAR) absorbed by vegetation (Myneni & Williams 1994). Phenological metrics are derived from NDVI time-series data and include, start, end and length of growing season from which seasonal net primary production (NPP) can be estimated. This landscape scale, spatio-temporal development of the vegetated land surface is often referred to as ‘land surface phenology’, as the remotely sensed phenology deals with mixtures of land cover and vegetation (de Beurs & Henebry 2005; Brown & de Beurs 2008). Remotely sensed, land surface phenology has been used for assessing the impact of climate change on seasonality of vegetation (Myneni et al. 1997; Osborne & Woodward 2001; Zhou et al. 2003; Reed 2006), as inputs to biophysical land-surface models (Denning et al. 2003) and to improve our understanding of the drivers of vegetation growth (Jolly & Running 2004; Archibald & Scholes 2007). Moreover, these data provide information on key aspects of vegetation function, such as NPP, seasonality and inter-annual variation in vegetation productivity. Therefore there is tremendous potential for characterizing, classifying and mapping vegetation based on remotely sensed phenology (Ferreira & Huete 2004; Hoare & Frost 2004; Paruelo et al. 2004; Alcaraz et al. 2006; Karlsen et al. 2006). Historically, vegetation functional units have often been defined from the ‘bottom-up’ using field data on vegetation composition as a point of departure (Blydenstein 1967). Floristically distinct units (which are assumed also to be functional units) are aggregated into

1100

groups that share similar broad-scale vegetation drivers such as climate, topography and soils (Mucina & Rutherford 2006). This approach has several disadvantages, however. It primarily maps potential rather than current vegetation (Paruelo et al. 2001) since it relies on broad-scale abiotic data to aggregate vegetation units (Rutherford & Westfall 1994), which may be inadequate where major drivers of vegetation patterns include localized disturbances such as fire, herbivory or human activities (Bond et al. 2003; Woodward et al. 2004). Moreover, the extent to which vegetation maps rely on abiotic information rather than floristic information depends largely on the amount of information available in an area – thus in southern Africa there is a great disparity in the quality and detail of mapping across the region. Finally, mapping efforts which rely heavily on field-derived floristic information are not easy to update when the vegetation at a location changes and such maps cannot be used as real-time indications of the vegetated state of a region. Remotely sensed, land surface phenology data have the potential to provide exactly the level of information that would resolve some of these vegetation-mapping problems as they capture the spatial patterns of vegetation dynamics through repetitive observations over vast areas (Jo¨nsson & Eklundh 2002). Moreover, as mentioned above, these data are directly related to key aspects of vegetation function such as seasonality, productivity and inter-annual variability (Nemani et al. 2003). Furthermore, vegetation function responds faster to environmental change and variability than vegetation structure and composition (Paruelo et al. 2001). Most biomes in South Africa (Fig. 1a) are characterized by high inter-seasonal variability in precipitation, with the coefficient of variation in rainfall ranging between 20% and 40% (Schulze 2007). Unlike previous studies in this region, which focused on describing the short-term seasonal patterns of land surface phenology in relation to rainfall (Vanacker et al. 2005; Lupo et al. 2007), the current study analysed long-term means and variance of phenology metrics derived from 1 km2 Advanced Very High Resolution Radiometer (AVHRR) NDVI data spanning a 15-yr period. Thus, taking a more representative approach to investigating the long-term vegetation patterns and their inter-annual variability. In this study a recently updated biome map of South Africa (Mucina & Rutherford 2006) is used to test whether the phenology metrics data can be used to differentiate these biomes based on their function

55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108

Remotely sensed vegetation phenology for describing and predicting

(BWUK AVSC 1100 Webpdf:=07/19/2010 03:34:57 2315247 Bytes 19 PAGES n operator=n.bhuvaneswari) 7/19/2010 3:56:08 PM

Q1

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54

AVSC

3

Fig. 1. (a) South African biomes after (Mucina & Rutherford 2006). (b) Areas with mean NDVI amplitude less than 0.1 (c) Mean Annual precipitation in mm (Schulze 2007).

(Paruelo et al. 2001, 2004). The biome map was compiled by hierarchically grouping floristically defined vegetation units into broader classes (i.e. bioregions, then biomes) (Mucina & Rutherford 2006). In this paper, regression-tree analysis was used to identify phenology metrics that are most characteristic of each biome and to predictively map the distribution of the biomes. Mucina & Rutherford (2006) used a similar regression-tree approach to explore how well the biomes relate to climate variables, and it is used in a comparison with a phenology-based regression tree. If the remotely sensed phenological data can identify and distinguish different biomes as well as, or better than the climate-based models, these methods have tremendous potential to be used both in finer-scale classification of vegetation functional types and for monitoring large-scale changes in the distribution of these biomes/vegetation types with changing climates (Ellery et al. 1991). They would also be useful for mapping vegetation over large regions which may lack standardized, fine-scaled

1100

floristic information. While similar vegetation classification studies have relied on basic NDVI-derived indicators of phenology and productivity (e.g. annual integral NDVI, intra-annual range of NDVI and month of maximum NDVI) (Garbulsky & Paruelo 2004; Hoare & Frost 2004; Paruelo et al. 2004; Alcaraz et al. 2006), the current paper uses more advanced methods for extracting phenology and productivity metrics (Jo¨nsson & Eklundh 2002, 2004), including the start, end and length of growing season, as well as their inter-annual variability. This paper aims to: 1.Describe patterns of remotely sensed, land surface phenology, including their inter-annual variability, across South Africa. 2.To identify the phenological attributes that distinguish between the recently redefined biomes of South Africa. 3.To test how well remotely sensed, land surface phenology can predict the distribution of these biomes using a regression tree analysis.

55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108

(BWUK AVSC 1100 Webpdf:=07/19/2010 03:34:57 2315247 Bytes 19 PAGES n operator=n.bhuvaneswari) 7/19/2010 3:56:08 PM

4

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54

AVSC

Wessels, Konrad et al.

Methodology Biome map of South Africa A biome is generally viewed as a high-level hierarchical unit having a similar vegetation structure, from a functional and physiological perspective (Woodward et al. 2004; Rutherford et al. 2006). Biomes are thought to have resulted from similar regional macroclimatic patterns, associated with longer-term time-scales, although there is a complex interplay between evolutionary (long-term) and ecological (short-term) time-scales (Rutherford et al. 2006). The recently revised biome map of South Africa was compiled using a bottom-up approach, by hierarchically grouping floristically defined vegetation units into broader classes (i.e. bioregions, then biomes) (Mucina & Rutherford 2006). Vegetation units were defined based on the classification of volumes of historical and recent plant distribution data, after which the local geographic distribution of the units were mapped using climate (Schulze 2007), geology, soil and topography data ranging from 1:50 000 to 1:250 000 scale. The vegetation units (types) were grouped into a nested hierarchy based on similarity in terms of their structure, floristic composition and physico-geographic traits. After the biome map was derived, the diagnostic climatic explanations of the biomes were investigated using a decision-tree analysis (Rutherford et al. 2006). The latest biome map does not imply changes in biome distribution when compared with the 1996 biome map (Low & Rebelo 1996), but instead represents a collaborative revision based on new definitions and methods.

2

1 km AVHRR data The local area coverage (LAC) (1 km2 resolution) dataset of Advanced Very High Resolution Radiometer (AVHRR) data were received at the CSIR, Satellite Application Centre at Hartebeeshoek in South Africa. The daily data were processed to 10-day maximum value composites. Data were consistently processed and calibrated to correct for sensor degradation and satellite changes (Rao & Chen 1995, 1996). Only AVHRR data for 1985 to 2000 was used to avoid differences in the spectral response function of NOAA-16 in relation to previous sensors (van Leeuwen et al. 2006). The data set could not be corrected for sensor orbital drift which cause trends in the sun-target-view geometry and associated residual variability in NDVI

1100

(Privette et al. 1995). Owing to the failure of NOAA13 shortly after its launch, data were not available for 1994. Therefore the 15-yr period investigated (1985-2000) only contained data for 13 full growing seasons. For further details on processing see (Wessels et al. 2006). Although, the periods 1985-1993 (288 decads) and 1995-2000 (180 decads) were analysed independently, the statistics were calculated across both periods. Calculating phenology and productivity metrics from AVHRR data A variety of methods have been used to extract land surface phenology metrics from time-series data (Reed et al. 2003; de Beurs & Henebry 2005; Zhang et al. 2006; White et al. 2009). A widely used time-series analysis program, TIMESAT, was used to calculate phenology metrics from the AVHRR data (Jo¨nsson & Eklundh 2002, 2004). The threshold method used by TIMESAT provides a robust and computationally simple method for identifying the start and end of growing season (Heumann et al. 2007). The threshold method furthermore consistently identifies the same 10-day composite periods as start and end of growing season as more complex methods (Wessels et al. 2009). A user-defined threshold of 20% of the seasonal amplitude, as measured from the left minima of the seasonal curve, was set as the start of growing season (START) (Fig. 2). Similarly, the end of growing season (END) was defined as date at which the right edge has declined to 20% as measured from the right minima. The other seasonal phenology metrics were calculated accordingly (Fig. 2, Table 1). The 20% threshold was chosen using TIMESAT’s graphic user interface in Matlab to visually inspect the impact of different thresholds across all biomes on the START and END dates displayed on the fitted time-series curves for individual random pixels. It is well established that NDVI is correlated with photosynthetic activity and therefore the NDVI summed over the growth season can be used as an estimate of NPP (Prince 1991; Myneni et al. 1997; Diouf & Lambin 2001; Wessels et al. 2006). The large integral (LI) is an estimate of total vegetation production from the zero level, whereas the small integral (SI) is a measure of the vegetation production within a growing season, calculated above the BASE value (i.e. average between left and right minima of curve; Jo¨nsson & Eklundh 2004). In evergreen areas the SI (within growing season production) will be small while the LI (total vegetation production) will be large (Jo¨nsson & Eklundh 2004).

55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108

Q4

Q5

Remotely sensed vegetation phenology for describing and predicting

(BWUK AVSC 1100 Webpdf:=07/19/2010 03:34:57 2315247 Bytes 19 PAGES n operator=n.bhuvaneswari) 7/19/2010 3:56:08 PM

Q1

AVSC

c d

0.8

e

0.7 h

0.6

NDVI

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54

5

g

0.5 a

0.4

b 0.3 0.2

f

0.1 i 0 M A M J J A S O N D J F M A M J J A S O N D J F M Months

Fig. 2. Phenology metrics extracted from the seasonal normalized difference vegetation index (NDVI) curve, as defined in TIMESAT. Redraw of (Jo¨nsson & Eklundh 2004). (a) Start of season (START), (b) End of season (END), (c) Length of season (LENGTH), (d) Mid position of season (MID), (e) Maximum NDVI (MAX), (f) Base level (BASE), (g) Seasonal amplitude (AMP), (h) Small seasonal integral (SI), (i) Large seasonal integral (LI). See Table 1 for details.

Table 1. Definitions of metrics shown in Fig. 2, after (Jo¨nsson & Eklundh 2004). NDVI 5 normalized difference vegetation index. Phenology metrics

Productivity metrics

a. Start of growing season (START) –increase to 20% of seasonal amplitude as measured from the left minima of curve b. End of growing season (END) – decrease to 20% of seasonal amplitude as measured from the right minima of curve c. Length of growing season (LENGTH) – length of time from START to END d. Mid-position of season (MID) – mean value of dates for which left edge increased to 80% and right edge decreased to 80%

e. Maximum NDVI (MAX) – largest data value for the fitted function during the season f. Base level (BASE) – average between left and right minima of curve g. Seasonal amplitude (AMP) – difference between MAX and BASE

In contrast, an arid grassland will have a SI and LI that is very similar in magnitude because the BASE value is close to the zero level in the dry season. The START, END, length of growing season (LENGTH) and mid position (MID) are hereafter referred to as ‘phenology metrics’, while LI, SI, MAX, AMP and BASE are referred to as ‘productivity metrics’ (Table 1). The adaptive Savitsky–Golay filter, which uses a local polynomial fit together with small moving windows in two fitting steps, proved to be the most appropriate curve-fitting algorithm as it closely models the raw data while capturing sudden rises in data values (Chen et al. 2004; Bachoo & Archibald

1100

h. Small seasonal integral (SI) – integral of growing season calculated between the fitted function and the BASE i. Large seasonal integral (LI) – integral of growing season calculated between the fitted function and the zero level

2007). A window width of 3 and 4 data points were used in the successive fitting steps. TIMESAT has the capability to detect multiple growth seasons per year. However, vegetation in South African only has a single growing season in any one area. Initial investigations indicated that only a few scattered pixels occasionally exhibited secondary growth peaks in the same year. However, the amplitudes of the detected secondary peaks were very low and therefore the number of growth seasons was assumed to be 1 by setting the ‘season cut-off value’ to 1 in TIMESAT The long-term mean was calculated for all metrics, while inter-annual variability of (1) the

55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108

(BWUK AVSC 1100 Webpdf:=07/19/2010 03:34:57 2315247 Bytes 19 PAGES n operator=n.bhuvaneswari) 7/19/2010 3:56:08 PM

6

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54

AVSC

Wessels, Konrad et al.

phenology metrics was expressed as standard deviation (SD) in days and (2) that of the productivity metrics as coefficient of variation (CV). The CV allowed the inter-annual variability of LI and SI to be expressed as a percentage of the mean productivity. Although the time series includes 13 growth cycles, the maximum number of 13 metrics could not always be resolved for each pixel owing to periodic very low NDVI values caused by erratic rainfall in arid areas. Although Heumann et al. (2007) cautions against the use of phenology metrics extracted from pixels with very low mean amplitude values (Fig. 1b, NDVIo0.1), the long-term mean and SD of the resolved phenology metrics provided plausible results. Data extraction Transformed areas such as cultivated land, plantations, built-up areas and surface mining were identified from the South African National Land Cover 2000 (NLC2000; Thompson 2001) and extended to include a 1 km buffer to avoid adjacency effects. Although the derived metrics maps included these transformed pixels, they were excluded from the summary statistics for each biome and regression tree analysis to ensure that only a pure natural vegetation signal was considered. The obvious impacts of different land-cover types on the maps of metrics are briefly discussed. A total of 400 pixels per biome were randomly selected from the remaining untransformed areas for the regression tree analysis. Only 200 pixels could be extracted from the Forest biome because of its limited area. A total of 3400 pixels were thus extracted. Selected pixels were unavoidably biased towards the biome boundaries or mountainous area within the Fynbos and Grassland biomes, which are largely transformed by agriculture and commercial forestry. Regression tree analyses Hoare & Frost (2004) previously produced a biome map from basic AVHRR-derived phenology indices by subjectively choosing cut-off values in order to mimic the previous biome map of South Africa (Low & Rebelo 1996). The present study, however, used more ecologically meaningful and advanced phenology and productivity metrics in a fully supervised decision-tree classification, based on the new biome map of South Africa (Mucina & Rutherford 2006). This allowed an objective assessment of the phenological attributes that are most characteristics of the newly defined biomes.

1100

A random forest regression tree was run on the 3400 sample points to test how well metrics derived from satellite data can predict the South African biomes (Mucina & Rutherford 2006). A regression tree is a classification method that predicts class membership by recursively partitioning data into more homogeneous subsets, referred to as nodes (Breiman et al. 1984). Also called a ‘decision tree’ it can be seen as a set of rules for classifying data into particular categories (in this case biomes) by identifying split conditions which decrease the deviance at each node in the tree. Usually some a priori rule for stopping the splitting process is used to prevent over-fitting of the tree. It has the advantage of being very easy to interpret because the splitting rules are explicit and it can accommodate non-linear relationships. Prior probabilities can be used when the relative likelihoods of the different classes vary, as in this case, where different biomes cover vastly different areas of South Africa: 0.1% of the landscape is forest and 33% grassland. The random forest method is a bootstrapping procedure developed to improve the predictive ability of the regression tree by repeatedly running individual trees on a random subset of the data (Breiman 2001; Prasad et al. 2006). In this case 1000 trees were generated using 63.2% of the data, each time with no stopping rules and thus the trees ran until each point was correctly classified. For each run the resulting tree was used to classify the points that were not included in the subset and ultimately the predicted class of each point was determined as the class that received the highest proportion of votes from the individual trees. The mean values of the phenology and productivity metrics, as well as their respective SD and CV’s were used as predictor variables in the regression tree. To determine how much information was contributed by the phenology metrics versus the productivity metrics, three decision trees were applied: (1) using all the metrics, (2) using phenology metrics only and (3) using productivity metrics only. The accuracy (purity) of each node is assessed using the Gini index, which indicates how evenly the data in the node are distributed among the different classes (in this case, biomes). In nodes with a high Gini index most of the data fall into one class; in nodes with a low Gini index the data are more equally split between the different classes, and therefore have a low accuracy. It was essential to assess how much information each metric variable adds to the final prediction. This was done by randomly permutating each input

55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108

Remotely sensed vegetation phenology for describing and predicting

(BWUK AVSC 1100 Webpdf:=07/19/2010 03:34:57 2315247 Bytes 19 PAGES n operator=n.bhuvaneswari) 7/19/2010 3:56:08 PM

Q1

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54

AVSC

variable in turn and re-running the model. For each tree the accuracy (purity) of each node was recorded, subtracted from the accuracy of the same node in the original tree and the result then averaged and normalized over all trees for each variable. See (Breiman 2001) for more details of this measure of variable importance. The 3400 sample points were used to develop the random forest model, which was then applied to all untransformed pixels in South Africa as a test of the accuracy of the classification. Finally, one more regression tree was run on the predicted values from the random forest; see Archibald et al. (2009) for details of this method. This single tree represented the most important split conditions for distinguishing between the different biomes. This final phenology-based regression tree was compared with the climate-based regression tree (Mucina & Rutherford 2006).

Results and Discussion Phenology metrics Start date of growing season (START) The maps of the phenology metrics are given in Fig. 3, while the frequency histograms of the START, MID and END of growing season for each biome are given in Fig. 4. From Fig. 3a a clearly defined spatial pattern in the date of START is apparent and this is consistent with known plant phenological patterns and regional rainfall patterns (Fig. 1c). The winter rainfall of the southwestern and western coastal strip is clearly distinguished from the summer rainfall regions in the remainder of the subcontinent (Cowling et al. 1997; Schulze 1997). Frequency histograms of START dates per South African biome show clear patterns in seasonality, with Succulent Karoo START dates in autumn (May), followed by the Fynbos (late May) (Fig. 4). START dates in Fynbos expand from late May throughout the winter. According to their histograms, all other biomes have a distinct early spring START, with Forests starting to grow slightly earlier (Aug) than Albany Thicket (Sep), followed by Indian Ocean Coastal Belt (Sep), Grassland (Oct) and finally Savanna biomes (late Oct) (Fig. 4). The Albany Thicket and Indian Ocean Coastal Belt biomes are hereafter referred to as Thicket and Coastal biomes, respectively. The Fynbos, although predominantly in a winter rainfall region in the west, spans to summer rainfall regions in the east (Mucina & Rutherford

1100

7

2006) (Figs 3a and 4). The areas in the western Fynbos with later START dates (Oct) are mainly wheat fields where the pattern of planting and harvesting gives very uniform phenologies that are distinctly different from the surrounding vegetation (point a in Fig. 3a). The Nama Karoo, which has no clearly defined START (Fig. 4), ranges from predominantly winter in the west to totally summer rainfall in the east along a diffuse gradient (Fig. 3a). This could in part be an artifact of the difficulty of resolving annual cycles in areas with very low NDVI values (Heumann et al. 2007). The range in START indicates the range of green-up dates within each biome (spatial variation) (Fig. 4). Most biomes have START dates within a 60- to 110-d range, while the Fynbos has a range of over 160 d (Fig. 4). The SD of START is a measure of variation in START between seasons for a specific pixel (temporal variation) (Fig. 3b). The greatest variability in START dates is associated with arid regions, especially the Nama Karoo (Fig. 3b). Grassland, Savanna and Coastal show the least variability in START date, but even in these biomes the SD is from 20 to 60 d (Fig. 3b). High variability in START dates is also associated with some areas of commercial forestry plantations in South Africa. End date and length of growing season (END, LENGTH) The spatial variation in mean END date (Fig. 3c) follows a similar pattern to the START date, but with a time offset equivalent to LENGTH (Fig. 3e). Frequency histograms of END date by South African biome show very similar overall patterns to those of START (Fig. 4). In the case of Grassland, Savanna, Coastal and Succulent Karoo the entire biome has ended growth before any area in the biome starts the new season’s growth. In the case of Thicket, Forest and Desert there is an overlap in time between parts of the biome entering a new season while other areas have not yet ended the previous season. This is in keeping with these biomes being distributed over areas with diverse timing of rainfall. Fynbos and Nama Karoo have a significant overlap in timing of START and END of growing season, which agrees with the range in the timing of rainfall of these biomes. The mean LENGTH (Fig. 3e) is more than 180 d for almost the entire region, with LENGTH of up to 300 d common in the moist Savanna areas in Angola and Zambia. The arid Savanna areas have shorter and more variable LENGTH and this is

55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108

(BWUK AVSC 1100 Webpdf:=07/19/2010 03:34:57 2315247 Bytes 19 PAGES n operator=n.bhuvaneswari) 7/19/2010 3:56:08 PM

8

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54

AVSC

Wessels, Konrad et al.

Fig. 3. (a) Mean and (b) standard deviation (SD) in start date of the growing season (START), (c) its end date (END), (d) SD in END and (d) growing season length (LENGTH); (f) SD of LENGTH, as derived from 1 km2 Advanced Very High Resolution Radiometer (AVHRR) data, 1985-2000. All dates and standard deviations are expressed as decads (10-d periods) starting on 1 Jan of each year. Decads are named by the first letter of the month (J, F, . . . D) followed by the decad number (1-3). Map A covers the entire area processed (i.e. southern Africa), while maps b-f cover only South Africa. Biome boundaries are overlaid – see Fig. 1a.

1100

55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108

Remotely sensed vegetation phenology for describing and predicting

(BWUK AVSC 1100 Webpdf:=07/19/2010 03:34:57 2315247 Bytes 19 PAGES n operator=n.bhuvaneswari) 7/19/2010 3:56:08 PM

Q1

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54

AVSC

9

Fig. 4. Frequency histograms of the mean start (START), mid position (MID) and end (END) date of the growing season, per biome (South Africa) based on 1 km2 Advanced Very High Resolution Radiometer (AVHRR) data, 1985-2000. Numbered decads are named by the first letter of the month (J, F, . . . D) followed by the decad number (1-3).

1100

55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108

(BWUK AVSC 1100 Webpdf:=07/19/2010 03:34:57 2315247 Bytes 19 PAGES n operator=n.bhuvaneswari) 7/19/2010 3:56:08 PM

10

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54

AVSC

Wessels, Konrad et al.

most pronounced in Namibia and the southern Kalahari. When comparing the mean LENGTH with the SD of LENGTH there are clearly areas with both long LENGTH and low SD of LENGTH such as the moist Savannas, versus areas such as the Nama Karoo and Desert with long LENGTH and high SD in LENGTH (in a mosaic of pixels, with adjacent pixels sometimes with low SD). These latter areas correlate closely with areas of low mean amplitude in NDVI (see Fig. 1b) and in these areas it is the inability of methods to reliably determine annual phenology metrics (mainly END) that leads to the long growth season. The wheat fields in the Rhenosterveld region of the Western Cape (point a Fig. 3e) have a short LENGTH (60 d) and low SD of LENGTH, a reflection of the defined planting and harvesting season.

Productivity metrics Large and small integral (LI, SI) The mean LI was significantly correlated with mean annual precipitation (Schulze 2007) across South Africa (R2 5 0.75, n 5 3362 random pixels, Po0.05) (Figs 1c and 5). As expected, Desert has the lowest LI, followed by Succulent and Nama Karoo. Fynbos and Thicket have a wide range in LI values, overlapping with the Karoos at the low end and Forest at the high end. The Grasslands and Savannas have a similar range of values as the Thicket, but in Grasslands the weighting of the distribution is biased to the higher values, while Savannas, on average, have a lower LI value, probably as a result of most of the South African savannas being in dryer climates than grasslands (Figs 6 and 1c).

Fig. 5. Mean of large integral (a) and associated inter-annual coefficient of variation (b). Mean small integral (c) and associated inter-annual coefficient of variation (d). Derived from 1 km2 Advanced Very High Resolution Radiometer (AVHRR) data 1985-2000.

1100

55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108

Remotely sensed vegetation phenology for describing and predicting

(BWUK AVSC 1100 Webpdf:=07/19/2010 03:34:57 2315247 Bytes 19 PAGES n operator=n.bhuvaneswari) 7/19/2010 3:56:08 PM

Q1

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54

AVSC

11

Fig. 6. Frequency histograms of large integral (LI) and small integral (SI) for South African biomes.

The SI show similar patterns to those of the LI, but values are about half those of LI. The SI was significantly correlated with mean annual precipitation across South Africa (R2 5 0.76, n 5 3362 random pixels).

1100

There are, however, some instances where significant differences between LI and SI exist. Forests in particular are differentiated as having high LI, but proportionally low SI (Fig. 6). This is true for both indigenous Forests and forestry plantations

55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108

(BWUK AVSC 1100 Webpdf:=07/19/2010 03:34:57 2315247 Bytes 19 PAGES n operator=n.bhuvaneswari) 7/19/2010 3:56:08 PM

12

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54

AVSC

Wessels, Konrad et al.

(point b in Fig. 5). Subtracting the LI from SI in effect gives the BASE value which is distinctly higher in Forests than in other vegetation types (Figs 2 and 6). The Fynbos, also show a relatively large difference between LI and SI values (Fig. 6). Both Forest and Fynbos have large standing biomass, which thus results in high BASE NDVI values. The coefficient of variation (CV) is the SD expressed as a percentage of the mean and thus avoids the correlation between SD and mean values. Both the LI CV and SI CV are high for the arid areas (Fig. 5b,d). The moist eastern Savanna areas tend to have low LI CVs, but relatively high SI CVs (Fig. 5b,d). This could be attributed to the fact that the tree cover in these areas may give a stable BASE value, but variable rainfall may give a large variance in grass growth (Archibald & Scholes 2007). Regression tree classification When all phenology and productivity metrics were used as inputs to the regression tree analysis, the prediction had an overall accuracy of 0.75 (proportion of sample pixels correctly classified). When only the date-related phenology metrics were used, or only NPP-related phenology metrics were used, this accuracy was reduced by 14% and 20%, respectively (Table 2). Most noticeable was the sudden drop in accuracy of classification of Savanna when only NPP-related phenology metrics were used, indicating that this biome is not clearly distinguished by productivity alone (as is confirmed in Tables 2 and 3). The regression tree derived from all phenology metrics was used for all subsequent analyses. The regression tree initially split the biomes based on vegetation production and then by the seasonality of growth (Table 3). The three arid biomes (Desert, Succulent and Nama Karoo) were isolated as having LI o4.7 and the two high-biomass biomes (Forest and Coastal vegetation) were isolated as having LI >10.9. These broad NPPdefined vegetation units were then subcategorized using the timing of the growth season. The winter rainfall Succulent Karoo biome is separated from the summer rainfall Nama Karoo biome by the midpoint of their growing seasons (pixels with midpoints before Jul were identified as Nama Karoo). The winter rainfall Fynbos is separated from the summer rainfall Grassland and Savanna by having a mean START date before the 1st of Oct (decad 28). It is appropriate that Thicket was identified in both the winter and summer rainfall split, as this biome bridges the divide between the two climate

1100

regions. Thicket is identified as having a higher amplitude than Fynbos and a more variable growing season length than Grassland or Savanna. The split between Grassland and Savanna is hard to explain: Grassland has a lower interannual coefficient of variation in its MAX value than Savannas (Table 3). It would be expected that Grasslands should have a higher CV, because productivity of grasses is more strongly controlled by current-year’s rainfall, whereas tree productivity is less variable between years (Archibald & Scholes 2007). However, the fact that the Savanna stretches into very arid areas in the north-western Kalahari explains why Savanna overall had a higher CV MAX. A sub-biome classification of Savanna and Grassland would therefore be most informative. Mean SI and LI were identified as the two most important metrics used in the classification, closely followed by other productivity metrics (i.e. mean MAX and the difference between integrals; Fig. 7). Hereafter phenology metrics (i.e. SD MID, mean MID, mean START and mean END) became important. The variance parameters of the productivity and phenology metrics were of medium importance (Fig. 7). The CV and SD were expected to distinguish easily between less variable systems such as Forests and more variable systems such as the Nama Karoo, as seen in Fig. 4. However, it appears that mean productivity metrics are better at distinguishing between these systems, simply as a result of the very large difference in their NDVI values. The values in Table 2 represent the accuracies of the model when predicting a small number of sample points (n 5 3400). When the regression tree model was applied to all the untransformed pixels in South Africa (749692 pixels), the individual biomes were clearly differentiated with an overall accuracy of 73% and a kappa coefficient (corrected for chance agreement) of 0.638 (Table 4 and Fig. 8). A kappa coefficient larger than 0.61 is considered good, while a coefficient larger than 0.81 is considered as very good (Landis & Koch 1977). The Succulent Karoo, Nama Karoo, Fynbos, Grassland and Savanna biomes all had very low errors of commission and omission (Table 4). However, certain biomes were not predicted well from the phenological data. In particular, the Desert, Coastal and Thicket biomes all had large errors of omission (470%) because pixels in these classes were classified into alternative biome classes by the regression tree. This inaccuracy is unlikely to be explained by the limited extent of these biomes because the Forest biome, which covers o1% of the country, was classified with accuracy close to 80%.

55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108

Remotely sensed vegetation phenology for describing and predicting

(BWUK AVSC 1100 Webpdf:=07/19/2010 03:34:57 2315247 Bytes 19 PAGES n operator=n.bhuvaneswari) 7/19/2010 3:56:08 PM

Q1

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54

AVSC

13

Table 2. Accuracy of the random forest regression tree model developed on 3400 sample points from nine biomes. Values represent proportion of the sample points which were correctly classified by the predictive model. Three different trees were applied using: (1) all phenology and productivity metrics, (2) only phenology metrics and (3) only productivity metrics. Results of a climate-based regression tree of the biomes (Rutherford et al. 2006) are presented for comparison. Not applicable as forest was not included in the climate-based tree.

Productivity metrics Phenology metrics All metrics Climate-based tree (Rutherford et al. 2006)

Desert

Succulent Karoo

Nama Karoo

Fynbos

Thicket

Grassland

Savanna

0.83 0.72 0.89 0.70

0.50 0.58 0.69 0.75

0.39 0.49 0.67 0.86

0.43 0.43 0.66 0.80

0.47 0.56 0.70 0.65

0.71 0.67 0.76 0.85

0.15 0.62 0.71 0.87

Forests 0.80 0.68 0.79 NA

Coastal

Total

0.65 0.77 0.90 0.75

0.54 0.61 0.75 0.78

Table 3. Split conditions to classify pixels into the nine biomes. (a) The regression tree model using phenological and productivity metrics. (b) The climate-based regression tree model (Mucina & Rutherford 2006). SM 5 Soil Moisture, evap 5 Annual mean (potential) evaporation, Tmin 5 Mean minimum temperature of the coldest month. (a) Split conditions Mean large integral o4.7

Mean large integral  4.7

Mean small integral o1.1 Mean small integral  1.1

Mean large integral o10.9

Desert Mean midpoint  18.5 Mean midpoint o18.5 Mean start date o27.5 Mean start date  27.5

Mean large integral  10.9

(b) Split conditions SM days summer o25.2

SM days winter o25.5

Succulent Karoo Nama Karoo Mean amplitude o0.14 Mean amplitude  0.41 sd growing season  6.3 sd growing season o6.3

1100

Thicket CV of maximum value o16.6 CV of maximum value  16.6

Grassland Savanna Forest

SM days winter o11.8 SM days winter  11.8

Savanna

SM days winter  25.5

Evap o2411 Evap42411

Heat units o21.9

Tmino2.2 Tmin  2.2

Heat units  21.9

Thicket

Mean amplitude o0.23 Mean amplitude  0.23

Coastal vegetation

Evap o2592

Evap  2592

SM days summer  25.2

Fynbos

SM days winter o19.1 SM days winter  19.1

SM days winter o22.2 SM days winter  22. 2

Tmino1.95 Tmin  1.95

SM days summer o17.4 SM days summer 417.4

Heat units o10.3 Heat units  10.3 SM days summer o41.4 SM days summer  41.4

Evap o2389 evap  2389 Tmino4.0 Tmin  4

Nama Karoo Albany Thicket Nama Karoo Nama Karoo Desert Nama Karoo Fynbos Succulent Karoo Grassland Grassland Savanna Albany Thicket Savanna Savanna Coastal vegetation

55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108

14

(BWUK AVSC 1100 Webpdf:=07/19/2010 03:34:57 2315247 Bytes 19 PAGES n operator=n.bhuvaneswari) 7/19/2010 3:56:08 PM

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Q11 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54

AVSC

Wessels, Konrad et al.

Comparison with climate-based predictions

Fig. 7. The importance of different phenological variables in predicting biome class, derived by randomly permutating each variable in turn and assessing how much this reduces the explanatory power of the regression tree model. The X-axis represents the sum of the impurity (Gini index) over all nodes in the tree and is an indication of the explanatory power of individual variables relative to other variables in predicting biome class.

A more likely explanation might be that these biomes are not functionally distinguishable. The Coastal biome is a recent addition to the biome map (Mucina & Rutherford 2006) and was previously included in the Savanna biome (with which it was largely confused in the classification). Similarly, the Desert biome represents the southern-most tip of a much larger desert region in Namibia and in previous classifications it was not considered to extend into South Africa at all (Low & Rebelo 1996). The Desert biome was indistinguishable from the other arid biomes based on its phenology and productivity metrics as it could not be accurately delineated. The confidence of the prediction varied between classes – Grassland, Savanna and the two Karoo biomes were classified with high confidence (Fig. 8c). Interestingly, the interface between the Nama Karoo biome and its eastern neighbors (Savanna and Grassland) was quite diffuse, and lacked confidence at the boundaries (Fig. 8c). The transition to a semi-arid shrubland is largely driven by a gradual decrease in rainfall, and its exact location might vary with variability in rainfall. In contrast, the confidence did not decrease across the Savanna– Grassland, Grassland–Forest and Nama–Succulent Karoo boundaries, indicating that these boundaries are more clearly defined (Fig. 8c).

1100

The overall accuracy of the climate-based biome prediction (Mucina & Rutherford 2006) and the phenology-based biome prediction (this study) were quite similar: 78% and 75%, respectively (Table 2). However, while the phenology-based tree could identify forests with 76% accuracy the broad-scale climatic-based tree was unable to resolve the Forest biome, as forests occur in the same regional climate bounds as both Savanna and Grassland, and other factors such as microclimate, topography, fire history and land use determine where Forest patches occur in this region. In the climate-based tree the less productive Nama and Succulent Karoo biomes were distinguished from other biomes by their high annual potential evaporation (Table 3). Other splits in the climate-based tree were harder to interpret because the indices were not directly related to plant function. Often the same biome would be predicted by different combinations of climate factors. For example, there are four separate nodes that all classify as Savanna, which means the Savanna biome is characterized as having both a very low number of winter soil moisture days (o12) as well as very a high number of winter soil moisture days (422). In contrast, in the phenology-based tree, each biome was only predicted by only one terminal node (with the exception of thicket) (Table 3), which is an indication that the phenology and productivity metrics used to define the biomes are meaningful.

Conclusions The phenology metrics derived from remote sensing data gave ecologically meaningful results that reflect the spatial patterns of production and seasonality of vegetation growth across South Africa. Even in arid areas with a very low NDVI values the long-term mean phenology metrics reflected the expected seasonality. Moreover, the 15-yr study period made it possible to quantify inter-annual variability in phenology and productivity as the period included a number of El Nin˜o–Southern Oscillation (ENSO) cycles (Tyson 1980). The phenological information gathered for naturally vegetated areas of each biome can provide a baseline or reference for assessing the impacts of human-induced land cover change and future climate change (Alcaraz-Segura et al. 2009). This paper offers new insights into the spatial patterns of the phenology metrics and their inter-annual variability that could

55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108

Remotely sensed vegetation phenology for describing and predicting

(BWUK AVSC 1100 Webpdf:=07/19/2010 03:34:57 2315247 Bytes 19 PAGES n operator=n.bhuvaneswari) 7/19/2010 3:56:08 PM

Q1

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54

AVSC

15

Table 4. (a) Confusion matrix of the prediction of the biome map of South Africa (excluding transformed pixels) using phenology-based regression tree. (b) Accuracy of the phenology-based predicted biome map of South Africa. Producers accuracy represents the proportion of the biome that was correctly classified. User’s accuracy represents the probability that a pixel classified into that biome is correctly classified. For example, 79% of the forest biome was correctly classified as forest, but only 23% of the pixels classified as forest fell in the forest biome. Forest therefore has a high producer’s accuracy but a low users accuracy. Also included are the accuracy scores of the climate-based regression tree (Mucina & Rutherford 2006). The kappa coefficient is a measure of the accuracy corrected for chance agreement. Observed

(a) Desert Succulent Karoo Nama Karoo Fynbos Thicket Grassland Savanna Forests Coastal Total

Predicted Desert

Succulent Karoo

Nama Karoo

Fynbos

Thicket

Grassland

Savanna

Forests

Coastal

1000 315 241 0 0 0 0 0 0 1556

1030 43 833 2336 6466 14 0 145 0 2 53 826

3391 14 433 144 406 2945 2783 1794 6028 0 73 175 853

7 3716 1312 27 526 4631 261 554 28 23 38 058

0 104 718 989 5431 442 525 23 161 8393

0 160 14 891 1214 7239 152 347 70 812 61 3468 250 192

75 880 30 855 1758 1899 12 169 166 647 17 1380 215 680

0 0 0 698 648 326 174 621 255 2722

0 0 0 15 219 519 1170 34 1455 3412

% Of total area (B) Desert Succulent Karoo Nama Karoo Fynbos Thicket Grassland Savanna Forests Coastal

Producer’s accuracy

Error of omission (%)

0.6 0.18 82 7 0.69 31 20.8 0.74 26 7 0.66 34 2.7 0.24 76 28.5 0.91 9 32.1 0.68 32 0.1 0.79 21 1.3 0.21 79 Overall accuracy Kappa coefficient With z 5 1033, P-value o0.0001

not be mapped without long-term satellite timeseries data. It is technically very challenging to collect ground observations on vegetation phenology and aggregate these species-specific, plant-level measurements to the landscape scale (Archibald & Scholes 2007; Studer et al. 2007; Maignan et al. 2008). Thus, with a few exceptions (Studer et al. 2007; Maignan et al. 2008; White et al. 2009), very little data are available covering large geographic areas, especially in Southern Africa. Independent ground validation of remotely sensed, land surface phenology products therefore remains a significant challenge in this field of study. Consistent with similar studies that classified vegetation-based remotely sensed indicators of phenology (Paruelo et al. 1998; Alcaraz et al. 2006), the regression tree initially split the biomes based on vegetation production and then by the seasonality of growth. Despite the fact that the biomes are internally very diverse in function, the accuracy of the phenologybased decision tree analysis was comparable to the

1100

Total 5503 63 441 194 759 41 611 22 864 167 858 246 055 784 6817 749 692

User’s accuracy

Error of commission (%)

Accuracy of climate-based tree

0.64 0.81 0.82 0.72 0.65 0.61 0.77 0.23 0.43 0.73 0.638

36 19 18 28 35 39 23 77 57

0.70 0.75 0.86 0.80 0.65 0.85 0.87 

0.75

more traditional climate-based regression tree (Rutherford et al. 2006) and, unlike the climate-based tree, was able to distinguish biomes such as Forest, which occur in the same climate space as other South African biomes. Moreover, the high errors of omission in the newly added Desert and Coastal biomes may also be highlighting areas of potentially contentious biome classifications (Fig. 8). The phenology-based tree included additional functional information, such as the independent estimates of NPP (LI and SI) and seasonality of growth, which climate-based approach had to infer from rainfall and evapo-transpiration with mixed results (Rutherford et al. 2006). The difference in the biome-specific responses given by the LI and SI can be used as a key variable for differentiating biomes (e.g. Forest and Savanna biomes). The split conditions derived from the phenology and productivity metrics matched well-known, macro-ecological differences in functional dynamics of the biomes. Although not investigated here, it is likely that phenology-based classifications may perform even

55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108

(BWUK AVSC 1100 Webpdf:=07/19/2010 03:34:57 2315247 Bytes 19 PAGES n operator=n.bhuvaneswari) 7/19/2010 3:56:08 PM

16

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54

AVSC

Wessels, Konrad et al.

Fig. 8. (a) Biome map of South Africa (Mucina & Rutherford 2006). (b) Biome distribution predicted by regression tree model based on Advanced Very High Resolution Radiometer (AVHRR)-derived phenology and productivity metrics. (c) Confidence of predicted biomes over the whole country. Transformed- and no-data areas are masked out in white.

better at the sub-biome than at the biome level, because considerable within-biome variation in vegetation structure and function is driven by factors such as soils and topography, which are captured by the remotely sensed metrics, but not reflected in climate data. Sub-biome level vegetation mapping using remotely sensed metrics is therefore entirely feasible and is currently being investigated. The phenology and productivity metrics captured functional processes that were not readily predictable from the combination of floristic data and climate variables (Hoare & Frost 2004). Given the fundamental difference in measurements and approaches, it remains intriguing that vegetation indices measured from a satellite could be used to reliably reproduce a biome map which was created by grouping detailed vegetation types based on their floristic composition and structure (Mucina & Rutherford 2006). This ultimately suggests a con-

1100

vergence of composition, structure and function at a biome level (Scholes et al. 1996; Paruelo et al. 2001, 2004). Although not specifically attempted in this paper, an unsupervised classification based on phenology and productivity metrics could be useful for mapping vegetation functional types across vast, scarcely surveyed regions (Paruelo et al. 2001; Alcaraz et al. 2006; Karlsen et al. 2006). Overall these findings lead us to conclude that remotely sensed metrics can play an indispensable role in the production of regional vegetation (biome) maps and characterizing the functional dynamics of vegetation.

Acknowledgements. This research was funded by the CSIR Strategic Research Panel (SRP). We thank the three anonymous reviewers for their contributions to improving the final manuscript.

55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108

Remotely sensed vegetation phenology for describing and predicting

(BWUK AVSC 1100 Webpdf:=07/19/2010 03:34:57 2315247 Bytes 19 PAGES n operator=n.bhuvaneswari) 7/19/2010 3:56:08 PM

Q1

1 2 3 4 5 6 7 8 Q6 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 Q7 39 40 41 42 43 44 45 46 Q8 47 48 49 50 51 52 53 54

AVSC

References Alcaraz, D., Paruelo, J. & Cabello, J. 2006. Identification of current ecosystem functional types in the Iberian Peninsula. Global Ecology and Biogeography 15: 200– 212. Alcaraz-Segura, D., Cabello, J. & Paruelo, J. 2009. Baseline characterization of major Iberian vegetation types based on the NDVI dynamics. Plant Ecology 202. Archibald, S. & Scholes, R.J. 2007. Leaf green-up in a semi-arid African savanna – separating tree and grass responses to environmental cues. Journal of Vegetation Science 18: 583–594. Archibald, S.A., Roy, D.P., van Wilgen, B.W. & Scholes, R.J. 2009. What limits fire? An examination of drivers of burnt area in Southern Africa. Global Change Biology 15: 613–630. Bachoo, A. & Archibald, S. 2007. Influence of using datespecific values when extracting phenological metrics from 8-day composite NDVI data. Multi Temp 2007. IEEE, Leuven, BE. Blydenstein, J. 1967. Tropical savanna vegetation of the llanos of Colombia. Ecology 48: 1–15. Bond, W.J., Midgley, G.F. & Woodward, F.I. 2003. What controls South African vegetation – climate or fire? South African Journal of Botany 69: 79–91. Breiman, L. 2001. Random forests. Machine Learning 45: 5–32. Breiman, L., Friedman, J.H., Olshen, R.A. & Stone, C.J. 1984. Classification and regression trees. Wadsworth & Brooks/Cole, Monterey, CA, US. Brown, M.E. & de Beurs, K.M. 2008. Evaluation of multisensor semi-arid crop season parameters based on NDVI and rainfall. Remote Sensing of Environment 112: 2261–2271. Chen, J., Jonsson, P., Tamura, M., Gu, Z.H., Matsushita, B. & Eklundh, L. 2004. A simple method for reconstructing a high-quality NDVI time-series data set based on the Savitzky–Golay filter. Remote Sensing of Environment 91: 332–344. Cowling, R.M., Richardson, D.M. & Pierce, S.M. 1997. Vegetation of Southern Africa. Cambridge University Press. Cambridge. de Beurs, K.M. & Henebry, G.M. 2005. Land surface phenology and temperature variation in the International Geosphere-Biosphere Program highlatitude transects. Global Change Biology 11: 779–790. Denning, A.S., Nicholls, M., Prihodko, L., Baker, I., Vidale, P.L., Davis, K. & Bakwin, P. 2003. Simulated variations in atmospheric CO2 over a Wisconsin forest using a coupled ecosystem–atmosphere model. Global Change Biology 1241–1250. Diouf, A. & Lambin, E. 2001. Monitoring land-cover changes in semi-arid regions: remote sensing data and field observations in the Ferlo, Senegal. Journal of Arid Environments 48: 129–148. Ellery, W.N., Scholes, R.J. & Mentis, M.T. 1991. An initial approach to predicting the sensitivity of the

1100

17

South-African grassland biome to climate change. South African Journal of Science 87: 499–503. Ferreira, L.G. & Huete, A.R. 2004. Assessing the seasonal dynamics of the Brazilian Cerrado vegetation through the use of spectral vegetation indices. International Journal of Remote Sensing 25: 1837–1860. Garbulsky, M.F. & Paruelo, J.M. 2004. Remote sensing of protected areas to derive baseline vegetation functioning characteristics. Journal of Vegetation Science 15: 711–720. Heumann, B.W., Seaquist, J.W., Eklundh, L. & Jonsson, P. 2007. AVHRR derived phenological change in the Sahel and Soudan, Africa, 1982–2005. Remote Sensing of Environment 108: 385–392. Hoare, D. & Frost, P. 2004. Phenological description of natural vegetation in southern Africa using remotelysensed vegetation data. Applied Vegetation Science 7: 19–28. Jolly, W. & Running, S. 2004. Effects of precipitation and soil water potential on drought deciduous phenology in the Kalahari. Global Change Biology 10: 303–308. Jo¨nsson, P. & Eklundh, L. 2002. Seasonality extraction by function fitting to time-series of satellite sensor data. IEEE Transactions on GeoScience and Remote Sensing 40: 1824–1832. Jo¨nsson, P. & Eklundh, L. 2004. TIMESAT – a program for analyzing time-series of satellite sensor data. Computers and Geosciences 30: 833–845. Justice, C.O., Townshend, J.R.G., Holben, B.N. & Tucker, C.J. 1985. Analysis of the phenology of global vegetation using meteorological satellite data. International Journal of Remote Sensing 6: 1271–1318. Karlsen, S.R., Elvebakk, A., Hogda, K.A. & Johansen, B. 2006. Satellite-based mapping of the growing season and bioclimatic zones in Fennoscandia. Global Ecology and Biogeography 15: 416–430. Landis, J.R. & Koch, G.G. 1977. The measurement of observer agreement for categorical data. Biometrics 33: 159–174. Low, A.B. & Rebelo, A.G. 1996. Vegetation of South Africa, Lesotho and Swaziland. Department of Environmental Affairs and Tourism, Pretoria, SA. Lupo, F., Linderman, M., Vanacker, V., Bartholome, E. & Lambin, E.F. 2007. Categorization of land-cover change processes based on phenological indicators extracted from time series of vegetation index data. International Journal of Remote Sensing 28: 2469– 2483. Maignan, F., Breon, F.M., Bacour, C., Demarty, J. & Poirson, A. 2008. Interannual vegetation phenology estimates from global AVHRR measurements Comparison with in situ data and applications. Remote Sensing of Environment 112: 496–505. Mucina, L. & Rutherford, M.C. 2006. The vegetation of South Africa, Lesotho and Swaziland##Strelitzia 19. South African National Biodiversity Institute, Pretoria, SA.

55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108

(BWUK AVSC 1100 Webpdf:=07/19/2010 03:34:57 2315247 Bytes 19 PAGES n operator=n.bhuvaneswari) 7/19/2010 3:56:08 PM

18

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54

AVSC

Wessels, Konrad et al.

Myneni, R., Keeling, C.D., Tucker, C.J., Asrar, G. & Nemani, R.R. 1997. Increased plant growth in northern high latitudes from 1981–1991. Nature 386: 698–702. Myneni, R.B. & Williams, D.L. 1994. On the relationship between FAPAR and NDVI. Remote Sensing of Environment 49: 200–211. Nemani, R.R., Keeling, C.D., Hashimoto, H., Jolly, W.M., Piper, S.C., Tucker, C.J., Myneni, R.B. & Running, S.W. 2003. Climate-driven increases in global terrestrial net primary production from 1982 to 1999. Science 300: 1560–1563. Osborne, C.P. & Woodward, F.I. 2001. Biological mechanisms underlying recent increases in the NDVI of Mediterranean shrublands. International Journal of Remote Sensing 22: 1895–1907. Paruelo, J.M., Jobbagy, E.G., Sala, O.E., Lauenroth, W.K. & Burke, I.C. 1998. Functional and structural convergence of temperate grassland and shrubland ecosystems. Ecological Applications 8: 194–206. Paruelo, J.M., Jobbagy, E.G. & Sala, O.E. 2001. Current distribution of ecosystem functional types in temperate South America. Ecosystems 4: 683–698. Paruelo, J.M., Golluscio, R.A., Guerschman, J.P., Cesa, A., Jouve, V.V. & Garbulsky, M.F. 2004. Regional scale relationships between ecosystem structure and functioning: the case of the Patagonian steppes. Global Ecology and Biogeography 13: 385–395. Prasad, A.M., Iverson, L.R. & Liaw, A. 2006. Newer classification and regression tree techniques: bagging and random forests for ecological prediction. Ecosystems 9: 181–199. Prince, S.D. 1991. A model of regional primary production for use with coarse-resolution satellite data. International Journal of Remote Sensing 12: 1313–1330. Privette, J.L., Fowler, C., Wick, G.A., Baldwin, D. & Emery, W.J. 1995. Effects of orbital drift on advanced very high resolution radiometer products: normalized difference vegetation index and sea surface temperature. Remote Sensing of Environment 53: 164–171. Rao, C.R.N. & Chen, J. 1995. Inter-satellite calibration linkages for the visible and near-infrared channels of the advanced very high resolution radiometer on NOAA-1, -9, and -11 spacecraft. International Journal of Remote Sensing 16: 1931–1942. Rao, C.R.N. & Chen, J. 1996. Post-launch calibration of the visible and near-infrared channels of the Advanced Very High Resolution Radiometer on the NOAA-14 spacecraft. International Journal of Remote Sensing 17: 2743–2747. Reed, B.C. 2006. Trend analysis of time-series phenology of North America derived from satellite data. GIScience & Remote Sensing 43: 1–15. Reed, B.C., White, M.A. & Brown, J.F. 2003. Remote sensing phenology. In: Shwartz, M.D. (ed.) Phenology: an integrative science. Kluwer Academic Publishing, Dordrecht, NL.

1100

Rutherford, M.C. & Westfall, R. 1994. Biomes of southern Africa – an objective categorization. Memoirs of the Botanical Survey of South Africa. 63: 1–94. Rutherford, M.C., Muncina, L. & Powrie, L.W. 2006. Biomes and bioregions of Southern Africa. In: Muncina, L. & Rutherford, M.C. (eds.) The vegetation of South Africa, Lesotho and Swaziland. pp. 32–50. Strelitzia, Cape Town, SA. Scholes, R.J., Ellery, W.N., Pickett, G. & Blackmore, A. 1996. Plant functional types in African savannas and grasslands. In: Woodward, I. (ed.) Plant functional types. IGBP series, Vol 1. Cambridge University Press, Cambridge, UK. Schulze, R.E. 1997. Climate. In: Cowling, R.M., Richardson, D.M. & Pierce, S.M. (eds.) Vegetation of Southern Africa. pp. 21–42. Cambridge University Press, New York, NY, US. Schulze, R.E. 2007. South African Atlas of Climatology and Agrohydrology. In: WRC Report 1489/1/06, Water Research Commission, Pretoria. Studer, S., Stockli, R., Appenzeller, C. & Vidale, P.L. 2007. A comparative study of satellite and groundbased phenology. International Journal of Biometeorology 51: 405–414. Thompson, M.W. 2001. Guideline Procedures for the National Land-Cover Mapping and Change Monitoring. CSIR Report, Pretoria. Tyson, P.D. 1980. Temporal and spatial variation of rainfall anomalies in Africa south of latitude 221 during the period of meteorological record. Climatic Change 2: 363–371. Vanacker, V., Linderman, M., Lupo, F., Flasse, S. & Lambin, E. 2005. Impact of short-term rainfall fluctuation on interannual land cover change in subSaharan Africa. Global Ecology and Biogeography 14: 123–135. van Leeuwen, W.J.D., Orr, B.J., Marsh, S.E. & Herrmann, S.M. 2006. Multi-sensor NDVI data continuity: uncertainties and implications for vegetation monitoring applications. Remote Sensing of Environment 100: 67–81. Wessels, K.J., Prince, S.D., Zambatis, N., MacFadyen, S., Frost, P.E. & VanZyl, D. 2006. Relationship between herbaceous biomass and 1-km2 Advanced Very High Resolution Radiometer (AVHRR) NDVI in Kruger National Park, South Africa. International Journal of Remote Sensing 27: 951–973. Wessels, K.J., Bachoo, A.K. & Archibald, S. 2009. Influence of composite period and date of observation on phenological metrics extracted from MODIS data. In: 33rd International Symposium on Remote Sensing of Environment, ISRSE, Stresa, Italy. White, M.A., de Beurs, K.M., Didan, K., Inouye, D.W., Richardson, A.D., Jensen, O.P., O’Keefe, J., Zhang, G., Nemani, R.R., Van Leeuwen, W.J.D., Brown, J.F., De Wit, A.J.W., Schaepman, M.E., Lin, X., Dettinger, M., Bailey, A.S., Kimball, J., Schwartz,

55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108

Remotely sensed vegetation phenology for describing and predicting

(BWUK AVSC 1100 Webpdf:=07/19/2010 03:34:57 2315247 Bytes 19 PAGES n operator=n.bhuvaneswari) 7/19/2010 3:56:08 PM

Q1

1 2 3 4 5 6 7 8 9 10 Q9 11 12 13 14 15 16 17 Q10 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54

AVSC

M.D., Baldocchi, D.D., Lee, J.T. & Lauenroth, W.K. 2009. Intercomparison, interpretation, and assessment of spring phenology in North America estimated from remote sensing for 1982–2006. Global Change Biology 15: 2335–2359. Woodward, F.I., Lomas, M.R. & Kelly, C.K. 2004. Global climate and the distribution of plant biomes. Philosophical Transactions of the Royal Society of London Series B – Biological Sciences 359: 1465–1476. Zhang, X.Y., Friedl, M.A. & Schaaf, C.B. 2006. Global vegetation phenology from Moderate Resolution Imaging Spectroradiometer (MODIS): evaluation of global patterns and comparison with in situ measurements. Journal of Geophysical Research– Biogeosciences 111. Zhou, L., Kaufmann, R.K., Tian, Y., Myneni, R.B. & Tucker, C.J. 2003. Relation between interannual variations in satellite measures of northern forest greenness and climate between 1982 and 1999. Journal of Geophysical Research-Atmospheres 108.

1100

19

Supporting Information Additional supporting information may be found in the online version of this article: Fig. S1. Mean start of growing season dates calculated from Advanced Very High Resolution Radiometer (AVHRR) 1 km data 1985–2000. Please note: Wiley-Blackwell is not responsible for the content or functionality of any supporting materials supplied by the authors. Any queries (other than about missing material) should be directed to the corresponding author for the article. Received 19 November 2009; Accepted 7 June 2010. Coordinating Editor: Dr Geoffrey Henebry

55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108

Author Query Form _______________________________________________________ Journal Article

AVSC 1100

_______________________________________________________ Dear Author, During the copy-editing of your paper, the following queries arose. Please respond to these by marking up your proofs with the necessary changes/additions. Please write your answers clearly on the query sheet if there is insufficient space on the page proofs. If returning the proof by fax do not write too close to the paper's edge. Please remember that illegible mark-ups may delay publication. Query No. Description Author Response

Q1

Author: A running head short title was not supplied; please check if this one is suitable and, if not, please supply a short title that can be used instead.

Author: Please note that abstracts should be a maximum of 250 words. Please shorten accordingly

Q2 Author: Please provide a nomenclature reference for species names in this paper.

Q3 Author: Please supply company name and address (City, State if USA, country) for timesat

Q4 Author: Please supply company name and address (City, State if USA, country) for Matlab

Q5 Author: Please provide the page range for reference Alcaraz-Segura et al (2009)

Q6 Author: Please provide the city location of publisher for reference Cowling et al (1997)

Q7 Author: Please provide the volume for reference Denning et al (2003)

Q8 Author: Please provide the page range for reference Zhang et al (2006)

Q9 Author: Please provide the page range for reference for Zhou et al (2003)

Q10

Author: Figure 7 has been saved at a low resolution of 96 dpi. Please resupply at Q11

600/300 dpi . Check required artwork specifications at http://authorservices.wiley.com/bauthor/illustration.asp