Using local transition probability models in Markov ... - Semantic Scholar

3 downloads 18425 Views 1MB Size Report
b Department of Geography & Institute for Advanced Computer Studies, University of ... Policy, and Management, University of California at Berkeley, 137 Mulford Hall # 3114, ...... Computationally, the global model is the cheapest and fastest.
Available online at www.sciencedirect.com

Remote Sensing of Environment 112 (2008) 2222 – 2231 www.elsevier.com/locate/rse

Using local transition probability models in Markov random fields for forest change detection Desheng Liu a,⁎, Kuan Song b , John R.G. Townshend b , Peng Gong c,d a

d

Department of Geography and Department of Statistics, The Ohio State University, 1036 Derby Hall, 154 North Oval Mall, Columbus, OH 43210-1361, United States b Department of Geography & Institute for Advanced Computer Studies, University of Maryland, College Park, MD 20742, United States c Department of Environmental Science, Policy, and Management, University of California at Berkeley, 137 Mulford Hall # 3114, Berkeley, CA 94720-3110, United States State Key Laboratory of Remote Sensing Science, Jointly Sponsored by the Institute of Remote Sensing Applications of Chinese Academy of Sciences and Beijing Normal University, Beijing, 100101, China Received 30 July 2007; received in revised form 2 October 2007; accepted 3 October 2007

Abstract Change detection based on the comparison of independently classified images (i.e. post-classification comparison) is well-known to be negatively affected by classification errors of individual maps. Incorporating spatial-temporal contextual information in the classification helps to reduce the classification errors, thus improving change detection results. In this paper, spatial-temporal Markov Random Fields (MRF) models were used to integrate spatial-temporal information with spectral information for multi-temporal classification in an attempt to mitigate the impacts of classification errors on change detection. One important component in spatial-temporal MRF models is the specification of transition probabilities. Traditionally, a global transition probability model is used that assumes spatial stationarity of transition probabilities across an image scene, which may be invalid if areas have varying transition probabilities. By relaxing the stationarity assumption, we developed two local transition probability models to make the transition model locally adaptive to spatially varying transition probabilities. The first model called locally adjusted global transition model adapts to the local variation by multiplying a pixel-wise probability of change with the global transition model. The second model called pixel-wise transition model was developed as a fully local model based on the estimation of the pixel-wise joint probabilities. When applied to the forest change detection in Paraguay, the two local models showed significant improvements in the accuracy of identifying the change from forest to non-forest compared with traditional models. This indicates that the local transition probability models can present temporal information more accurately in change detection algorithms based on spatial-temporal classification of multi-temporal images. The comparison between the two local transition models showed that the fully local model better captured the spatial heterogeneity of the transition probabilities and achieved more stable and consistent results over different regions of a large image scene. Published by Elsevier Inc. Keywords: Forest change detection; Paraguay; Post-classification comparison; Markov random fields; Local transition probability model; Spatial-temporal information

1. Introduction Forests on the earth have undergone extensive loss during the last several decades (Tucker & Richards 1983; Williams 1989). According to DeFries et al. (2002), the estimated mean annual net forest area losses in tropical Latin America are ⁎ Corresponding author. Tel.: +1 614 247 2775; fax: +1 614 292 6213. E-mail address: [email protected] (D. Liu). 0034-4257/$ - see front matter. Published by Elsevier Inc. doi:10.1016/j.rse.2007.10.002

3.6 million hectares in the 1980s and 3.2 million hectares in the 1990s. The large-scale deforestation has profound impacts on global climate change by increasing greenhouse gas emissions (Cook et al., 1990; Houghton 1991; Keller et al., 1991; Dixon et al., 1994; Fearnside, 1996) and on the loss of biodiversity by destruction and fragmentation of natural habitats (Chiarello, 1999; Laurance & Bierregaard 1997; Laurance et al., 1997, 1998; Pimm et al., 1995; Skole & Tucker, 1993). It is therefore important to estimate the extents and rates of deforestation

D. Liu et al. / Remote Sensing of Environment 112 (2008) 2222–2231

across different regions in order to understand these significant impacts. Satellite remote sensing plays an important role in the monitoring of deforestation due to its capability to observe forest change in a repetitive and consistent manner over large areas (Alves et al., 1999; DeFries et al., 2002; Huang et al., 2007; Mertens and Lambin 1997; Skole and Tucker 1993; Steininger et al., 2001). Many change detection algorithms using remotely sensed imagery (Gong and Xu, 2003; Lu et al., 2004; Singh, 1989) are available for forest change detection. Among them, change detection based on the comparison of independently classified images (also known as post-classification comparison) is one of the most widely used approaches. It has the advantage of being free of atmospheric influences and radiometric differences and providing “from-to” change information (Gong and Xu 2003; Lu et al 2004; Singh 1989). However, it is well-known that this approach suffers from the propagation of errors from individual maps due to mis-registration and misclassification (Gong et al., 1992; Serra et al., 2003; Townshend et al., 1992). The effect due to mis-classification errors is partly attributed to the fact that classification of multi-date images is performed independently in the post-classification comparison approach. Consequently, pixels misclassified in one date will result in errors on the “from-to” change detection map no matter whether the corresponding pixels in another date are correctly classified. Linking multi-temporal images and making use of temporal information in the classification process can reduce classification errors and thus improve change detection results. One intuitive approach to linking multi-temporal images together is through the so-called multi-date classification. In this approach, image bands from two or more dates are stacked together; and then a single classification is performed to identify change classes separately from classes where no change occurs between the dates. Nevertheless, this multi-date classification approach is challenging to implement because it demands large numbers of training data due to the increased number of classes and image bands. Alternatively, one could link multi-temporal images by incorporating the temporal information derived from them into their classifications. This enables the joint classification of multiple images so that some of the errors occurred in an independent classification mode could be corrected in the joint mode. Change classes are then detected through the comparison of jointly classified images. Markov Random Fields (MRF) model is a mathematical tool that permits the use of temporal correlation in multi-temporal image classification (Liu et al., 2005, 2006; Melgani & Serpico, 2003; Solberg et al., 1996). Solberg et al. (1996) introduced a general framework based on MRF to integrate multi-source data in remote sensing image classification, from which temporal information possessed on a classified image at a given date can contribute to the classification of the same area at a subsequent date. Melgani and Serpico (2003) extended this framework from a uni-directional cascade approach to a bi-directional one, thus allowing the temporal information to be exchanged mutually between two dates. Liu et al. (2006) considered both temporal correlation and temporal exclusion in MRF in an attempt to reduce prohibitive transitions between two dates. All

2223

these MRF approaches led to improved classification results by exploring temporal components of multi-temporal imagery in terms of transition probabilities. It is important to note that one common feature in these MRF models is that the transition probabilities are defined globally. Hence, the probability of one specific transition is fixed across the whole image. This is a very simple assumption and easy to specify but may not be sufficient to model the spatial heterogeneity of the transition probabilities between two images. For example, in the case of forest change detection, the transition probability from forest to non-forest is often spatially variable depending on many physical and socialeconomic factors such as accessibility and land value. It is therefore desirable to relax the global transition probabilities to a local level (e.g. pixel-wise) to accommodate the spatial variability of transition probabilities. The purpose of this paper is to develop two local transition probability models in order to provide more accurate temporal information for the multi-temporal classification based on MRF for forest change detection. As changes tend to occur in a spatially autocorrelated way, it is desirable to incorporate spatial information in the change detection. Thus, the temporal information obtained from the proposed local transition probability models will be incorporated into MRF in conjunction with the spatial information, giving rise to a change detection approach based on spatial-temporal classification. Nevertheless, our focus is mainly on the evaluation of the proposed local transition probability models in estimating more accurate temporal information for change detection. This rest of the paper is organized as follows. We first present a spatial-temporal classification framework based on MRF. Then, we provide the details for a conventional global transition probability model and the proposed local transition probability models. Finally, we apply these methods to multi-temporal Landsat images for forest change detection for a test area in Paraguay. To validate the proposed local models, we compare the change detection results from the spatial-temporal classification using the proposed local transition probability models with the results from spatial-temporal classification using a global transition probability model. 2. Study site and data 2.1. Study site The study site is located in eastern Paraguay, near the tricountry border of Paraguay, Brazil, and Argentina (Fig. 1). Deforestation in South America in the 1990s was mainly in the southern Brazilian provinces, Paraguay, and Bolivia. (Hansen and DeFries, 2004). The native forest canopy of eastern Paraguay, the Atlantic forest, has been identified as one of the top priority ecosystems for global biological conservation (Myers, 1988; Olson & Dinerstein, 2002). The forest has extremely high levels of biodiversity, hosting an estimated 20,000 plant species and over 1300 non-fish vertebrate species (Mittermeier et al., 1999). During the past 30 years or so, the Atlantic forest has been rapidly replaced by commercial agriculture, largely for the cultivation of soybean and for raising

2224

D. Liu et al. / Remote Sensing of Environment 112 (2008) 2222–2231

Fig. 1. Study site, Paraguay, South America.

livestock and also by subsistence agriculture by legal and illegal settlers (Huang et al., 2007). This land cover conversion process results in a checkerboard landscape of human land use and remaining forest patches. We chose the study site in an area with a very high proportion of land use change. Within this study area of 6760.52 km2, 45.87% of the forest cover in 1990 has been turned into farmland as of 2001. The land use change consists mostly of mechanized soybean farms and dairy farms and there is no visible forest regrowth during this period. 2.2. Landsat data and reference data The data used in this study include Landsat imagery and the corresponding reference forest cover change map, all of which were provided by the Global Land Cover Facility (GLCF) at the

University of Maryland, College Park. The Landsat data include Landsat TM data acquired on May 24, 1990 and Landsat ETM+ data acquired on March 3, 2001. Fig. 2(a) and (b) provide a snapshot of the two Landsat images clipped to the study area. Both images are of 28.5 m resolution, with the exception of the thermal infrared bands of 57 m resolutions. The TM image was selected from the orthorectified Geocover imagery set. The ETM+ image was manually co-registered to the TM image. The RMSE for the georeferencing is within a half pixel. Fig. 2(c) shows the reference forest cover change map of the study area. The wall-to-wall forest cover change map for the entire Paraguay was developed using the 1990s and 2000s Landsat data through image interpretation and classification performed by experienced data analysts at GLCF. The final map was improved by intensive human editing using supplemental data

D. Liu et al. / Remote Sensing of Environment 112 (2008) 2222–2231

2225

Fig. 2. (a) TM imagery (7-4-2), (b) ETM+ imagery (7-4-2), and (c) Reference map. Legend for reference map: Red (deforestation), Yellow (non-forest), Green (remaining forest). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

provided by collaborators in Paraguay. When validated using IKONOS, Quickbird imagery, and aerial photos, the final forest cover change map was shown to have an accuracy of higher than 95%. This highly reliable forest cover map product provides a full reference map for all the pixels in our study site, thus serving as a comprehensive reference data set for testing our models.

pixel and its spatial-temporal neighbors. Considering two images observed at time T1 and T2 (T1 b T2), denote the spatial and temporal neighbors of an arbitrary pixel u by NS(u) and NT(u) respectively. Using the same spatial-temporal neighborhood system as that used in Liu et al. (2006) (Fig. 3), the spatialtemporal dependence of pixel u at time T1 on its spatial

3. Methods 3.1. Markov random fields Markov Random Fields (MRF) are well-established probabilistic models for analyzing contextual relationships between image pixels (Besag, 1986; Dubes and Jain, 1989; Cressie, 1993; Li, 2001). Using MRF, the contextual relationships (i.e. dependences) between pixels can be modeled by conditional probabilities with respect to a neighborhood system which is defined by an image analyst. Neighbors of a given pixel in a single image usually consist of its spatially adjacent pixels, on which the conditional probabilities are conditioned. The definition of the neighbors of a given pixel can be generalized to include spatial-temporally adjacent neighbors in multi-temporal images (Liu et al., 2005, 2006; Melgani & Serpico, 2003; Solberg et al., 1996). In this way, MRF can be used to model the spatial-temporal dependencies between a

Fig. 3. MRF spatial-temporal neighborhood system. Axes X, Y are the spatial coordinates of pixels. Axis T is the temporal index of pixels (Liu et al., 2006).

2226

D. Liu et al. / Remote Sensing of Environment 112 (2008) 2222–2231

neighbors NS(u) at time T1 and temporal neighbors NT(u) at T2 is modeled as a conditional prior distribution of the following form: Pðc1 ðuÞjc1 ðNS ðuÞÞ; c2 ðNT ðuÞÞÞ 1 ¼ exp fU ðc1 ðuÞ; c1 ðNS ðuÞÞ; c2 ðNT ðuÞÞÞg; Z

ð1Þ

where c1(u), c1(NS(u)), c2(NT(u)) denote the class label of pixel u at time T1, class labels of its spatial neighbors (NS(u)) at time T1, and class labels of its temporal neighbors (NT(u)) at time T2 respectively; Z is a normalizing constant; U is the total spatial-temporal energy function. The total energy function U can be further modeled as a linear combination of a spatial component US and a temporal component UT:

0 1 Pðc1 ð•Þ ¼ x1 jc2 ð•Þ ¼ x1 Þ Pðc1 ð•Þ ¼ x2 jc2 ð•Þ ¼ x1 Þ Pðc1 ð•Þ ¼ x3 jc2 ð•Þ ¼ x1 Þ T ð•Þ ¼ @ Pðc1 ð•Þ ¼ x1 jc2 ð•Þ ¼ x2 Þ Pðc1 ð•Þ ¼ x2 jc2 ð•Þ ¼ x2 Þ Pðc1 ð•Þ ¼ x3 jc2 ð•Þ ¼ x2 Þ A Pðc1 ð•Þ ¼ x1 jc2 ð•Þ ¼ x3 Þ Pðc1 ð•Þ ¼ x2 jc2 ð•Þ ¼ x3 Þ Pðc1 ð•Þ ¼ x3 jc2 ð•Þ ¼ x3 Þ

U ðc1 ðuÞ; c1 ðNS ðuÞÞ; c2 ðNT ðuÞÞÞ ¼ US ðc1 ðuÞ; c1 ðNS ðuÞÞÞ þ UT ðc1 ðuÞ; c2 ðNT ðuÞÞÞ ð2Þ The spatial energy function US in Eq. (2) is characterized by the agreement in class labels between c(u) and c(NS(u)): X US ðcðuÞ; cðNS ðuÞÞÞ ¼ bS I ðcðuÞ ¼ cðu VÞÞ; ð3Þ u VaNS ðuÞ

X

ð5Þ

By adding a location index u in the transition matrices, the local model specifies the pixel-wise transition probabilities from T2 to T1 at pixel u with T(u) in Eq. (6), where c1(u) and c2(u) represents the labels of the pixel u at T1 and T2 respectively. 0 Pðc1 ðuÞ ¼ x1 jc2 ðuÞ ¼ x1 Þ T ðuÞ ¼ @ Pðc1 ðuÞ ¼ x1 jc2 ðuÞ ¼ x2 Þ Pðc1 ðuÞ ¼ x1 jc2 ðuÞ ¼ x3 Þ

Pðc1 ðuÞ ¼ x2 jc2 ðuÞ ¼ x1 Þ Pðc1 ðuÞ ¼ x2 jc2 ðuÞ ¼ x2 Þ Pðc1 ðuÞ ¼ x2 jc2 ðuÞ ¼ x3 Þ

1 Pðc1 ðuÞ ¼ x3 jc2 ðuÞ ¼ x1 Þ Pðc1 ðuÞ ¼ x3 jc2 ðuÞ ¼ x2 Þ A Pðc1 ðuÞ ¼ x3 jc2 ðuÞ ¼ x3 Þ ð6Þ

where βS is a non-negative parameter controlling the spatial dependence; I(•) is an indicator function which equals to 1 if c (u) = c(u′) is true and 0 otherwise. The temporal energy function UT is characterized by temporal transition probabilities: UT ðc1 ðuÞ; c2 ðNT ðuÞÞÞ ¼ bT

empirically estimated from the images. Specifically, it can be specified as a global model by assuming spatial stationarity of the transition probabilities or a local model which is adaptive to the spatial variation of the transition probabilities. Transition probabilities can be expressed by a transition matrix T, whose (i, j)-th element Ti,j represents the probability of the change from class i to class j. In a global model, all transition probabilities are constant for all the pixels across the image, so one transition matrix is needed for the whole image. In a local model, transition probabilities vary at each pixel, so one transition matrix is needed for each pixel. Consider a classification of three classes {ω1, ω2, ω3}, let us denote T(•) and T(u) as global and local transition matrices from T2 to T1 respectively. The global model specifies the constant transition probabilities from T2 to T1 for the whole image with T(•) in (5), where c1(•) and c2(•) represent the labels of an arbitrary pixel at T1 and T2 respectively.

Pðc1 ðu VÞjc2 ðuVÞÞ;

In the following, three different transition probability models are discussed in detail. For the sake of brevity, we will only explicitly define the transition probability models from T2 to T1 (global model: T(•) and local model: T(u)). The transition probability models from T1 to T2 (global model: T†(•) and local model: T†(u)) can be similarly defined.

ð4Þ

u VaNT ðuÞ

where P(c1(u′)|c2(u′)) is the transition probability of c2(u′)⇒c1(u′); βT is a non-negative parameter controlling the temporal dependence. The transition probability P(c1(u′)|c2(u′)) is the major component pertaining to the temporal dependences in the MRF model. The specification of P(c1(u′)|c2(u′)) is therefore crucial to correctly determine the contribution of temporal dependence to multi-temporal change detection. This formulation is advanced in this paper through further consideration of different transition probability models (Section 3.2). As we adopt a mutual MRF approach, the temporal dependence for an arbitrary pixel u at time T2 is similarly defined as above. Therefore, the spatial-temporal dependence of a pixel u at time T2 given its spatial-temporal neighbors can be similarly defined as analogy to (1)–(4). 3.2. Transition probability models The transition probability model in Eq. (4) can be characterized in various ways. It can be based on expert opinion or

3.2.1. Model 1: Global transition model In the global transition model, the transition probabilities are constant for every pixel and estimated empirically from the images by comparing classified images. From the classified images, the total number of pixels with specific class label ω1 at T2 and the total number of pixels with specific transitions (from class label ω1 at T2 to class label ω1 at T1) can be counted. The (i, j)-th element in T(•), P(c1(•) =ωj|c2(•) = ωi), where ωi, ωj ∈ {ω1, ω2, ω3}, is then estimated as the proportion of pixels that are assigned a class label ωi at T2 and class label ωj at T1 among all pixels that are assigned a class label ωi at T2 as in Eq. (7).   P c1 ð•Þ ¼xj jc2 ð•Þ ¼ xi  P I c1 ðuÞ ¼ xj AND c2 ðuÞ ¼ xi P ¼ u u I ð c 2 ð uÞ ¼ x i Þ

ð7Þ

The estimation starts from the initial non-contextual classification and is updated using contextual classification iteratively (Section 3.3). With Eq. (7), we complete the specification of the global model defined in Eq. (5).

D. Liu et al. / Remote Sensing of Environment 112 (2008) 2222–2231

3.2.2. Model 2: Locally adjusted global transition model The diagonal elements in transition matrix T(•) in Eq. (5) represent the probabilities of no-changes whereas the off-diagonal elements specify the probabilities of changes. The global transition model applies the same transition probabilities to the whole image no matter whether changes occur or not. This may not be appropriate because pixels in area where changes occur should have smaller values for the diagonal elements and larger values for the off-diagonal elements compared to pixels in area with no changes. In the locally adjusted global transition model, we incorporate pixel-wise change probability into the global transition model to make it a local model. In doing so, we first estimate the pixel-wise probability that a change occurs at any pixel u, denoted by P(u), by a logistical function using some difference bands from multi-temporal images. This gives us an approximate idea of where change occurs, thus allowing the transition probabilities to be adjusted locally. Then, we multiply all off-diagonal elements in T(•) by the probability of change P(u) and all diagonal elements by the probability of non-change (1 − P(u)). This will improve the global transition model in that pixels with high probability of change will receive higher values in change transitions (off-diagonal elements) and lower values in non-change transitions (diagonal elements). The resultant matrix is denoted by T(u) in Eq. (8) as it depends on each pixel u in terms of its change probability. To make the sum of each row in modified transition probability matrix T(u) equal to 1, we normalize each row by dividing each element by its row sum in Eq. (8). 0

T ðuÞ ¼ @

T1;1 ð•Þ  ð1  PðuÞÞ T2;1 ð•Þ  PðuÞ T3;1 ð•Þ  PðuÞ

T1;2 ð•Þ  PðuÞ T2;2 ð•Þ  ð1  PðuÞÞ T3;2 ð•Þ  PðuÞ

1 T1;3 ð•Þ  PðuÞ T2;3 ð•Þ  PðuÞ A T3;3 ð•Þ  ð1  PðuÞÞ

ð8Þ

3.2.3. Model 3: Pixel-wise transition model The locally adjusted global transition model achieves local adaptation through a pixel-wise change probability. This model is partially local in that it still relies on the global transition probabilities and does not make any adaptation to different types of changes. We could make the transition model completely local by estimating the pixel-wise transition probabilities explicitly. This essentially requires the pixel-wise joint probability of the class labels at two dates to be estimated from the images. Assuming Gaussian models for the images, we estimate the pixelwise joint probability P(c1(u) =ωj, c2(u) =ωi), where ωi, ωj ∈ {ω1, ω2, ω3}, through maximum likelihood estimation with the spectral observations of training pixels. From P(c1(u) =ωj, c2(u) =ωi), the marginal probability P(c2(u) =ωi) can be derived by summing all P(c1(u) =ωj|c2(u) =ωi) over c1(u) ∈ {ω1, ω2, ω3}. Then, the transition probability P(c1(u) =ωj, c2(u) =ωi) is estimated as the conditional probability by dividing the joint probability by the marginal probability as in Eq. (9).   P c1 ðuÞ ¼xj jc2 ðuÞ ¼ xi  P c1 ðuÞ ¼ xj ;c2 ðuÞ ¼ xi   ¼P ð9Þ c1 ðuÞ P c1 ðuÞ ¼ xj ;c2 ðuÞ ¼ xi Eq. (9) corresponds to the (i, j)-th element Ti,j(u), with which we complete the specification of the local model defined in Eq. (6).

2227

3.3. Spatial-temporal change detection The spatial-temporal change detection algorithm developed here is also classification-based. However, it differs from the conventional post-classification comparison in that the multitemporal images are not independently classified but are jointly classified with spatial-temporal dependence incorporated into the classification process using MRF. With the spatial-temporal classification, change detection is followed by comparing the jointly classified images. Specifically, the spatial-temporal classification makes use of the Bayes rule, MAP (maximum a posterior), so that each pixel is assigned to the class label which maximizes the posterior probability. Considering a pixel indexed by u observed at T1, its posterior probability is further divided into two components: 1) the spectral conditional probability, P(c1(u)|d1(u)), which is modeled by a Gaussian distribution of the spectral data d1(u), and 2) the spatialtemporal conditional probability, P(c1(u)|c1(NS(u)), c2(NT(u))), which is modeled by MRF. Following Liu et al. (2006), the posterior probability is expressed in terms of three different energy functions: Pðc1 ðuÞjd1 ðuÞ; c1 ðNS ðuÞÞ; c2 ðNT ðuÞÞÞ 1 ¼ exp f½UD ðc1 ðuÞ; d1 ðuÞÞ þ US ðc1 ðuÞ; c1 ðNS ðuÞÞÞ þ UT ðc1 ðuÞ; c1 ðNT ðuÞÞg; Z ð10Þ

where D, S, and T stand for spectral data, spatial dependence, and temporal dependence respectively; UD(c1(u), d1(u)) is the spectral energy function, and estimated as −ln[P(c1(u)|d1(u))]; US(c1(u), c1(NS(u))) is the spatial energy function, and defined by Eq. (3); UT(c1(u), c1(NT(u))) is the temporal energy function, and defined by Eq. (4), in which P(c1(u′)|c2(u′)) is specified according to one of the three models described in Section 3.2; Z is a normalization term to make the posterior probability range from 0 to 1. The classification is solved by an iterative optimization algorithm, called Iterated Conditional Modes (ICM) (Besag, 1986), c1 ðuÞðtþ1Þ ¼ arg sup

n  o P c1 ðuÞjd1 ðuÞ; c1 ðNS ðuÞÞðtÞ ; c2 ðNT ðuÞÞðtÞ ;

ð11Þ

where the MAP solution of pixel u at (t + 1)-th iteration is conditioned on its spatial-temporal neighbors at the t-th iteration. The ICM starts with an initial non-contextual classification by maximizing the spectral conditional probability P(c1(u)|d1(u)), which is essentially a maximum likelihood classification. This will generate the initial spatial-temporal neighbors, from which ICM proceeds to update the classification by integrating the three energy functions from the spectral, spatial and temporal components until convergence is obtained. Model parameters involved in the spatial and temporal energy functions are crucial to determine the importance of different energy functions, and are estimated by genetic algorithms using the overall accuracy as the objective functions (Brandt et al., 1999). 3.4. Model evaluations To evaluate the performances of different transition probability models in accommodating the spatial variability of transition probabilities, it is necessary to test the models with

2228

D. Liu et al. / Remote Sensing of Environment 112 (2008) 2222–2231

Table 1 The means of accuracy indices Models

Model-0 Model-1 Model-2 Model-3

Overall

Deforestation

1990 (%)

2001 (%)

PCC (%)

User (%)

Producer (%)

Average (%)

94.3 95.1 95.2 95.8

91.7 93.4 93.0 94.8

87.5 90.1 90.2 92.0

81.8 78.4 79.8 85.7

82.6 86.4 88.6 87.1

82.2 82.4 84.2 86.4

PCC: post-classification comparison, Model-0: non-contextual model, Model-1: global transition model, Model-2: locally adjusted global transition model, Model-3: pixel-wise transition model.

different local regions with spatially varying transition probabilities. In doing so, we divide the whole image scene (TM, ETM+, and reference map) into 32 equal sized sub-scenes of 510 pixels by 510 pixels (Fig. 2). These 32 sub-scenes exhibit a wide range of spatial patterns of varying deforestation probabilities, allowing us to evaluate the performances of our local transition probability models. At each sub-scene, change detection results based on different models are obtained using 1% of reference pixels randomly selected from sub-scene reference map as training data and then evaluated against the full reference sub-scene map. The model evaluation will generate six different accuracy indices at each sub-scene namely: -

the overall classification accuracy of 1990 image the overall classification accuracy of 2001 image the overall post-classification comparison accuracy the user's accuracy of deforestation the producer's accuracy of deforestation the average of user's and producer's accuracies of deforestation.

There are two reasons that we also calculate the three accuracy indices for deforestation: 1) the detection of deforestation is our primary interest in this paper, and 2) as Fig. 2(c) indicates, the overall change detection accuracy will be dominated by the non-deforestation pixels (about 75% of all pixels are no-change), which will mask out the differences in different local transition probability models. The means and standard deviations of the six accuracy indices across the 32 subscenes are calculated as overall measures to evaluate the performances of different models in accommodating to local variation in transition probabilities. In addition to the six accuracy indices, the means of MRF parameters on spatial and temporal energy functions for both years across the 32 sub-scenes are also calculated for further comparison of model performances.

which six accuracy indices are calculated at each sub-scene. The means and standard deviations of these accuracy indices over the 32 sub-scenes are reported in Tables 1 and 2 respectively. The means of MRF parameters on spatial and temporal energy functions for both years are reported in Table 3. The four models evaluated are denoted as model-0, model-1, model-2, and model-3. Model-0 refers to change detection by comparing two independent classifications, which we call non-contextual model. The evaluation of this non-contextual model in this paper serves as a benchmark result with which the other three models will be compared, as it is the most widely used multicategory change detection algorithm. Model-1, model-2, and model-3 correspond to the spatial-temporal change detections by comparing two joint classifications linked by a global transition probability model, a locally adjusted global transition probability model, and a pixel-wise transition probability model respectively. 4.1. Means of accuracy indices For classification results, all the models achieved N 90% accuracies for both 1990 and 2001. Comparatively, the accuracy for 1990 was consistently higher than that for 2001 for all the models. This may be attributed to two facts: 1) many forest areas in 1990 were cleared and converted to farmland in 2001, and 2) there was some spectral confusion between farmland and forest. Model-wise comparison indicated that: model-0 had the lowest accuracies in both years; model-1 and model-2 were in the middle and had close accuracies; model-3 achieved the highest accuracies for both years. This result was not surprising because model-0 did not consider any useful spatial and temporal information whereas all other three models used spatial-temporal contextual information in classification. It is interesting to note that classification in 2001 benefited more from the use of spatial-temporal information in terms of more improvements from model-0 to the other three models. This may be explained by that the temporal information conveyed from 1990 to 2001 was more reliable than that from 2001 to 1990, due to the higher classification accuracy in 1990 than that in 2001. The performances of the four models in the overall accuracy of post-classification comparison followed the same pattern as in the individual classification accuracy. Compared to model-1 and model-2, model-3 improved more upon model-0,

Table 2 The standard deviations of accuracy indices Models

4. Results The images of 1990 (TM) and 2001 (ETM+) at each subscene were classified into two classes (forest and non-forest) either independently or jointly using three different transition models. Change detection was then derived through postclassification comparison. In total, four change detection results were generated based on different classification models, from

Model-0 Model-1 Model-2 Model-3

Overall

Deforestation

1990 (%)

2001 (%)

PCC (%)

User (%)

Producer (%)

Average (%)

2.3 2.3 1.9 1.7

2.9 2.9 2.4 1.7

4.1 3.4 2.9 2.2

6.6 16.5 7.9 5.0

9.7 5.2 6.3 6.9

6.8 9.5 5.8 4.8

PCC: post-classification comparison, Model-0: non-contextual model, Model-1: global transition model, Model-2: locally adjusted global transition model, Model-3: pixel-wise transition model.

D. Liu et al. / Remote Sensing of Environment 112 (2008) 2222–2231 Table 3 The means of model parameters

2229

4.3. Means of model parameters

Models

βS (1990)

βT (1990)

βS (2001)

βT (2001)

Model-1 Model-2 Model-3

2.6 1.9 2.3

0.8 1.5 2.3

2.4 3.3 0.9

0.7 0.5 6.4

Model-1: global transition model, Model-2: locally adjusted global transition model, Model-3: pixel-wise transition model.

indicating the effectiveness of pixel-wise transition model in modeling temporal dependence. In terms of the accuracy for deforestation, model-1 achieved 3.8% improvements in producer's accuracy but at a cost of 3.4% decreases in user's accuracy on the basis of model-0, resulting in nearly no improvement on the average of the two indices. Model-2 gained 6% in producer's accuracy but only lost 2% in user's accuracy, resulting in 2% improvement in average accuracy over model-0. In contrast, model-3 improved both producer's and user's accuracies, resulting in 4.2% improvement in average accuracy upon model-0. The fact that model-1 and model-2 traded commission error with omission error may indicate that the global model tended to overestimate the transition probability from forest to non-forest in many pixels. 4.2. Standard deviations of accuracy indices The standard deviations of accuracy indices were calculated to evaluate the consistencies of the different models over the 32 sub-scenes. In general, smaller standard deviations in Table 2 corresponded to higher mean accuracy indices in Table 1. Specifically, model-0 had the largest standard deviation in the overall classification and post-classification comparison accuracies, indicating the spatial variation of spectral confusion between forest area and agriculture land. Incorporating spatialtemporal dependence helped to reduce the spectral confusion, thus reduced the variability in accuracy indices when comparing model-0 with all other three contextual models. The comparison among the three transition probability models is of more interests given that the main objective of the paper is to compare the two local transition models with the global transition model. Model-1 had larger standard deviations in almost all of the accuracy indices compared to model-2 and model-3. Particularly, the standard deviation in user's accuracy of deforestation in model-1 was 16.5%, which was much larger than that in model-2 (7.9%) and in model-3 (5.0%). This means that model1 performed very well in some of the sub-scenes and very bad in the other sub-scenes. By adding a local component (pixel-wise probability of change) to the global transition model, model-2 achieved much smaller standard deviations. The use of fully local transition model in model-3 led to further reduction in the standard deviations compared to model-2. In all, the above results indicated that 1) the global transition model was insufficient to adapt to the spatial heterogeneity in the probability of deforestation; 2) both locally adjusted global model and pixel-wise transition model were more adaptive to the spatial variation of transition probabilities.

The model parameters listed in Table 3 were their means over the 32 sub-scenes. As model-0 is a non-contextual model, its model parameters can be thought as zeros so the spatial and temporal information has zero contributions to the model. From Eq. (10), it is easy to show that the parameter for spectral data is defaulted as 1, so the model parameters (βS and βT) reflect the relative weights of the spatial and temporal dependence with respect to spectral information. For the image of 1990, the spatial weight in model-1 was larger than 1 and three times larger than the temporal weight which was less than 1, indicating temporal information was not as useful as spatial information; the spatial and temporal weights in model-2 and model-3 were about the same and both larger than 1. For the image of 2001, the comparisons of spatial and temporal weights among the three models became very interesting. Both model-1 and model-2 had much larger spatial weights than their temporal ones; this may be attributed to the inaccurate temporal information contained in the global transition models, so small weights were found by the genetic algorithm to limit the influence of inaccurate temporal information. In contrast, model-3 had much larger temporal weight than that in model-1 and model-2. This indicated that the pixel-wise transition model contained much more accurate temporal information in terms of the improved spatial adaptation, which greatly increased the contribution of temporal information in the classification model. It was also interesting to note that the temporal weight in model3 was much larger than its spatial counterpart. This was probably due to the fact that the relatively low classification accuracy in 2001 made the spatial information not as useful as the temporal information, leading to smaller contribution from spatial information in the model. 5. Discussion and conclusions Change detection based on the comparison of independently classified images (also known as post-classification comparison) is widely used in remote sensing community. Although conceptually simple, this method can be negatively affected by the classification errors on individual maps. Incorporating spatial-temporal contextual information in the classification algorithm helps to reduce the classification errors, thus improving the change detection results. In this paper, Markov Random Fields model was used to integrate the spatial-temporal information with spectral information for multi-temporal classification in an attempt to mitigate the impacts of classification errors on change detection. One important component in the MRF model is the specification of transition probabilities. Traditionally, a global transition probability model is used that assumes spatial stationarity of transition probabilities across the image scene, which may not be valid to areas with varying transition probabilities. In this paper, we relaxed the stationarity assumption and developed two local transition probability models to make the transition model locally adaptive to the spatially varying transition probabilities. The first model called locally adjusted

2230

D. Liu et al. / Remote Sensing of Environment 112 (2008) 2222–2231

global transition model adapts to the local variation by multiplying a pixel-wise probability of change with the global transition model. When applied to the forest change detection in Paraguay, this model showed significant improvements in various accuracy indices compared to the global model and the non-contextual model. This model however is not a fully local adaptive model because it is based on the global transition model and does not differentiate between different types of change. The second model called pixel-wise transition model was developed as a fully local model based on the estimation of the pixel-wise joint probabilities. The change detection results showed that this model outperformed all other evaluated models in different accuracy indices. This model also showed higher consistency over the different regions of a large image scene in terms of the lowest standard deviations associated with the accuracy indices, indicating that the pixel-wise transition model better captured the spatial heterogeneity of the transition probabilities than the global model and the locally adjusted global model. This was further illustrated by the fact that the largest temporal weight associated with the transition probabilities was obtained from this pixel-wise transition model. Computationally, the global model is the cheapest and fastest model. The locally adaptive global model requires additional estimation of probability of change from logistic function, which is not a big concern. The pixel-wise transition model requires the estimation of joint probabilities, which may become computationally more expensive when the combination of classes increases. However, since most classification involves only a small number of classes, the computation issue is manageable. In the case of a large number of classes, the locally adaptive model seems to be a good compromise of accuracy and computation. In summary, we developed two local transition probability models to obtain more accurate temporal information for forest change detection. The proposed models were incorporated into a MRF model as temporal energy functions within a spatialtemporal classification framework for change detection. The results from the application of the proposed models to the forest change detection in Paraguay showed their superior performances in identifying the change from forest to non-forest compared to the global transition model and non-contextual model. This indicates that the local transition probability models can present the temporal information more accurately in change detection algorithms based on spatial-temporal classification of multi-temporal images. Acknowledgements This study was made possible through the following NASA programs: REASoN (NNG 04GC53A), Land Cover Land Use Change (NAG5-9337), and Advancing Collaborative Connections for Earth–Sun System Science (NNH05ZDA001N-ACCESS). Peng Gong was supported by an NNSFC grant (30590370) and a one hundred of talented people grant from the Chinese Academy of Sciences. The paper was strengthened with the comments of three anonymous reviewers.

References Alves, D. S., Pereira, J. L. G., Sauza, C. L., Soares, J. V., & Yamaguchi, F. (1999). Characterizing landscape changes in central Rondonia using Landsat TM imagery. International Journal of Remote Sensing, 20, 2877−2882. Besag, J. (1986). On the statistical analysis of dirty pictures. Journal of the Royal Statistical Society B, 48, 259−302. Brandt, C. K. Tso, & Mather, Paul M. (1999). Classification of multisource remote sensing imagery using a genetic algorithm and Markov Fandom Fields. IEEE Transaction on Gescience and Remote Sensing, 37(3), 1255−1260. Chiarello, A. G. (1999). Effects of fragmentation of the Atlantic forest on mammal communities in south-eastern Brazil. Biological Conservation, 89, 71−82. Cook, A. G., Janetos, A. C., & Hinds, W. T. (1990). Global effects of tropical deforestation: Towards an integrated perspective. Environmental Conservation, 17, 201−212. Cressie, N. A. C. (1993). Statistics for spatial data (2nd ed.). New York: Wiley. DeFries, R., Houghton, R. A., Hansen, M., Field, C., Skole, D. L., & Townshend, J. (2002). Carbon emissions from tropical deforestation and regrowth based on satellite observations for the 1980s and 90s. Proceedings of the National Academy of Sciences, 99(22), 14256−14261. Dixon, R. K., Brown, S., Houghton, R. A., Solomon, S. M., Trexler, M. C., & Wisniewski, J. (1994). Carbon pools and flux of global carbon forest ecosystems. Science, 263, 185−190. Dubes, R. D., & Jain, A. K. (1989). Random field models in image analysis. Journal of Applied Statistics, 16, 131−164. Fearnside, P. M. (1996). Amazonian deforestation and global warming: carbon stocks in vegetation replacing Brazil’s Amazon forest. Forest Ecology and Management, 80, 21−34. Gong, P., LeDrew, E., & Miller, J. (1992). Registration noise reduction in difference images for change detection. International Journal of Remote Sensing, 13(4), 773−779. Gong, P., & Xu, B. (2003). Remote sensing of forests over time: Change types, methods, and opportunities. In M. Woulder & S.E. Franklin (Eds.), Remote sensing of forest environments: Concepts and case studies (pp. 301−333). Amsterdam, Netherlands: Kluwer Press. Hansen, M., & DeFries, R. (2004). Detecting long term forest change using continuous fields of tree cover maps from 8 km AVHRR data for the years 1982–1999. Ecosystems. doi:10.1007/s10021-004-0243-3 Houghton, R. A. (1991). Tropical deforestation and atmospheric carbon cycle. Climate Change, 19, 99−118. Huang, C., Kim, S., Altstatt, A., Townshend, J. R. G., Davis, P., Song, K., Tucker, C. J., Rodas, O., Yanosky, A., Clay, R., & Musinsky, J. (2007). Rapid loss of Paraguay's Atlantic forest and the status of protected areas — A Landsat assessment. Remote Sensing of Environment, 106(4), 460−466. Keller, M., Jacobs, D. J., Wofsy, S. C., & Harris, R. C. (1991). Effects of tropical deforestation on global and regional atmospheric chemistry. Climate Change, 19, 139−158. Laurance, W. F., & Bierregaard, R. O. (1997). Tropical Forest Remnants: Ecology, Management and Conservation of Fragmented Communities. Chicago, USA: University of Chicago Press (409 pp.). Laurance, W. F., Ferreira, L. V., Rankin-de Merona, J. M., & Laurance, S. G. (1998). Rainforest fragmentation and the dynamics of Amazonian tree communities. Ecology, 79, 2032−2040. Laurance, W. F., Laurance, S. G., Ferreira, L. V., Rankin-de Merona, J. M., Gascon, C., & Lovejoy, T. E. (1997). Biomass collapse in Amazonian forest fragments. Science, 278, 1117−1118. Li, S. Z. (2001). Markov random field modeling in image analysis. Springer. Liu, D., Kelly, M., & Gong, P. (2005). Classifying multi-temporal TM imagery using Markov random fields and support vector machines. Proceedings of the third international workshop on the analysis of multitemporal remote sensing images (pp. 225−228). Liu, D., Kelly, M., & Gong, P. (2006). A spatial-temporal approach to monitoring forest disease spread using multi-temporal high spatial resolution imagery. Remote Sensing of Environment, 101(2), 167−180. Lu, D., Mausel, P., Brondizio, E., & Moran, E. (2004). Change detection techniques. International Journal of Remote Sensing, 25(12), 2365−2407.

D. Liu et al. / Remote Sensing of Environment 112 (2008) 2222–2231 Melgani, F., & Serpico, S. B. (2003). A Markov random field approach to spatiotemporal contextual image classification. IEEE Transactions on Geoscience and Remote Sensing, 41(11), 2478−2487. Mertens, B., & Lambin, E. F. (1997). Spatial modeling of deforestation in Southern Cameroon. Applied Geography, 17, 143−162. Mittermeier, R. A., Myers, N., & Mittermeier, C. G. (1999). Hotspots: Earth's biologically richest and most endangered terrestrial ecoregions (pp. 430). Washington, DC: CEMEX and Conservation International. Myers, N. (1988). Threatened biotas: “Hot spots” in tropical forests. Environmentalist, 8, 187−208. Olson, D. M., & Dinerstein, E. (2002). The global 200: Priority ecoregions for global conservation. Annals of the Missouri Botanical Garden, 89, 199−224. Pimm, S. L., Russell, G. J., Gittleman, J. L., & Brooks, T. M. (1995). The future of biodiversity. Science, 269, 347−350. Serra, P., Pons, X., & Sauri, D. (2003). Post-classification change detection with data from different sensors: some accuracy considerations. International Journal of Remote Sensing, 24(16), 3311−3340.

2231

Singh, A. (1989). Digital change detection techniques using remotely-sensed data. International Journal of Remote Sensing, 10(6), 989−1003. Skole, D., & Tucker, C. (1993). Tropical deforestation and habitat fragmentation in the Amazon: Satellite data from 1978 to 1988. Science, 260, 1905−1910. Solberg, A. H. S., Taxt, T., & Jain, A. K. (1996). A Markov random field model for classification of multisource satellite imagery. IEEE Transactions on Geoscience and Remote Sensing, 34(1), 100−113. Steininger, M. K., Tucker, C. J., Townshend, J. R. G., Killeen, T. J., Desch, A., Bell, V., & Ersts, P. (2001). Tropical deforestation in the Bolivian Amazon. Environmental Conservation, 28(2), 127−134. Townshend, J. R. G., Justice, C. O., Gurney, C., & McManus, J. (1992). The impact of misregistration on change detection. IEEE Transactions on Geoscience and Remote Sensing, 30(5), 1054−1060. Tucker, R. P., & Richards, J. F. (1983). Global deforestation and the nineteenth century world economy.Durham, USA: Duke University Press. 210 pp. Williams, M. (1989). Deforestation: past and present. Progress in Human Geography, 13, 176−208.