Change Detection Based on Multi-Feature Clustering Using ... - MDPI

3 downloads 0 Views 7MB Size Report
Oct 21, 2018 - Interferences in Landsat images include Gaussian noise and local ... statistical distribution of the changed and unchanged classes. ... the supervised methods, while there is no need for training data in .... DI can be mapped to a data point in the feature space according to its ...... Without loss of generality, the.
remote sensing Article

Change Detection Based on Multi-Feature Clustering Using Differential Evolution for Landsat Imagery Mi Song , Yanfei Zhong *

and Ailong Ma *

State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430079, China; [email protected] * Correspondence: [email protected] (Y.Z.); [email protected] (A.M.) Received: 31 August 2018; Accepted: 16 October 2018; Published: 21 October 2018

 

Abstract: Change detection (CD) of natural land cover is important for environmental protection and to maintain an ecological balance. The Landsat series of satellites provide continuous observation of the Earth’s surface and is sensitive to reflection of water, soil and vegetation. It offers fine spatial resolutions (15–80 m) and short revisit times (16–18 days). Therefore, Landsat imagery is suitable for monitoring natural land cover changes. Clustering-based CD methods using evolutionary algorithms (EAs) can be applied to Landsat images to obtain optimal changed and unchanged clustering centers (clusters) with minimum clustering index. However, they directly analyze difference image (DI), which finds itself subject to interference by Gaussian noise and local brightness distortion in Landsat data, resulting in false alarms in detection results. In order to reduce image interferences and improve CD accuracy, we proposed an unsupervised CD method based on multi-feature clustering using the differential evolution algorithm (M-DECD) for Landsat Imagery. First, according to characteristics of Landsat data, a multi-feature space is constructed with three elements: Wiener de-noising, detail enhancement, and structural similarity. Then, a CD method based on differential evolution (DE) algorithm and fuzzy clustering is proposed to obtain global optimal clusters in the multi-feature space, and generate a binary change map (CM). In addition, the control parameters of the DE algorithm are adjusted to improve the robustness of M-DECD. The experimental results obtained with four Landsat datasets confirm the effectiveness of M-DECD. Compared with the results of conventional methods and the current state-of-the-art methods based on evolutionary clustering, the detection accuracies of the M-DECD on the Mexico dataset and the Sardinia dataset are very close to the best results. The accuracies of the M-DECD in the Alaska dataset and the large Canada dataset increased by about 3.3% and 11.9%, respectively. This indicates that multiple features are suitable for Landsat images and the DE algorithm is effective in searching for an optimal CD result. Keywords: change detection; detail enhancement; differential evolution; multi-feature clustering; noise robust; remote sensing imagery; structural similarity

1. Introduction Human activities and frequent natural disasters accelerate natural land-cover changes on the Earth. Over the past few decades, remote sensing image CD has become the primary solution to effectively monitor changes in large areas. CD is a process that aims at identifying differences in land cover by analyzing the multi-temporal images acquired in the same geographical area [1]. Timely and accurate CD of natural land cover can provide foundational data for disaster assessment, environmental protection, sustainable development, and maintenance of ecological balance [2,3]. The Landsat series of satellites has been providing continuous observation of the Earth’s surface since 1972; its spatio-temporal historical extension is ripe with information that detect land cover changes. Furthermore, the Landsat program has launched seven satellites (Landsat 1, 2, 3, 4, 5, 7, 8) and Remote Sens. 2018, 10, 1664; doi:10.3390/rs10101664

www.mdpi.com/journal/remotesensing

Remote Sens. 2018, 10, 1664

2 of 26

is equipped with five types of sensors—Multispectral Scanner (MSS), Thematic Mapper (TM), Enhanced Thematic Mapper (ETM+), Operational Land Imager (OLI), and Thermal Infrared Sensor (TIRS)—which have moderate spatial resolution (15–80 m) and short revisit times (16–18 days). These sensors are suitable for observing natural land cover, such as water, soil, coastlines, and vegetation. In addition, since 2008, the US Geological Survey (USGS) has provided more than 11 million current and historical Landsat images free of charge to users over the Internet (https://earthexplorer.usgs.gov/), with ease of access. Thus, Landsat imagery has been widely used in monitoring the natural land cover changes caused by floods [4], forest fires [5,6], and glacial melting [7]. Interferences in Landsat images include Gaussian noise and local brightness distortion; it is not sensitive to the changes of small objects. On one hand, the process of acquiring a remote sensing image—from acquisition to transmission—is inevitably affected by different degrees of noise, e.g., speckle noise in SAR images and Gaussian noise in optical satellite images. On the other hand, Landsat satellites carry passive sensors that record electromagnetic energy; however, the passive sensors are sensitive to the difference in illumination and atmospheric conditions. Therefore, the multi-temporal Landsat images generally have a difference in local brightness [8]. In addition, since intra-class variation and inter-class variation in the Landsat image are relatively small, the expression of changed details is weak. As the spatial resolution of the Landsat image is not high (≥15 m) and the intra-class variability in the image is small, pixel-based methods are suitable for Landsat image CD, which can be further divided into threshold-based or classification-based [3]. In the threshold-based method [9–13], it is crucial to analyze the statistical distribution of DI and find the optimal threshold to distinguish changed pixels from the unchanged ones. For example, Kittler and Illingworth proposed a minimum-error thresholding algorithm (KI), which determines the threshold by optimizing the average pixel classification error rate [11]. The expectation maximization (EM) algorithm automatically selects the decision threshold by minimizing the overall CD error probability under the assumption that pixels in the DI are independent [9]. Zanetti constructed a compound model to describe the statistical distribution of the DI, and the threshold value was determined by the EM algorithm and Bayes decision [12]. Threshold-based methods are efficient and useful. However, they require an accurate estimation of the DI statistical distribution and decision threshold, and are sensitive to image interferences. The Markov random field (MRF) model was introduced to integrate the spatial-contextual information included in the neighborhood of each pixel, and it can remove small false detection regions [9]. However, MRF also requires the selection or estimation of a model for the statistical distribution of the changed and unchanged classes. As CD is essentially a classification problem, there exist in literature both supervised and unsupervised approaches (clustering-based) to address it [14–21]. A set of training data is required for the supervised methods, while there is no need for training data in the clustering-based methods. Thus, the clustering-based methods are commonly applied in real applications. Clustering is a classification process under the assumption that samples of the same cluster are more similar to one another than samples belonging to different clusters. In clustering-based CD methods, because the image pixels belonging to different clusters cannot be separated with sharp boundaries, one of the most popular methods is the fuzzy c-means (FCM) algorithm [17–21]. The conventional FCM algorithm is not robust to image noise and local brightness distortion because the spatial-contextual information is not taken into account [17]. Krindis and Chatzis proposed a fuzzy local information c-means clustering algorithm (FLICM), which uses a fuzzy local similarity measure to resist image noise. In particular, a fuzzy factor is introduced into its objective function to enhance the clustering performance [18]. Gong et al. proposed a reformulated fuzzy local-information c-means clustering algorithm to classify the changed and unchanged regions in the fused DI [19]. However, fuzzy clustering-based CD methods are sensitive to the initialization of the clusters and easily get stuck to the local minima because the distribution of DI is complex due to image interferences.

Remote Sens. 2018, 10, 1664

3 of 26

Since EAs have promising optimization ability, which can jump out of the local minima, EAs-based optimal clustering has been applied to band selection [22], image classification [23–25], and change detection [8,20,21,26–32]. Celik used the genetic algorithm (GA) to generate optimal CM with minimum mean square error (MSE) [20]. This method is suitable for CD on different types of satellite images. However, the algorithm is sensitive to image interferences and converges slowly due to its encoding strategy, which is based on the whole binary mask. Ghosh et al. proposed a context-sensitive CD method, which adopts a mean filter to remove the image noise in DI and uses GA to evolved the changed and unchanged clusters [21]. Li et al. proposed an evolutionary multi-objective fuzzy clustering method, which can obtain a set of clusters with different trade-offs between removing noise and preserving detail [28]. However, filtering will blur the weak changed details, especially on boundaries or edges, and it cannot resist local brightness distortion in the multi-temporal images. Thus, although the above EAs-based methods can obtain optimal clusters, they are overly dependent on the DI and are affected by image interferences, resulting in inaccurate detection results. In order to reduce image interferences and improve CD accuracy, we propose an unsupervised CD method based on multi-feature clustering using DE algorithm (M-DECD) for Landsat Imagery. In M-DECD, according to characteristics of Landsat images, the Wiener filter, edge detection and structural similarity (SSIM) index are used to extract spectral-spatial features from the multi-temporal images and DI. To obtain optimal clusters in the multi-feature space, a CD method based on DE algorithm and FCM is proposed and an adaptive strategy adopted to adjust DE’s control parameters. Compared to previous works, the major contributions of this paper are as follows. (1)

(2)

(3)

The optimal multi-feature clustering-based CD framework. This framework is composed of two steps. In the first step, a suitable multi-feature space is constructed according to characteristics of the data. In the second step, an EA is adopted to search for optimal CD clusters using the minimum clustering index. The binary CM can be generated according to the distance between the pixels and the clusters. A multi-feature space for Landsat images CD. Three complementary spectral-spatial features are extracted from the multi-temporal images and DI, including the Wiener de-noising, structural similarity, and detail enhancement, which are designed to resist the interference of the Gaussian noise and local brightness distortion, while preserving weak changes. Adaptive DE algorithm for the optimization of CD results. Since the DE algorithm has promising global optimization capabilities, it is employed to search for the optimal changed and unchanged clusters in the complex multi-feature space. The two clusters are encoded into an individual, and the objective function derived from the FCM is utilized to evaluate the quality of clusters. The genetic operators, i.e., crossover, mutation and selection are used for evolutionary processes. In addition, the control parameters of DE, i.e., the scaling factor F and crossover rate CR, which have significant influence on optimization performance, have been adaptively adjusted to improve the robustness of the proposed method.

The experimental results show that the M-DECD is robust to interferences of Landsat images, and it can preserve some weak changed edges of small objects. Compared with the conventional methods and the state-of-the-art methods based on evolutionary clustering, M-DECD can generate a better CM and improve detection accuracy, especially in the Canada dataset. This paper is organized as follows. Section 2 briefly introduces the FCM algorithm and the basic DE algorithm. Section 3 presents the methodology of the proposed M-DECD, including the construction of multi-feature space and the adaptive DE algorithm. Dataset description, experimental results, and parameters analyses are presented in Section 4. Finally, the conclusions are drawn in Section 5.

Remote Sens. 2018, 10, 1664

4 of 26

2. Background 2.1. Fuzzy Clustering-Based Change Detection Each pixel of the DI can be mapped to a data point in the feature space according to its gray value. Clustering-based CD approaches attempt to divide the DI into the changed and unchanged classes by searching two clusters in the feature space. Since the pixels of DI belonging to the two clusters are highly overlapped in the feature space, the fuzzy clustering is more suitable to partition the DI than the crisp clustering in which the data points are divided into distinct clusters by hard boundaries, such as k-means. In fuzzy clustering, data points can potentially belong to multiple clusters, which have robust characteristics for ambiguity. Thus, fuzzy clustering is effective in separating the overlapping clusters. One of the most widely used fuzzy clustering algorithm is the FCM algorithm [3,17–21,28,31]. It is put in a tabular form in Table 1. Table 1. FCM algorithm. Input unlabeled data, output clusters, fuzzy membership matrix. Set number of classes, fuzzy component, termination condition. Step 1.

Initialize clusters randomly and compute the membership matrix U (0) using (2)

Step 2

Calculate the new clusters G (k) with fuzzy membership matrix U (k) using (3)

Step 3.

Update the fuzzy membership U (k) → U (k+1) using (2)

Step 4.

Repeat Step 2 and Step 3, until the termination condition is met.

FCM is an iterative method that searches the clusters and produces the partition by minimizing the objective function Jm distance [3]. N

Jm =

c





∑ ∑ umn,k Xn − Gk

(1)

n =1 k

For the fuzzy clustering-based CD, partition c = {0, 1}. N is the size of the image, Xn denotes the n-th pixel (n-th data point in the feature space). G1 and G0 correspond to the changed and unchanged clusters, respectively. un,k is the fuzzy membership of belonging of pixel Xn to the cluster G k . m is the fuzzy component. k · k represents the Euclidean distance. The membership un,k and the clusters G k are updated by using Equations (2) and (3), respectively. −1/(m−1)

un,k =

k Xn − G k k c

−1/(m−1) ∑ Xn − G k

(2)

k

N

∑ um n,k Xn

G (k) =

n =1 c

(3)

∑ um n,k k

However, the FCM is sensitive to the initialization of the clusters and usually converges to a local minimum, which can affect detection accuracy. In order to find the optimal clusters with the minimum Jm distance, we can combine FCM with a global optimization algorithm, such as the DE algorithm. 2.2. Differential Evolution Algorithm The DE algorithm was first proposed by Storn and Price [33], and it is a real value-encoded EA with promising global optimization capability. DE often performs well in all types of optimization problems without any assumption, and it has been proven effective in image clustering [23]. Zhong et al.

Remote Sens. 2018, 10, 1664

5 of 26

proposed a clustering method based on the adaptive multi-objective DE algorithm, which could achieve high accuracy [24]. Ma et al. proposed a multi-objective memetic algorithm for remote sensing image clustering, which adopt DE in the global search step [25]. The classical DE algorithm is put in a tabular form in Table 2. DE is a population-based stochastic heuristic global search method and one of the most powerful real-parameter optimization algorithms in current use [34,35]. The purpose of DE is to search for a global optimal point, which can minimize the specific optimization problem, in a D dimensional real parameter space, defined as:  min f G1 , . . . , Gj , . . . , GD s.t. Low j ≤ Gj ≤ U p j , j = 1, 2, . . . , D

(4)

where Gj is the j-th dimension of point G. Low j and U p j denote the minimum and maximum boundaries of the j-th dimension of the real parameter space. DE begins with a randomly initiated population of NP real-valued vectors, and evolves them through evaluation, mutation, crossover, and selection. The crucial idea of DE is to perturb the current generation population members with the scaled differences of randomly selected and distinct population members. After each generation, DE preserves the elitist vectors and abandons the inferior ones. Each vector is then considered the target vector and the evolutionary process is repeated until the termination condition is met, e.g., reach the maximum iterations. In this paper, DE is employed to optimize the objective function of CD and search for the optimal clusters in the multi-feature space due to its simplicity, ease of implementation, fast convergence, and robustness. To the best of the authors’ knowledge, the use of DE for optimization of CD result has not been reported in the literature. Details are described in Section 3.2. Table 2. Differential evolution algorithm. Model the objective function according to the specific optimization problem.Set population size NP, termination condition, crossover rate CR, and scale factor F. Step 1.

Initialization: Initialize the population of NP real-valued parameter vectors randomly. Each vector, also known as individual, forms a candidate solution to the optimization problem.

Step 2.

Evolutionary process. 2.1. Fitness evaluation: Fitness is the objective function value of each individual, which is used to evaluate the quality of the solution. 2.2. Mutation: For each target individual, randomly choose other two individuals from the population and generate a difference vector. A mutant individual is created by adding the weighted difference vector to the target vector under the scale factor F. 2.3. Crossover: The trial vector is generated by mixing the components of the mutant vector with the target vector under the crossover rate CR. 2.4. Greedy selection: For a minimization problem, the solution is better with smaller fitness; thus the target individual will survive to the next generation if it has smaller fitness than the trial individual, and vice versa.

Step 3.

Repeat the evolutionary process until the termination condition is met, then output the optimal solution with minimum fitness value.

3. Proposed Methodology The framework of the proposed M-DECD (shown in Figure 1) is composed of two main steps: (1) Constructing a multi-feature space from the multi-temporal images and DI; and (2) adopting the adaptive DE algorithm to search for the optimal changed and unchanged clusters and generate the binary CM. For a better understanding, the meaning of commonly used symbols is listed in Table 3.

Remote Sens. 2018, 10, 1664

6 of 26

Table 3. Symbol annotation. Symbol

Meaning

XT1 ,XT2 Xd H,W CM Xedge Xw Xdetail XSSI M X F CR

multi-temporal images difference image image height and width binary change map edge feature wiener de-noised feature detail enhancement feature structural similarity feature multi-feature scale factor crossover rate

We assume that two co-registered images XT1 and XT2 of size H × W are captured over the same geographical area at two different times, T1 and T2 , respectively. The purpose of the M-DECD is to generate an accurate CM, CM = {cm(i, j)|1 ≤ i ≤ H, 1 ≤ j ≤ W }, where cm(i, j) ∈ {0, 1}. The first step of the M-DECD is to calculate a DI using multi-temporal images XT1 and XT2 . For the Landsat images, we use the images subtraction to compute the DI (Xd ), which calculates the absolute value of the spectral difference. Xd = | XT1 − XT2 |

(5)

Remote Sens. 2018, 10, 1664 Remote Sens. 2018, 10, x FOR PEER REVIEW

7 of 26 7 of 26

Figure 1.Figure Framework of the proposed M-DECD. (a) Given two registered images, the DIthe is DI generated by images subtraction. (b) (b) Three image features areare extracted 1. Framework of the proposed M-DECD. (a) Given two registered images, is generated by images subtraction. Three image features extractedtoto constructconstruct a multi-feature space, including the Wiener de-noised feature, detaildetail enhancement feature and and structural similarity feature. (c) (c) TheThe optimal clusters ininthe a multi-feature space, including the Wiener de-noised feature, enhancement feature structural similarity feature. optimal clusters the multi-feature space are obtained by the adaptive DE algorithm. (d) Generate the binary CM according to the optimal clusters and fuzzy membership matrix. multi-feature space are obtained by the adaptive DE algorithm. (d) Generate the binary CM according to the optimal clusters and fuzzy membership matrix.

Remote Sens. 2018, 10, 1664

8 of 26

3.1. Multifeature Space Construction According to the characteristics of Landsat images, a multi-feature space is constructed to resist the interferences of Gaussian noise and local brightness distortion, while preserving important weak changes. The multi-feature space consists of three complementary elements: Wiener de-noising, the detail enhancement and structural similarity, which are extracted from multi-temporal images and DI. 3.1.1. Wiener De-Noised Feature Image denoising is an important step in digital image processing, and it can effectively reduce the small false detection regions in CD. The evolutionary clustering-based CD methods apply mean filtering to the difference image and utilize the average value of the neighborhood pixels as the spatial feature [21,28]. Although the mean filter can remove some image noise, it will blur important changed details, such as boundaries or edges. For Landsat images, Gaussian noise is generated by the resistive components of the receiver. Since the Wiener filter can effectively remove Gaussian noise and retain the high-frequency information of the image (e.g., boundaries and edges), in M-DECD, we apply Wiener filtering to the DI and generate the Wiener de-noised feature. The Wiener filter removes the noise by minimizing the MSE between the output image and the expected result, and it can adaptively conduct smoothing according to the local image variance. When the local variance is large, it performs little smoothing, and vice versa. Therefore, the quality of filtering result will not decrease drastically with an increase of the filter window size [36]. The Wiener filter [37] estimates the local mean µ and variance σ2 around each pixel using Equations (6) and (7), respectively. The Wiener de-noised feature Xw is then calculated using Equation (8). η is the n × m local neighborhood of each pixel in Xd , and v2 is the average of the local estimated variances. 1 µ= Xd (i, j) (6) nm i,j∑ ∈η σ2 =

  1 2 2 X i, j − µ ( ) d nm i,j∑ ∈η

Xw (i, j) = µ +

σ 2 − v2 ( Xd (i, j) − µ) σ2

(7)

(8)

3.1.2. Detail Enhancement Feature For Landsat images, the intra-class variation and inter-class variation are relatively small, and the expression of changed details is weak, especially at the edge of the changed region. Thus, except for the noise removal, another challenge of Landsat images CD is detail preservation. In [38], a modified Perona-Malik filter was employed to remove image noise and extract stable edges for CD. In [39], a novel edge-preserving MRF method for CD was proposed, which integrated the edge and detail-preserving information in the MRF modeling. In the proposed M-DECD, the image details are enhanced by fusing the edge features with the DI. In order to obtain a full range of edge features, four edge detection operators of different directions are applied to the DI, including 0◦ , 90◦ , 180◦ , and 270◦ . The convolution templates are shown in Figure 2.

Remote Sens. 2018, 10, 1664 Remote Sens. 2018, 10, x FOR PEER REVIEW

(a)

9 of 26 9 of 26

(b)

(c)

(d)

◦ . (b) ◦ . (c) 180 ◦ . (d) 270°. Figure2.2.The Theconvolution convolutiontemplates templatesfor foredge edgedetection. detection.(a) (a)00°. (b) 90 90°. 180°. Figure 270◦ .

After edge features XedgeX, edge in order to prevent abnormal sample data possibly After extraction extractionofofmultiple multiple edge features , in order to prevent abnormal sample data arising out of the multiple features from interfering with the subsequent change judgment, all the possibly arising out of the multiple features from interfering with the subsequent change judgment, features need be normalized to [0, 1]. The DI is then combined with the edge features using the all the features need be normalized to [0, 1]. The DI is then combined with the edge features using weighted sum strategy and the detail enhancement feature Xdetail is defined as: the weighted sum strategy and the detail enhancement feature X detail is defined as: 4 4 Xdetail = Xd + ∑ Xedge (k) (9) (9) X  X  X k  detail



d k =1 k 1

edge

3.1.3. Structural Similarity Feature 3.1.3. Structural Similarity Feature In order to resist the local brightness difference between the multi-temporal Landsat images, the order resist to themeasure local brightness difference between the multi-temporal Landsat images, SSIMIn [40] was to utilized the similarity between the multi-temporal images, and is calculated the SSIM [40] was utilized to measure the similarity between the multi-temporal images, and is based on local luminance, contrast, and structure comparisons. The SSIM has been used in various calculated based contrast, and structure comparisons. SSIM [41]. has been applications, suchon as local imageluminance, compression [40], video surveillance, and object The detection In [8],used the in various applications, such as image compression [40], video surveillance, and object detection difference image was obtained by the SSIM. [41]. The In [8], the uses difference image was obtained by luminance the SSIM. l, contrast c, and structure s, to compute SSIM three components, including l c , and s , to The SSIM uses three components, including luminance the structural similarity feature XSSI M using Equation (10), where,α,contrast β, and γ are the structure exponents used    compute the structural similarity feature using Equation (10), where , , and are the X SSIM to adjust the influence of each measurement. In M-DECD, α = β = γ = 1. The variables l, c, and s are exponents using used Equation to adjust(11). the influence of each measurement. In M-DECD,       1 . The calculated variables l , c , and s are calculated using Equation (11). γ XSSI M = [l ( XT1 , XT2 )]α [c( XT1 , XT2 )]β [s( XT1 , XT2 )] (10)  X SSIM l  X T 1 , X T 2  c  X T 1 , X T 2   s  X T 1 , X T 2  (10) 2µ XT1 µ XT2    l ( X1 , X2 ) = µXT1 +µXT2 +ε 1     22σ  XX σXX 2 (11) c(XlT1 X, X X 2) = σXT1 +TT11σXT2TT2  +ε 2 1 , T2         X X 1   T1 T2   s(XT1 , XT2 ) = σXT1 XT2 σX σX +ε 3

2 XTT21 XT 2 T1  c  X T1, X T 2   (11)  where µ, σ, and σXT1 XT2 are the mean, standard deviation, cross correlation between XT1 and XT2 ,  XT 1  and XT 2   2  respectively. ε 1 , ε 2 , and ε 3 are small positive constants to avoid instability. The image statistics (µ, σ,   XT 1 XT 2 and σXT1 XT2 ) are computed as proposed in s [40]. X T1, X T 2    the local structures  XT 1 XT 2  of If XSSI M is close to 1, it indicates that 3 the two-phase images have a high similarity. On the contrary, if XSSI M is close to 0, it means that the local structures of the two-phase  , quite  X T 1 X T 2 are the mean, standard deviation, and cross correlation between X T 1  , and where are images different. extracting the1Wiener de-noised feature, the detailconstants enhancement feature and theThe structural  3 are and After ,  2 , and small positive to avoid instability. image X T 2 , respectively. similarity feature, a multi-feature space X = X , X , X is constructed by stacking these [ ] w SSI M detail   statistics ( , , and  X T 1 X T 2 ) are computed as proposed in [40]. feature maps. Then, each pixel in the DI can be mapped to a data point in the multi-feature space and If X is close to 1, it indicates that the local structures of the two-phase images have a high the CD is SSIM transformed to a classification problem, and it is crucial to find the optimal changed and similarity. On the contrary, if X SSIM is close unchanged clusters in the multi-feature space.to 0, it means that the local structures of the two-phase images are quite different. After extracting the Wiener de-noised feature, the detail enhancement feature and the structural similarity feature, a multi-feature space X  X w , X detail, X SSIM  is constructed by stacking these feature maps. Then, each pixel in the DI can be mapped to a data point in the multi-feature space and

Remote Sens. 2018, 10, x FOR PEER REVIEW

10 of 26

the CD is transformed to a classification problem, and it is crucial to find the optimal changed and unchanged clusters in the multi-feature space.

Remote Sens. 2018, 10, 1664

10 of 26

3.2. Adaptive Differential Evolution-Based Change Detection

3.2. Adaptive Differential Change Detection In M-DECD, a CD Evolution-Based method based on adaptive DE algorithm and FCM is proposed. The DE is employed to search the optimal clusters in the multi-feature space minimizing theThe objective In M-DECD, a CD method based on adaptive DE algorithm and by FCM is proposed. DE is function derived FCM. clusters Input the multi-feature data , then randomly X by  Xminimizing employed to searchfrom the optimal in the multi-feature space theobjective function w , X detail, X SSIM derived Input the multi-feature data X = [ Xwspace, , Xdetailand , XSSIproduce randomly generate a generatefrom a setFCM. of potential clusters in the multi-feature clusters through M ], thennew set of potential clusters in space, and produce clusters through mutation and mutation and crossover of the DE multi-feature algorithm. The Jm distance of FCMnew is used as the objective function of crossover of DE the algorithm. Jm distance FCM is used the objective functionobjective of DE to evaluate DE to evaluate quality The of these clusters,ofand those eliteasclusters with smaller function the quality of theseto clusters, and those eliteuntil clusters with smaller objective function iterations. value will survive value will survive the next generation the process reaches the maximum Finally, to the next until the reaches thedata maximum iterations. Finally, output optimal output the generation optimal clusters andprocess the multi-feature can be divided into changed and the unchanged clusters and the multi-feature datamembership can be divided intooptimal changedclusters. and unchanged classesDE-based accordingCD to classes according to their fuzzy to the The adaptive their fuzzy membership to the optimal clusters. The adaptive DE-based CD method is composed of method is composed of four steps, which are described as follows: four steps, which are described as follows: 3.2.1. Population Initialization 3.2.1. Population Initialization The population consists of NP individuals, G  G1,G2, ..., G NP . Each individual represents a The population consists of NP individuals, G = G , G , . , GNP a { }. Each individual 2 1 candidate solution of the CD problem, i.e., a vector of changed. .and unchanged clusters, represents as shown in candidate solution of the CD problem, i.e., a vector of changed and unchanged clusters, as shown in Figure 3. Each individual in the population is randomly generated in the multi-feature space. Figure 3. Each individual in the population is randomly generated in the multi-feature space. Gi, j  min X j  rand (0,1)  (max X j  min X j ) (12) Gi,j = minX j + rand (0, 1) · (maxX j − minX j ) (12) where i  1,2,..., NP , j  1,2,..., D , Gi, j represents the j-th dimension of the i-th individual in the where i = 1, and 2, . . . ,DNP, = 1, 2, . . . , D, Gof represents the j-th min dimension theXi-thcorrespond individual to in the i,j the population, is jthe dimension two clusters. the X j and ofmax j population, and D is the dimension of the two clusters. minX j and maxX j correspond to the minimum minimum and maximum values in the j-th dimension of the input multi-feature data, respectively, and maximum values in the j-th dimension of the input multi-feature data, respectively, and rand( a, b) and rand(a, b) is a random value between a and b . is a random value between a and b.

Figure unchanged classes. Gi1 G is1 the cluster of the Figure 3. 3. Individual Individual represents representsthe theclusters clustersofofchanged changedand and unchanged classes. i is the cluster of 0 changed class, and Gi is the cluster of the unchanged class. the changed class, and Gi0 is the cluster of the unchanged class.

3.2.2. Fitness Evaluation 3.2.2. Fitness Evaluation The objective function is modeled according to the fuzzy clustering-based CD problem. In this The is modeled according fuzzy clustering-based problem. In this paper, weobjective adoptedfunction the Jm distance of FCM, whichtoisthe designed to minimize the CD intra-class distance, paper, we adopted the Jmof distance of FCM,Fitness which is to minimize intra-class as as the objective function DE algorithm. isdesigned the objective functionthe value of each distance, individual, the objective function DE algorithm. FitnessThe is the objective function valueCD of clusters each individual, which can evaluate theof quality of the solution. purpose of DE is to obtain with the which can fitness evaluate the quality of the solution. The purpose of DE is using to obtain CD clusters minimum value. The fitness of individual Gi can be calculated Equation (13). with the minimum fitness value. The fitness of individual Gi can be calculated using Equation (13).



H ×W

m W u min f ( Gi ) = H∑ (13)

Xn − Gi0 + um

Xn − Gi1 , i = 1, 2, . . . , NP n,0 n,1 m 0 m 1 min f (Gi )  n=1 un,0 X n  Gi  un,1 X n  Gi , i  1, 2,..., NP (13)

 n 1

where Gi is a D-dimension individual consisting of changed cluster Gi1 and unchanged cluster Gi0 . Xn 1 Dwhere Gi is a D -dimension individual consisting of changed cluster Gi and unchanged is a 2 -dimension vector which denotes the n-th point of the multi-feature data X = [ Xw , Xdetail , XSSI M ]. D membership which represents the probability of belonging of point X to 0 1} is the fuzzy u {i0, n n,k , k =G cluster . X n is a -dimension vector which denotes the n-th point of the multi-feature data k 0 k represents the intraclass the cluster Gi , and it is 2calculated according to Equation (2). um k X − G n n,0 i 1 un, k , k and  0,1 fuzzy membership which represents the probability of X  X w , of X detail , X SSIM  . pixels, distance unchanged um is k Xthe n − G k represents the intra-class distance of changed pixels. n,1

i

belonging of point X n to the cluster Gik , and it is calculated according to Equation (2). 3.2.3. Mutation and Crossover

DE upsets the current population with the scaled difference of randomly selected and distinct individuals. For each target individual Gi , we randomly pick three exclusive individuals Gr1 , Gr2 , Gr3

Remote Sens. 2018, 10, 1664

11 of 26

from the current population and generate a mutant individual vi using Equation (14). The mutant individual is a vector of potential clusters around the target individual. F ∈ [0, 2] is the scaling factor. vi has to be constrained according to the domain of multi-feature space. vi,j = Gr1,j + F · Gr2,j − Gr3,j



(14)

In order to produce a desirable vector of clusters, the crossover operator is then applied to the mutant individual vi to generate a trial individual qi,j using Equation (15), where CR ∈ [0, 1] is the crossover rate. K is a random integer between 0 and D. ( qi,j =

vi,j , i f j = K or rand(0, 1) ≤ CR Gi,j , otherwise

(15)

Two control parameters (scaling factor F, crossover rate CR) affect the optimization result. If properly designed, an adaptive strategy can enhance the robustness of the algorithm and improve the convergence rate. In the M-DECD, a self-adaptive strategy is adopted, and it can dynamically adjust the parameters according to the fitness of the individual (details can be found in [42]). The mutation rate pm is adaptively determined by Equation (16), which means that a vector of clusters with a smaller fitness is more likely to retain its control parameters. Before performing mutation and crossover, the new control parameters F 0 and CR0 are updated using Equations (17) and (18), respectively. pm =

f ( Gi ) − min f ( Gi ) max f ( Gi ) − min f ( Gi )

(

(1−

gen

)

2

1 − rand1 (0, 1) maxgen , i f rand2 (0, 1) < pm F = F, otherwise ( rand3 (0, 1), i f rand4 (0, 1) < pm CR0 = CR, otherwise 0

(16)

(17)

(18)

where randk (0, 1), k ∈ {1, 2, 3} denotes the uniform random values within the range [0,1], gen is the iteration number, maxgen is the maximum iterations. In the initial generations, adaptive DE tends to search the solutions in the multi-feature space uniformly and, in the later generations, it tends to search the solutions near the optimal clusters. 3.2.4. Selection To keep the population size a constant, DE applies the greedy selection strategy to decide whether the new solutions can survive to the next generation or not, which is executed between the target individual Gi and trial individual qi using Equation (19), where Gi 0 is the i-th individual in the next generation, and f (·) is the objective function. ( Gi0

=

qi , i f f (qi ) ≤ f ( Gi ) Gi , otherwise

(19)

After selection, potential clusters with smaller Jm distance are preserved. The evolutionary process should be repeated until the number of iterations reach the maximum, Finally, output the optimal clusters and its fuzzy membership un,k , n = {1, 2, . . . , N }, k = {0, 1}.

Remote Sens. 2018, 10, 1664

12 of 26

3.2.5. Generation of the Change Map The optimal CM can be obtained by assigning a label (0: unchanged, 1: changed) to each pixel according to its fuzzy membership. ( CMn =

0, i f (un,0 ≥ un,1 ) 1, i f (un,0 un,1 )

(20)

3.3. Computational Complexity of the M-DECD Algorithm The computational complexity of M-DECD can be split into two parts: (1) the construction of the multi-feature space; (2) the adaptive DE clustering-based CD. We assume that the size of the input image is M×N, and the dimension of the multi-feature space is D, then the computational complexity for part (1) is O(M×N×D). The adaptive DE clustering-based CD can be further split into six steps as follows: (1) (2) (3)

(4)

Population initialization. The computational complexity should be O(L × NP), where L denotes the length of the individual, i.e., dimension of the changed and unchanged clusters. Fitness evaluation. The fitness of each individual is calculated based on the input data, so the computational complexity of the fitness evaluation of NP individuals should be O(M × N × NP). Population optimization. This step is composed of fitness evaluation, mutation, crossover and selection. The whole computational complexity should be O(L × NP × G), where G is the maximum generations. Generation of the change map. The computational complexity should be O(M × N).

4. Experiments and Analysis In order to demonstrate the effectiveness of the M-DECD, four real Landsat datasets were tested in the experiments. Seven pixel-based methods for remote sensing images with moderate spatial resolution are presented, including a threshold-based method (KI [11]), a spatial-contextual method (Markov random field expectation maximization, MRF+EM [9]), fuzzy clustering-based methods (FCM [17]; fuzzy local information c-means algorithm, FLICM [18]), and EA-based methods (genetic algorithm encoding change mask, GA-mask [20]; genetic algorithm combined with FCM algorithm, GA-FCM [21]; multi-objective EA with spatial information, MOEA/D [28]). The CD result is a binary CM, where the white pixels in the map represent the changed objects, and the black pixels represent the unchanged objects. In order to quantitatively analyze the experiment results, we compare the CMs with the reference image. Generally speaking, the false alarms (FAs), missed alarm (MAs), overall error (OE), and kappa coefficient are commonly used as evaluation indexes to assess the result of CD. MA represents the number of pixels that are classified into the unchanged class but are changed in the reference image. FA denotes the number of pixels, which are classified into the changed class but are unchanged in the reference image. OE is defined as: OE = FA + MA

(21)

The kappa coefficient is usually applied to evaluate the effect of the classification. A higher value of kappa indicates a better detection result. Kappa can be calculated as: Kappa =

PCC − PRE 1 − PRE

(22)

where PCC represents the percentage of correct classification, PRE represents the proportion of expected agreement. PCC and PRE are defined by Equations (23) and (24). PCC = ( TP + TN )/( TP + FA + TN + MA)

(23)

Kappa 

PCC  PRE 1  PRE

(22)

where PCC represents the percentage of correct classification, PRE represents the proportion of expected agreement. PCC and PRE are defined by Equations (23) and (24). PCC  TP  TN  / TP  FA  TN  MA

Remote Sens. 2018, 10, 1664

2 2 TP  Nc MA+ TN  / N)/N NuNu PRE =PRE FA) FA × Nc + ( MA TN (( TP + )×

13 of 26 (23)

(24) (24)

N number where is the number of pixels. image pixels. is theof number of pixels changed that are classified, correctly where N is the of image TP is theTP number changed thatpixels are correctly Nc number TN is of classified, andnumber theunchanged number ofpixels unchanged thatclassified. are correctly classified. and Nu and TN is the that arepixels correctly Nc and Nu are the of are the number of changed pixels and unchanged pixels in the reference image, respectively. changed pixels and unchanged pixels in the reference image, respectively.

4.1. Dataset Description

4.1.1. Mexico Mexico 4.1.1. The Mexico forfor change detection evaluation. It isItmade up of The Mexico dataset datasetisisan anopen-access open-accessdataset dataset change detection evaluation. is made uptwo of images acquired by the sensor of the satellite in aninarea Mexico on 18on April 2000 two images acquired byETM+ the ETM+ sensor ofLandsat-7 the Landsat-7 satellite an of area of Mexico 18 April and 20 May 2002.2002. The spatial resolution was 30 m.30Am. section of 512 512 pixels waswas selected as the 2000 and 20 May The spatial resolution was A section of× 512 × 512 pixels selected as test site. Between the two acquisition dates, forest fires destroyed a large proportion of the vegetation. the test site. Between the two acquisition dates, forest fires destroyed a large proportion of the Band 4 (spectral from 0.775 to 0.900 was selected ourselected experiment, band vegetation. Band range 4 (spectral rangeµm from 0.775 µm) μm to 0.900 μm) in was in ourbecause experiment, 4 of the ETM+ sensor is the near-infrared band, which is sensitive to the difference of vegetation because band 4 of the ETM+ sensor is the near-infrared band, which is sensitive to the difference of type. Figuretype. 4a,b shows band 4 of the andimages 2002, respectively. The reference map was vegetation Figurethe 4a,b shows the images band 42000 of the 2000 and 2002, respectively. The manually defined according to a detailed visual analysis of the available multi-temporal images and reference map was manually defined according to a detailed visual analysis of the available the difference image. map image. contained changed pixels and 236,545 unchanged multi-temporal imagesThe andreference the difference The 25,599 reference map contained 25,599 changed pixels pixels. No radiometric applied. correction was applied. and 236,545 unchangedcorrection pixels. Nowas radiometric

(a)

(b)

Figure 4. Mexico dataset. (a) Image acquired in April 2000. (b) Image acquired in May 2002.

4.1.2. Sardinia Sardinia Island Island 4.1.2. The second second dataset dataset tested tested in in our is the The our experiment experiment is the Sardinia Sardinia island island dataset, dataset, which which is is composed composed of two multispectral images acquired by the TM sensor of the Landsat-5 satellite in September 1995 of two multispectral images acquired by the TM sensor of the Landsat-5 satellite in September 1995 and July July 1996. The spatial spatial resolution resolution was was 30 30 m. The test test site site was 412 pixels pixels of of aa and 1996. The m. The was aa section section of of 300 300 × × 412 scene covered Lake Mulargia on the island of Sardinia. Between the two acquisition dates, the water scene covered Lake Mulargia on the island of Sardinia. Between the two acquisition dates, the water level of of the the lake lake increased increased due due to to the the flood. flood. Band Band 44 (spectral (spectral range range from from 0.760 0.760 μm µm to to 0.960 0.960 μm) µm) of of the the level TM sensor is the near-infrared band, because water has a strong absorption effect in the near-infrared TM sensor is the near-infrared band, because water has a strong absorption effect in the band, which makes the water profile clear profile in bandclear 4. Figure show 5a,b the tested image 1995 and near-infrared band, which makes the water in band5a,b 4. Figure show the tested image imageand 1996, respectively. The reference was manually which 7480 changed and 1995 image 1996, respectively. The map reference map was defined, manuallyindefined, in which 7480pixels changed 116,120 unchanged pixels were identified. No radiometric correction was applied to the multi-temporal images. The images were co-registered with 12 ground control points.

Remote Sens. 2018, 10, x FOR PEER REVIEW Remote Sens. 2018, 10, x FOR PEER REVIEW

14 of 26 14 of 26

pixels and 116,120 unchanged pixels were identified. No radiometric correction was applied to the 26 pixels were identified. No was applied14toofthe multi-temporal images. The images were co-registered withradiometric 12 ground correction control points. multi-temporal images. The images were co-registered with 12 ground control points.

Remote 10, 1664 pixelsSens. and2018, 116,120 unchanged

Figure 5. Image acquired in September 1995. (b) Image acquired in July Figure 5. Sardinia Sardiniaisland islanddataset. dataset.(a)(a) Image acquired in September 1995. (b) Image acquired in Figure 5. Sardinia island dataset. (a) Image acquired in September 1995. (b) Image acquired in July 1996. July 1996. 1996.

4.1.3. Alaska 4.1.3. Alaska The third dataset is the Alaska dataset, which is made up of two multispectral images acquired dataset the Alaska dataset, which is made up of2005. twoThe multispectral imageswas acquired by theThe TM sensor of satellite ininJuly and 2005. spatial resolution 30 30 m. TMthird sensor ofthe theisLandsat-5 Landsat-5 satellite July1985 1985 andJuly July The spatial resolution was by the TM sensor of the Landsat-5 satellite in July 1985 and July 2005. The spatial resolution was 30 Band 4 was selected for the experiment for the reasons mentioned above. A section of 400 400 m. Band 4 was selected for the experiment for same the same reasons mentioned above. A section of× 400 × m. Band 4 was selected for the experiment for the same reasons mentioned above. A section of 400 pixels waswas selected for for thethe experiment. Figure 6a,b display 400 pixels selected experiment. Figure 6a,b displaythe thetested testedimages. images. The The reference reference map× 400 pixels was selectedpixels for the experiment. Figure 6a,b display the tested images. Thewas reference map contains 9741 changed and 150,259 pixels. No correction applied to 9741 changed pixels and 150,259unchanged unchanged pixels. Noradiometric radiometric correction was applied contains 9741 changed pixels and 150,259 unchanged pixels. No radiometric correction was applied the multi-temporal images. The former image was registered to the latter one. to the multi-temporal images. The former image was registered to the latter one. to the multi-temporal images. The former image was registered to the latter one.

(a) (b) (a) (b) Figure 6. Alaska dataset. (a) Image acquired in July 1985. (b) Image acquired in July 2005. Figure 6. Alaska dataset. (a) Image acquired in July 1985. (b) Image acquired in July 2005.

4.1.4. Canada 4.1.4. Canada 4.1.4. Canada The The Canada Canada dataset dataset is is composed composed of of two two multispectral multispectral images images acquired acquired by by the the Landsat-8 Landsat-8 OLI OLI The Canada dataset is composed of two multispectral images acquired by the OLI sensor on 2 July 2017 and 22 August 2017. The spatial resolution was 30 m. The test site is a scene near sensor on 2 July 2017 and 22 August 2017. The spatial resolution was 30 m. The test Landsat-8 site is a scene sensor on 2George, July 2017 and 22 August 2017. spatial waswhich 30 m.isThe testis site isthan abigger scene Prince George, British Columbia, Canada. TheThe size is 3000 × 2500 pixels, much bigger the near Prince British Columbia, Canada. The size resolution is 3000 × 2500 pixels, which much near Prince George, British Columbia, Canada. The size is 3000 × 2500 pixels, which is much bigger above three datasets. The forest fires lasted more than two weeks, which led to the suspension of three than the above three datasets. The forest fires lasted more than two weeks, which led to the than timber the above three datasets. forest people fires lasted thanpeople twoInweeks, led detect to the major andtimber more The than 14,000 had than tomore be 14,000 evacuated. order towhich accurately suspension ofproducers, three major producers, and more had to be evacuated. In suspension of three major timber producers, and more than 14,000 people had to be evacuated. In the disaster area in the forest, we selected band 5 for our experiments. Band 5 (spectral range from order to accurately detect the disaster area in the forest, we selected band 5 for our experiments. order to accurately detect the disaster area in the forest, we selected band 5 for our experiments. 0.845 µm to 0.885 µm) belongs to the near-infrared band, which is sensitive to changes in vegetation Band 5 (spectral range from 0.845 μm to 0.885 μm) belongs to the near-infrared band, which is Band 5 (spectral fromof0.845 μm Figure to 0.885 belongs the5 near-infrared band, which is and suitable for therange detection fire scars. 7aμm) and (b) show to band of the multi-temporal images. According to a detailed visual analysis, 1,795,840 changed pixels and 5,704,160 unchanged pixels were

Remote Sens. 2018, 10, x FOR PEER REVIEW

15 of 26

Remote Sens. 2018, 10, 1664

15 of 26

sensitive to changes in vegetation and suitable for the detection of fire scars. Figure 7a and (b) show band 5 of the multi-temporal images. According to a detailed visual analysis, 1,795,840 changed pixels andin 5,704,160 unchanged were correction identified orinimages the reference map. radiometric identified the reference map. No pixels radiometric registration wasNo applied to the correction orimages. images registration was applied to the bi-temporal images. bi-temporal

Figure Figure7. 7. Canada Canada dataset. dataset. (a) (a) Image Image acquired acquired on on 22 July July 2017. 2017. (b) (b) Image Image acquired acquired on on 22 22 August August2017. 2017.

4.2. 4.2. Parameter Parameter Setting Setting The proposed method is composed of two main steps: multi-feature space construction and The proposed method is composed of two main steps: multi-feature space construction and DE DE clustering-based first step, threeimage imagefeatures featuresare areextracted extracted and and the the parameters clustering-based CD.CD. In In thethe first step, three parameters are are adjusted by trial and error. The window size of the Wiener filtering was set to be 13, 13, adjusted by trial and error. The window size of the Wiener filtering was set to be 13, 13, 3, 3, and and 13 13 for for the Mexico, Sardinia, Alaska, and Canada datasets, respectively. SSIM illustrates a locally isotropic the Mexico, Sardinia, Alaska, and Canada datasets, respectively. SSIM illustrates a locally isotropic property, property,and andwe weused usedaa53 53×× 53 53 circular circular symmetric symmetric Gaussian Gaussian weighting weighting function function with with aa standard standard deviation of one sample for all of datasets. In the second step, the pixels in the multi-feature space are deviation of one sample for all of datasets. In the second step, the pixels in the multi-feature space classified into either changed or unchanged classes using using the DEthe algorithm. The population size forsize all are classified into either changed or unchanged classes DE algorithm. The population tests is set to be 30, and the initial scale factor F and the initial crossover rate CR are set to be 0.8 and for all tests is set to be 30, and the initial scale 0factor F0 and the initial crossover0rate CR0 are set to be 0.2, respectively. The maximum numbernumber of generations is set toisbe component m in 0.8 and 0.2, respectively. The maximum of generations set100. to beThe 100.fuzzy The fuzzy component the objective function is set to 2.0,4.1, 2.0,2.0, and2.0, 3.0 and for the Sardinia,Sardinia, Alaska, and Canada m in the objective function is be set4.1, to be 3.0Mexico, for the Mexico, Alaska, and datasets, respectively. Canada datasets, respectively. For For the the fuzzy fuzzyclustering-based clustering-based CD CDmethods, methods,i.e., i.e.,FCM, FCM,FLICM, FLICM,GA-FCM, GA-FCM,and andMOEA/D, MOEA/D, their their fuzzy components are the same as the proposed M-DECD. For the EAs-based CD fuzzy components are the same as the proposed M-DECD. For the EAs-based CD methods, methods, i.e., i.e., GA-mask, GA-FCM and MOEA/D, the population size is set to be 30. The maximum number GA-mask, GA-FCM and MOEA/D, the population size is set to be 30. The maximum number of of iterations is set set to iterations of of GA-FCM GA-FCM and and MOEA/D MOEA/D is to be be 100. 100. Especially, Especially, the the maximum maximum number number of of iterations iterations of of GA-mask GA-mask is is set set to to be be 100,000, 100,000, because because the the GA-mask GA-mask method method directly directly encodes encodes the the whole whole binary binary CM, resulting in slow convergence. In contrast, the GA-FCM, MOEA/D and M-DECD encode CM, resulting in slow convergence. In contrast, the GA-FCM, MOEA/D and M-DECD encode the the CD CD clusters, clusters, whose whose length length is is much much lower lower than than the the binary binary CM. CM. Thus, Thus, these these methods methods can can converge converge in in 100 100generations. generations. 4.3. Experimental Results 4.3. Experimental Results 4.3.1. Results of the Mexico Dataset 4.3.1. Results of the Mexico Dataset The CMs of the experiments tested on the Mexico dataset are shown in Figure 8. From Figure 8a, The of the experiments the Mexico dataset are shown in Figure 8. obvious, From Figure we can seeCMs that the DI of the Mexico tested dataseton is relatively simple, because the change area is and 8a, we can see that the DI of the Mexico dataset is relatively simple, because the change area is the background interference is small. Therefore, the CD of the Mexico dataset is not very challenging. obvious, and the background interference is small. Therefore, the tested methods obtained similar results.Therefore, the CD of the Mexico dataset is not veryAs challenging. Therefore, the tested methods obtained similar shown in Figure 8, compared with the reference map (see results. Figure 8j), we see that many white As shown in Figure 8, compared with the reference map (see Figure we see that noise spots exist in the CMs of KI (see Figure 8b), FCM (see Figure 8d), and8j), GA-mask (seemany Figurewhite 8f)), noise spots exist in the CMs of KIthat (see the Figure 8b),are FCM (see Figure 8d), and other GA-mask 8f)), because these methods assume pixels independent of each and(see onlyFigure consider because these methods assume that the pixels are independent of each other and only consider their their spectral information. It should be mentioned that GA-mask is very time-consuming due to spectral information. It should mentioned that 8GA-mask is very time-consuming to its its mask-based encoding strategy,beand it took about h to process the Mexico dataset. Indue contrast, mask-based encoding strategy, and it took about 8 h to process the Mexico dataset. In contrast, MRF+EM (see Figure 8c), FLICM (see Figure 8e), GA-FCM (see Figure 8g), and MOEA/D (see

Remote RemoteSens. Sens.2018, 2018,10, 10,1664 x FOR PEER REVIEW

16ofof2626 16

MRF+EM (see Figure 8c), FLICM (see Figure 8e), GA-FCM (see Figure 8g), and MOEA/D (see Figure Figure 8h) consider the influence of the neighborhood pixels, so these methods are more robust to 8h) consider the influence of the neighborhood pixels, so these methods are more robust to background interference, but they lose some changed details. For example, some weak changed background interference, but they lose some changed details. For example, some weak changed regions in the middle of the DI are not detected. The result of the proposed method is shown in regions in the middle of the DI are not detected. The result of the proposed method is shown in Figure 8i, where M-DECD shows better detail-preserving capability for the weak changes, due to Figure 8i, where M-DECD shows better detail-preserving capability for the weak changes, due to the the detail enhancement feature. In addition, the proposed method has fewer detections, due detail enhancement feature. In addition, the proposed method has fewer falsefalse detections, due to theto the Wiener denoising feature. However, did findthat thatthe thedetail detailenhancement enhancementfeature featureenhances enhances Wiener denoising feature. However, wewe did find background interference, so some pixels are wrongly identified as changed in the left of the CM. background interference, so some pixels are wrongly identified as changed in the left of the CM.As Asto the algorithm efficiency, thethe proposed method takes TheDE DE to the algorithm efficiency, proposed method takesabout about33min minfor forthe the Mexico Mexico dataset. dataset. The algorithm can converge in 100 iterations and obtain a satisfactory result. algorithm can converge in 100 iterations and obtain a satisfactory result.

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i)

(j)

Figure8.8.Results Resultsofof Mexico dataset. (a) difference image. (b)(c)KI. (c) MRF+EM. (d) (e) FCM. (e) Figure thethe Mexico dataset. (a) difference image. (b) KI. MRF+EM. (d) FCM. FLICM. FLICM. (f) GA-mask. (g) GA-FCM. (h) MOEA/D. (i) Proposed M-DECD. (j) Reference Yellow (f) GA-mask. (g) GA-FCM. (h) MOEA/D. (i) Proposed M-DECD. (j) Reference map. map. Yellow ovals ovalstorefer to thedetection false detection the comparison algorithms compared the M-DECD, refer the false areas areas of theofcomparison algorithms compared withwith the M-DECD, and andred theovals red ovals to the miss-detection areas comparisonalgorithms. algorithms. Green ovals the referrefer to the miss-detection areas of of thethe comparison ovals refer refertoto over-smoothingregion. region. over-smoothing

4.3.2. 4.3.2.Results Resultsof ofthe theSardinia Sardinia Dataset Dataset The datasetare areshown shownininFigure Figure9.9.We We can that Figure 9a) TheCMs CMs of of the the Sardinia dataset can seesee that thethe DI DI (see(see Figure 9a) of ofthe theSardinia Sardiniadataset datasethas hassome somebackground backgroundinterference interferencecaused causedbybythe theGaussian Gaussiannoise. noise.InInaddition, addition, the it is a challenging challenging work workto toremove removethe thenoise noisewhile while thelake lakeboundary boundary isis weak weak and and slim. slim. Therefore, Therefore, it preserving the lake boundary. preserving the lake boundary. We effect of of the the noise noise due due to to ignoring ignoring ofof We can can see see that that some some methods methods cannot cannot resist the effect spatial-contextual suchasasKIKI(see (see Figure FCM Figure and GA-mask spatial-contextual information, information, such Figure 9b),9b), FCM (see(see Figure 9d), 9d), and GA-mask (see (see Figure 9f)), in which many pixels are wrongly detected as changed. MRF+EM (see Figure 9c) shows Figure 9f)), in which many pixels are wrongly detected as changed. MRF+EM (see Figure 9c) shows a agood goodperformance performanceininnoise noiseremoval, removal,but butits itsCM CM appears appearsover-smoothing over-smoothingand andthe theweak weakchanged changed boundary of of the the lake lake isis not not detected. detected. Although Although the boundary the average average of of the the neighborhood neighborhoodpixels pixelshas hasbeen been considered, there are some small false detected regions in the CMs of FLICM (see Figure 9e), considered, there are some small false detected regions in the CMs of FLICM (see Figure 9e), GA-FCM GA-FCM 9g), and (see MOEA/D 9h). of The result of themethod proposed methodinisFigure shown9i; (see Figure(see 9g), Figure and MOEA/D Figure(see 9h).Figure The result the proposed is shown Figure 9i;that it was that can the not M-DECD can not only effectively resistinterference background interference itinwas found thefound M-DECD only effectively resist background and generate a and generate a clean CM,the but alsochanged retain the weak changed boundary of theinlake, especially in the clean CM, but also retain weak boundary of the lake, especially the red oval regions. red oval regions.

Remote Sens. 2018, 10, x FOR PEER REVIEW Remote Sens. 2018, 10, 1664 Remote Sens. 2018, 10, x FOR PEER REVIEW

17 of 26 17 of 26 17 of 26

(a) (a)

(b) (b)

(c) (c)

(d) (d)

(e) (e)

(f) (f)

(g) (g)

(h) (h)

(i) (i)

(j) (j)

Figure 9. Results of the Sardinia dataset. (a) Difference image. (b)(c) KI.MRF+EM. (c) MRF+EM. (d) FCM. (e) Figure 9. Results of the Sardinia dataset. (a) Difference image. (b) KI. (d) FCM. (e) FLICM. FLICM. GA-mask. (g) Sardinia GA-FCM. (h) MOEA/D. (i) Proposed M-DECD. (j) Reference 9.(f)Results of the dataset. Difference image. (b) KI. (c) MRF+EM. (d) FCM. (e) (f) Figure GA-mask. (g) GA-FCM. (h) MOEA/D. (i)(a) Proposed M-DECD. (j) Reference map. map. FLICM. (f) GA-mask. (g) GA-FCM. (h) MOEA/D. (i) Proposed M-DECD. (j) Reference map.

4.3.3. Results ofof the 4.3.3. Results theAlaska AlaskaDataset Dataset 4.3.3. Results of the Alaska Dataset The results ofofthe 10. In InFigure Figure10a, 10a,we wesee seethat that the Alaska The results theAlaska Alaskadataset datasetare are shown shown in Figure Figure 10. the Alaska The results of the Alaska dataset are shown in Figure 10. In Figure 10a, weand see that the Alaska dataset has complex background interference, caused by Gaussian noise brightness dataset has complex background interference, caused by both both Gaussian noise andlocal local brightness dataset has complex background interference, caused by both Gaussian noise and localTherefore, brightness distortion (marked the blue ovals). addition, there many weak and slim edges. Therefore,CD distortion (marked byby the blue ovals). InIn addition, there areare many weak and slim edges. (marked by theisblue ovals). In addition, there are many weak and slim edges. Therefore, CD Alaska of the Alaska dataset quite challenging. of distortion the dataset is quite challenging. CDAs ofAs the Alaska dataset is10, quite challenging. shown Figure10, compared with the the reference reference map some shown ininFigure compared with map(see (seeFigure Figure10j), 10j),we wesee seethat that some As shown in Figure 10, compared with the reference map (see Figure 10j), we see that some methods are edge-sensitive, including KI (see Figure 10b), FCM (see Figure 10d), and GA-mask (see methods are edge-sensitive, including KI (see Figure 10b), FCM (see Figure 10d), and GA-mask methods are edge-sensitive, including KI also (see Figure 10b), FCM (see Figure 10d), GA-mask (see Figure 10f). However, thesethese methods are noise-sensitive, and many pixels areand wrongly detected (see Figure 10f). However, methods are also noise-sensitive, and many pixels are wrongly Figure 10f). However, these are also noise-sensitive, pixels are wrongly detected as changed. MRF+EM (see methods Figure 10c), FLICM (see Figure 10e),many GA-FCM (see Figure 10g), and detected as changed. MRF+EM (see Figure 10c), FLICM (see and Figure 10e), GA-FCM (see Figure 10g), as changed. (see Figure FLICM (see Figure 10e), GA-FCM (seeinformation Figure 10g),ofand MOEA/D (seeMRF+EM Figure 10h) can reduce10c), some false alarms by considering the spatial the and MOEA/D (see Figure 10h) can reduce some false alarms by considering the spatial information MOEA/D (see Figure can reduce some smoothness false alarms may by considering the spatial the neighborhood pixels; 10h) however, the spatial blur the edges and it information cannot resistoflocal of the neighborhood pixels; however, the spatial smoothness may blur the edges and it cannot neighborhood pixels; however, thethese spatialmethods smoothness may blur thepixels edges in andbrightness it cannot resist local brightness distortion. Therefore, misjudge some distortion resist local brightness distortion. Therefore, these methods misjudge some pixels in brightness brightness distortion. misjudge some pixels in brightness distortion regions. Some methodsTherefore, lose many these weak methods changed edges due to over-smoothing (MRF+EM, FLICM, distortion regions. Some methods lose many weak changed edges due to over-smoothing (MRF+EM, regions. Some methods loseofmany weak changed edgesindue to over-smoothing FLICM, and MOEA/D). The results the M-DECD are shown Figure 10i; it was found(MRF+EM, that the M-DECD FLICM, andthe MOEA/D). Theofresults ofand thegenerate M-DECD are shown in Figure 10i; it that was the found that the and MOEA/D). Theof results the M-DECD are shown in Figure 10i;toitthe wasWiener found de-noised M-DECD can resist effect Gaussian noise a clean CM due feature. M-DECD can resist the effect of Gaussian noise and generate a clean CM due to the Wiener de-noised can resist the effect of Gaussian noisewell and generate a cleandistortion CM due toregions the Wiener feature. Moreover, the M-DECD performs in brightness due de-noised to the structural feature. Moreover, the M-DECD performs in brightness distortion regions structural Moreover, the M-DECD performs well well in brightness distortion regions duedue to to thethe structural similarity feature. similarity feature. similarity feature.

(a) (a)

(b) (b)

(c) (c)

(d) (d)

(e) (e)

(f) (f)

(g) (g)

(h) (h)

(i) (i)

(j) (j)

Figure 10. Results of the Alaska dataset. (a) Difference image, the blue ovals refer to the local brightness deformation region. (b) KI. (c) (d) FCM. (e)blue FLICM. (f)refer GA-mask. (g) Figure Results of Alaska the Alaska dataset. (a) Difference image, theovals blue ovals to GA-FCM. the local Figure 10.10. Results of the dataset. (a)MRF+EM. Difference image, the torefer the local brightness (h) MOEA/D. (i) Proposed Reference map. brightness deformation (b) KI.(j) (c) MRF+EM. (d) FCM.(f) (e)GA-mask. FLICM. (f)(g) GA-mask. GA-FCM. deformation region. (b) KI.region. (c)M-DECD. MRF+EM. (d) FCM. (e) FLICM. GA-FCM.(g)(h) MOEA/D. MOEA/D. (i) Proposed M-DECD. (j) Reference map. (i) (h) Proposed M-DECD. (j) Reference map.

Remote Sens. 2018, 10, 1664

18 of 26

Remote Sens. 2018, 10, x FOR PEER REVIEW

18 of 26

4.3.4. Results of the Canada Dataset 4.3.4. Results of the Canada Dataset

The CMs of the experiments tested on the Canada dataset are shown in Figure 11. First, the CMs of the experiments tested the Canada dataset are shown in Figure 11. First, the11a, sizewe size of The the Canada dataset is larger thanon the other three datasets. In addition, from Figure of the Canada dataset is larger than the other three datasets. In addition, from Figure 11a, we see that see that the Canada dataset has strong background interference, caused by both Gaussian noise and thebrightness Canada dataset has strong background interference, caused by Gaussian noise and localhas local distortion (marked by the blue ovals), especially inboth the top right corner, where brightness distortion distortion. (marked byMoreover, the blue ovals), especially in thebetween top rightthe corner, where severe severe local brightness the spectral contrast changed andhas unchanged local brightness distortion. Moreover, the spectral contrast between the changed and unchanged regions is low. Therefore, CD of the Canada dataset is the most difficult in our experiments. regions is low.inTherefore, CD of the Canada is the most in our11j), experiments. As shown Figure 11, compared with dataset the reference mapdifficult (see Figure we can see that the As shown in Figure 11, compared with the reference map (see Figure 11j), that the advantage of the proposed M-DECD is obvious. Some methods are sensitivewetocan thesee image noise advantage of the proposed M-DECD is obvious. Some methods are sensitive to the image noise and and local brightness distortion, including KI (see Figure 11b), FCM (see Figure 11d), and GA-mask local brightness distortion, including KI (see Figure 11b), FCM (see Figure 11d), and GA-mask (see (see Figure 11f)). However, some other methods can resist the image noise, including MRF+EM Figure 11f)). However, some other methods can resist the image noise, including MRF+EM (see (see Figure 11c), FLICM (see Figure 11e), GA-FCM (see Figure 11g), and MOEA/D (see Figure 11h). Figure 11c), FLICM (see Figure 11e), GA-FCM (see Figure 11g), and MOEA/D (see Figure 11h). However, they cannot resist local brightness distortion, and many regions are wrongly detected as However, they cannot resist local brightness distortion, and many regions are wrongly detected as changed, especially in in thethe top right corner ofof the CMs. changed, especially top right corner the CMs.The Theresult resultofofthe theM-DECD M-DECDisisshown shownininFigure Figure11i; it was found that the M-DECD performs wellwell in brightness distortion regions, while preserving 11i; it was found that the M-DECD performs in brightness distortion regions, while preservingthe changed details. the changed details.

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i)

(j)

Figure Resultsofofthe theCanada Canada dataset. dataset. (a) to to thethe local Figure 11.11.Results (a) Difference Differenceimage, image,the theblue blueovals ovalsrefer refer local brightness deformation region. (b) KI. (c) MRF+EM. (d) FCM. (e) FLICM. (f) GA-mask. (g) GA-FCM. brightness deformation region. (b) KI. (c) MRF+EM. (d) FCM. (e) FLICM. (f) GA-mask. (g) GA-FCM. MOEA/D.(i)(i)Proposed ProposedM-DECD. M-DECD. (j) (j) Reference Reference map. (h)(h) MOEA/D. map.

4.3.5. Quantitative 4.3.5. QuantitativeAnalysis Analysis The quantitative evaluationofofthe theexperimental experimentalresults resultsisislisted listedin inTable Table4.4. The The optimal optimal results results of The quantitative evaluation of each evaluation index are marked in bold, and the suboptimal results are underlined. each evaluation index are marked in bold, and the suboptimal results are underlined.

Among the four datasets, CD of the Mexico dataset is the easiest, and CD of the Sardinia dataset Tabledue 4. Quantitative evaluation the experimental results obtained is more difficult to background noiseofand weak changed edges. Thewith OEfour anddatasets. kappa of M-DECD are Dataset very close to the best because the FA of M-DECD is a little higher thanMOEA/D the best result due to Index KI results, MRF+EM FCM FLICM GA-mask GA-FCM M-DECD MA which 1918 3236 enhancement 3320 4681 2795of the M-DECD 2114 edge expansion, is caused6103 by the detail feature. The3270 performance FA 5389 429 1014 3453 1049 1247 2068 could be further improved by adopting a 2357 more appropriate edge detection method. Mexico OE 7308 6532 5593 4334 8134 4319 4042 4182 CD of the Alaska dataset is quite challenge, due to the complex background interference caused kappa 0.8508 0.8430 0.8770 0.9022 0.8200 0.9027 0.9101 0.9094 by Gaussian noise and local brightness distortion, and there are many weak changed edges. CD MA 739 2266 910 1038 1478 1004 877 485of the FA 5448 250 3834 1078 2617 1408 1392 1944 Canada dataset is the most difficult, due to its large size and strong brightness distortion caused by Sardinia OE 6187The accuracies 2516 2116 on the 4095 2412 2269 illumination difference. of4744 the M-DECD Alaska dataset and the large2429 Canada Alaska

kappa MA

0.6645 2160

0.7994 4117

0.7189 2378

0.8525 3505

0.7325 3927

0.8355 2116

0.8463 3644

0.8442 3095

Remote Sens. 2018, 10, 1664 Remote Sens. 2018, 10, x FOR PEER REVIEW

FA

6114

1064

19 of 26 19 of 26

4550

1447

7076

2853

1500

984

dataset increased by about 3.3% and 11.9%, respectively. This indicates that the multi-feature space is OE 8274 5181 6928 4952 9741 4969 5144 4079 suitable for Landsat DE clustering-based CD method kappa images 0.6199 and the 0.6682 0.6571 0.6997 0.4776is effective. 0.7184 0.6866 0.7519 MA 823,031 189,863 406,428 293,202 831,436 778,996 338,566 386,034 TableFA 4. Quantitative experimental results325,081 obtained with four datasets. 331,124 evaluation 1,002,245 of the 855,901 722,737 364,579 723,305 205,690 Canada OE 1,154,155 1,192,108 1,262,329 1,015,939 1,156,517 1,143,575 1,061,871 591,724 Dataset kappa Index KI MRF+EM FCM FLICM M-DECD 0.5337 0.6222 0.5744 0.6563 GA-Mask 0.5314 GA-FCM 0.6120MOEA/D 0.6379 0.7757 MA 1918 6103 3236 3320 4681 3270 2795 2114 429 1014 3453 1247 2068 AmongFA the four 5389 datasets, CD of the 2357 Mexico dataset is the easiest,1049 and CD of the Sardinia dataset Mexico OE 7308 6532 5593 4334 8134 4319 4042 4182 is more difficult to background and weak changed edges.0.9027 The OE and kappa0.9094 of M-DECD kappa due0.8508 0.8430 noise 0.8770 0.9022 0.8200 0.9101

are very close to the best because 910 the FA of1038 M-DECD1478 is a little1004 higher than result due MA 739 results, 2266 877 the best 485 to edge expansion, which is caused by the detail enhancement feature. The performance FA 5448 250 3834 1078 2617 1408 1392 1944 of the Sardinia OE 6187 2516 4744 2116 4095 2412 2269 2429 M-DECD could be further improved by adopting a more appropriate edge detection method. kappa 0.6645 0.7994 0.7189 0.8525 0.7325 0.8355 0.8463 0.8442 CD of the Alaska dataset is quite challenge, due to the complex background interference caused 2160 4117 2378 3505 3927 2116weak changed 3644 3095 CD of by GaussianMA noise and local brightness distortion, and there are many edges. 4550 1447 7076 2853 1500 984 FA 6114 1064 Alaska the Canada dataset is8274 the most5181 difficult, 6928 due to its4952 large size9741 and strong OE 4969brightness 5144 distortion 4079 caused by illumination accuracies M-DECD on the0.7184 Alaska 0.6866 dataset and the large kappa difference. 0.6199 The 0.6682 0.6571of the 0.6997 0.4776 0.7519 Canada dataset by about406,428 3.3% and 11.9%,831,436 respectively. indicates that the MA increased 823,031 189,863 293,202 778,996 This 338,566 386,034 1,002,245 855,901 images 722,737and 325,081 723,305 CD 205,690 multi-featureFAspace 331,124 is suitable for Landsat the DE 364,579 clustering-based method is Canada 1,154,155 1,192,108 1,262,329 1,015,939 1,156,517 1,143,575 1,061,871 591,724 effective. OE kappa 0.5337 0.6222 0.5744 0.6563 0.5314 0.6120 0.6379 0.7757 In addition, we can find that the CD results of the EAs clustering-based CD methods (e.g., GA-FCM, MOEA/D, and M-DECD) perform better than those conventional clustering-based CD In addition, can find that demonstrates the CD resultsthe ofpromising the EAs clustering-based CD methods methods (FCM,we FLICM), which optimization capability of the (e.g., EAs for GA-FCM, MOEA/D, and M-DECD) perform better than those conventional clustering-based clustering-based CD. The M-DECD obtains stable results with high accuracy in all of the CD tested methods (FCM, FLICM), demonstrates the promising optimization capability of thethe EAsoptimal for datasets, which proveswhich that adaptive DE algorithm is robust and effective in searching clustering-based CD results. CD. The M-DECD obtains stable results with high accuracy in all of the tested datasets, which proves that adaptive DE algorithm is robust and effective in searching the optimal CD results. 4.4. Experiment Analysis 4.4. Experiment Analysis 4.4.1. Analysis of the Self-Adaptive Strategy 4.4.1. Analysis of the Self-Adaptive Strategy The proposed M-DECD adopted a self-adaptive parameter adjustment strategy [42] for DE. In The proposed M-DECD adopted a self-adaptive parameter adjustment strategy [42] for DE. order to prove the significance of the adaptive strategy, the relationship between the DE In order to prove the significance of the adaptive strategy, the relationship between the DE performance performance and different combinations of control parameters was analyzed. Without loss of and different combinations of control parameters was analyzed. Without loss of generality, the generality, the experimental result obtained with the Alaska dataset is presented in Figure 12. The experimental result obtained with the Alaska dataset is presented in Figure 12. The abscissas are abscissas are parameters CR and F, respectively, and the ordinate is the fitness value of the optimal parameters CR and F, respectively, and the ordinate is the fitness value of the optimal solution. It can solution. It can be seen that the different combinations of the control parameters have a significant be seen that the different combinations of the control parameters have a significant influence on the influence on the optimization result. DE with an inappropriate combination of CR and F converges optimization result. DE with an inappropriate combination of CR and F converges to the local minima, to the local minima, which will affect the detection result. Therefore, it is crucial to use an adaptive which will affect the detection result. Therefore, it is crucial to use an adaptive strategy to ensure that strategy to ensure that DE converges to the global optimal solution. DE converges to the global optimal solution.

fitness

CR

F

Figure 12. Relationship between the search performance (fitness) and different combinations of control Figure 12. Relationship between the search performance (fitness) and different combinations of parameters (F and CR). control parameters (F and CR).

In order to prove the effectiveness of the adopted self-adaptive strategy, we analyzed the relationship between the algorithm convergence process and different combinations of control parameters, as shown in Figure 13. The abscissa is the number of DE evolutionary iterations, and the ordinate is the fitness value of the optimal solution in each generation. It can be concluded that the adopted strategy (red line) can not only improve convergence speed, but also enhance the Remote Sens. 2018, 10, 1664 20 of 26 robustness of the algorithm. Remote Sens. 2018, 10, x FOR PEER REVIEW 20 of 26

In order order to of the strategy, we we analyzed the In to prove prove the the effectiveness effectiveness of the adopted adopted self-adaptive self-adaptive strategy, analyzed the relationship between the algorithm convergence process and different combinations of control relationship between the algorithm convergence process and different combinations of control parameters, shown in in Figure Figure 13. The abscissa abscissa is is the the number number of of DE DE evolutionary evolutionary iterations, iterations, and and the the parameters, as as shown 13. The ordinate is the fitness value of the optimal solution in each generation. It can be concluded that the ordinate is the fitness value of the optimal solution in each generation. It can be concluded that the adopted strategy line) cancan not only improve convergence speed, but also enhance robustness adopted strategy(red (red line) not only improve convergence speed, but alsothe enhance the of the algorithm. robustness of the algorithm.

Figure 13. Relationship between the search performance (fitness) and different combinations of control parameters (F and CR).

4.4.2. A Sensitivity Analysis on Adaptive DE Algorithm There are three parameters that have to be set before the adaptive DE-based CD step, i.e., population size NP, initial scale factor F0, and initial crossover rate CR0. In order to prove the robustness of the adaptive DE, we tested the influence of these parameters. Without loss of Figure 13. 13. Relationship between search performance (fitness) and combinations of between thethe search performance (fitness) and different of control generality, theRelationship tested results of Alaska dataset are shown in Figure 14. different Forcombinations demonstration purposes, parameters (F and CR). control parameters (F and CR). after each generation, we recorded the fitness of the best individual as the Y-axis, i.e., the Jm distance of the best clusters, and the X-axis is the parameter of DE. Population size is an important parameter, 4.4.2. A A Sensitivity Sensitivity Analysis Analysis on on Adaptive Adaptive DE DE Algorithm Algorithm 4.4.2. which affects the distribution of population and convergence behavior. We fixed F0 = 0.8 and CR0 = There are three parameters that have to be set before before the adaptive adaptive DE-based CDFigure step, i.e., i.e., 0.2, There then recorded performance DE to at be different population sizes, as shown in 14a. are threethe parameters that of have set the DE-based CD step, population NP, scale factor FF00,,and and initial rate . DE In order order to prove prove the Further, wesize fixed NPinitial = 30 and 0 = 0.2, the performance with various initial population size NP, initial scaleCR factor andrecorded initial crossover crossover rate CR CR0of 0. In to the robustness of the adaptive DE, we tested the influence of these parameters. Without loss of generality, scale factors, shown in Figure Furthermore, Figure 14c depictsparameters. the performance of DE robustness of asthe adaptive DE, 14b. we tested the influence of these Without losswith of the tested results of Alaska dataset are shown in Figure 14. For demonstration purposes, after each different initial crossover rates, where NP = 30 and F 0 = 0.8. generality, the tested results of Alaska dataset are shown in Figure 14. For demonstration purposes, generation, we results, recorded fitness of the best individual as the Y-axis, i.e., the JmDE distance ofdistance the Fromgeneration, the it the can be concluded thatof the of the adaptive algorithm is best very after each we recorded the fitness theperformance best individual as the Y-axis, i.e., the Jm clusters, and the X-axis is the parameter of DE. Population size is an important parameter, which stable. Theclusters, population sizeX-axis NP, initial factor of F0,DE. andPopulation initial crossover rateimportant CR0 haveparameter, little effect of the best and the is thescale parameter size is an affects the distribution of population and convergence behavior. We fixed F = 0.8 and CR = 0.2,CR then on convergence behavior. On one hand, the optimization of CD clusters is not difficult owing to the 0 0 which affects the distribution of population and convergence behavior. We fixed F0 = 0.8 and 0= recorded the performance of DE at different population sizes, as shown in Figure 14a. Further, we low-dimensional encoding strategy. On the other hand, the parameter self-adaptive strategy can 0.2, then recorded the performance of DE at different population sizes, as shown in Figure 14a. fixed NPwe = 30 andNP CR 0.2, and the performance of DE with various initial scale factors, dynamically adjust the scale factor the crossover rate, improve the search efficiency of DE. 0= =30 Further, fixed and CRrecorded 0 and = 0.2, and recorded theand performance of DE with various initial as shown in Figure 14b. Furthermore, Figure 14c depicts the performance of DE with different initial Therefore, the adaptive DE algorithm can converge to satisfactory optimization results in 100 scale factors, as shown in Figure 14b. Furthermore, Figure 14c depicts the performance of DE with crossover rates, where NP = 30 and F = 0.8. iterationsinitial undercrossover various parameters. different rates, where0 NP = 30 and F0 = 0.8. From the results, it can be concluded that the performance of the adaptive DE algorithm is very stable. The population size NP, initial scale factor F0, and initial crossover rate CR0 have little effect on convergence behavior. On one hand, the optimization of CD clusters is not difficult owing to the low-dimensional encoding strategy. On the other hand, the parameter self-adaptive strategy can dynamically adjust the scale factor and the crossover rate, and improve the search efficiency of DE. Therefore, the adaptive DE algorithm can converge to satisfactory optimization results in 100 iterations under various parameters. (a)

(b)

(c)

Figure Figure14. 14.Relationship Relationshipbetween between the theperformance performance and anddifferent differentparameters parametersof ofDE. DE.(a) (a)Performance Performance with population size size NP. (b) Performance with different scaleinitial factor scale F0 . (c) factor Performance withdifferent different population NP. (b) Performance with initial different F0. (c) with different with initialdifferent crossover rate crossover CR0 . Performance initial rate CR0.

From the results, it can be concluded that the performance of the adaptive DE algorithm is very stable. The population size NP, initial scale factor F0 , and initial crossover rate CR0 have little effect on convergence of CD clusters is not difficult owing to (a)behavior. On one hand, the optimization (b) (c) the low-dimensional encoding strategy. On the other hand, the parameter self-adaptive strategy can Figure 14. Relationship between the performance and different parameters of DE. (a) Performance with different population size NP. (b) Performance with different initial scale factor F0. (c) Performance with different initial crossover rate CR0.

Remote Sens. 2018, 10, 1664

21 of 26

dynamically adjust the scale factor and the crossover rate, and improve the search efficiency of DE. Therefore, the adaptive DE algorithm Remote Sens. 2018, 10, x FOR PEER REVIEW can converge to satisfactory optimization results in 100 iterations 21 of 26 under various parameters. 4.4.3. Performance Comparison of GA, ACO, and DE 4.4.3. Performance Comparison of GA, ACO, and DE The DE algorithm is employed to search for the optimal changed and unchanged clusters with The DE employedDE to search the optimal and unchanged clusters with minimum Jmalgorithm distance. isHowever, is just for a case study ofchanged the proposed optimal multi-feature minimum Jm distance. However,and DE can is just case study of theEAs proposed optimal multi-feature clustering-based CD framework, be areplaced by other with powerful optimization clustering-based CD framework, and can be replaced by other EAs with powerful optimization ability. In this section, we further test the performance of other EAs, i.e., genetic algorithm (GA [43]), ability. In this section, wealgorithm further test the performance of other EAs, i.e., genetic algorithm (GA [43]), ant colony optimization (ACO [44]) in the proposed framework, namely M-GACD and ant colony optimization algorithm (ACO [44]) in the proposed framework, namely M-GACD and M-ACOCD. The purpose is to compare the convergence process of GA, ACO, and DE algorithms M-ACOCD. The purpose is to function. compare the convergence processconsidering of GA, ACO, algorithms using the proposed objective In the GA algorithm, thatand theDE individual is using thewith proposed objective In the GA algorithm, considering that the individual is encoded encoded real-value, we function. use intermediate recombination and real-value mutation operators [43] with real-value, use intermediate recombination and real-value operators [43] to operator produce to produce newwe individuals. As for the ACO algorithm, we adoptmutation the real-value mutation new individuals. As for the ACO algorithm, we adopt the real-value mutation operator to construct to construct new solutions, and the transition probability and pheromone are updated according to new solutions, and theThese transition probabilitygenerate and pheromone are and updated according to the basic the basic ACO [44,45]. EAs randomly NP clusters, evolve the clusters through ACO [44,45]. These EAs randomly generate NP clusters, and evolve the clusters through different different evolutionary operators. For fairness, their population size is set to 30, and the maximum evolutionary operators. isFor theirSardinia population size is setthe to 30, and dataset the maximum number number of generations setfairness, to 100. The dataset and Alaska are selected as of generations is set to 100. The Sardinia dataset and the Alaska dataset are selected as examples, examples, and Figure 15 depicts the convergence process of the compared EAs. The abscissa is the and Figure 15 depicts the convergence of the compared EAs. The abscissa thegeneration. number of number of generations, and the ordinateprocess is the best fitness of the optimal clusters in is each generations, andresults the ordinate the best fitness the optimal clusters in each From the of theisSardinia datasetofand the Alaska dataset, wegeneration. find that the initial From the results of the Sardinia dataset and the Alaska dataset, we find that the initial populations populations of GA and ACO are better than DE, because the populations are randomly generated. of GA and ACO are better than DE, because the populations are randomly generated. However, However, eventually, the DE algorithm converges to the best solution with minimal fitness, which eventually, thethat DE algorithm convergesglobal to thesearch best solution withThe minimal fitness, which demonstrates demonstrates DE has a powerful capability. GA algorithm can obtain a better that DE has a powerful searchSince capability. The GA algorithm can obtain a better solution than solution than the ACOglobal algorithm. the ACO algorithm is more suitable for combinatorial the ACO algorithm. thetraveling ACO algorithm is more suitable[46], for combinatorial such optimization, such Since as the salesman problem the proposedoptimization, CD problem is as a the traveling salesman problem [46], the proposed CD problem is a continuous optimization problem, continuous optimization problem, so the ACO algorithm tends to fall into the local minima, so the ACO algorithm tends to fall clusters. into the local minima, resulting in failure to find optimal clusters. resulting in failure to find optimal

(a)

(b)

Figure Convergence analysis analysis of of GA, GA, ACO, ACO, and DE in the multi-feature Figure 15. 15. Convergence multi-feature clustering-based CD framework. of different EAsEAs for the dataset.dataset. (b) Convergence process framework. (a) (a)Convergence Convergenceprocess process of different forSardinia the Sardinia (b) Convergence of different EAs for the Alaska dataset. process of different EAs for the Alaska dataset.

4.4.4. Robustness Test 4.4.4. Robustness Test Taking into consideration interferences in the Landsat image, three features are extracted to Taking into consideration interferences in the Landsat image, three features are extracted to construct a multi-feature space—detail enhancement, Wiener de-noising, and structural similarity. construct a multi-feature space—detail enhancement, Wiener de-noising, and structural similarity. In this section, we test the robustness of the proposed M-DECD in resisting the interferences of the In this section, we test the robustness of the proposed M-DECD in resisting the interferences of the Landsat images, and analyze the effect of the three features. Landsat images, and analyze the effect of the three features. The Mexico dataset and the Sardinia dataset are selected for the Gaussian noise robustness The Mexico dataset and the Sardinia dataset are selected for the Gaussian noise robustness test, test, because the background interference of the two datasets is mainly composed of Gaussian because the background interference of the two datasets is mainly composed of Gaussian noise with little brightness distortion. The Gaussian noise, which obeys the normal distribution N   ,  2  ,   0,   0, 2,...,32 , is added into the tested datasets. The accuracy (Kappa coefficient) of the comparison methods is recorded as the Y-axis, and the X-axis is the standard variance of the Gaussian noise, as shown in Figure 16. It can be found that the MRF-EM and FLICM methods can

Remote Sens. 2018, 10, 1664

22 of 26

noise with little brightness distortion. The Gaussian noise, which obeys the normal distribution  N µ, σ2 , µ = 0, σ = 0, 2, . . . , 32, is added into the tested datasets. The accuracy (Kappa coefficient) of the comparison methods is recorded as the Y-axis, and the X-axis is the standard variance of the Remote Sens. 2018, 10, x FOR PEER REVIEW 22 of 26 Gaussian noise, as shown in Figure 16. It can be found that the MRF-EM and FLICM methods can remove derived from from remove strong strong Gaussian Gaussian noise, noise, because because their their CD CD is is conducted conducted with with spatial spatial information information derived the pixels in in DI. method has has employed employed the the Wiener Wiener the neighborhood neighborhood pixels DI. Although Although the the proposed proposed M-DECD M-DECD method filter to remove some Gaussian noise, the detail enhancement feature strengthens the interference when filter to remove some Gaussian noise, the detail enhancement feature strengthens the interference the noise too strong > 14 for σ > 16and for the dataset), resulting in  >Sardinia when theisnoise is too (σ strong (  the > 14Mexico for thedataset Mexicoand dataset 16 for the Sardinia dataset), reduced accuracy. Therefore, we further tested the M-DECD method without the detail enhancement resulting in reduced accuracy. Therefore, we further tested the M-DECD method without the detail feature and thefeature M-DECD only with the Wiener M-DECD-noD and enhancement andmethod the M-DECD method only de-noise with thefeature, Wienernamely de-noise feature, namely M-DECD-noDS, respectively. We found that the M-DECD-noDS similar results as the M-DECD-noD and M-DECD-noDS, respectively. We found that can the obtain M-DECD-noDS can obtain FLICM when the noise is strong. The M-DECD-noD performs better than M-DECD, but it still cannot similar results as the FLICM when the noise is strong. The M-DECD-noD performs better than resist strong image noise, which means theimage structural feature also enhances the image noise M-DECD, but it still cannot resist strong noise,similarity which means the structural similarity feature to some extent. the Thus, the proposed could be further combining also enhances image noise to method some extent. Thus, the improved proposed by method couldthe bemultiple further features in a more appropriate way. improved by combining the multiple features in a more appropriate way.

(a)

(b)

16.Gaussian Gaussiannoise noise robustness analysis of comparison the comparison methods. (a) Accuracy of the Figure 16. robustness analysis of the methods. (a) Accuracy of the Mexico Mexico dataset with respect to different noise level. (b) Accuracy of Sardinia the Sardinia dataset with respect dataset with respect to different noise level. (b) Accuracy of the dataset with respect to different noise levels. to different noise levels.

Furthermore, dataset and and the the Canada Canada dataset dataset are are tested tested to to prove Furthermore, the the Alaska Alaska dataset prove the the effectiveness effectiveness of of the feature and the detail detail enhancement enhancement feature and the the structural structural similarity similarity feature, feature, because because the the Alaska Alaska dataset dataset has has many and some locallocal brightness distortion, while while the Canada dataset is affected many weak weakchanged changededges edges and some brightness distortion, the Canada dataset is by stronger brightness distortion (marked by (marked the blue ovals in blue Figure 11a).in For demonstration, affected by local stronger local brightness distortion by the ovals Figure 11a). For we tested the M-DECD method the detail enhancement (M-DECD-noD) to analyze demonstration, we tested thewithout M-DECD method without feature the detail enhancement feature the effect of detail preservation, and we also tested the M-DECD without structural similarity feature (M-DECD-noD) to analyze the effect of detail preservation, and we also tested the M-DECD (M-DECD-noS) to analyze the feature robustness of the algorithm to local the brightness distortion. The CMs are without structural similarity (M-DECD-noS) to analyze robustness of the algorithm to depicted in Figure 17 and their accuracy is listed in Table 5. local brightness distortion. The CMs are depicted in Figure 17 and their accuracy is listed in Table 5. The CMs of the Alaska dataset show that the M-DECD-noD lost some weak changed edges (marked by red oval), compared to the M-DECD, which indicates that the M-DECD has better detail preservation space ofof M-DECD-noS is preservation ability abilitydue dueto tothe thedetail detailenhancement enhancementfeature. feature.The Themulti-feature multi-feature space M-DECD-noS composed of the detail enhancement feature and the Wiener de-noise feature;feature; thus, its thus, CM contains is composed of the detail enhancement feature and the Wiener de-noise its CM several changed details. However, without the structural similarity feature, the CM of M-DECD-noS contains several changed details. However, without the structural similarity feature, the CM of has many falsehas detected in the brightness region (marked region by yellow oval).by From the M-DECD-noS manypixels false detected pixels in distortion the brightness distortion (marked yellow CMs the Canada dataset, find that the M-DECD-noD M-DECD-noS miss of changed oval).ofFrom the CMs of thewe Canada dataset, we find thatand the the M-DECD-noD and thea lot M-DECD-noS pixels, prove that the detail feature the structural similarity can miss a which lot of changed pixels, whichenhancement prove that the detailand enhancement feature and thefeature structural preserve details the changed thechanged spectral object. contrastSince between changed and similaritylocal feature canofpreserve localobject. detailsSince of the the the spectral contrast unchanged is low the Canadaregions dataset,isitlow needs information possible the between theregions changed andinunchanged in as themuch Canada dataset, as it needs asfor much information as possible for the accurate CD. The M-DECD-noD does not consider the spectral value of DI, so its CM has some false detected pixels in the right top corner. On comparing the CMs of the M-DECD and the M-DECD-noS, we can conclude that the structural similarity feature is effective in resisting strong local brightness. The accuracy of the M-DECD and M-DECD-noD demonstrates that the detail enhancement

Remote Sens. 2018, 10, 1664

23 of 26

accurate CD. The M-DECD-noD does not consider the spectral value of DI, so its CM has some false detected pixels in the right top corner. On comparing the CMs of the M-DECD and the M-DECD-noS, we can conclude that the structural similarity feature is effective in resisting strong local brightness. of the M-DECD RemoteThe Sens.accuracy 2018, 10, x FOR PEER REVIEW and M-DECD-noD demonstrates that the detail enhancement 23 of 26 feature can decrease the miss alarm, and the accuracy of the M-DECD and M-DECD-noS proves that the structural similarity feature can significantly decrease the false alarm, especially when the interference of images is mainly caused by local brightness distortion. Results of the Alaska dataset with weak changed edges and some local brightness distortion

Results of the Canada dataset with strong local brightness distortion

(a)

(b)

(c)

(d)

Resultsof ofthe theM-DECD M-DECDseries seriesofofmethods methods with various combinations of image features. Figure 17. Results with various combinations of image features. (a) (a) CMs of the M-DECD. CMs theM-DECD-noD. M-DECD-noD.(c) (c)CMs CMsof ofthe the M-DECD-noS. M-DECD-noS. (d) (d) Reference CMs of the M-DECD. (b)(b) CMs ofofthe Yellow ovals refer to the false detection areas compared with the M-DECD, and the red ovals map. Yellow refer to the miss-detection areas compared with the M-DECD. M-DECD. Table 5. Accuracy of the results obtained with various combinations of image features. Table 5. Accuracy of the results obtained with various combinations of image features. Dataset DatasetIndex Index MAMA FA FA Alaska Alaska OE OE Kappa Kappa MA

MA FA Canada FA OE Canada OEKappa Kappa

M-DECDM-DECD-noD M-DECD-noD M-DECD-noS M-DECD M-DECD-noS 3095 3524 1925 3095 3524 1925 984984 624 5902 624 5902 4079 4148 7827 4079 4148 7827 0.7519 0.736619 0.7366190.640776 0.640776 0.7519 386,034 458,155 310,156 386,034 458,155 310,156 205,690 230,708 795,020 205,690 230,708 795,020 591,724 688,863 1,105,176 591,724 0.736385 688,863 0.629674 1,105,176 0.7757 0.7757

0.736385

0.629674

5. Discussion and Conclusions 5. Discussion and Conclusions In this paper, we proposed an unsupervised CD method based on multi-feature clustering this paper,differential we proposed an unsupervised CD for method basedimagery. on multi-feature clustering using usingInadaptive evolution (M-DECD) Landsat The proposed method adaptive differential evolution (M-DECD) for Landsat imagery. The proposed method constructs constructs an effective multi-feature space for Landsat image CD, which can resist Gaussian noise an effective multi-feature space while for Landsat image CD,changes. which can resist Gaussian and from local and local brightness distortion, preserving weak Three features are noise extracted brightness distortion, while preserving weak changes. Three features are extracted from the the multi-temporal images and DI—Wiener denoising, detail enhancement, and structural multi-temporal images and optimal DI—Wiener denoising, detail enhancement, structural with similarity. similarity. In order to obtain CD results, the M-DECD combines theand DE algorithm FCM In order to obtain optimal CD results, the M-DECD combines the DE algorithm with FCM to search to search the changed and unchanged clusters in the multi-feature space by minimizing the the Jm distance. In addition, the control parameters of the DE algorithm have been adaptively adjusted to improve the robustness of M-DECD. Four Landsat datasets with typical natural land cover changes have been tested and the experiment results demonstrate the effectiveness and superiority of the M-DECD when compared with seven conventional and state-of-the-art methods, especially for the Canada dataset, whose detection accuracy is increased by 11.9%.

Remote Sens. 2018, 10, 1664

24 of 26

changed and unchanged clusters in the multi-feature space by minimizing the Jm distance. In addition, the control parameters of the DE algorithm have been adaptively adjusted to improve the robustness of M-DECD. Four Landsat datasets with typical natural land cover changes have been tested and the experiment results demonstrate the effectiveness and superiority of the M-DECD when compared with seven conventional and state-of-the-art methods, especially for the Canada dataset, whose detection accuracy is increased by 11.9%. However, the proposed method treats all the features equally. It could be further improved by analyzing the noise level and brightness distortion level of the data, then incorporating the multiple image features in a more appropriate way. In addition, the detail enhancement feature strengthens image noise, so the M-DECD is not suitable for processing SAR image with strong speckle noise. Author Contributions: All the authors made significant contributions to the work. Y.Z., M.S. and A.M. proposed the conception and designed the methodology. Funding: This work was supported by National Key Research and Development Program of China under Grant No. 2017YFB0504202, National Natural Science Foundation of China under Grant Nos. 41622107 and 41771385, and Natural Science Foundation of Hubei Province in China under Grant No. 2016CFA029. Acknowledgments: The authors would like to thank Liangpei Zhang for providing advice for experiments and the preparation of the paper. The authors would also like to thank the National Aeronautics and Space Administration (NASA) and the US Geological Survey (USGS) for acquiring and providing the Landsat images used in this study, and Zhang Hua (China University of Mining and Technology, Jiangsu, China) for sharing the Mexico, Sardinia, and Alaska datasets. Conflicts of Interest: The authors declare no conflict of interest.

References 1. 2.

3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14.

Singh, A. Digital change detection techniques using remotely-sensed data. Int. J. Remote Sens. 1989, 10, 989–1003. [CrossRef] Bruzzone, L.; Prieto, D.F. An adaptive semiparametric and context-based approach to unsupervised change detection in multitemporal remote-sensing images. IEEE Trans. Image Process. 2002, 11, 452–466. [CrossRef] [PubMed] Mishra, N.S.; Ghosh, S.; Ghosh, A. Fuzzy clustering algorithms incorporating local information for change detection in remotely sensed images. Appl. Soft Comput. 2012, 12, 2683–2692. [CrossRef] Yoon, G.W.; Yun, Y.B.; Park, J.H. Change vector analysis: Detecting of areas associated with flood using landsat TM. Int. Geosci. Remote Sens. 2003, 5, 3386–3388. Chen, W.; Moriya, K.; Sakai, T.; Koyama, L.; Cao, C.X. Mapping a burned forest area from landsat tm data by multiple methods. Geomat. Nat. Hazards Risk. 2016, 7, 384–402. [CrossRef] Schroeder, T.A.; Wulder, M.A.; Healey, S.P.; Moisen, G.G. Detecting post-fire salvage logging from landsat change maps and national fire survey data. Remote Sens. Environ. 2012, 122, 166–174. [CrossRef] Yavasli, D.D.; Tucker, C.J.; Melocik, K.A. Change in the glacier extent in turkey during the landsat ERA. Remote Sens. Environ. 2015, 163, 32–41. [CrossRef] Yavariabdi, A.; Kusetogullari, H. Change detection in multispectral landsat images using multiobjective evolutionary algorithm. IEEE Geosci. Remote Sens. Lett. 2017, 14, 414–418. [CrossRef] Bruzzone, L.; Prieto, D.F. Automatic analysis of the difference image for unsupervised change detection. IEEE Trans. Geosci. Remote Sens. 2000, 38, 1171–1182. [CrossRef] Otsu, N. Threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 1979, 9, 62–66. [CrossRef] Kittler, J.; Illingworth, J. Minimum error thresholding. Pattern Recognit. 1986, 19, 41–47. [CrossRef] Zanetti, M.; Bruzzone, L. A theoretical framework for change detection based on a compound multiclass statistical model of the difference image. IEEE Trans. Geosci. Remote Sens. 2018, 56, 1129–1143. [CrossRef] Zanetti, M.; Bovolo, F.; Bruzzone, L. Rayleigh-rice mixture parameter estimation via em algorithm for change detection in multispectral images. IEEE Trans. Image Process. 2015, 24, 5004–5016. [CrossRef] [PubMed] Peiman, R. Pre-classification and post-classification change-detection techniques to monitor land-cover and land-use change using multi-temporal landsat imagery: A case study on pisa province in italy. Int. J. Remote Sens. 2011, 32, 4365–4381. [CrossRef]

Remote Sens. 2018, 10, 1664

15.

16.

17. 18. 19. 20. 21. 22. 23. 24.

25. 26. 27. 28. 29. 30.

31.

32. 33. 34. 35. 36. 37. 38.

25 of 26

Ghosh, S.; Bruzzone, L.; Patra, S.; Bovolo, F.; Ghosh, A. A context-sensitive technique for unsupervised change detection based on hopfield-type neural networks. IEEE Trans. Geosci. Remote Sens. 2007, 45, 778–789. [CrossRef] Bovolo, F.; Bruzzone, L.; Marconcini, M. A novel approach to unsupervised change detection based on a semisupervised svm and a similarity measure. IEEE Trans. Geosci. Remote Sens. 2008, 46, 2070–2082. [CrossRef] Bezdek, J.C.; Ehrlich, R.; Full, W. FCM—The fuzzy c-means clustering-algorithm. Comput. Geosci. 1984, 10, 191–203. [CrossRef] Krinidis, S.; Chatzis, V. A robust fuzzy local information c-means clustering algorithm. IEEE Trans. Image Process. 2010, 19, 1328–1337. [CrossRef] [PubMed] Gong, M.G.; Zhou, Z.Q.; Ma, J.J. Change detection in synthetic aperture radar images based on image fusion and fuzzy clustering. IEEE Trans. Image Process. 2012, 21, 2141–2151. [CrossRef] [PubMed] Celik, T. Change detection in satellite images using a genetic algorithm approach. IEEE Geosci. Remote Sens. Lett. 2010, 7, 386–390. [CrossRef] Ghosh, A.; Mishra, N.S.; Ghosh, S. Fuzzy clustering algorithms for unsupervised change detection in remote sensing images. Inf. Sci. 2011, 181, 699–715. [CrossRef] Wang, Q.; Zhang, F.; Li, X. Optimal clustering framework for hyperspectral band selection. IEEE Trans. Geosci. Remote Sens. 2018, 56, 5910–5922. [CrossRef] Zhong, Y.F.; Ma, A.L.; Ong, Y.S.; Zhu, Z.X.; Zhang, L.P. Computational intelligence in optical remote sensing image processing. Appl. Soft Comput. 2018, 64, 75–93. [CrossRef] Zhong, Y.F.; Zhang, S.; Zhang, L.P. Automatic fuzzy clustering based on adaptive multi-objective differential evolution for remote sensing imagery. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2013, 6, 2290–2301. [CrossRef] Zhong, Y.F.; Ma, A.L.; Zhang, L.P. An adaptive memetic fuzzy clustering algorithm with spatial information for remote sensing imagery. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2014, 7, 1235–1248. [CrossRef] Celik, T. Image change detection using gaussian mixture model and genetic algorithm. J. Vis. Commun. Image Represent. 2010, 21, 965–974. [CrossRef] Celik, T.; Yetgin, Z. Change detection without difference image computation based on multiobjective cost function optimization. Turk. J. Electr. Eng. Comput. Sci. 2011, 19, 941–956. Li, H.; Gong, M.G.; Wang, Q.; Liu, J.; Su, L.Z. A multiobjective fuzzy clustering method for change detection in sar images. Appl. Soft Comput. 2016, 46, 767–777. [CrossRef] Aghababaee, H.; Tzeng, Y.C.; Amini, J. Swarm intelligence and fractals in dual-pol synthetic aperture radar image change detection. J. Appl. Remote Sens. 2012, 6, 063596. [CrossRef] Kusetogullari, H.; Yavariabdi, A.; Celik, T. Unsupervised change detection in multitemporal multispectral satellite images using parallel particle swarm optimization. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2015, 8, 2151–2164. [CrossRef] Singh, K.K.; Mehrotra, A.; Nigam, M.J.; Pal, K. Unsupervised change detection from remote sensing images using hybrid genetic FCM. In Proceedings of the 2013 Students Conference on Engineering and Systems (Sces): Inspiring Engineering and Systems for Sustainable Development, Allahabad, India, 12–14 April 2013. Shang, R.H.; Qi, L.P.; Jiao, L.C.; Stolkin, R.; Li, Y.Y. Change detection in sar images by artificial immune multi-objective clustering. Eng. Appl. Artif. Intell. 2014, 31, 53–67. [CrossRef] Storn, R.; Price, K. Differential evolution—A simple and efficient heuristic for global optimization over continuous spaces. J. Glob. Optim. 1997, 11, 341–359. [CrossRef] Das, S.; Mullick, S.S.; Suganthan, P.N. Recent advances in differential evolution—An updated survey. Swarm Evol. Comput. 2016, 27, 1–30. [CrossRef] Das, S.; Suganthan, P.N. Differential evolution: A survey of the state-of-the-art. IEEE Trans. Evol. Comput. 2011, 15, 4–31. [CrossRef] Wang, D.; Zhang, X.F.; Liu, Y.; Zhao, Z.W.; Song, Z.J. Remote sensing image denoising with iterative adaptive wiener filter. Springer Proc. Phys. 2017, 192, 361–370. Lim, J.S. Two-Dimensional Signal and Image Processing; Prentice Hall: Englewood Cliffs, NJ, USA, 1990; 710p. Stephane, M.; Charlotte, P. Primal sketch of image series with edge preserving filtering application to change detection. In Proceedings of the 2015 8th International Workshop on the Analysis of Multitemporal Remote Sensing Images (Multi-Temp), Annecy, France, 22–24 July 2015.

Remote Sens. 2018, 10, 1664

39. 40. 41. 42. 43. 44. 45. 46.

26 of 26

Moser, G.; Serpico, S.B. Unsupervised change detection with high-resolution sar images by edge-preserving markov random fields and graph-cuts. Int. Geosci. Remote Sens. 2012, 1984–1987. [CrossRef] Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [CrossRef] [PubMed] Loza, A.; Mihaylova, L.; Bull, D.; Canagarajah, N. Structural similarity-based object tracking in multimodality surveillance videos. Mach. Vis. Appl. 2009, 20, 71–83. [CrossRef] Zhong, Y.F.; Zhang, L.P. Remote sensing image subpixel mapping based on adaptive differential evolution. IEEE Trans. Syst. Man Cybern. Part B Cybern. 2012, 42, 1306–1329. [CrossRef] [PubMed] Mühlenbein, H.; Schlierkamp-Voosen, D. Predictive models for the breeder genetic algorithm I. Continuous parameter optimization. Evol. Comput. 1993, 1, 25–49. [CrossRef] Dorigo, M. Optimization, learning and natural algorithms. PhD Thesis, Politecnico di Milano, 1992. Dorigo, M.; Blum, C. Ant colony optimization theory: A survey. Theor. Comput. Sci. 2005, 344, 243–278. [CrossRef] Dorigo, M.; Caro, G.D. Ant colony optimization: A new meta-heuristic. In Proceedings of the 1999 Congress on Evolutionary Computation (CEC 99), Washington, DC, USA, 6–9 July 1999; Volume 2, pp. 1470–1477. © 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).