information Article
Land Cover Classification from Multispectral Data Using Computational Intelligence Tools: A Comparative Study André Mora 1, * ID , Tiago M. A. Santos 1 , Szymon Łukasik 2,3 , João M. N. Silva 4 António J. Falcão 1 , José M. Fonseca 1 ID and Rita A. Ribeiro 1 ID 1
2 3 4
*
ID
,
Computational Intelligence Group of CTS/UNINOVA, FCT, University NOVA of Lisbon, 2820-516 Caparica, Portugal;
[email protected] (T.M.A.S.);
[email protected] (A.J.F.);
[email protected] (J.M.F.);
[email protected] (R.A.R.) Faculty of Physics and Applied Computer Science, AGH University of Science and Technology, 30-059 Kraków, Poland;
[email protected] Systems Research Institute, Polish Academy of Sciences, 01-447 Warsaw, Poland Forest Research Centre, School of Agriculture, University of Lisbon, 1349-017 Lisbon, Portugal;
[email protected] Correspondence:
[email protected]; Tel.: +351-21-2948527
Received: 8 October 2017; Accepted: 13 November 2017; Published: 15 November 2017
Abstract: This article discusses how computational intelligence techniques are applied to fuse spectral images into a higher level image of land cover distribution for remote sensing, specifically for satellite image classification. We compare a fuzzy-inference method with two other computational intelligence methods, decision trees and neural networks, using a case study of land cover classification from satellite images. Further, an unsupervised approach based on k-means clustering has been also taken into consideration for comparison. The fuzzy-inference method includes training the classifier with a fuzzy-fusion technique and then performing land cover classification using reinforcement aggregation operators. To assess the robustness of the four methods, a comparative study including three years of land cover maps for the district of Mandimba, Niassa province, Mozambique, was undertaken. Our results show that the fuzzy-fusion method performs similarly to decision trees, achieving reliable classifications; neural networks suffer from overfitting; while k-means clustering constitutes a promising technique to identify land cover types from unknown areas. Keywords: image fusion; aggregation operators; remote sensing; land cover classification
1. Introduction The main objective of this study is to discuss the suitability of different computational intelligence methods for studying land cover spatiotemporal modifications, mainly for improving land usage and management. This paper is based on a preliminary conference paper [1] where we presented a novel fuzzy image fusion technique and compared it with two other computational intelligence methods, decision trees and neural networks, for fusing images and performing classification of terrains as waterbody, river bank, bare area, cropland, grassland, shrubland and forest. In this article, we extend the previous work by reproducing a spatiotemporal case study with three years (1989, 2002, 2005) run by Temudo et al. [2] where the changes of land cover usage in the post-war period (>1992) in the district of Mandimba, Niassa province, Mozambique, were studied (http://earthexplorer.usgs.gov). In our article, we also include an unsupervised approach for enriching the comparative study. The aim of this extension is to strengthen the claims about the accuracy of the fuzzy-fusion approach and to demonstrate its suitability for spatiotemporal image fusion.
Information 2017, 8, 147; doi:10.3390/info8040147
www.mdpi.com/journal/information
Information 2017, 8, 147
2 of 15
Nowadays, with the free availability in web repositories of multispectral satellite images from Earth observation missions, we can efficiently classify terrain types and, therefore, contribute to studies and improved analysis of the Earth environment. These studies fall into the domain of remote sensing analysis and cover topics such as deforestation, degradation of coastal areas, wildfires and shifting cultivation, among others [3]. The term remote sensing is understood as a technique for identifying, classifying and determining an object’s properties through the analysis of data acquired remotely, without physical contact with the object itself [4]. There are several works in the literature about applying Computational Intelligence (CI) techniques to satellite images for feature extraction and classification, with the aim of improving land cover analysis. For example, CI techniques such as fuzzy logic [5–7], Decision Trees (DT) [8–10] and Artificial Neural Networks (ANN) [11–14] are already applied in many classification problems and in particular in some cases of remote sensing [15–17]. The discussed fuzzy-fusion inference method, instead of just using a basic fuzzy inference model, is improved with reinforcement aggregation operators to perform the inference reasoning [1, 18]. Reinforcement aggregation operators have been applied in decision making methods and data fusion [19–22], but here, they are used in the inference scheme to ensure positive or negative reinforcement in the classification. The fuzzy-fusion inference method includes three steps: (i) fuzzification of the spectral information (bands); (ii) creation of the rule set for the land cover classes; (iii) evaluation of the rules using aggregation operators. To create the fuzzy membership functions (Step i), we generated histograms from the training data and then built the membership functions by fitting Gaussian functions to the histogram’s clusters (clustered with Otsu’s thresholding method [23]) using a hybrid algorithm that combines the Levenberg–Marquardt optimization method [24] and the classic mean and standard deviation approach [16]. For the classification process (Step ii), seven rules were defined, one for each class to be classified, having seven inputs each, corresponding to the spectral bands’ membership functions. For the inference scheme (Step iii), we followed the Takagi–Sugeno model of fuzzy inputs, crisp output [25] and tested four types of aggregation operators: average; minimum; and two reinforcement operators: Fixed Identity Monotonic Identity Commutative Aggregation (FIMICA); Fuzzy-Fusion (FF)-Uninorm [21,26,27]. This paper is organized as follows. Section 2 provides an overview of techniques used in remote sensing, with special consideration of approaches based on the techniques of computational intelligence. In Section 3, the case study and the proposed classification method are presented. In Section 4, a comparative study of the results obtained with our approach, DT, ANN and unsupervised k-means clustering is given. Finally, in Section 5, we provide some conclusions and future work. 2. Related Work in Remote Sensing Analysis Land cover classification from satellite images constitutes a typical problem related to remote sensing. It can be performed in a supervised way (having a preclassified training set) or an unsupervised way (where no additional information regarding the content of the image is provided). This means that the unsupervised algorithm extracts class labels for each image region (or individual pixel), being equivalent to a clustering task. Both classic statistical approaches and others based on Computational Intelligence (CI) techniques have been successfully applied for this clustering task. The classic statistical approach involves algorithms such as k-means [28], c-means [29] or the expectation-maximization method [30]. On the other hand, the approaches based on CI frequently employ nature-inspired heuristics like genetic algorithms [31] or particle swarm optimization [32]. These CI supervised methods represent an alternative solution to the aforementioned clustering techniques, where the algorithms are supplemented with additional knowledge related to the assignment of pixels/regions to different classes. Among the most common approaches of this type are DT, ANN and fuzzy inference systems. As the paper studies a supervised variant of a remote sensing method and employs the aforementioned techniques for comparison, they will be covered below in more detail.
Information 2017, 8, 147
3 of 15
Decision Tree (DT) classifiers divide data into smaller subsets with some similar features, until subsets are homogeneous. A DT is composed of nodes (root and interior) and leaves (terminal nodes). Each node represents a rule that is applied to the data and that controls the path to be followed to the next node. The design of a tree can be done manually, when it is based on a priori knowledge, or automatically using mathematical evaluation algorithms, such as ID3 (Iterative Dichotomiser 3), C4.5 or CART (Classification And Regression Tree) methods. Friedl and Brodley [3] and Fauvel et al. [17] used DTs to perform terrain classification and concluded that they are adaptable to noisy and nonlinear relations between land cover classes and remotely-sensed data. Artificial Neural Networks (ANN) form a classification technique where neurons are trained to detect patterns in the training dataset and then the trained classifier is applied to unknown data. In a multi-layer ANN, the number of neurons in the input and output layers is determined by the data being analyzed (external elements), while the number of hidden layers is determined mainly by trial and error, allowing the network to solve more complex problems [15]. This classification technique has some drawbacks, in particular a long learning time, lack of transparency on the reasoning process (typically presenting a black-box behavior) and a tendency to produce overfitting. One of the first applications of ANN in remotely-sensed data was the work of Kanellopoulos and Wilkinson [33]. In their article, a set of best practices for applying ANN in remote sensing data was presented. Later, Ayhan and Kansu [15] conducted a study, comparing three multispectral image classification techniques, namely ANN, the maximum-likelihood estimator and fuzzy logic, using images from IKONOS II and Landsat. They concluded that ANN is a robust method, but with the drawback that determining the optimum network structure is a hard and fundamental task to ensure good results. Finally, Fuzzy Inference Systems (FIS) are rule-based models described by logic operators in rules that establish relationships between fuzzy sets [34]. The set of rules, which can be provided either by experts or by creating all possible combinations between input variables [35] are composed by a set of propositions, antecedents (inputs) and consequents (outputs). FIS include three main processes [34]: inputs’ fuzzification; fuzzy rules’ definition and inference scheme selection to obtain the outputs. The fuzzification process refers to the representation of all input variables on the [0, 1] domain through the use of fuzzy sets. The most common fuzzification processes [34] are performed through: intuition, inference or induction. Induction uses observation data to generate membership functions, and this is the chosen method in this work. Regarding the inference scheme, the most well-known models are those of Mamdani [36] and Takagi–Sugeno [25]. The Mamdani model requires defuzzification and is more computationally intensive, while Takagi–Sugeno is more suitable for mathematical analysis because it is computationally efficient and produces crisp outputs. In this work, and due to its efficiency, we use the weighted average of the Takagi–Sugeno model, but substituting the classical algebraic operators by the reinforcement aggregation operator. Throughout the literature, there are several published works where fuzzy logic classifiers were applied in remote sensing problems, but none with our specifications. Some, like [7,15,37], train the classifier (fuzzification) using simple Gaussian membership functions, which are defined only with the data’s mean and standard deviation, and use a discrete max-min operator for the inference scheme. This method will be hereafter referred to as the classical method. 3. Materials and Methods 3.1. Input Data As mentioned, our case study covers the 1989–2005 period because there is a previous study [2] where the authors studied the population return to one Mozambique district (Mandimba), after the 1992 peace accord, which resulted in observable changes in land usage. Mainly, a change was observed for shifting cultivation, a known agricultural technique, sometimes called slash and burn [38], where deforestation allows new cultivation areas, which are well observed from satellite imagery.
Information 2017, 8, 147
4 of 15
Information 2017, 8, 147
4 of 15
The imagery used was borrowed from Temudo et al. [2], which included Landsat 5 (1989) and The imagery used was borrowed from Temudo et al. [2], which included Landsat 5 (1989) and 7 7 (2002 and 2005) high-resolution satellite images (8-bit data), taken over the district of Mandimba, (2002 and 2005) high-resolution satellite images (8-bit data), taken over the district of Mandimba, province of Niassa, in Mozambique (Figure 1a). The choice of dataset was due to the existing expert province of Niassa, in Mozambique (Figure 1a). The choice of dataset was due to the existing expert land classification, on the mentioned study, that could act as the ground-truth. Landsat 5 used a land classification, on the mentioned study, that could act as the ground-truth. Landsat 5 used a Thematic Mapper (TM) multispectral scanning radiometer and Landsat 7 the Enhanced Thematic Thematic Mapper (TM) multispectral scanning radiometer and Landsat 7 the Enhanced Thematic Mapper Plus (ETM+), which has similar resolution (28.5 m/pixel) and the same spectral bands (seven) Mapper Plus (ETM+), which has similar resolution (28.5 m/pixel) and the same spectral bands (seven) plus a panchromatic one (15 m/pixel), and each image covers the same area. The satellite images were plus a panchromatic one (15 m/pixel), and each image covers the same area. The satellite images were acquired in 1989, 2002 and 2005 and cover a region of approximately 16,831 km2 2(4605 × 4500 pixels acquired in 1989, 2002 and 2005 and cover a region of approximately 16,831 km (4605 × 4500 pixels with a 28.5-m/pixel effective resolution). A mask of the Mandimba district and areas not covered with a 28.5-m/pixel effective resolution). A mask of the Mandimba district and areas not covered by by nebulosity in any of the three images was created (Figure 1b), reducing the total covered area to nebulosity in any of the three images was created (Figure 1b), reducing the total covered area to 3367 km2 . We used five spectral bands, namely Green-Band 2, Red-Band 3, Near-Infrared (c)-Band 3367 km2. We used five spectral bands, namely Green-Band 2, Red-Band 3, 4, Short Wave Infrared 1 (SWIR-1)-Band 5 and SWIR-2-Band 7, together with two vegetation indices, Near-Infrared (c)-Band 4, Short Wave Infrared 1 (SWIR-1)-Band 5 and SWIR-2-Band 7, together with to ensure the distinction of vegetation types: the Normalized Difference Vegetation Index (NDVI) [39] two vegetation indices, to ensure the distinction of vegetation types: the Normalized Difference and the Vegetation Index (VI) proposed in [40], respectively depicted by Equations (1) and (2). In their Vegetation Index (NDVI) [39] and the Vegetation Index (VI) proposed in [40], respectively depicted original equations, they provide normalized values in the interval [−1, 1], but here, they were rescaled by Equations (1) and (2). In their original equations, they provide normalized values in the interval to a normalized 8-bit unsigned scale [0, 255]. These vegetation indices have the advantage of being [−1, 1], but here, they were rescaled to a normalized 8-bit unsigned scale [0, 255]. These vegetation less dependent on illumination and having a good discrimination between different land cover types. indices have the advantage of being less dependent on illumination and having a good discrimination They show higher values for vegetation, positive low values for water and bare soils and negative between different land cover types. They show higher values for vegetation, positive low values for index values for clouds and snow. water and bare soils and negative index values for clouds and snow. N𝑁𝐼𝑅 IR –−𝑅𝐸𝐷 RED NVDI + 127 (1) (1) 𝑁𝑉𝐷𝐼= =127 127×× N IR + RED + 127 𝑁𝐼𝑅 + 𝑅𝐸𝐷 N IR − SW IR 2 + 127 V I 𝑉𝐼 = =127 × × 𝑁𝐼𝑅 – 𝑆𝑊𝐼𝑅 2 + 127 127 N𝑁𝐼𝑅 IR + 𝑆𝑊𝐼𝑅 SW IR2 2
(a)
(2) (2)
(b)
Figure 1. (a) Landsat 5 image taken in 1989 over the district of Mandimba (RGB-Bands 743) and (b) in Figure 1. (a) Landsat 5 image taken in 1989 over the district of Mandimba (RGB-Bands 743) and (b) in black, the the mask mask of of the the Mandimba Mandimba district district and and areas areas not not covered coveredby byclouds cloudsin inthe the33images. images. black,
Our case study encompassed a total of seven land cover classes, as described in Table 1. Our case encompassed a total of seven land cover classes, et as al. described in Table 1. However, However, we study will follow the general analysis procedure of Temudo [2], merging similar classes, we will follow the general analysis procedure of Temudo et al. [2], merging similar classes, reducing reducing its number to five (Table 1, column “merged”) to facilitate understanding the its number to five (Table 1, column “merged”) to facilitate understanding the comparative results. comparative results.
Information 2017, 8, 147 Information 2017, 2017, 8, 8, 147 147 Information Information Information 2017, 2017, 8, 8, 147 147 Information 2017, 8, 147 Table Information Information 2017, 2017, 8, 8, 147 147
ID ID ID ID ID ID ID ID 1111 11 11
5 of 15
1. Classes used for classification and merged 5-class distribution.
Table 1. 1. Classes Classes used used for for classification classification and and merged merged 5-class 5-class distribution. distribution. Table Table 1. Classes used for classification and merged 5-class distribution. Table 1. Classes used for classification and merged 5-class distribution. Table 1. Classes used for classification and merged 5-class distribution. Name Abbreviation Class Description Merged Table 1. used for and 5-class Class Class Name Abbreviation Class Description Merged Table 1. Classes Classes used for classification classificationClass and merged merged 5-class distribution. distribution. Class Name Abbreviation Description Merged Class Abbreviation Class Merged Class Name Name Abbreviation Class Description Description Merged Class Name Abbreviation Class Description Merged Class Abbreviation Class Description Merged Class Name Name Abbreviation Classcovered Description Merged Areas covered by water Areas by(e.g., waterrivers, Waterbody WaterB WaterBAreas 1 11 Waterbody Areas covered covered by by water water (e.g., (e.g., rivers, rivers, Waterbody WaterB Areas covered by water (e.g., rivers, (e.g., rivers, lakes) lakes) Waterbody WaterB 1 Areas covered by water (e.g., rivers, lakes) Waterbody WaterB 1 Areas water lakes) Waterbody WaterB 1 Areas covered covered by by water (e.g., (e.g., rivers, rivers, lakes) Waterbody WaterB 1 lakes) Waterbody WaterB 1 lakes) lakes)
2222 22 22
River River Bank River River Bank Bank Bank River Bank River Bank River River Bank Bank
RiverB RiverB RiverB RiverB RiverB RiverB RiverB RiverB
3333 33 33
Bare Area Area Bare Bare Area Bare Area Bare Bare Area Area Bare Area Bare Area
Areas without without vegetation vegetation (e.g., (e.g., rock rock Areas BareA Areas without vegetation (e.g., without vegetation BareA Areas Areas without vegetation (e.g., rock rock outcrops) BareA BareA Areas without vegetation (e.g., rock outcrops) BareA Areas vegetation (e.g., (e.g.,outcrops) rock outcrops) BareA Areas without without vegetation (e.g., rock rock outcrops) BareA outcrops) BareA outcrops) outcrops)
2 2 2 2 2 2 2 2
444 444 44
Croplands Croplands Croplands Croplands Croplands Croplands Croplands Croplands
CropL CropL CropL CropL CropL CropL CropL CropL
Areas covered covered by by crops crops Areas Areas covered by crops Areas Areascovered coveredby bycrops crops Areas covered by crops Areas Areas covered covered by by crops crops
2 2 2 2 2 2 22
5 555 555 5
Grasslands Grasslands Grasslands Grasslands Grasslands Grasslands Grasslands Grasslands
GrassL GrassL GrassL GrassL GrassL GrassL GrassL GrassL
Areas covered covered by by herbaceous herbaceous Areas Areas by herbaceous Areas covered covered by herbaceous vegetation Areas covered by herbaceous Areas covered by vegetation Areas by vegetation Areas covered covered by herbaceous herbaceous vegetation herbaceous vegetation vegetation vegetation vegetation
33 3 3 3 3 33
666 6666 6 7 777 77 77
Areas nearby water bodies Areas Areasnearby nearbywater waterbodies bodies Areas nearby water bodies Areas nearby water bodies Areas nearby water bodies Areas nearby water bodies Areas nearby water bodies
Shrublands and and Areas covered covered by by shrubs shrubs (closed (closed to to Shrublands Areas ShrubL Shrublands and Areas covered by shrubs (closed to ShrubL Shrublands and Areas covered by shrubs (closed to Thickets open) ShrubL Shrublands and Areas covered by shrubs (closed to Thickets open) ShrubL Shrublands and Areas covered by (closed covered by shrubs Thickets open) ShrubL Shrublands and Areas Areas covered by shrubs shrubs (closed to to Thickets open) ShrubL Shrublands Thickets and Thickets open) ShrubL ShrubL (closed to open) Thickets open) Thickets open) Forest and and Areas with with a tree tree canopy canopy cover cover greater Forest Areas ForestW Forest and Areas with a a tree canopy cover greater greater ForestW Forest Areas canopy Woodlands than 10% cover ForestW Forest and and Areas with with a a tree tree canopy cover greater greater Woodlands than 10% ForestW Forest and Areas with a tree canopy cover greater Woodlands than 10% ForestW Forest and Areas with a tree canopy cover greater Areas with a tree canopy Woodlands than 10% 10% Woodlands than ForestWForestW Forest and WoodlandsForestW Woodlands than 10% cover than greater than 10% Woodlands 10%
5 of of 15 15 5 5 5 of of 15 15 5 of 15 55 of of 15 15
Samples Samples Samples Samples Samples Samples Samples Samples
1 1 1 1 1 1 11
4 4 4 4 4 4 44 55 5 5 5 5 55
3.2. Overview of of the Fuzzy-Fusion Fuzzy-Fusion Uninorm Method Method 3.2. 3.2. Overview Overview of of the the Fuzzy-Fusion Fuzzy-Fusion Uninorm Uninorm Method Method 3.2. Overview the Uninorm 3.2. Overview of the Fuzzy-Fusion Uninorm Method 3.2. of Method The fuzzy-fusion fuzzy-fusion classifier Uninorm is supervised using an an inference scheme scheme with specialized specialized 3.2. Overview Overview of the the Fuzzy-Fusion Fuzzy-Fusion Uninorm Methodsystem using The classifier is aaa supervised 3.2. Overview of the Fuzzy-Fusion Uninorm Method system The fuzzy-fusion classifier is supervised system using an inference inference scheme with with specialized The fuzzy-fusion classifier is a supervised system using inference with The fuzzy-fusion fuzzy-fusion classifier is aa supervised supervised system using an inference scheme with specialized specialized reinforcement aggregation operators. The output output of each each rulean is the the result scheme of the the aggregation aggregation of its its The classifier is system using an inference scheme with specialized reinforcement aggregation operators. The of rule is result of of The fuzzy-fusion classifier is a supervised system using an inference scheme with specialized reinforcement aggregation operators. The output of each rule is the result of the aggregation of its reinforcement aggregation operators. output of each rule is the result the of reinforcement aggregation operators. The output of evaluates each rule aaan isgiven the result ofusing the aggregation aggregation of its its antecedents, i.e., each rule represents represents aThe class, which evaluates given pixelof using its seven seven spectral The fuzzy-fusion classifier is a supervised system using inference scheme with specialized reinforcement aggregation operators. The output of each rule is the result of the aggregation of its antecedents, i.e., each rule a class, which pixel its spectral reinforcementi.e., aggregation operators. a The output of evaluates each rule a isgiven the result the aggregation of its antecedents, each rule represents class, which pixelof using its seven spectral antecedents, i.e., represents a evaluates a given using its seven antecedents, i.e., each each rule representsThe a class, class, which evaluates a is given pixel using its seven spectral information bands. Therule classification result is which the class class thatrule obtained thepixel maximum score. Tospectral build aaof its reinforcement aggregation operators. output of evaluates each the result of the aggregation antecedents, i.e., each rule represents a class, which a given pixel using its seven spectral information bands. The classification result is the that obtained the maximum score. To build antecedents, i.e., each rule represents a class, which evaluates a given pixel using its seven spectral information bands. The classification result is the class that obtained the maximum score. To build a information bands. The classification result is the that obtained the maximum To build a information bands. The classification result iswhich the class class thatare obtained the(1) maximum score. To set build a classifieri.e., based on the the fuzzy-fusion approach, three stages are required: (1) create thescore. training set and information bands. The classification the class that obtained the maximum score. To build a classifier based on fuzzy-fusion approach, three stages required: create the training and antecedents, each rule represents aresult class,is evaluates a given pixel using its seven spectral information bands. The classification result is the class that obtained the maximum score. To build a classifier based on the fuzzy-fusion approach, three stages are required: (1) create the training set and classifier based on the fuzzy-fusion approach, three stages are required: (1) create the training set and classifier based on membership the fuzzy-fusion fuzzy-fusion approach, three stages are required: required: (1) create create the training set and and build the thebased inputs’ membership functions; (2) three define the rule-base rule-base system; and the (3) training implement the classifier on the approach, stages are (1) set build inputs’ functions; (2) define the system; and (3) implement the information bands. The classification result is the class that obtained the maximum score. To build classifier based on the fuzzy-fusion approach, three stages are required: (1) create the training set and build the inputs’ membership functions; (2) define the rule-base system; and (3) implement the build the inputs’ membership functions; (2) the build the scheme inputs’ with membership functions; (2) define define the rule-base rule-base system; system; and and (3) (3) implement implement the the inference scheme with reinforcement aggregation operators. build the inputs’ membership functions; (2) define the rule-base system; and (3) implement the inference reinforcement aggregation operators. build the inputs’ membership functions; (2) define the rule-base system; and (3) implement the inference scheme with reinforcement aggregation operators. a classifier based on the fuzzy-fusion approach, three stages are required: (1) create the training set inference inference scheme scheme with with reinforcement reinforcement aggregation aggregation operators. operators. inference scheme with reinforcement aggregation operators. inference scheme with reinforcement aggregation operators. and build inputs’ 3.2.1. the Training Set membership functions; (2) define the rule-base system; and (3) implement the 3.2.1. Training Set 3.2.1. Training Set 3.2.1.scheme Trainingwith Set reinforcement aggregation operators. inference 3.2.1. Training Set 3.2.1. Training Set Since the satellite satellite images for for the the three three years years were were not not calibrated, calibrated, different different supervised supervised training training 3.2.1.Since Training Set the images Since the satellite images for the three years were not calibrated, different supervised training Since the satellite images for the three years were not calibrated, different supervised training Since the satellite images for the three years were not calibrated, different supervised training sets for each year were created. Further, as mentioned before, since there is no ground-truth data Since the satellite for the were calibrated, different training sets for each year wereimages created. Further, asyears mentioned before, since there there is no nosupervised ground-truth data 3.2.1. sets Training Set Since theyear satellite images forFurther, the three three years were not not calibrated, different supervised training for each were created. as mentioned before, since is ground-truth data sets for each year were created. Further, as mentioned before, since there is no ground-truth data sets for each year were created. Further, as mentioned before, since there is no ground-truth data available, we used the training sets built manually by the experts, Temudo et al. [2], through visual sets created. Further, as before, since there ground-truth data available, we year used were the training training sets built manually manually by the the experts, Temudo et no al. [2], [2], through visual visual sets for for each each year were created.sets Further, as mentioned mentioned before, since there is is no ground-truth data available, we used the built by experts, Temudo et al. through available, we used the training built manually by the et al. [2], through visual Since the satellite images for sets the threeimages yearsand were notexperts, calibrated, different supervised training available, we used the training sets built manually by the experts, Temudo et al. [2], through visual inspection of Landsat Landsat and SPOT satellite images and supported by Temudo land cover cover statistics produced by available, we used the training sets built manually by the experts, Temudo et al. [2], through visual inspection of and SPOT satellite supported by land statistics produced by available, we used the training sets built manually by the experts, Temudo et al. [2], through visual inspection of Landsat and SPOT satellite images and supported by land cover statistics produced by inspection of Landsat and SPOT satellite images and supported by land cover statistics produced by inspection of Landsat and SPOT satellite images and supported by land cover statistics produced by Mozambique National Cartography and Remote Sensing Centre (CENACARTA) (available at sets for each year were created. Further, as mentioned before, since there is no ground-truth inspection and SPOT images and supported by cover produced Mozambique National Cartography and Remote Sensing Centre (CENACARTA) (available at inspection of of Landsat Landsat and SPOT satellite satellite images and Sensing supported by land land cover statistics statistics (available produced by by Mozambique National Cartography and Remote Centre (CENACARTA) at Mozambique National Cartography and Remote Sensing Centre (CENACARTA) (available at Mozambique National Cartography and Remote Sensing Centre (CENACARTA) (available at http://www.cenacarta.com). They selected a total of 10,840, 9950 and 10,106 sample pixels (seven data available, we used the training sets built manually by the experts, Temudo et al. [2], through Mozambique and Centre (available at http://www.cenacarta.com). They total 10,840, and 10,106 (seven Mozambique National National Cartography Cartography and a Remote Sensing Centre (CENACARTA) (available at http://www.cenacarta.com). They selected selected aRemote total of ofSensing 10,840, 9950 9950 and(CENACARTA) 10,106 sample sample pixels pixels (seven http://www.cenacarta.com). They selected a total of 10,840, 9950 and 10,106 sample pixels (seven They selected a total of 10,840, 9950 and 10,106 sample pixels (seven bands each), representing representing the seven aforementioned classes, respectively for the the years 1989, 2002 and http://www.cenacarta.com). They selected a total of 10,840, 9950 and 10,106 sample pixels (seven bands each), the seven aforementioned classes, respectively for years 1989, 2002 and visualhttp://www.cenacarta.com). inspection of Landsat and SPOT satellite images and supported by land cover statistics http://www.cenacarta.com). They selected a total of 10,840, 9950 and 10,106 sample pixels (seven bands each), each), representing representing the the seven seven aforementioned aforementioned classes, classes, respectively respectively for for the the years years 1989, 1989, 2002 2002 and and bands bands each), representing the seven aforementioned classes, respectively for the years 1989, 2002 and 2005. by bands representing seven respectively for years 2002 2005. produced Mozambique National Cartography classes, and Remote Sensing (CENACARTA) bands each), each), representing the the seven aforementioned aforementioned classes, respectively for the theCentre years 1989, 1989, 2002 and and 2005. 2005. 2005. After defining the training set, seven intensity histograms (one per band or vegetation index) 2005. defining intensity histograms per or index) 2005.After (available at http://www.cenacarta.com). selected a total(one of 10,840, and 10,106 sample After defining the the training training set, set, seven sevenThey intensity histograms (one per band band9950 or vegetation vegetation index) After defining the training set, seven intensity histograms (one per band or vegetation index) defining the training set, seven intensity histograms (one per band or vegetation wereAfter defined for each each class using the spectral intensity information of the sample pixels. To Toindex) each After defining the training set, seven intensity histograms (one per band or defined for using spectral intensity information of sample pixels. each After defining the class training set,the seven intensity histograms (oneclasses, per band or vegetation vegetation index) were defined foreach), each class using the spectral intensity information of the the sample pixels.for Toindex) eachyears pixelswere (seven bands representing the seven aforementioned respectively the were defined for each class using the spectral intensity information of the sample pixels. To each were defined for class using the spectral information of the to sample pixels. To each histogram, one oreach more membership functionsintensity were assigned according to a fitting process, as were defined for class using information of sample pixels. To histogram, one more membership functions were according a process, as were and defined foror each class using the the spectral spectral intensity information of the the to sample pixels. To each each histogram, one oreach more membership functionsintensity were assigned assigned according to a fitting fitting process, as 1989, 2002 2005. histogram, one or more membership functions were assigned according a fitting process, as histogram, one or more membership functions were assigned according to a fitting process, as illustrated in Figure 2, for Waterbodies and Forests and Woodlands classes. histogram, one or more functions were assigned according to a fitting as illustrated in Figure 2, for membership Waterbodies and Forests and Woodlands classes. histogram, or training more functions were fitting process, process, illustrated in Figure 2, Waterbodies and Forests and Woodlands classes. illustrated inone Figure 2, for for membership Waterbodies and Forests and assigned Woodlands classes. After defining the set, seven intensity histograms (oneaccording per bandtooravegetation index)aswere illustrated in Figure 2, for Waterbodies and Forests and Woodlands classes. illustrated in Figure 2, for Waterbodies and Forests and Woodlands classes. illustrated in Figure 2, for Waterbodies and Forests and Woodlands classes.
defined for each class using the spectral intensity information of the sample pixels. To each histogram, one or more membership functions were assigned according to a fitting process, as illustrated in Figure 2, for Waterbodies and Forests and Woodlands classes.
Information 2017, 8, 147 Information 2017, 8, 147
Class7 (ForestW)
Band1
Band7
⋮
⋮
Pixel Intensity
Pixel Intensity
Normalized Occurrences Normalized Occurrences
Class1 (WaterB)
6 of 15 6 of 15
Figure 2. Example of histograms and unimodal and bimodal membership function fits for classes Figure 2. Example of histograms and unimodal and bimodal membership function fits for classes Waterbodies and Forests and Woodlands. Waterbodies and Forests and Woodlands.
There are several methods to create fuzzy membership functions. A good overview can be found There arecase, several methodsan toinductive create fuzzy membership functions. good overview can be in [34]. In our we followed method using histograms andA fitted Gaussian functions, found In our case, inductive method using histograms and Gaussian which in on[34]. the one hand, canwe befollowed generatedanautomatically from the training set and, on fitted the other hand, functions, which on the one hand, can be generated automatically from the training set and, on the provide a smooth model of the histogram distribution. Knowing that histograms describe the other hand, provide a smooth model of the histogram distribution. Knowing that histograms describe frequency of pixel values within a class, pixels more likely to appear have higher membership values, the frequency of pixel values within a class, more likely to appear have higher membership while the inverse applies to pixels with lowerpixels occurrences. values, while the inverse applies to pixels with lower occurrences. Since the area covered by one pixel (28.5 m × 28.5 m) may contain more than one land type or, Since the area covered by one pixel (28.5 m × 28.5 m) mayland contain more thantypes one land type or, in other words, the class selected can be composed of multiple or vegetation (example of in other words, the class selected can be composed of multiple land or vegetation types (example of Shrublands that are a mixture of bushes and soil), these histograms are not always smooth, unimodal Shrublands that Bimodalities are a mixtureand of bushes and soil), these histograms are not always in smooth, or symmetrical. asymmetries may occur like the ones illustrated Figureunimodal 2. Fitting or symmetrical. Bimodalities and asymmetries may occur like the ones illustrated in Figure 2. Fitting just one single Gaussian function did not always converge to the best solution. Therefore, we devised just one single Gaussian function did not always converge to the best solution. Therefore, we devised five steps to obtain the pseudo-optimal membership functions: five steps to obtain the pseudo-optimal membership functions: 1. Fit the histogram with one asymmetrical Gaussian membership function; 1. Fit the Otsu’s histogram with one asymmetrical membership 2. Apply thresholding method [23] toGaussian the histogram to find function; the two main clusters; 2. Apply Otsu’s thresholding [23] to the histogram to find thefunction, two mainusing clusters; 3. Obtain for each cluster an method asymmetrical Gaussian membership the cluster’s deviation values above and below the mean value; using the cluster’s mean 3. mean Obtainand for standard each cluster an asymmetrical Gaussian membership function, 4. Fit clusterdeviation by an asymmetrical Gaussian membership function; andeach standard values above and below the mean value; 5. Use the cluster root mean error toGaussian select themembership membership function that best fits the cluster 4. Fit each by ansquare asymmetrical function; (choosing the resulting membership function of Step 1, 3 or 4). 5. Use the root mean square error to select the membership function that best fits the cluster (choosing the process, resultingthe membership function of Step 1, 3 or 4). algorithm [24] was used. It solves For the fitting Levenberg–Marquardt optimization non-linear square problems, reducing the sum ofoptimization square differences between theused. original data For theleast fitting process, the Levenberg–Marquardt algorithm [24] was It solves points (histogram) and the model function (membership function), by adjusting the model non-linear least square problems, reducing the sum of square differences between the original data parameters. Our model uses asymmetrical Gaussian functions. topologies triangular points (histogram) and the model function (membership function),Other by adjusting the such modelasparameters. and model trapezoidal were also tested; however, Gaussian functions achieved better fittingand for trapezoidal the data. In Our uses asymmetrical Gaussian functions. Other topologies such as triangular the case of classes with few samples (bare areas) or in some VI histograms where sparsity occurred were also tested; however, Gaussian functions achieved better fitting for the data. In the case of classes due to rounding effects, a final manual adjustment of the Gaussian fitting was required.
Information 2017, 8, 147
7 of 15
with few samples (bare areas) or in some VI histograms where sparsity occurred due to rounding effects, a final manual adjustment of the Gaussian fitting was required. 3.2.2. Rule-Based System The rule set defined for our method is composed of seven rules, one for each land cover class. The fuzzy variables are the spectral bands, with a specific membership function for each class. In general, if a class receives high scores through all N bands, then it should be assigned to the pixel. A set of K rules (K-number of classes) is then automatically identified in the fuzzy rule-based inference system. It should be noticed that when classes contain a mixture of land or vegetation types and it is possible to have a more detailed training set, it is better to have more rules for each land type, although the output class is the same. In our case, the rule-set is composed of seven rules: Rule 1: Rule 2: Rule 3: Rule 4: Rule 5: Rule 6: Rule 7:
If Band 1 is Waterbody (WaterB) and Band 2 is WaterB . . . and Band 7 is WaterB If Band 1 is River Bank (RiverB) and Band 2 is RiverB . . . and Band 7 is RiverB If Band 1 is Bare Area (BareA) and Band 2 is BareA . . . and Band 7 is BareA If Band 1 is Cropland (CropL) and Band 2 is CropL . . . and Band 7 is CropL If Band 1 is Grassland (GrassL) and Band 2 is GrassL . . . and Band 7 is GrassL If Band 1 is Shrubland (ShrubL) and Band 2 is ShrubL . . . and Band 7 is ShrubL If Band 1 is Forest and Woodlands (ForestW) and Band 2 is ForestW . . . and Band 7 is ForestW
Then output is WaterB; Then output is RiverB; Then output is BareA; Then output is CropL; Then output is GrassL; Then output is ShrubL; Then output is ForestW.
The firing level of every rule (i.e., rule output score) is calculated with a reinforcement aggregation operator, detailed in the next section. This technique also generates a certainty measure for all classes, thus enabling producing certainty distribution maps. 3.2.3. Reinforcement Inference Scheme As mentioned before, here, we follow an inference scheme that uses reinforcement aggregation operators to perform the fusion of images [1]. This fusion process with specialized aggregation operators is based on other work performed by some co-authors [18]. Reinforcement aggregation operators penalize results with lower scores (negative reinforcement) and reward high scores (positive reinforcement) allowing one to discard alternative classes with very low scores (details about these operators can be seen in [26,27]). Formally, the discussed inference scheme includes, as the premise, the scores from each band and then the firing level determination (aggregation operation) selection of the best score, i.e., the class identification is as follows: Score = maxi ⊕ j Classi band j
(3)
where:
• • • • •
⊕ j = aggregation operator; i ∈ [0, NClasses ] = the class number; j ∈ [0, NBands ] = the band number; Classi = class under evaluation; band j = input bands.
It should be noted that performing inference with reinforcement operators is an innovative method to determine a more positive or negative reinforcement of the rule’s firing level and respective classification certainty for each class (for more details, see [1]). In the same article, it was found that the Uninorm reinforcement aggregation operator was better for classification of satellite images; hence, it is the one considered in this comparative study. 4. Assessment and Discussion In the next sub-sections, we discuss the details of the four methods for: (1) the classifiers’ training performances and (2) the results comparison of the four land cover classifications of the district of
Information 2017, 8, 147
8 of 15
Mandimba. In the ground-truth study [2], the inputs were preprocessed by a mean filtering to produce more homogenous land cover maps. Then, the authors used a decision tree classifier to generate their land cover maps. Information 2017, 8, 147 The effect of filtering on the land cover maps can be seen in Figure 3 where 8 ofthe 15 original image was compared with the FF-Uninorm method before and after applying a 3 × 3 mean Figure 3b contains a noisier image than the ground-truth one, and Figure contains 3filtering. × 3 mean filtering. Figure 3b contains a noisier image than the ground-truth one,3c and Figurethe 3c same level smoothness Figure 3a. as Following this observation, preprocessed all the imagesall in contains theofsame level ofas smoothness Figure 3a. Following this we observation, we preprocessed the images same fashion. in the same fashion.
(a)
(b)
(c)
Figure Land ground-truth classification as by published byal.Temudo et al. [2]; Figure 3.3.(a)(a) Land covercover ground-truth classification as published Temudo et [2]; (b) Fuzzy-Fusion (b) Fuzzy-Fusion (FF)-Uninorm classification; (c) FF-Uninorm classification after applying a (FF)-Uninorm classification; (c) FF-Uninorm classification after applying a mean filtering. mean filtering.
The Decision Tree (DT) ground-truth classifier was created with the CART algorithm, while the The Decision Tree (DT) ground-truth classifier was created with the CART algorithm, while the ANN and k-means algorithms were developed using MATLAB Toolboxes. The ANN configuration ANN and k-means algorithms were developed using MATLAB Toolboxes. The ANN configuration was a feedforward network composed of seven input neurons, one hidden layer with eight neurons was a feedforward network composed of seven input neurons, one hidden layer with eight neurons and seven output neurons fully interconnected. The k-means was configured to detect eight clusters, and seven output neurons fully interconnected. The k-means was configured to detect eight clusters, being this extra one the background mask, and then, it was applied to the full image. Its training being this extra one the background mask, and then, it was applied to the full image. Its training accuracy was computed by locating in the image the training samples, obtaining their outputs and accuracy was computed by locating in the image the training samples, obtaining their outputs and computing the accuracy scores. Since the ground-truth study [2] aggregates the seven classes into a computing the accuracy scores. Since the ground-truth study [2] aggregates the seven classes into a smaller set of five classes, we merged the classes according to the same merged column in Table 1. smaller set of five classes, we merged the classes according to the same merged column in Table 1. 4.1. Comparison of Training Results 4.1. Comparison of Training Results We compare our fuzzy-fusion model, configured with the Uninorm reinforcement aggregation We compare our fuzzy-fusion model, configured with the Uninorm reinforcement aggregation operator (FF-Uninorm), against other CI techniques, namely with DT (ground-truth study), ANN and operator (FF-Uninorm), against other CI techniques, namely with DT (ground-truth study), ANN k-means (an unsupervised approach). All techniques share the same training sets, previously presented and k-means (an unsupervised approach). All techniques share the same training sets, previously on Section 3.2.1, one for each year. The accuracy was computed as the rate of correct classifications presented on Section 3.2.1, one for each year. The accuracy was computed as the rate of correct versus the total number of training samples. classifications versus the total number of training samples. The classifier training accuracies are shown in Table 2 for each training set (year). In the “training The classifier training accuracies are shown in Table 2 for each training set (year). In the “training samples” row are presented the percentage of samples for each class within the training set, which in samples” row are presented the percentage of samples for each class within the training set, which in addition are used to weight the total average calculation. From the results shown, ANN consistently addition are used to weight the total average calculation. From the results shown, ANN consistently produced more accurate classifications along the three years (95.6%, 91.8% and 92.1%), although produced more accurate classifications along the three years (95.6%, 91.8% and 92.1%), although less less consistent along the several classes, as can be seen in 2002, where only 22.4% accuracy for consistent along the several classes, as can be seen in 2002, where only 22.4% accuracy for Shrublands Shrublands was achieved. On the opposite side, k-means, being an unsupervised technique, in the was achieved. On the opposite side, k-means, being an unsupervised technique, in the 1989 subset 1989 subset obtained very reasonable results (78.8%), but decreased the accuracy in the subsequent obtained very reasonable results (78.8%), but decreased the accuracy in the subsequent training sets training sets (63.0% and 53.7%). Fuzzy-fusion with Uninorm and DT obtained good classification (63.0% and 53.7%). Fuzzy-fusion with Uninorm and DT obtained good classification accuracies (78.1– accuracies (78.1–88.2% for FF-Uninorm and 85.5–90.2% for DT). It can also be noticed that all techniques 88.2% for FF-Uninorm and 85.5–90.2% for DT). It can also be noticed that all techniques including DT including DT and FF-Uninorm had difficulties correctly classifying Shrublands, which could indicate a and FF-Uninorm had difficulties correctly classifying Shrublands, which could indicate a misclassified training set. misclassified training set.
Information 2017, 8, 147
9 of 15
Information 2017, 8, 147
9 of 15
Table 2. Accuracy of classifying the training set with FF-Uninorm, DT, ANN and k-means. Year Method Other CropL GrassL ShrubL ForestW Total Avg Table 2. Accuracy of classifying the training set with FF-Uninorm, DT, ANN and k-means. FF-Uninorm 99.9% 83.0% 81.5% 69.6% 92.5% 88.2% DT 100.0% 88.8% 86.0% 87.0% 90.4% 90.2% Year Method Other CropL GrassL ShrubL ForestW Total Avg 1989 ANN 97.7% 99.3% 66.8% 70.8% 98.4% 95.6% FF-Uninorm 99.9% 83.0% 81.5% 69.6% 92.5% 88.2% k-means 91.4% DT 100.0% 89.4% 88.8% 44.0% 86.0%67.0% 87.0%75.0% 90.4% 78.8% 90.2% samples 7.4% ANN 97.7% 35.3% 99.3% 7.1% 66.8% 3.1% 70.8%47.1% 98.4% 95.6% 1989 Training k-means 91.4% 91.7% 89.4% 59.9% 44.0%63.1% 67.0%76.8% 75.0% 81.5% 78.8% FF-Uninorm 99.9% Training samples 7.4% 35.3% 7.1% 3.1% 47.1% DT 100.0% 87.4% 73.4% 69.3% 85.5% 85.5% FF-Uninorm 99.9% 91.7% 59.9% 63.1% 76.8% 81.5% 2002 ANN 96.5% 99.6% 73.5% 22.4% 94.6% 91.8% DT 100.0% 87.4% 73.4% 69.3% 85.5% 85.5% k-means 92.6% ANN 96.5% 53.0% 99.6% 33.3% 73.5%41.9% 22.4%74.2% 94.6% 63.0% 91.8% 2002 k-means 92.6% 34.5% 53.0% 10.3% 33.3% 3.4% 41.9%44.2% 74.2% 63.0% Training samples 7.7% Training samples 7.7% 34.5% 10.3% 3.4% 44.2% FF-Uninorm 99.3% 87.3% 63.8% 67.3% 72.6% 78.1% FF-Uninorm 99.3% 87.3% 63.8% 67.3% 72.6% 78.1% DT 100.0% 90.4% 88.4% 78.1% 80.8% 86.4% DT 100.0% 90.4% 88.4% 78.1% 80.8% 86.4% 2005 ANN 99.7% ANN 99.7% 98.8% 98.8% 84.4% 84.4%58.5% 58.5%94.4% 94.4% 92.1% 92.1% 2005 k-means 94.7% 46.3% 46.3% 21.0% 21.0%38.3% 38.3%68.0% 68.0% 53.7% 53.7% k-means 94.7% Training samples 7.6% 34.6% 13.7% 8.0% 36.0% Training samples 7.6% 34.6% 13.7% 8.0% 36.0%
From these these training training accuracies, accuracies, itit was was expected expected to tohave haveaamore morereliable reliableclassification classificationfrom fromANN, ANN, From but to to confirm confirm this, we classified the full image and discuss the results in the following section. but 4.2. Comparison Comparison of of Classification Classification Results Results 4.2. The full full size size image image of of the the Mandimba Mandimba region region was was also also classified classified and and results results compared compared with with the the The ground-truth. The analyzing thethe total areas covered by each landland covercover type, ground-truth. The comparison comparisonwas wasmade madebyby analyzing total areas covered by each computing the Rand Index clustering agreement validity measure (explained in the next section) and type, computing the Rand Index clustering agreement validity measure (explained in the next by visualand inspection of theinspection land cover classified images. In the study run by Temudo et al. [2], the section) by visual of the land cover classified images. In the study runmain by objective was to analyze the land cover changes that occurred in the period 1989–2005. The authors Temudo et al. [2], the main objective was to analyze the land cover changes that occurred in the period found that,The mainly observed border with Malawi (west), there a shifting in there the use of 1989–2005. authors foundnear that,the mainly observed near the border withwas Malawi (west), was from and areas devoted to crops. alands shifting inforest the use of shrublands lands from to forest and shrublands to areas devoted to crops. The land cover area distribution over the three-year study for for the the four four methods methods can can be be found found in in The land cover area distribution over the three-year study Figure 4, while Figure 5 demonstrates land cover classified images. Analyzing the areas of ANN and Figure 4, while Figure 5 demonstrates land cover classified images. Analyzing the areas of ANN and k-means, only mentioned by by thethe ground-truth study, and all theallother k-means, only croplands croplandsshowed showedthe theevolution evolution mentioned ground-truth study, and the land cover types had irregular behaviors, which does not reflect the reality in this district. In the case other land cover types had irregular behaviors, which does not reflect the reality in this district. In of ANN, which in the training best accuracy results, the irregularity be justified the case of ANN, which in theobtained trainingthe obtained the best accuracy results, the can irregularity canbybea possible by overfitting. ANN does not adapt tonot small training sets, while other techniques justified a possibleConcluding, overfitting. Concluding, ANN does adapt to small training sets, while other do. The k-means could identify the main land cover types, as depicted in Figure 5 and in more in techniques do. The k-means could identify the main land cover types, as depicted in Figure 5detail and in Figuredetail 6, where a road6,is where clearlyadetected, and thedetected, other main classes weremain correctly classified. Perhaps more in Figure road is clearly and the other classes were correctly more tests with a higher number of clusters can identify more subtypes, which can be merged the classified. Perhaps more tests with a higher number of clusters can identify more subtypes,into which land cover groups previously defined and achieve a higher accuracy. can be merged into the land cover groups previously defined and achieve a higher accuracy. 60.0% 50.0%
60.0%
FF-Uninorm
50.0%
40.0%
40.0%
30.0%
30.0%
20.0%
20.0%
10.0%
10.0%
0.0%
0.0%
DT
Other
CropL
GrassL
ShrubL
ForestW
Other
CropL
GrassL
ShrubL
ForestW
1989
1.2%
4.5%
23.0%
14.8%
56.5%
1989
1.0%
5.0%
19.8%
23.0%
51.3%
2002
0.9%
12.6%
18.7%
22.7%
45.0%
2002
1.0%
12.3%
17.1%
21.7%
48.0%
2005
1.1%
21.9%
15.7%
27.3%
34.0%
2005
0.9%
17.8%
23.9%
20.9%
36.5%
(a)
(b) Figure 4. Cont.
Information 2017, 8, 147
10 of 15
Information 2017, 8, 147
10 of 15
75.0%
60.0%
ANN
50.0%
60.0%
k-means
40.0%
45.0%
30.0% 30.0%
20.0%
15.0% 0.0%
10.0% 0.0%
Other
CropL
GrassL
ShrubL
ForestW
Other
CropL
GrassL
ShrubL
ForestW
1989
0.8%
10.1%
8.0%
10.6%
70.5%
1989
0.8%
4.4%
36.8%
19.3%
38.6%
2002
0.8%
14.2%
4.7%
22.8%
57.5%
2002
0.8%
13.8%
17.5%
22.6%
45.2%
2005
0.6%
21.8%
16.2%
0.2%
61.2%
2005
0.8%
17.1%
42.4%
23.0%
16.8%
(c)
(d)
Figure 4. Land cover areas distribution along the three-year study for the four classification methods: Figure 4. Land cover areas distribution along the three-year study for the four classification methods: (a) FF-Uninorm; (b) Decision Tree; (c) Artificial Neural Network; (d) k-means clustering. (a) FF-Uninorm; (b) Decision Tree; (c) Artificial Neural Network; (d) k-means clustering.
Regarding FF-Uninorm and DT, it clearly depicts the trend presented in the ground-truth study Regarding FF-Uninorm DT,»it21.9% clearlyfor depicts the trendand presented the ground-truth study as as croplands area increasedand (4.5% FF-Uninorm 5.0% »in17.8% for DT) and forest croplands area increased (4.5% » 21.9% for FF-Uninorm and 5.0% » 17.8% for DT) and forest decreased decreased its area (56.5% » 34.0% for FF-Uninorm and 51.3% » 36.5% for DT), while the Others class its area (56.5% » 34.0% for FF-Uninorm and (1.2 51.3% » 36.5% for DT), while remained remained with approximately the same area » 1.1% for FF-Uninorm andthe 1.0Others » 0.9% class for DT). When with approximately thegrassland same areaand (1.2shrubland, » 1.1% for FF-Uninorm and 1.0 » 0.9% for DT). Whenaanalyzing analyzing the areas of the results differ. FF-Uninorm identified negative the areas of grassland shrubland, theshrubland, results differ. a negative trend trend for grassland and aand positive trend for whileFF-Uninorm DT obtainedidentified approximately stable areas for grassland a positive trend for shrubland, whileare DT difficult obtainedto approximately stableby areas for for both typesand along this time period. These results distinguish even visual both types along this time period. These results are difficult to distinguish even by visual inspection inspection (top row of Figure 7), since small bushes can be interleaved with grasslands and only a (top rowanalysis of Figure 7),identify since small bushes can be interleaved with grasslands a detailed detailed can which classification method was more accurate.and Toonly improve their analysis can identify which classification method was more accurate. improve classification classification a neighboring analysis should be undertaken to detectTo mixed landtheir cover areas. If wea neighboring be undertaken to detect we sum trend. the areas of sum the areasanalysis of theseshould two classes, their differences aremixed lowerland and cover show areas. a slightIf positive theseAnother two classes, theirmeasure, differences are lower show a slightperformance, positive trend. validity especially to and assess k-means was the use of the Rand validity measure, especially to assess k-means performance, was the use on of the Rand IndexAnother (RI) cluster agreement measure. RI is an external clustering validity measure based pairwise Index (RI) cluster agreement measure. RI an external clustering measureclusters based on comparison of clustering assignments of is points belonging to the validity same/different (inpairwise both of comparison of clustering assignments of points belonging to the same/different clustersclusterings (in both of compared clustering solutions) [41]. A value of one of RI indicates that both investigated compared clustering [41]. valueonofany onepair of RI both investigated clusterings are identical and zero solutions) that they do notAagree ofindicates points. Inthat Table 3, the agreements between are four identical and zero that year they and do not on any pair points. In study Table 3, thebeagreements between the methods for each theagree average along theofthree-year can seen. the four forresults each year and thethe average along that the three-year study seen. The methods agreement reinforced conclusion FF-Uninorm andcan DTbeare the most similar agreement results reinforcedFurthermore, the conclusionitthat FF-Uninorm DT areclustering the most similar ones ones The among them (0.8 on average). showed that theand k-means technique, among them (0.8 an on unsupervised average). Furthermore, it showed that consistent the k-means clustering technique, although although being technique, was quite along the study and obtained being an unsupervised technique, was quite along and obtained relatively high relatively high scores when compared with consistent FF-Uninorm andthe DTstudy (0.73–0.84 with FF-Uninorm and scores when compared with FF-Uninorm and DT (0.73–0.84 with FF-Uninorm and 0.69–0.76 with DT). 0.69–0.76 with DT). In summary, summary, our model with the FF-Uninorm FF-Uninorm operator operator produced produced results results that that are are adaptable adaptable and and In consistent with DT (bottom row of Figure 6). k-means was shown to be an adequate technique consistent with DT (bottom row k-means was shown to be an adequate technique whenever there is no prior knowledge on the land cover classes. whenever Table 3. 3. Rand RandIndex Index agreement agreement validity validity measure measure among among FF-Uninorm, FF-Uninorm, DT, ANN and k-means. Table 1989 1989 DT ANN k-Means DT ANN k-Means FF-Uninorm 0.77 0.73 0.75 FF-Uninorm 0.77 0.73 0.75 DT 0.75 0.69 DT 0.75 0.69 ANN 0.65 ANN 0.65
2002 2002 ANN k-Means ANN k-Means 0.79 0.73 0.85 0.79 0.73 0.75 0.74 0.75 0.74 0.62 0.62
DT DT 0.85
DT DT 0.80 0.80
2005 2005 ANN k-Means ANN k-Means 0.71 0.84 0.71 0.84 0.75 0.76 0.75 0.76 0.70 0.70
Average Average DT DT 0.80
0.80
ANN k-Means k-Means 0.74 0.77 0.74 0.77 0.75 0.73 0.75 0.73 0.65 0.65
ANN
Information 2017, 8, 147 Information 2017, 8, 147
11 of 15 11 of 15
Other Cropland Grassland Shrubland Forest
Figure FF-Uninorm, decision decision trees, trees,artificial artificialneural neuralnetworks networks Figure5.5.Land Landcover cover classification classification results results using using FF-Uninorm, and andk-means k-meansclustering. clustering.
Information 2017, 8, 147
12 of 15
Information 2017, 8, 147 Information 2017, 8, 147
12 of 15 12 of 15
RGB/743 RGB/743
FF-Uninorm FF-Uninorm
K-means K-means
Figure Detail of the land cover classification using FF-Uninorm and k-means, where aa road is clearly Figure 6. 6. Figure 6. Detail Detail of of the the land landcover coverclassification classification using using FF-Uninorm FF-Uninorm and and k-means, k-means, where where a road road is is clearly clearly seen and correctly identified by both methods. seen and correctly identified by both methods. seen and correctly identified by both methods.
RGB/743 RGB/743
FF-Uninorm FF-Uninorm
DT DT
Figure Figure 7. 7. (Top (Top row) row) Detail Detail on on the the Grassland Grassland and and Shrubland Shrubland classification classification comparison comparison between between Figure 7. (Top row) Detail on the Grassland and Shrubland classification comparison between FF-Uninorm and DT. (Bottom row) Correct and detailed classification obtained by FF-Uninorm FF-Uninorm and andDT. DT. (Bottom row) Correct and detailed classification obtained by FF-Uninorm FF-Uninorm (Bottom row) Correct and detailed classification obtained by FF-Uninorm and DT. and and DT. DT.
5. Conclusions 5. 5. Conclusions Conclusions In this work, we extended previous work where aa fuzzy-fusion approach with reinforcement In In this this work, work, we we extended extended previous previous work work where where a fuzzy-fusion fuzzy-fusion approach approach with with reinforcement reinforcement aggregation operators, for land cover cover classification classification from from multispectral multispectral satellite satellite images, images, was was proposed. proposed. aggregation operators, for land aggregation operators, for land cover classification from multispectral satellite images, was proposed. The main aim of the approach was to fuse spectral information (from a multispectral satellite The The main main aim aim of of the the approach approach was was to to fuse fuse spectral spectral information information (from (from aa multispectral multispectral satellite satellite imagery imagery imagery source or other) to produce land cover maps andthe compare the preliminary approachothers with source source or or other) other) to to produce produce land land cover cover maps maps and and compare compare the preliminary preliminary approach approach with with two two others two others from the computational intelligence realm. In this article, we improved the preliminary from from the the computational computational intelligence intelligence realm. realm. In In this this article, article, we we improved improved the the preliminary preliminary study study with with study with two First, additions. First, wethe enriched the comparison assessment by introducing a new two additions. we enriched comparison assessment by introducing a new two additions. First, we enriched the comparison assessment by introducing a new example example example composed of a three-year period ofimages satellite images (taken in the years 1989, 2002 and 2005) composed composed of of aa three-year three-year period period of of satellite satellite images (taken (taken in in the the years years 1989, 1989, 2002 2002 and and 2005) 2005) enabling enabling enabling comparing the evolution of the land cover area distribution with the results of a study, comparing the evolution of the land cover area distribution with the results of a study, acting comparing the evolution of the land cover area distribution with the results of a study, acting as as the the acting as the ground-truth, run in this region. Second, we included another promising method on ground-truth, run in this region. Second, we included another promising method on our comparison ground-truth, run in this region. Second, we included another promising method on our comparison study, study, the the k-means k-means unsupervised unsupervised clustering clustering technique, technique, to to assess assess its its suitability suitability as as an an alternative alternative classification method. classification method.
Information 2017, 8, 147
13 of 15
our comparison study, the k-means unsupervised clustering technique, to assess its suitability as an alternative classification method. Our fuzzy-fusion approach improved the training of common fuzzy classifiers by: (a) proposing a hybrid algorithm that performs clustering of the pixel intensity histograms and fits them with Gaussian functions; (b) uses a reinforcement operator (Uninorm) as a novel inference scheme for the rule-based classification. Furthermore, this method can produce classification certainty maps that help to assess the results and detect possible improvements (although this functionality was not used in this study). When comparing the accuracy of our FF-Uninorm approach with the other three computational intelligence techniques (decision trees, artificial neural networks and k-means), the general accuracy was lower than DT and ANN. However, when applied to the full images, it performed similarly to DT, and we manage to successfully reproduce the ground-truth results. A higher disparity in the classification of grassland and shrubland between FF-Uninorm and DT was noticed, although this might be due to these classes being composed of a mixture of bare and vegetation areas, and these methods are not able to process this spatial distribution just with a single pixel classification. The introduction of the k-means unsupervised technique in this study gave positive insights on the ability of the method to analyze unknown areas or to prepare training sets for other methods. In most tests, it produced reasonable classifications and obtained a good agreement with FF-Uninorm and DT. As future directions, we can identify the need for improvements on the training set generation and consequent improvements on the membership function creation, mainly to deal with the small number of samples of less frequent classes. Furthermore, we plan to introduce a pixel neighboring analysis upon the pixel classification, to improve the classification of land cover classes that have a mixture of land types (ex. Shrublands). Acknowledgments: This work was partially funded by FCT Strategic Program UID/EEA/00066/203 of the Center of Technologies and System (CTS) of UNINOVA—Institute for the Development of new Technologies. It is also partially based on work from COST Action TD1403 “Big Data Era in Sky and Earth Observation”, supported by COST (European Cooperation in Science and Technology). Author Contributions: A.M., T.M.A.S. and S.Ł. prepared and ran the experiments. All the authors have scientifically contributed to the proposed method and collaborated in writing and reviewing the article. Conflicts of Interest: The authors declare no conflict of interest.
References 1.
2. 3. 4. 5. 6. 7.
8. 9.
Santos, T.M.A.; Mora, A.; Ribeiro, R.A.; Silva, J.M.N. Fuzzy-fusion approach for land cover classification. In Proceedings of the 2016 IEEE 20th Jubilee International Conference on Intelligent Engineering Systems (INES), Budapest, Hungary, 30 June–2 July 2016; pp. 177–182. [CrossRef] Temudo, M.P.; Silva, J.M.N. Agriculture and forest cover changes in post-war Mozambique. J. Land Use Sci. 2012, 7, 425–442. [CrossRef] Friedl, M.A.; Brodley, C.E. Decision tree classification of land cover from remotely sensed data. Remote Sens. Environ. 1997, 61, 399–409. [CrossRef] Schowengerdt, R.A. Remote Sensing: Models and Methods for Image Processing; Academic Press: Cambridge, MA, USA, 2006. Ishibuchi, H.; Nozaki, K.; Yamamoto, N.; Tanaka, H. Construction of fuzzy classification systems with rectangular fuzzy rules using genetic algorithms. Fuzzy Sets Syst. 1994, 65, 237–253. [CrossRef] Nguyen, C.H.; Pedrycz, W.; Duong, T.L.; Tran, T.S. A genetic design of linguistic terms for fuzzy rule based classifiers. Int. J. Approx. Reason. 2013, 54, 1–21. [CrossRef] Shackelford, A.K.; Davis, C.H. A combined fuzzy pixel-based and object-based approach for classification of high-resolution multispectral data over urban areas. IEEE Trans. Geosci. Remote Sens. 2003, 41, 2354–2363. [CrossRef] Barros, R.C.; Basgalupp, M.P.; de Carvalho, A.C.P.L.F.; Freitas, A.A. A Survey of Evolutionary Algorithms for Decision-Tree Induction. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.) 2012, 42, 291–312. [CrossRef] Belgiu, M.; Drăgu¸t, L. Random forest in remote sensing: A review of applications and future directions. ISPRS J. Photogramm. Remote Sens. 2016, 114, 24–31. [CrossRef]
Information 2017, 8, 147
10.
11.
12. 13.
14. 15. 16. 17. 18. 19. 20. 21.
22. 23. 24. 25. 26. 27. 28.
29. 30. 31. 32.
33.
14 of 15
Millán, V.E.G.; Sanchez-Azofeifa, G.A.; Malvárez, G.C. Mapping Tropical Dry Forest Succession With CHRIS/PROBA Hyperspectral Images Using Nonparametric Decision Trees. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2015, 8, 3081–3094. [CrossRef] Ahmed, B.; Al Noman, M.A. Land cover classification for satellite images based on normalization technique and Artificial Neural Network. In Proceedings of the 2015 1st International Conference on Computer and Information Engineering (ICCIE), Rajshahi, Bangladesh, 26–27 November 2015; pp. 138–141. [CrossRef] Castelluccio, M.; Poggi, G.; Sansone, C.; Verdoliva, L. Land Use Classification in Remote Sensing Images by Convolutional Neural Networks. arXiv, 2015. Sharapov, R.; Varlamov, A. Using neural networks in remote sensing monitoring of exogenous processes. In Proceedings of the Sixth International Conference on Graphic and Image Processing (ICGIP 2014), Beijing, China, 24–26 October 2014. Schmidhuber, J. Deep learning in neural networks: An overview. Neural Netw. 2015, 61, 85–117. [CrossRef] [PubMed] Ayhan, E.; Kansu, O. Analysis of image classification methods for remote sensing. Exp. Tech. 2012, 36, 18–25. [CrossRef] Mather, P.; Tso, B. Classification Methods for Remotely Sensed Data, 2nd ed.; CRC Press: Boca Raton, FL, USA, 2009. Fauvel, M.; Chanussot, J.; Benediktsson, J.A. Decision Fusion for the Classification of Urban Remote Sensing Images. IEEE Trans. Geosci. Remote Sens. 2006, 44, 2828–2838. [CrossRef] Ribeiro, R.A.; Falcão, A.; Mora, A.; Fonseca, J.M. FIF: A fuzzy information fusion algorithm based on multi-criteria decision making. Knowl.-Based Syst. 2014, 58, 23–32. [CrossRef] Rudas, I.J.; Pap, E.; Fodor, J. Information aggregation in intelligent systems: An application oriented approach. Knowl.-Based Syst. 2013, 38, 3–13. [CrossRef] Beliakov, G.; Pradera, A.; Calvo, T. Aggregation Functions: A Guide for Practitioners; Springer Publishing Company: Heidelberg, Germany, 2007; Volume 221. Ribeiro, R.A.; Pais, T.C.; Simões, L.F. Benefits of Full-Reinforcement Operators for Spacecraft Target Landing. In Preferences and Decisions, Models and Applications; Greco, S., Marques Pereira, R.A., Squillante, M., Yager, R.R., Kacprzyk, J., Eds.; Springer: Berlin/Heidelberg, Germany, 2010; pp. 353–367. Torra, V.; Narukawa, Y. Modeling Decisions: Information Fusion and Aggregation Operators; Springer: Berlin/Heidelberg, Germany, 2007; Volume XIV. Otsu, N. A Threshold Selection Method from Gray-Level Histograms. IEEE Trans. Syst. Man. Cybern. 1979, 9, 62–66. [CrossRef] Marquardt, D.W. An algorithm for least-squares estimation of non-linear parameters. J. Soc. Ind. Appl. Math. 1963, 11, 431–441. [CrossRef] Sugeno, M. An introductory survey of fuzzy control. Inf. Sci. 1985, 36, 59–83. [CrossRef] Yager, R.R.; Rybalov, A. Full reinforcement operators in aggregation techniques. IEEE Trans. Syst. Man Cybern. Part B 1998, 28, 757–769. [CrossRef] [PubMed] Calvo, T.; Mayor, G.; Mesiar, R. Aggregation Operators: New Trends and Applications; Physica: Berlin/Heidelberg, Germany, 2012. Rekik, A.; Zribi, M.; Benjelloun, M.; Ben Hamida, A. A k-means clustering algorithm initialization for unsupervised statistical satellite image segmentation. In Proceedings of the 2006 1ST IEEE International Conference on E-Learning in Industrial Electronics, Hammamet, Tunisia, 18–20 December 2006; pp. 11–16. [CrossRef] Mitra, S.; Kundu, P.P. Satellite image segmentation with Shadowed C-Means. Inf. Sci. 2011, 181, 3601–3613. [CrossRef] Pal, S.K.; Mitra, P. Multispectral image segmentation using the rough-set-initialized EM algorithm. IEEE Trans. Geosci. Remote Sens. 2002, 40, 2495–2501. [CrossRef] Bandyopadhyay, S.; Maulik, U. Genetic clustering for automatic evolution of clusters and application to image classification. Pattern Recognit. 2002, 35, 1197–1208. [CrossRef] Omkar, S.N.; Kumar, M.M.; Mudigere, D.; Muley, D. Urban Satellite Image Classification using Biologically Inspired Techniques. In Proceedings of the IEEE International Symposium on Industrial Electronics, Vigo, Spain, 4–7 June 2007; pp. 1767–1772. [CrossRef] Kanellopoulos, I.; Wilkinson, G.G. Strategies and best practice for neural network image classification. Int. J. Remote Sens. 1997, 18, 711–725. [CrossRef]
Information 2017, 8, 147
34. 35. 36. 37. 38. 39.
40. 41.
15 of 15
Ross, T.J. Fuzzy Logic with Engineering Applications; John Wiley & Sons: Chichester, UK, 2009. Mendel, J.M. Uncertain Rule-Based Fuzzy Logic Systems: Introduction and New Directions; Prentice Hall PTR: Upper Saddle River, NJ, USA, 2001. Mamdani, E.H.H.; Assilian, S. An experiment in linguistic synthesis with a fuzzy logic controller. Int. J. Man Mach. Stud. 1975, 7, 1–13. [CrossRef] Nedeljkovic, I. Image classification based on fuzzy logic. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2004, 34, 685. Ruthenberg, H.; MacArthur, J.D. Farming Systems in the Tropics; Clarendon Press: Oxford, UK, 1980. Rouse, J.W.; Haas, R.H.; Schell, J.A.; Deering, D.W. Monitoring the Vernal Advancement and Retrogradation (Green Wave Effect) of Natural Vegetation; Technical Report for Texas A&M University; Texas A&M University: College Station, TX, USA, 1974. Garcia, M.L.; Caselles, V. Mapping burns and natural reforestation using Thematic Mapper data. Geocarto Int. 1991, 6, 31–37. [CrossRef] Rand, M. Objective Criteria for the Evaluation of Methods Clustering. J. Am. Stat. Assoc. 1971, 66, 846–850. [CrossRef] © 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).