Satellite Image Classification Using Morphological ...

2 downloads 0 Views 755KB Size Report
Abstract—In this letter, we proposed a high-resolution satel- lite image classification method using morphological component analysis of texture and cartoon ...
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 10, NO. 5, SEPTEMBER 2013

1109

Satellite Image Classification Using Morphological Component Analysis of Texture and Cartoon Layers Chong Yu, Qinfu Qiu, Yilu Zhao, and Xiong Chen

Abstract—In this letter, we proposed a high-resolution satellite image classification method using morphological component analysis of texture and cartoon layers. The construction of the dictionary matrix used in the algorithm is based on independent component analysis. After the decomposition, we obtain the morphological coefficient vectors in both texture and cartoon layers which are termed as the sparse representation of the input highresolution satellite image. By combining the features from two layers, the total probability of the target image classification is calculated out according to the maximum likelihood mechanism. Quantitative analysis on experiment results and comparisons with classic image classification algorithms have proven that the proposed classification method has better accuracy, efficiency, and performance than most of state-of-the-art classification methods. Index Terms—Independent component analysis (ICA), morphological component analysis (MCA), satellite image classification, sparse representation, texture and cartoon layers.

I. I NTRODUCTION ATELLITE image analysis has attracted growing interests in various applications, such as mineralogy, forestry, agriculture, military, mapping, urban planning, ocean monitoring, and disaster prevention. Most of the applications can be finally changed into classification problems which are equivalent to labeling imaged terrains or regions in the remote sensing field. This trend mainly comes from the fact that the high-resolution satellite images have become more accessible recently. With the increasing spatial resolution, more details about the objects and their structures on the surface of the Earth can be captured and offered by satellite images. Moreover, the high-resolution satellite images provide us with the potential to discriminate more detailed categories and improve the classification accuracy eventually. In the literature, various techniques have been developed for the classification purpose in remote sensing applications. The most representative state-of-the-art methods include neural networks, support vector machines (SVMs) [1], graph method [2], AdaBoost, border vector detection method, linear discriminant analysis, Gaussian process approach, and nearest neighbor (NN) classifier [3]. Despite of the classification techniques and achievements mentioned, the problems of how to use the high-resolution satellite images more effectively and how to further increase the classification accuracy with computers are still far from being

S

Manuscript received August 15, 2012; revised September 26, 2012, October 20, 2012, and November 17, 2012; accepted November 18, 2012. Date of publication February 25, 2013; date of current version June 13, 2013. The authors are with the School of Information Science and Technology, Fudan University, Shanghai 200433, China (e-mail: [email protected]; [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/LGRS.2012.2230612

solved. Here are some challenges in the high-resolution satellite image classification task. For high-resolution satellite images, the objects and their structures often dominate the classification process. Examples in [4] show that the objects in different categories may have similar global responses to the classification algorithm, while those in the same category may have quite different responses to the same algorithm. Due to these phenomena, the whole imagebased representation may sometimes lead to mistakes for the image classification. Among the previous classification approaches, SVMs have been proven as a powerful tool to solve supervised classification problems for high-dimensional data. However, SVM-based classification methods are extremely sensitive to model selection [5]. Moreover, as SVM requires a sophisticated learned model for each category, this method is incapable of scaling up to classify thousands of categories. Recently, the nonparametric NN schemes have attracted great interests in the image classification field. However, despite the popularity of this algorithm, it is still unclear how to suitably define the “true” neighbors of the high-resolution optical satellite images. Therefore, the wide application prospect of this method is limited. Generally speaking, collection of labeled training samples is quite difficult, time consuming, and expensive, while unlabeled training samples can be collected or generated in a much easier way. In remote sensing applications, a large number of labeled training samples are usually required to obtain a good classification performance. Unfortunately, nonavailability of a sufficient amount of labeled training samples is a general problem in pattern classification issues. What was worse, this is a more serious problem in remote sensing applications because it is extremely difficult and expensive to identify and label the training samples. Sometimes, it is even not feasible. In a conclusion, the acquired high-resolution satellite images, which contain tremendous information, and the increasingly demanding classification requirements make most of the stateof-the-art classification algorithms incompetent. In this letter, we mainly focus on solving the challenges mentioned in the previous high-resolution satellite image classification issues by using morphological component analysis (MCA). The overall organization of this letter is as follows. In Section II, we give the description of MCA algorithm. In Section III, the decomposition of the image to texture and cartoon layers based on MCA algorithm is described. In Section IV, the satellite image classification method is proposed. In Section V, some experiment results for testing the capabilities and performance of the proposed classification algorithm are shown. Finally, Section VI summarizes this whole letter and makes closing conclusion.

1545-598X/$31.00 © 2013 IEEE

1110

IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 10, NO. 5, SEPTEMBER 2013

II. MCA F RAMEWORK

B. Dictionary Matrix Constructed by ICA

A. Algorithm Framework MCA framework assumes that image x containing M pixels is the linear combination of P morphological components, which are possibly contaminated with additive noise ε y=

P 

σε2 = Var[ε] < +∞.

xk + ε

(1)

k=1

MCA algorithm is used to recover the morphological components xk (k = 1, . . . , P ) from the observed linear mixture y. Each morphological component xk can be represented by the following expression: x k = Dk α k ,

k = 1, . . . , P

(2)

where Dk is the No.k basis used to represent the image x. αk is the coefficient of No.k basis. By combining the linear equations together, the linear system can be written in the vector and matrix form as follows: x = Dα

(3)

where x ∈ RM is the input image which is represented as a 1-D vector of length M by lexicographic ordering. The matrix D = [D1 , . . . , DP ] ∈ RM ×P (typically M  P ) is a dictionary built by P elements. α ∈ RP is the vector of sparse coefficients. The vector α is termed as the sparse representation of the input image x. Moreover, the word “sparse” means that only a few coefficients are large enough. Moreover, most of the coefficients are near to zero, so they are negligible. In MCA algorithm framework, each column of P columns of the matrix D is a possible image in set RM . Thus, these columns are referred as atomic images, and the matrix D is termed as a dictionary of atoms. We can interpret the matrix D as the periodic table of the fundamental elements in chemistry. We refer to the coefficient vector α that generates the image x as its representation, since vector α determines which atoms and their proportions are used for construction. The constructed image x can be illustrated as a molecule in the “signal chemistry.” This process of linearly combining atoms to form the image x can be termed as atom composition. The sparsity of the coefficient vector α can be quantified by different quasi-norms with the following form:  αpp = |α(i)|p (4) i

where αp is the penalty quantifying sparsity (typically 0 ≤ p ≤ 1). In particular, the 0 pseudonorm, i.e., α0 = limp→0 αpp , is just equal to the number of nonzero components in the coefficient vector α. In the MCA algorithm, it is proposed to solve the underdetermined system of equations, i.e., x = Dα, and estimate the morphological components, i.e., xk (k = 1, . . . , P ), by solving the constrained optimization problem as follows: minx1 ,...,xP

P  k=1

DkT xk pp ,

such that y −

P  k=1

xk 2 ≤ τ

√ where τ is typically chosen as a constant: P σε . The constraint in this optimization problem accounts for the presence of model imperfection and noise.

As we all know, independent component analysis (ICA) is a brilliant signal processing method used in blind source separation. ICA algorithm can separate the mixture of multisource signals into independent components. By adopting the ICA algorithm to a real image, we can obtain a group of independent bases which are the feature vectors of high-order statistics. The real image can be represented as the linear combination of these independent bases. In this letter, we construct the dictionary matrix D used in the MCA framework by ICA algorithm instead of conventional transforms mentioned earlier. In the experiment part, we will see that the dictionary matrix constructed by ICA has a better performance than others through contrast experiments. C. MCA Algorithm In general, the constrained optimization problem mentioned earlier is hard to solve, particularly when p < 1. In the condition that p = 0, the computational complexity of the optimization problem is even non-deterministic polynomial (NP)-hard. However, if all morphological components, i.e., xi = Di αi , are fixed except the No.k morphological component xk = Dk αk , then it has been proven that the solution, i.e., αk , can be given by hard thresholding (when p = 0) or soft thresholding (when p = 1) the marginal residuals with the following form: rk = y − i =k Di αi . These marginal residuals rk contain the feature information of morphological component xk and can be reduced by other morphological components. This concept dictates a coordinate relaxation algorithm that cycles through all the morphological components and applies a threshold to the marginal residuals in each iteration process. The steps of MCA algorithm summarized in Algorithm part can be justified in this concept. The symbol T Hλ (α) denotes the thresholding process suitable for morphological components with threshold λ. In hard thresholding process, if |u| > λ, HTλ (u) = u; otherwise, HTλ (u) = 0. In soft thresholding process, STλ (u) = u max(1 − λ/|u|, 0). Algorithm: MCA decomposition algorithm Task: signal and image decomposition Parameters: the signal or image: x the dictionary matrix: D = [D1 , . . . , DP ], number of iterations: Niteration , stopping threshold λmin , threshold update schedule. Initialization: (0) Initial the solution: xk = 0, ∀k. Initial the residual: r(0) = y. Initial the threshold: let k ∗ = maxk DkT y∞ , set λ(0) = maxk =k∗ DkT y∞ . Main iteration: For t = 1 to t = Niteration For k = 1 to k = P (t) 1) Compute marginal residuals: rk =r(t−1)+xk . 2) Update No.k morphological component coeffi(t) (t) cients by thresholding: αk =T Hλ(t−1) (DkT rk ). 3) Update No.k morphological component: (t) (t) xk = Dk αk .

YU et al.: SATELLITE IMAGE CLASSIFICATION USING MCA OF TEXTURE AND CARTOON LAYERS

 (t) 1) Update the residuals: r(t) = y − P k=1 xk . 2) Update the threshold λ(t) according to the given threshold update schedule. If λ(t) ≤ λmin , then stop iteration process. (N ) Output: morphological components (xk iteration )k=1,...,P . The MCA algorithm handles the data which are perturbed by additive noise ε with bounded variance σε2 = Var[ε] < +∞ in a natural way. As the MCA algorithm is a coarse-to-fine iterative procedure which is in terms of coefficient amplitude in the dictionary, additive noise with bounded energy can be handled just by stopping iteration when the residual is at the same order of magnitude with the additive noise. Assuming that the noise variance σε2 is known, the MCA algorithm can be stopped at iteration t when √ the 2 -norm of the residual satisfies the condition: r(t) 2 ≤ P σε . III. D ECOMPOSITION OF T EXTURE AND C ARTOON L AYERS The MCA algorithm has an excellent performance in the decomposition task of linearly combined texture and cartoon layers in an image. As we all know, a real image contains both geometry and texture layers. The piecewise smooth content in the real image is referred as cartoon layer, which only carries the geometric information of the real image. Other textured content in the real image is referred as texture layer. Assume that the matrices D t and D c are the representation dictionary matrices of texture layer and cartoon layer, respectively. For an arbitrary image x containing both layers which are segmented or superposed, we propose to find a suitable sparse representation method with the combined dictionary consisting of both matrix D t and matrix D c . If we define the sparsity of the coefficients with 0 pseudonorm, then the constrained optimization problem that needs to be solved has the following form:  opt opt  = arg min{αopt ,αopt αt , α c αt 0 + αc 0 c } t subject to x = D t αt + D c αc . Intuitively, it is very desirable to obtain the solution of this constrained optimization problem. The solution will lead to a successful decomposition, with D t αt containing the texture layer and D c αc containing the cartoon layer. This expectation mainly relies on the assumptions mentioned earlier that the matrices D t and D c can sparsely represent one layer while being highly noneffective in sparsely representing the other layer. However, the constrained optimization problem mentioned earlier is nonconvex and intractable to solve. Computational complexity grows exponentially with the number of columns in the overall dictionary matrix. The basis pursuit method proposed a solution by replacing the 0 with the 1 pseudonorm. Then, the intractable nonconvex constrained optimization problem is converted to a tractable convex constrained optimization problem which can reduce the computational complexity by linear programming. The tractable problem is as follows:  opt opt  αt , α c = arg min{αopt ,αopt αt 1 + αc 1 c } t subject to x = D t αt + D c αc .

1111

If the real image is perturbed by noise, it cannot be completely decomposed into the sparse texture layer and cartoon layer. In this condition, a noise-cognizant version of the convex constrained optimization problem solved by the basis pursuit method is proposed as follows:  opt opt  = arg min{αopt ,αopt αt 1 +αc 1 , αt , α c c } t subject to x−D t αt −D c αc 2 ≤ ε. In this way, the decomposition of the real image is only an approximate representation. It will leave some errors to be absorbed by the content that cannot be represented well by both the texture and cartoon dictionary matrices. The parameter ε stands for the noise level in the real image x. IV. S ATELLITE I MAGE C LASSIFICATION The satellite image classification is directly based on the sparse coefficient vector of texture layer αt and the sparse coefficient vector of cartoon layer αc , which are obtained from MCA decomposition. We assume that there are J categories of target objects in a given real image. Each category has Z category labels, i.e., C = {c1 , . . . , cZ }, where the value of each category label ci ∈ {1, . . . , L} (i = 1, . . . , Z). The symbol wij (i = 1, . . . , Z; j = 1, . . . , J) represents the weight of the No.i label: ci of the No.j category is introduced in order to further boost the classification performance. For example, there are three categories in a given real image; then, J = 3. Each category has two category labels; then, Z = 2. Moreover, the category labels of No.1, No.2, and No.3 categories are {c1 , c2 }_1, {c1 , c2 }_2, and {c1 , c2 }_3. Since the subsymbols would make the expressions less clear, so we use C = {c1 , . . . , cZ } to represent the set of {c1 , c2 }_1, {c1 , c2 }_2, {c1 , c2 }_3. The values of category labels c1 , c2 can be chosen in {1, . . . , 10}; then, L = 10. Thus, the symbol w23 represents the weight of the No.2 label: c2 of the No.3 category. In order to quantify the image categories, a set of coefficients, i.e., CW = [cw1 , . . . , cwL ]T , is introduced to measure the weight of each selected category. The definition equation is as follows: cwj =

Z  i=1

wij αti +

Z 

wij αci ,

j = 1, . . . , J

(5)

i=1

where αti is the sparse coefficient of the No.i label of the No.j category in the texture layer and αci is the sparse coefficient of the No.i label of the No.j category in the cartoon layer. By defining a weight matrix W as follows: ⎤ ⎡ 1 1 w 1 . . . wZ ⎢ .. ⎥ .. W = ⎣ ... (6) . . ⎦ w1J

···

J wZ

then we can reformulate the definition equation as follows: CW = W t αt + W c αc .

(7)

According to the maximum likelihood mechanism, by combining the features both from the texture layer and the cartoon

1112

IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 10, NO. 5, SEPTEMBER 2013

layer, the total probability of classifying the target image xtarget into the No.j category can be defined in Pj (xtarget ) Z





j i max wij αti , 0 + Z i=1 max wi αc , 0 



 . = j i j i J Z Z j  =1 i=1 max wi αt , 0 + i=1 max wi αc , 0 (8) i=1

Prior to computing the decision rule defined earlier, all the sparse coefficient vectors should be normalized to alleviate the coefficient sensitivity to the values of the sparse coefficient vectors.

Fig. 1.

Dictionary which is constructed by ICA.

Fig. 2.

Original satellite image collected from Digital Global Company.

V. E XPERIMENTS AND A NALYSIS In this section, we report on extensive experiments to evaluate the performance of the proposed method. All experiments are performed on MATLAB R2011b platform. To reduce the random errors, each result of overall accuracy shown is obtained as an average of 15 Monte Carlo runs. We choose the samples randomly as the labeled training samples in every experiment so that our results and conclusions should not depend on any particular choice of the labeled training data. The high-resolution images are collected by three satellites named WorldView-1, WorldView-2, and QuickBird from Digital Global Company. The highest spatial resolution of the satellite images can reach 0.5 m/pixel. In this experiment, we first test the general classification ability of the proposed method. Then, we mainly focus on the comparison experiment with state-of-the-art classification methods. We select some representative methods, namely, dynamic subspace (DS) method in [6], homotopy method (HM) in [7], NN method in [3], semisupervised (S_S) method in [2], and SVMs in [1]. It is a nonlinear SVM based on Gaussian radial basis kernel functions. The specific settings of optimal kernel width parameter in [1]. We select the satellite image of an island to demonstrate the classification ability of the proposed MCA of texture and cartoon layer (MCA_TC) method. The image has the dimension of 1500 × 750 pixels. To make the experimental analysis more significant from the statistical viewpoint, seven categories of the physical scenes in the satellite image with sufficient samples for labeling and training are marked. The chosen categories contain Meadow, Racetrack, Road, Roof, Shadow, Tree, and Water. The remaining pixels are classified into the Remaining category. We construct the dictionary matrix D used in the MCA_TC method by adopting ICA algorithm to the satellite image itself. The dictionary which is learned from a group of labeled training samples (15% of total samples) is shown in Fig. 1. The dictionary contains 256 atoms. Each atom is arranged as a patch with the size of 16 × 16 pixels. The original satellite image is shown in Fig. 2. The general classification results on the satellite image are shown in Fig. 3. In this case, 15% of samples in each category are selected randomly as the labeled training samples, and the remaining samples are used to test the performance. Figs. 2 and 3 reveal that the proposed MCA_TC algorithm has quite good classification accuracy of seven chosen physical scene categories in general.

Fig. 3. Classification results which are obtained by MCA_TC algorithm. The specific categories and corresponding colors are shown in legend.

To illustrate the effects of two factors (i.e., number of labeled training samples and different categories) on the classification accuracy, we make the curve plot as shown in Fig. 4. The first conclusion is that constructing the dictionary matrix by adopting ICA to the satellite image itself obtains a salient performance in the satellite image classification. Compared to conventional transforms, the bases gained by ICA can represent the features of high-order statistics in satellite images and are more suitable in MCA decomposition of texture and cartoon layers. As we all know, choosing an appropriate dictionary matrix is the key step toward a good sparse decomposition and classification algorithm. Thus, the dictionary matrix constructed by ICA plays an important role in improving the overall accuracy of the MCA_TC algorithm. The second conclusion is that the amount of labeled training samples is crucial to the classification accuracy of MCA_TC method. The experiment data show that, although the classification accuracy of MCA_TC suffers from the increase of the test sets and the decrease of the labeled training samples,

YU et al.: SATELLITE IMAGE CLASSIFICATION USING MCA OF TEXTURE AND CARTOON LAYERS

Fig. 4.

Effect of two factors on the classification accuracy.

1113

better than those of the DS approach, but as the number of labeled training samples gets higher, the performance of NN method is exceeded by that of DS method. In all cases, the performance of the proposed MCA_TC method is better than that of compared methods. The overall accuracy of HM approach is comparable to that of the proposed MCA_TC approach with the increasing of labeled samples being used for training. However, when there are fewer labeled samples, the overall accuracy of HM approach decreases faster than that of MCA_TC approach. As we all know, nonavailability of a sufficient amount of labeled training samples is a general problem in pattern classification issues. What was worse, this is a more serious problem in remote sensing applications because it is extremely difficult and expensive to identify and label the training samples. Sometimes, it is even not feasible. Thus, the salient capability of classification with few labeled training samples is of great value in applications. VI. C ONCLUSION

Fig. 5.

Classification results of MCA_TC, DS, HM, NN, S_S, and SVM.

the MCA_TC can still obtain satisfactory performance if the training set has more than 15% of the total available samples. This experiment also demonstrates that MCA_TC method has a different capability on the classification of physical scenes from different categories. In Fig. 4, we can obviously observe that MCA_TC has best category accuracy of trees and meadows, while category accuracy of roads is worst. Category accuracy of the other four categories is close to the overall accuracy. The features of texture and cartoon layers in trees and meadows are relatively obvious. There are some buses, cars, walking people, etc., on the road which can be considered as the noise of satellite image of road category. In this aspect, the experiment results are reasonable and can be explained. In comparison experiment, we analyze the effect of labeled training sample amount on the classification accuracies. Ten different situations are tested in this experiment, i.e., 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, and 50% of the total available samples of each category are randomly selected to be labeled and as the training samples. The remaining samples are used for evaluation. The classification performances of MCA_TC, DS, HM, NN, S_S, and SVM are illustrated as the curve plot as shown in Fig. 5. As it can be seen from Fig. 5, the overall accuracy of the proposed MCA_TC approach is quite good for insufficient labeled training samples. For the case that the labeled training samples only occupy 1% of all samples, the overall accuracies achieved by MCA_TC method and HM method are 67.74% and 62.18%, respectively, which are better than those of DS, NN, S_S, and SVM methods. With the increase in the amount of labeled training samples, the classification performances of six methods improve as expected. For a few labeled training samples in each category, the classification results of NN method are slightly

In this letter, we have presented a high-resolution satellite image classification method based on MCA. The MCA algorithm is designed as an iteration process to suit computer programming. The construction of the dictionary matrix is based on ICA. We quantify the sparsity of morphological coefficient vector by quasi-norms and transform the underdetermined system of equations into a constrained optimization problem. By adopting the MCA algorithm, we decompose the high-resolution satellite image into texture and cartoon layers. Combining the features from both layers, the total probability of the target image classification is calculated according to the maximum likelihood mechanism. Experiment results have shown the salient capabilities and performance of the proposed classification algorithm. To achieve the same classification accuracy, MCA_TC method needs fewer amounts of labeled training samples and less learning time than conventional methods. Quantitative analysis and comparisons with classic image classification algorithms have proven that satellite image classification method based on MCA_TC has better accuracy and efficiency and is more suitable to the increasingly demanding classification requirements than most of state-of-the-art classification algorithms. R EFERENCES [1] F. Melgani and L. Bruzzone, “Classification of hyperspectral remote sensing images with support vector machines,” IEEE Trans. Geosci. Remote Sens., vol. 42, no. 8, pp. 1778–1790, Aug. 2004. [2] G. Camps-Valls, T. V BandosMarsheva, and D. Zhou, “Semi-supervised graph-based hyperspectral image classification,” IEEE Trans. Geosci. Remote Sens., vol. 45, no. 10, pp. 3044–3054, Oct. 2007. [3] J. Yang, P. Yu, and B. Kuo, “A nonparametric feature extraction and its application to nearest neighbor classification for hyperspectral image data,” IEEE Trans. Geosci. Remote Sens., vol. 48, no. 3, pp. 1279–1293, Mar. 2010. [4] D. X. Dai and W. Yang, “Satellite image classification via two-layer sparse coding with biased image representation,” IEEE Geosci. Remote Sens. Lett., vol. 8, no. 1, pp. 173–176, Jan. 2011. [5] Q. S. Haq, L. M. Tao, F. C. Sun, and S. Q. Yang, “A fast and robust sparse approach for hyperspectral data classification using a few labeled samples,” IEEE Trans. Geosci. Remote Sens., vol. 50, no. 6, pp. 2287–2302, Jun. 2012. [6] J.-M. Yang, B.-C. Kuo, P.-T. Yu, and C.-H. Chuang, “A dynamic subspace method for hyperspectral image classification,” IEEE Trans. Geosci. Remote Sens., vol. 48, no. 7, pp. 2840–2853, Jul. 2010. [7] Q. S. ulHaq, L. Shi, L. Tao, and S. Yang, “Hyperspectral data classification via sparse representation in homotopy,” in Proc. IEEE Int. Conf. Inf. Sci. Eng., Dec. 2010, pp. 3748–3752.