Downloaded from http://iranpaper.ir
http://tarjomano.com
The 22nd Iranian Conference on Electrical Engineering (ICEE 2014), May 20-22, 2014, Shahid Beheshti University
Hyperspectral Image Classification based on Spatial Graph Kernel Mostafa Borhani1 Hassan Ghassemian2 1,2
Faculty of Electrical and Computer Engineering, Tarbiat Modares University,Tehran, Iran 1
[email protected],
[email protected]
Abstract—This paper proposed a new strategy for spectral-spatial hyperspectral image classification. The proposed strategy, has concentrated on spatial graph kernel and automatic “outstanding” spatial structures. Contribution of this paper is related to analysing probabilistic classification results for selecting the most reliable classified pixels as outstanding points of spatial regions. Experimental implementations with four datasets (Indiana Pine, Hekla, University of Pavia and Centre of Pavia) represent advantageous of the proposed method in hyperspectral remote sensing applications. From empirical results, we conclude that the novel proposed approach meaningfuly decreases of oversegmentation, and improves the classification accuracies and provides classification maps with more homogeneous regions. Keywords-component; Remote Sensing; Hyperspectral; Spatial Graph Kernel; Minimum Spanning Forest; Spectral-Spatial Classification; Outstanding points; Probabilistic SVM; Majority Voting; Indiana Pine, Hekla, University and Centre of Pavia ;
I.
II.
THE PRPOSED APPROACH
The goal of proposed approach is improving hyperspectral classification accuracy. This can be achieved by the incorporation additional spatial knowledge into spectral information. We propose to use outstanding points for this purpose. Our approach determines outstanding points for each region, and then split the image in such a way that each region in a pre-classification map is grown from an outstanding point. In order to obtain an accurate spectral-spatial classification map, we must select an indicator for each spatial object in the image. Our proposed approach is shown in Figure 1. Actually it has three significant blocks: 1. outstanding point selection, 2. construction of a spatial graph kernel and 3. optional decision fusion approach.
INTRODUCTION
Recently, many approaches have been directed toward the use of spatial information to refine spectral-based classifiers, in the remotely sensed data. Machine learning algorithms recognize in five groups: Naïve Bayes [1] [2], k-Nearest Neighbors [3] [4], Random Forests (RF) [5], Support Vector Machines (SVM) [6] [7], and Artificial Neural Networks [8]. Spectral Spatial kernels for hyperspectral image classification, e.g., composite [9], morphological [10], and graph kernels [11], have also been introduced recently for the improvement of the SVM classifier. The kernel-based methods have given good results in terms of accuracies for classifying hyperspectral images [9]–[14]. As a main contribution of this paper, in the proposed strategy, we have concentrated on techniques to reduce over-segmentation in a hyperspectral image [10], which is achieved by automatically “true labelling” the meaningful spatial structures before performing an outstanding-controlled segmentation. An important contribution consists in analysing probabilistic classification results for selecting the most reliably classified pixels as representer points of spatial regions. Several representer selection approaches are proposed, using either individual classifiers, or a multiple classifier system. Then, different approaches for outstanding-based spatial graph kernel classifier, using either probabilistic SVM or Multiple Spatial Spectral Classifiers [22] [23] followed with Minimum Spanning Forest. Section 2 will discuss proposed classification using two proposed representer selection procedures and spatial graph kernel especially Minimum Spanning Forest [19] followed by optional majority voting as decision fusion, then, classification approaches using automatically selected outstanding points will be examined by real remote sensing dataset in section 3. Finally, in section 4, conclusions will be outlined.
978-1-4799-4409-5/14/$31.00 ©2014 IEEE
Figure 1. The block diagram of propose approach A. Outstanding Points Selection This section presents two proposed representer selection procedures. The first one uses probabilistic SVM results to choose the most reliable pixels as outstanding points. The selection approach can be represented as followed steps for Bband hyperspectral image: The first step consists in performing a pixel-wise classification of the hyperspectral image. We propose to use an SVM classifier for this purpose which is extremely well suited to classify hyperspectral images. This step results in a classification map (where each pixel contains a label of the class it is assigned to) and a probability map (if a particular pixel is assigned to the class k, the probability map contains a probability estimate for this pixel to actually belong to the class k). And if we look at the results of a pixel-wise classification for the considered image, we can see that there are parts of the image with a high probability of the correct classification and other parts where classification results are less reliable.
1811
Downloaded from http://iranpaper.ir
http://tarjomano.com
The aim of the next step is to choose the most reliably classified pixels, in order to define suitable outstanding points. We propose to analyse the obtained classification and probability maps for this purpose: -
First, we perform a connected components labelling of the classification map.
-
Then, we analyse each connected component: o
o
If it is large (in our experiments, if it contains more than 20 pixels), we consider it as a relevant region (because the classifier has assigned a large group of adjacent pixels to the same class). Therefore, it must contain an indicator. We determine an indicator as 5% of pixels within this component with the highest probabilities. If a region is small, we investigate if its pixels were classified with a high probability. If this is the case, the region represents a small spatial structure. Its tagged pixel with high accuracy potential is formed by the pixels with probability estimates higher than a defined threshold. We have shown that the method is robust to the selection of thresholds.
The drawback of the presented method is that the choice of outstanding points strongly depends on the performances of the selected probabilistic pixelwise classifier (in our proposal, an SVM classifier). Our next objective is to mitigate this dependence. This can be achieved by using not a single classification algorithm for representer selection, but rather multiple classifiers. Thus, we propose a new representer selection approach based on a multiple classifier system, consists of two steps:
classification results. The use of spectral-spatial classifiers yields more accurate classification maps, when compared to those obtained by pixelwise techniques. -
Finally, we construct a map of outstanding points by selecting pixels assigned by all the classifiers to the same class. The next step is to use the obtained map of outstanding points for agent-based region growing.
B. Construction of a Spatial Graph Kernel Here, we will discuss the proposed graph-based approach, which consists in the construction of a minimum spanning forest, where each tree is rooted on one outstanding point. First, we map an image onto a graph. -
Each pixel is considered as a vertex of an undirected graph. Each edge of this graph connects a couple of vertices corresponding to the closest samples (in experiments of this paper, 8-proximity area is used).
-
Furthermore, a weight is assigned to each edge, which indicates the degree of dissimilarity between two pixels connected by this edge. For instance, the Spectral Angle Mapper distance can be used as a dissimilarity measure. ∑
, ∑
(1) ∑
And since we have a map of outstanding points, each identified pixel is associated with the corresponding true tagged pixel. The Minimum Spanning Forest as a Spatial Graph Kernel: Given a graph G, a Minimum Spanning Forest rooted on m vertices is a:
-
First, several classifiers are used independently to classify (for classifying) an image.
•
non-connected graph without cycles,
•
which contains all the vertices of G,
-
Then, an indicator map is constructed: For every pixel, if all the classifiers agree, the pixel is considered as reliably classified and is kept in the map of outstanding points.
•
consists of connected subgraphs, each subgraph (a tree) contains one root
•
The sum of the edges weights of this forest is minimal (among all the possible spanning forests).
Furthermore, we propose to include spatial information in the representer selection procedure. The proposed approach consists of the following steps: -
-
First, unsupervised image segmentation is performed. Segmentation methods based on different principles must be chosen. We have investigated the use of the three techniques: Watershed segmentation [10], segmentation by Expectation Maximization (EM) [20] and segmentation using the Hierarchical SEGmentation (HSEG) [21] method. Then, Pixelwise classification is applied. We propose to use an SVM method for classifying a hyperspectral image.
In order to obtain the MSF rooted on our outstanding points, m additional vertices corresponding to m outstanding points are introduced. (one extra vertex for one agent) The procedure of the construction of a MSF is a region growing method, which consists of the following steps:
Then, each of the obtained unsupervised segmentation maps is combined with the pixelwise classification map using the majority voting principle. For every region in the segmentation map, all the pixels are assigned to the most frequent class within this region. Different segmentation methods based on dissimilar principles lead to different
1812
•
First, m roots are chosen to belong to the forest.
•
Then, at each iteration, we choose an edge from the modified graph with the minimal weight, such that one vertex adjacent to this edge is in the forest, and another is not.
•
And we add this pixel and edge to the forest (In other words, at each iteration a new pixel is added to the segmentation map, so that the dissimilarity criterion between this pixel and one of the pixels already belonging to the map is minimal.).
Downloaded from http://iranpaper.ir
http://tarjomano.com
•
We go to the next iteration, until all the vertices belong to the forest.
•
Finally, we assign a class of each labelled point with high accuracy potential to all the pixels grown from this true labelled.
C. Post-processing In order to make the proposed classification scheme more robust, we propose an optional post-processing step: •
We consider the obtained spectral-spatial classification map as a segmentation map. For this purpose, we apply connected component labelling (using a four-proximity area connectivity) to get a map of regions.
•
Furthermore, for every connected component, all the pixels are assigned to the most frequent class when analysing a pixel-wise classification map within this region. The proposed post-processing scheme is shown red box in Figure 1. The classification maps obtained by the proposed method contain much more homogeneous regions, when compared to a pixel-wise map. III.
EXPERIMENTAL RESULTS
To demonstrate the proposed algorithm performance, let’s take a look at classification accuracies. We have included in Figure 3 accuracies of a pixel-wise k-Nearest Neighbour Classifiers (k=3), Maximum Likelihood classifier, SVM , the Extraction and Classification of Homogeneous Objects (ECHO) [15] and Extended Morphological Profile (EMP) [14] classification, and accuracy of proposed outstanding spatial graph based classification results: using probabilistic SVM [16] or multiple spectral-spatial classifier approach for selecting outstanding points, and using minimum spanning forest for the region growing (here also with post-processing). We can conclude that all our proposed approaches yield higher accuracies when compared to the pixelwise SVM or to the ECHO classification results. The proposed strategy gives the best classification results. In particular, it is of great interest to use a Multiple SpectralSpatial Classifier approach for representer selection and a minimum spanning forest for spatial graph kernel-based region growing and majority voting as post processing block. Our experiments are based on four available hyperspectral remote sensed images. They were conventionally used in different previous works [10] [9] [11] [14]. At this paper, we ignore to mention all of the parameters used for different approaches but actually most of them are selected such that or to be the most accurate as an approach can to be or as it’s parameters in the previous works to compare with. Figure 2-5 presents classification maps for the AVIRAS Indina Pine[17], Hekla [18], University of Pavia and Centre of Pavia images, using a pixelwise SVM classification, and two results obtained by selecting outstanding points using either SVM or a multiple spectral spatial classifier approach, and then constructing a spatial graph kernel. We can see that all spectral-spatial classification maps contain much more homogeneous regions
when compared to a pixelwise map. The drawback of spectralspatial classification approaches (typical for most spectralspatial classification approaches) consists in the fact that they smooth a classification map, and in some parts of the image they may smooth it too much. For instance, this region of the image may consist of the mix of different classes. We can guess it from the pixelwise classification map. Spectral-spatial techniques assign this region (or large parts of this region) to the same class. Figure 6 summarized the overall accuracy by more than 14 percentage training points, when compared to a pixel-wise classification. If we look at corresponding classification accuracies, we can make the similar conclusions: the proposed strategy consisting of a multiple spectral-spatial classifier representer selection followed by construction of a minimum spanning forest yields the best global and most of the classspecific accuracies. We have tested the proposed approaches on four conventional hyperspectral datasets, and the corresponding results can be found in Figure 2-6. In all of cases our proposed approaches earn much better overall accuracy (OA), average accuracy (AA) and kappa factor from others. IV.
CONCLUSION
This section concludes by recalling the main contributions of this paper. The new strategy for spectral-spatial classification of hyperspectral data has been proposed and investigated. The proposed strategy has been concentrated on techniques to reduce oversegmentation in hyperspectral images, which is achieved by automatically “outstanding” spatial structures of interest before performing an indicatorcontrolled segmentation. An important contribution consists in analysing probabilistic classification results for selecting the most reliably classified pixels as outstanding points of spatial regions. We have concluded on the interest of using spatial information and multiple classifier approaches for representer selection, and using a spatial graph kernel-based approach for outstanding point-controlled region growing. Finally, we applied and adapted the proposed method for analysis of four datasets (Indiana Pine [17], Hekla [18], University of Pavia and Centre of Pavia) in hyperspectral remote sensing applications. From experimental results, we can make conclusions that •
The proposed spectral-spatial classification approach uses automatically selected outstanding points: o
Meaningfuly decreases oversegmentation
o
improves classification accuracies
o
and provides classification homogeneous regions
maps
with
•
For the representer selection step, it is advantageous to use an SVM classifier, spatial information and multi classifier approaches.
•
The spatial graph kernel for agent-controlled region growing has proven to be an efficient and robust technique.
1813
Downloaded from http://iranpaper.ir
http://tarjomano.com
(a) ML
. (b) SVM
(c) ECHO
(d) ML+ Spatial Graph Kernel
(e) ML+ Spatial Graph Kernel + Majority Voting
(f) Watershed Segmentation + Majority Voting
(g) E Expectation Maxi.+ Majority Voting
(h) RHSEG+ Majority Voting
(k) SVM+ Spatial Graph Kernel
(m) SVM+ Spatial Graph Kernel + Majority Voting
(n) Multiple Classifier- Spa. Graph Kernel
(o) MSSC- Spatial Graph Kernel
Figure 2: Classification maps for the Indian Pines image. (a) ML. (b) SVM. (c) ECHO. (d) ML+ Spatial Graph Kernel. (e) ML+ Spatial Graph Kernel + Majority Voting. (f) Watershed Segmentation + Majority Voting. (g) EM+ Majority Voting. (h) RHSEG+ Majority Voting. (k) SVM+ Spatial Graph Kernel. (m) SVM+ Spatial Graph Kernel + Majority Voting. (n) Multiple Classifier - Spatial Graph Kernel. (o) Multiple Spatial Spectral Classifiers + Spatial Graph Kernel.
(a) SVM
(b) SVM+ Spatial Graph Kernel + Majority Voting
(c) Multiple Classifier - Spatial Graph Kernel
(d) MSSC- Spatial Graph Kernel
Figure 3: Classification maps for the Center of Pavia image. (a) SVM. (b) SVM+ Spatial Graph Kernel + Majority Voting. (c) Multiple Classifier+ Spatial Graph Kernel. (d) Multiple Spatial Spectral Classifiers + Spatial Graph Kernel.
1814
Downloaded from http://iranpaper.ir
http://tarjomano.com
(a) SVM
(b) ECHO
(c) Watershed Seg. +Majority Voting
(d) EM+ Majority Voting
(e) RHSEG+ Majority Voting
(f) SVM+ Spatial Graph Kernel + Majority Voting
(g) MultiClassifier+ Spatial Graph Kernel
(h) MSSC+ Spatial Graph Kernel
Fig. 4: Classification maps for the University of Pavia image. (a) SVM. (b) ECHO. (c) Watershed Segmentation + Majority Voting. (d) EM+ Majority Voting. (e) RHSEG+ Majority Voting. (f) SVM+ Spatial Graph Kernel + Majority Voting. (g) Multiple Classifier+ Spatial Graph Kernel. (h) Multiple Spatial Spe. Classifiers + Spa. Graph Kernel
(a) SVM
(b) ECHO
(c) Watershed Seg.+Majority Voting
(d) EM+ Majority Voting
(e) RHSEG+ Majority Voting (f) SVM+ Spatial Graph Kernel + Majority Voting (g) Multiple Classifier+ Spatial Graph Kernel (h) MSSC+ Spatial Graph Kernel Figure 5: Classification maps for the Hekla image. (a) SVM. (b) ECHO. (c) Watershed Segmentation+ Majority Voting. (d) EM+ Majority Voting. (e) RHSEG+ Majority Voting. (f) SVM+ Spatial Graph Kernel + Majority Voting. (g) Multiple Classifier+ Spatial Graph Kernel. (h) Multiple Spatial Spectral Classifiers + Spatial Graph Kernel.
1815
Downloaded from http://iranpaper.ir
http://tarjomano.com
REFERENCES
% Accuracy
Indiana Pine 100 80 60 40 20 0
[1]
[2]
[3]
3-NN
ML
SVM
ECHO
SVMMSF
SVM+ MSF+ MV
MCMSF
MSSCMSF
OA
66.27
75.41
78.17
82.64
88.41
91.8
86.66
92.32
AA
76.77
79.61
85.97
83.75
91.57
94.28
92.58
94.22
Kappa 62.04
72.25
75.33
80.38
86.71
90.64
84.82
91.19
[4] [5] [6] [7]
% Accuracy
Hekla 105 100 95 90 85 80
[8]
[9] 3-NN
ML
SVM
ECHO
SVMMSF
SVMMSF+ MV
MCMSF
MSSCMSF
OA
90.17
96.18
88.56
96.63
90.34
98.96
99.08
98.41
AA
86.6
96.99
89.44
97.67
94.89
98.45
99.07
97.52
Kappa 88.64
95.59
86.91
96.12
89.04
98.8
98.93
98.16
[10]
[11]
[12]
% Accuracy
University of Pavia 120 100 80 60 40 20 0
[13]
[14] 3-NN
ML
SVM ECHO EMP
SVMSVMMSF+ MSF MV
MC- MSSC MSF -MSF
OA
68.38 79.06 81.01 87.58 85.22 84.14 91.08 87.98 97.9
AA
77.21 84.85 88.25 92.16 90.76 92.35 94.76 92.05 98.59
Kappa 59.85
72.9
75.86 83.9
80.86 79.71 88.3
[15]
[16]
84.32 97.18
[17]
% Accuracy
Center of Pavia 120 100 80 60 40 20 0
[18]
[19]
3-NN
ML
SVM
ECHO
SVMMSF
SVMMSF+ MV
MCMSF
MSSCMSF
OA -
90.3
95.75
95.64
96.22
96.37
96.62
97.04
97.78
AA -
80.51
91.13
90.6
92.47
92.55
92.78
94.34
94.82
Kappa -
83.9
92.91
92.71
93.7
93.93
94.35
95.04
96.28
Figure 6: Classification accuracy for various images. (a) SVM. (b) ECHO. (c) WH+ Majority Voting. (d) EM+ Majority Voting. (e) RHSEG+ Majority Voting. (f) SVM+ Spatial Graph Kernel + Majority Voting. (g) Multiple Classifier+ Spatial Graph Kernel. (h) Multiple Spatial Spectral Classifiers + Spatial Graph Kernel
[20]
[21] [22]
[23]
1816
Henery, R.J. Classification. In: Michie, D., Spiegelhalter, D.J., Taylor, C.C. (Eds.), Machine Learning, Neural and Statistical Classification. Ellis Horwood, New York, pp. 6–16, 1994 Guyon, I. A practical guide to model selection. In: Marie, J. (Ed.), Proceedings of the Machine Learning Summer School. Canberra, Australia, Springer Text in Statistics, Springer p.37, 2009. Cover, T., Hart, P., 1967. Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13, 21–27. Fix, E., Hodges, J.L. Discriminatory analysis. Nonparametric discrimination; Consistency properties. U.S. Air Force, Texas, 1951. Breiman, L. Random forests. Mach. Learn. 45, 5–32, 2001. Vapnik, V.N. Statistical Learning Theory. John Wiley & Sons, Inc., New York, USA p. 736, 1998. Hsu, C.-W., Chang, C.-C., Lin, C.-J. A Practical Guide to Support Vector ClassificationDepartment of Computer Science, National Taiwan University, Taipei, Taiwan16., 2010. Hastie, T., Tibshirani, R., Friedman, J.H. The elements of statistical learning: data mining, Inference and Prediction, 2nd edn. Springer, New York, USA p. 53, 2009. G. Camps-Valls, L. Gomez-Chova, J. Munoz-Mari, J. Vila-Frances, and J. Calpe-Maravilla, “Composite kernels for hyperspectral image classification,” IEEE Geosci. Remote Sens. Lett.,3, 1, 93–97, Jan. 2006. M. Fauvel, J. Chanussot, and J. A. Benediktsson, “A spatial–spectral kernel-based approach for the classification of remote-sensing images,” Pattern Recognit., vol. 45, no. 1, pp. 381–392, Jan. 2012. G. Camps-Valls, N. Shervashidze, and K. M. Borgwardt, “Spatiospectral remote sensing image classification with graph kernels,” IEEE Geosci. Remote Sens. Lett., vol. 7, no. 4, pp. 741–745, Oct. 2010. A. Plaza, P. Martinez, J. Plaza, and R. Perez, “Dimensionality reduction and classification of hyperspectral image data using sequences of extended morphological transformations,” IEEE Trans. Geosci. Remote Sens., vol. 43, no. 3, pp. 466–479, Mar. 2005. D. M. Mura, A. Villa, J. A. Benediktsson, J. Chanussot, and L. Bruzzone, “Classification of hyperspectral images by using extended morphological attribute profiles and independent component analysis,” IEEE Geosci. Remote Sens. Lett., vol. 8, no. 3, pp. 542–546, May 2011. J. A. Benediktsson, J. A. Palmason, and J. R. Sveinsson, “Classification of hyperspectral data from urban areas based on extended morphological profiles,” IEEE Trans. Geosci. Remote Sens.,43, 3, 480–491, Mar. 2005. R. L. Kettig and D. A. Landgrebe. Classification of multispectral image data by extraction and classification of homogeneous objects. IEEE Trans. Geoscience Electronics, 14(1):19–26, Jan. 1976. J. Platt, “Probabilities for support vector machines,” in Advances in Large Margin Classifiers, A. Smola, P. Bartlett, B. Schölkopf, and D. Schuurmans, Eds. Cambridge, MA: MIT Press, 2000, pp. 61–74. AVIRIS NW Indiana’s Indian Pines 1992 data set [Online], Available: ftp://ftp.ecn.purdue.edu/biehl/MultiSpec/92AV3CS J. A. Benediktsson and I. Kanellopoulos, “Classification of multisource and hyperspectral data based on decision fusion,” IEEE Trans. Geosci. Remote Sens., vol. 37, no. 3, pp. 1367–1377, May 1999. J. Stawiaski, “Mathematical morphology and graphs: Application to interactive medical image segmentation,” Ph.D. dissertation, Paris School Mines, Paris, France, 2008. A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, B 39(1):1–38, 1977. J.C. Tilton. Analysis of hierarchically related image segmentations. IEEE Workshop on Ad. in Tech. for Analy.of R. Sen. Data, 60–69, 2003. M. Borhani, H. Ghassemian, Novel Spatial Approaches for Classification of Hyperspectral Remotely Sensed Landscapes, Symposium on Artificial Intelligence and Signal Processing , 2013 M. Borhani, H. Ghassemian, Hyperspectral Image Classification Based on Spectral-Spatial Features Using Probabilistic SVM and Locally Weighted Markov Random Fields, Iranian Conference on Intelligent Systems (ICIS), 1377-1382, 2014