3792
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 9, NO. 8, AUGUST 2016
Building Collapse Assessment in Urban Areas Using Texture Information From Postevent SAR Data Weidong Sun, Lei Shi, Member, IEEE, Jie Yang, and Pingxiang Li, Member, IEEE
Abstract—A number of earthquakes have occurred in recent years, posing a challenge to Earth observation techniques. Owing to the all-weather response capability, synthetic aperture radar (SAR) has become a key tool for collapse interpretation. As the requirement for multitemporal data is usually not satisfied in practice, interpretation using only postevent SAR imagery is indispensable for emergency rescue. Despite being found that texture has a relationship with building damage, only a few texture measures have been adopted using simple threshold criteria. To fully explore the spatial contextual information in very high resolution (VHR) SAR images, there are two questions that should be discussed: 1) Which textural features are helpful for collapse assessment? 2) How do diverse imaging configurations influence the interpretation? In response, five texture descriptors are used for the stricken area texture extraction in this paper, and a random forests classifier is applied to identify the building collapse level. It was found that most of the gray-level histogram features perform quite well, and several other primitive features, such as the isolated bright points originating from scattered rubble, can also help to discriminate different collapse levels. Moreover, the experiments with data from the Yushu earthquake of April 14, 2010, indicated that spatial detail quality is the key for texture interpretation, and the span image can be considered as an appropriate choice when VHR PolSAR data are available. The optimal interpretation results (with an overall accuracy of 84.7%) were obtained using the 122 proposed measures in a VHR X-band image. Index Terms—Building collapse, earthquake, postevent, random forests (RFs), synthetic aperture radar (SAR), texture.
I. INTRODUCTION NUMBER of natural disasters have occurred worldwide in recent years, including earthquakes and possible consequent tsunamis in the Indian Ocean (2004), Peru (2007), Haiti (2010), and Chile (2010), and in Wenchuan (2008), Yushu
A
Manuscript received November 10, 2015; revised January 7, 2016 and April 22, 2016; accepted June 1, 2016. Date of publication June 29, 2016; date of current version August 24, 2016. This work was supported in part by the Public Welfare Project of Surveying and Mapping under Grant 201412002, in part by the National Natural Science Foundation of China under Grant 41501382, Grant 91438203, and Grant 61371199, in part by the Hubei Provincial Natural Science Foundation of China under Grant 2015CFB328, in part by the Hong Kong Scholars Program, and in part by the National Administration of Surveying, Mapping and Geo-Information through the Key Laboratory of Geo-Informatics under Grant 201406. (Corresponding author: Weidong Sun.) W. Sun, J. Yang, and P. Li are with the State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430079, China (e-mail:
[email protected];
[email protected];
[email protected]). L. Shi is with the State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430079, China and also with the Department of Land Surveying and Geo-Informatics, Hong Kong Polytechnic University, Kowloon, Hong Kong (e-mail: shi.lei@ whu.edu.cn). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org Digital Object Identifier 10.1109/JSTARS.2016.2580610
(2010), and Ludian (2014) in China. Owing to the unpredictability and destructiveness, earthquakes are a serious threat to people’s lives and properties. Once a major disaster occurs, damage assessment and rapid observation of the stricken areas are crucial for the subsequent rescue operations, posing a severe challenge to Earth observation remote sensing techniques. In recent years, optical sensors and synthetic aperture radar (SAR) have been widely used to quickly investigate the destruction situation over large areas after an earthquake. On one hand, optical images have been applied to detect damaged buildings and assess the extent of damage [1]–[4]. Unfortunately, night-time and severe weather often limit the use of optical images in practice (e.g., after the Yushu earthquake). On the other hand, thanks to the unique characteristics of microwaves, SAR sensors have a nearly all-weather capability and can work day and night [5]. Because backscattering and phase information are sensitive to ground changes [6], building damage can be detected based on changes in intensity [7], [8] and coherence [9], [10], using pre- and postevent SAR images. Moreover, combination analyses between optical and SAR images have been used to explore damage information in detail [8], [11], [12]. Although the results based on these above methods have attained certain reliability, the requirement for multitemporal or multisource data sometimes may not be satisfied, especially for very high resolution (VHR) pre- and postevent airborne images. As a result, to avoid this dilemma, a number of researchers have started to focus on earthquake damage assessment using only postevent SAR data [5]. Balz and Liao [13] preliminarily analyzed the spatial characteristics of damaged buildings in postevent SAR images after the Wenchuan earthquake of 2008. Brunner et al. [14] later showed that an increased resolution could support a more reliable identification of damaged buildings, allowing them to classify the buildings into several classes in decimeter-resolution SAR images. This study pointed out a possible way to explore spatial contextual information, but did not give a detailed and quantitative analysis. Lately, encouraged by the suggestion that polarimetric parameters such as the polarization orientation angle and entropy could help to distinguish damaged buildings [15]–[17], Zhao et al. [6] made use of postevent polarimetric SAR (PolSAR) data from Yushu to assess the damage based on a damage-level index. In their decision process, only one specific gray-level co-occurrence matrix (GLCM) measure was adopted to correct the inaccuracy caused by oblique buildings, which indicates the potential power of texture information, but they just used a simple threshold criterion, which does not make the best use of the spatial contextual information. However, in our recent research [18], three SAR image information types, namely, polarimetric, interferometric,
1939-1404 © 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications standards/publications/rights/index.html for more information.
SUN et al.: BUILDING COLLAPSE ASSESSMENT IN URBAN AREAS USING TEXTURE INFORMATION FROM POSTEVENT SAR DATA
and textural feature sets, were investigated, and it was found that the texture descriptors are the most useful for postevent collapse interpretation. Therefore, in this paper, we focus on independently applying texture information in collapse assessment and investigating the corresponding effectiveness, since with more and more SAR sensors which satisfy a meter- or even decimeter-level resolution coming into service, image resolution quality is becoming much finer, and the spatial information is therefore becoming more critical. Owing to the difficulty of postevent SAR imagery interpretation, a stricken area is traditionally first divided into several city blocks according to the road map, and the interpretation result is made in such a block unit [4], [6], [19], which implies that once a block is classified incorrectly, the interpretation of the buildings in this block will also be wrong. However, in fact, textural features can reveal the local spatial variation of the center pixel in an extraction window, powerfully supporting pixel-based interpretation. Furthermore, as is well known, different texture measures can capture different properties of the contextual information, such as regularity, coarseness, and directionality [20], [21]. We therefore propose to exploit not just a single feature, but a variety of texture measures, to classify pixels into three collapse levels, noting that this is a pixel-based collapse assessment rather than a block-based approach. Five common texture extraction methods which are widely used in image processing are introduced to explore the spatial characteristics of buildings in different collapse levels, and we make a comprehensive analysis of their macro/micro-effectiveness using a random forests (RFs) classifier. To the best of the authors’ knowledge, this is the first time that research has focused on the application and analysis of various textural features for pixelbased collapse assessment using only postevent VHR SAR data. Furthermore, other possible factors influencing the interpretation are also discussed. The remainder of this paper is organized as follows. Section II first gives an overview of the study area, the Chinese domestic dual-band airborne SAR system named CASMSAR, and the preprocessing steps. The main algorithms and process flow used in the paper are introduced in Section III, including the texture extraction and classification. The results and the relevant analysis about how texture, resolution, channels, and bands influence these results are provided in Section IV. Finally, the conclusions and future work are presented in Section V. II. CASMSAR PRODUCTS AND PREPROCESSING A. Product Description On April 14th, 2010, a major earthquake occurred in Yushu County, Qinghai province, China, with a maximum magnitude of 7.1, which was one of the biggest disasters in the last decade in China. The epicenter was located at 33.271°N, 96.629°E at a depth of 10 km. This great disaster caused severe damage, and a large number of buildings were damaged or collapsed, which resulted in the deaths of 2600 persons and countless others being made homeless. Unfortunately, limited by the rainfall and cloudy conditions at the time, optical remote sensing technology was unable to provide a rapid response for the emergency
3793
TABLE I ACQUISITION INFORMATION OF THE POSTEVENT CASMSAR X-BAND HH POLARIZATION MASTER IMAGES AND THE P-BAND FULLY POLARIMETRIC IMAGES OVER YUSHU COUNTY, QINGHAI PROVINCE, CHINA, APRIL 17, 2010 Image X-strip 13 X-strip 14 X-strip 15 X-strip 16 P-strip 1 P-strip 2
Code
Data
Polar.
Angle
20100417_083814_13 20100417_083814_14 20100417_083814_15 20100417_083814_16 201004170831_F1_39_46 201004180933_F6_36_43
2010/04/17 2010/04/17 2010/04/17 2010/04/17 2010/04/17 2010/04/17
HH HH HH HH Full Full
45.9° 45.9° 45.9° 45.9° 47.5° 47.5°
Fig. 1. Typical structure instances of three collapse levels: minor, moderate, and serious. (a)–(f) and (g)–(l) are the pre- and postevent Google Earth VHR optical images from 2009/05/25 and 2010/05/06, respectively. (m)–(r) are the corresponding postevent X-band VHR SAR images from 2010/04/17.
rescue task. Fortunately, the Chinese domestic dual-band airborne SAR mapping system (CASMSAR) was able to undertake an urgent investigation of the stricken area [22]. The CASMSAR system is equipped with two independent high-resolution SAR sensors (X-band and P-band), allowing the integration of interferometric and fully polarimetric functions. On April 17th, 2010, four postevent VHR X-band HH polarization interferometric pairs (strips 13–16, resolution 0.5 m) with product codes from 20100417_083814_13 to 20100417_083814_16 and two P-band PolSAR datasets (resolution 1.1 m) with product codes 201004170831_File1_0039_0046 (201004170831_F1_39_46) and 201004180933_File6_0036_0043 (201004180933_F6_ 36_43) were collected. The relevant imaging information is listed in Table I. As only the master images in these interferometric pairs are used to extract the textural features in this paper, the slave images are ignored. Some typical structures of different collapse levels in the Yushu earthquake in China are presented in Fig. 1. As the earthquake occurred on 2010/04/14, Fig. 1(a)–(f) are pre-event optical images from 2009/05/25, and (g)–(l) are postevent images from 2010/05/06, with both sets of images being obtained from Google Earth. The corresponding postevent X-band VHR SAR images are shown in Fig. 1(m)–(r). B. Preprocessing As the acquired data have diverse imaging configurations, such as coverage, resolution, center incident angle, and polarization state, the slightly cumbersome preprocessing flow is
3794
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 9, NO. 8, AUGUST 2016
Fig. 3. Multiresolution pyramids of the X/P-band datasets. To the left is the X-band pyramid (Res 0: 0.5 m, Res 1: 1.0 m, Res 2: 2.0 m) and to the right is the P-band pyramid (Res 1: 1.1 m, Res 2: 2.2 m, Res 3: 4.4 m).
0.5/1.0/2.0 m resolution, and the P-band pyramid includes Res 1/2/3 for the 1.1/2.2/4.4 m resolution. III. COLLAPSE ASSESSMENT METHODS A. Texture Extraction
Fig. 2. (a) Preprocessing flow. (b) X-band intensity image covering the overlapping built-up areas. (c) P-strip 1 Pauli-RGB image. (d) P-strip 2 Pauli-RGB image.
shown in Fig. 2(a), before the full description. In general, the procedure is as follows. For the two P-band datasets, because the nominal polarimetric calibration accuracy of the CASMSAR products is not public, a re-calibration step using the improved Quegan technique [23] and our co-polarization channel imbalance calibration algorithm [24] was carried out to guarantee the reduction of distortion. For the four X-band master images collected successively with about 20–25% overlap, we matched them by building mosaic models based on quadratic polynomials using dozens of ground control points (GCPs). The co-registration operation was then conducted between the X-band and P-band images. Considering the different penetrative performances and flight paths, we first manually chose dozens of GCPs from the X-band and P-band images to carry out the coarse co-registration using the quadratic polynomial model. Denser control points at the sub-pixel level were then distilled by scale-invariant feature transform [25], and the fine co-registration was finished in each small local patch, with 20% overlap, to ensure the three-pixel accuracy of the large scale. The size of overlapping region in built-up areas between the two bands was 5200 × 11100 pixels. The corresponding X-band intensity image and P-band Pauli-RGB images are displayed in Fig. 2(b)–(d). A 4 × 4 multilook operation was later carried out to suppress the speckle noise and reduce the image size. In order to undertake a resolution analysis (Section IV), we also constructed X/P-band multiresolution pyramids, which are shown in Fig. 3. As the original resolution was 0.5 m for the X-band and 1.1 m for the P-band, the X-band pyramid includes Res 0/1/2 for the
Earthquake-induced building damage assessment using optical images has been investigated for many years [5]. As a result of the all-weather quick response, there has recently been increasing interest in damage interpretation using SAR images. In the field of SAR imaging, the characteristics of typical urban structures and collapsed structures have been discussed in detail in the past few years [13], [14], [16]. For simplicity, Brunner et al. [14] showed an example of the backscattering range profile of a flat-roof building modeled as a rectangular structure with uniform surfaces and flat surroundings, which indicated that the variation of gray levels in built-up areas should observe some intrinsic rules, owing to the regular appearance of the backscattering from the ground, front walls, building roofs, and shadows, especially the double-bounce scattering from the dihedral corners between walls and the surrounding ground. Undoubtedly, collapsed structures break these rules. Some instances of different collapse levels are shown in Fig. 1, based on our previous work [18]. Clearly, the two minor collapse instances show regular gray-level variation originating from the layover, double-bounce scattering, and shadows, as shown in Fig. 1(m)–(n). However, in the instances of the moderate and serious collapse levels, the regularity has been gradually destroyed because the building structures have been damaged and the rubble has been randomly distributed in the surroundings, as shown in Fig. 1(o)–(r). As we know, textural features have the capacity to capture different patterns of gray-level variation, so they may also have the potential to distinguish different collapse levels, according to the above analysis. Our previous work [18] indicated that compared with polarimetric and interferometric features, textural descriptors are more important in collapse assessment. However, it is surprising that reports specifically discussing the application of abundant textural features in collapse assessment based on postevent SAR imagery have been very few in number until now. The main reason why this paper has been written is to investigate the applicability of contextual descriptors for revealing the inner discriminatory features in different collapse-level cases, aiming at a deeper understanding of spatial behaviors in SAR
SUN et al.: BUILDING COLLAPSE ASSESSMENT IN URBAN AREAS USING TEXTURE INFORMATION FROM POSTEVENT SAR DATA
imagery of stricken areas, and the impact of diverse imaging configurations. To explore the spatial contextual information in VHR SAR imagery, different kinds of texture extraction approaches are introduced to comprehensively evaluate building collapse levels. Unlike the case of lower or medium resolutions, highresolution SAR images describe more detail and information on the ground. In this case, the traditional pixel-based PolSAR scattering properties cannot be used to classify objects correctly as different distributions of the same objects can have different semantic meanings [26]. There is therefore a demand for contextual information support. Similar to the image segmentation and retrieval tasks, suitable texture descriptors are of crucial importance to improve the performance. Research into textural features has a long history, and reviews can be found in [27] and [28]. There are now many texture features that have become widely accepted by researchers, although it is still quite difficult to define texture accurately. Moreover, it is also important to comprehensively compare and analyze the advantages and disadvantages of the various texture extraction techniques, especially methods based on different starting points, e.g., statistics-based, model-based, and signal processing techniques. The following texture descriptors, which are familiar from the image processing field, are all investigated using dual-band CASMSAR products in Section IV. 1) Gray-Level Histogram: The most intuitive context is the statistical histogram [29]. After gray-level quantization, a gray-level histogram counts every gray-level frequency directly, and then certain statistical properties can be found in it. In this paper, the features computed based on the gray-level histogram are the mean, variance, skewness, kurtosis, energy, and entropy. 2) GLCM: The GLCM is the second-order statistics of how often different combinations of gray levels occur in an image [30]. There were a total of 13 textural features derived from the GLCM in Haralick’s research [31], but many are correlated [32]. The GLCM is created from a grayscale image by selecting either a horizontal (0°), vertical (90°), or diagonal (45° or 135°) orientation, and thus we can adopt all these orientations to obtain four GLCMs, and later compute five representative textural features (angular second moment, contrast, correlation, entropy, inverse difference moment), respectively. Note that the size of the GLCM depends on the number of gray values available, and we used a quantized 4-bit SAR image to obtain a GLCM size of 16 × 16 elements, in order to speed up the calculation. 3) Local Binary Pattern (LBP): The LBP method is a simple but powerful descriptor, which has been widely used in face recognition [33], [34] and texture analysis [35], [36] in recent years. In its original form, the operator works by thresholding a 3 × 3 neighborhood with the value of the center pixel, thus forming a LBP, which is shown in Fig. 4(a). The occurrences of different local patterns are collected into a histogram that is used as the texture descriptor. However, the dimension of the original LBP
3795
Fig. 4. Schematic diagram of LBP and GMRF. (a) Original LBP. (b) Fifthorder GMRF neighbors of a center pixel, where the first number in each lattice indicates which order the neighbor belongs to.
histogram is very high (256-dimension), so we adopt here its rotation-invariant variation (36-dimension) [37] for a better comparison and analysis since this variation treats different numbers which are formed after rotation of the same number as an identical pattern. 4) Gaussian Markov Random Fields (GMRF): As a modelbased method [38], the Markov random field (MRF) method supposes that texture can be modeled as a twodimensional (2-D) separable Markov field [39]. Furthermore, a pixel brightness value is just a function of its neighbors in all directions. If this brightness value obeys a Gaussian distribution, the MRF models are equivalently characterized by a linear interpolative equation containing a set of unknown parameters. Based on a series of such equations, these parameters which represent the textural features can be estimated using a least-squares approach. Conventionally, we apply a fifth-order model, and the number of estimated parameters is 12 (note that GMRF considers symmetrical neighboring pixels sharing the same parameters). Fig. 4(b) shows the neighbors in the fifth-order GMRF model. 5) Gabor Filters: The Gabor function first proposed by Dennis Gabor [40] has been used to process signals, and belongs to a kind of short-time Fourier transform. The 2-D Gabor filters are particularly appropriate for texture representation and discrimination [41]–[44]. In the spatial domain, a 2-D Gabor filter is a Gaussian kernel function modulated by a sinusoidal plane wave. However, the function parameter settings (scale, orientation, and frequency) are complicated, and no a priori knowledge can help us find a suitable frequency. Therefore, in this paper, we select the self-adaptive method proposed in [45] to relieve this dilemma. The features we use are the mean and variance of every processed patch generated from the original extraction window convolved by the Gabor filters. All of the textural features we use in this paper are numbered and listed in Table II. B. Classification Building collapse assessment can be seen as a special classification task. However, unlike common classification, which separately adopts a certain dimensionality reduction technique and a specific classifier [46], [61], the performance evaluation of
3796
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 9, NO. 8, AUGUST 2016
TABLE II ONE HUNDRED TWENTY TWO PRACTICAL TEXTURAL FEATURES USED IN THIS PAPER. GLCM, GMRF, AND RILBP REPRESENT THE GRAY-LEVEL CO-OCCURRENCE MATRIX, GAUSSIAN MARKOV RANDOM FIELDS, AND ROTATION-INVARIANT LOCAL BINARY PATTERNS, RESPECTIVELY No.
Features
Symbol
Description
μ Mv Ms Mk Ene Ent
Mean, variance, skewness, kurtosis, energy, and entropy from the gray-level histogram Angular second moment, contrast, correlation, entropy, and inverse difference moment of the GLCM, where Δ x, Δ y is (0, 1), (−1, 1), (−1, 0), and (−1, −1) Rotation-invariant local binary patterns GMRF model parameters Means and variances of the Gabor features (four scales and six orientations)
001–006
Gray-level histogram
007–026
GLCM
ASMΔ x , Δ y CONΔ x , Δ y CORΔ x , Δ y ENTΔ x , Δ y IDMΔ x , Δ y
027–062
RILBP
P1 P2 , . . . , P3 6
063–074 075–122
GMRF Gabor
θ1 θ2 . . . θ1 2 μ1 μ2 . . . μ2 4 Mv 1 Mv 2 . . . Mv 24
all 122 proposed features for discriminating collapse levels, and the classification results, needed to be acquired simultaneously in this study. Thanks to [47]–[49], the RF classifier has become a hotspot in machine learning, and has since been applied to SAR images. RF is a supervised ensemble classifier constructed by a series of decision trees generated based on bootstrap sampling and the random feature subspace from the original sample set and feature space [50]. The attractive points of RF are mainly from two aspects: On the one hand, considering the RF classification process, there are three steps: 1) obtain a bootstrap sample set from the training samples; 2) based on the bootstrap sample set, grow a decision tree, and at each node, carry out the best split in a randomly selected feature subspace; and 3) repeat the above two steps to build a number of trees. Each tree finally obtains a classification result using the test samples, and the final RF output is the ensemble result by means of a majority vote based on the results from all the decision trees, which makes RF highly robust. On the other hand, RF is not only a classifier or regression method, but is also a feature evaluation technique [48]. It has several evaluation approaches, including permutation importance, Gini importance, variable selection frequency, and so on. Since permutation importance individually takes the impact of each predictor variable into account, as well as its interaction with the other input variables, it is the most common among the three [47]. In general, the calculation of permutation importance includes three steps: 1) in each tree, except for the roughly two-thirds of training samples which are used to construct the bootstrap sample set, the remaining one-third in the training sample set are collected as the so-called out-of-bag (OOB) sample set for calculating the permutation importance; 2) for every specified feature, all the values of this feature in the OOB samples are permuted randomly, while the other features are kept unchanged; and 3) the difference in the prediction error or accuracy
between the permuted and original OOB samples in each tree is calculated, and then the mean permutation accuracy importance is obtained by averaging all these differences over all the trees in the forest. This importance score can be considered as an importance indicator of how the feature influences the classification performance. The importance calculation formula is shown in (1). In summary, the importance score represents the classification accuracy loss when permuting the instances of special features by RF. The coupling of feature selection and classification in RF supports a novel importance evaluation path direct to the feature impact of collapse assessment. This advantage exactly meets our requirement. The RF code is freely available from: http://cda.psych.uiuc.edu/statistical_learning_course/ n OOB Pi i − errorOOB ) i i=1 (errori (1) Importance = n where OOBPi and OOBi are the permuted and original OOB samples in tree i, respectively, and n is the total number of decision trees. C. Framework of the Collapse Assessment and Influence Analysis In this paper, the main concerns and contributions for building collapse assessment are the explorations of the following three aspects: 1) the potential of independently applying spatial contextual information for pixel-based collapse assessment using only postevent VHR SAR imagery; 2) the influence of diverse configurations on the assessment results, such as texture, resolution, channels, and bands (especially the first one); and 3) an investigation into what forms of spatial dependence behavior can preferably reveal different collapse levels. For the above three aspects, the experimental settings and impact analysis flow are described in detail in Section IV. The collapse interpretation includes both feature extraction and supervised classification. The different kinds of textural features are first extracted from the various SAR images. In order to ensure that most kinds of buildings were included in such a texture extraction window, the window size was designed to be 21 × 21 for the X-band Res 0 and P-band Res 1, and 11 × 11 pixels for the X-band Res 1 and P-band Res 2. As for the X-band Res 2 and P-band Res 3, we employed a 7 × 7 window rather than a 5 × 5 window because too small a window is not suitable for extracting texture. For example, for the fifth-order GMRF model, a 5 × 5 size can acquire only one estimating equation, which always results in an abnormal solution. Next, in the supervised classification step, 2% of the pixels available in the groundtruth map were randomly picked as a training set to build the RF model, which was later used to identify the collapse level. The two common RF parameters—the total number of trees and the number of features used to divide in each node—were empirically set to 500 and one-half of the input feature dimension. To guarantee reliability, the sample selection and RF classification steps were repeated 20 times to obtain the average performance, which represents the power of the texture response to aspect (1) above. It is worth noting that this approach is quite different to the traditional block-based assessment [4], [6], [16], which
SUN et al.: BUILDING COLLAPSE ASSESSMENT IN URBAN AREAS USING TEXTURE INFORMATION FROM POSTEVENT SAR DATA
TABLE III RELEVANT CONFIGURATIONS IN THE FOUR CASES, INCLUDING TEXTURAL FEATURES, RESOLUTION, CHANNELS, AND BANDS Case i ii iii iv
Features
Resolution
Channels
Bands
HI/GL/RI/GM/GA All All All
Res 1 X-Res 0-2/P-Res 1-3 Res 1 Res 1
All All All/Tex HH
All All P X/P
involves assigning specified damage indexes or ratios to street blocks, and dividing blocks into several levels by manually selected thresholds. The traditional assessment just gives a rough general performance in the street block unit, and depends on manual experience. For aspect (2), once the relevant configuration has changed, how this impacts the interpretation result is also thoughtprovoking. Thanks to the dual-band CASMSAR system, an investigation into the influence of texture, resolution, channels, and bands could be carried out. Moreover, regretfully, no relevant published works have specifically focused on which kind of texture is preferable for postevent interpretation. In fact, the traditional comparative studies of texture have focused on the macro-effectiveness, i.e., they were based on the overall accuracy (OA) or error rate [51]. However, although the macroeffectiveness tells us which texture extraction method is better for classification, it does not help with the analysis of individual features. Fortunately, the application of RF can help us to focus on the micro-effectiveness, allowing us to explore the direct impact of each individual feature dimension on the collapse assessment. As for the last aspect mentioned above, the micro-effectiveness implies that the valid texture descriptors for discriminating different collapse levels, and the spatial dependence behaviors which correspond to these descriptors, will be obtained simultaneously, aiming at a deeper understanding of the stricken building characteristics in postevent SAR images. The results based on different configurations are displayed in the next section. A total of four specific discussion cases— case i: texture; case ii: resolution; case iii: channels; case iv: bands—are given separately according to their macro-/microeffectiveness, namely, collapse assessment performance and RF importance scores. In addition, we also took the so-called texture images and normalized texture images, which can be estimated by the spherically invariant random vector (SIRV) model and fixed point (FP) estimator, into account in the channel analysis for the completeness of the comparison. The details are provided in the following section. Once it is known which texture measures are valuable, the inner discriminatory spatial behaviors in different collapse levels can be easily found. The relevant configurations in the four cases are listed in Table III, where “HI/GL/RI/GM/GA” stands for gray-level histogram/GLCM/RILBP/GMRF/Gabor features, and “Tex” represents the texture image and normalized texture image.
are interested in not only the assessment itself, but also the influence of the different variables. Using the related settings described in Section III, the classification process was carried out many times, based on diverse configurations: bands (X/P); resolution (Res 0–3); channels (HH, HV, VV, span, etc.); and textural features (gray-level histogram, GLCM, RILBP, GMRF, Gabor filters). In the channel configuration, HH, HV, and VV stand for the different polarization state combinations between transmitting and receiving, where “H” is the horizontal polarization wave and “V” is the vertical polarization. Span is the total power equivalent to the intensity sum of HH, 2HV, and VV returns under the reciprocity hypothesis. In order to guarantee assessment reliability, we repeated the sample selection and classification 20 times in each configuration to record the average performance. In this study, the ground truth was generated from two main sources: the International Charter on Space and Major Disasters (ICSMD) publication collapse map (http://www. disasterscharter.org) and the Major Natural Disasters Atlas 2010 (http://www.chinabookshop.net/major-natural-disastersatlas-2010-p-14623.html). Both of these sources consider three collapse levels: minor, moderate, and serious. With the aid of visual interpretation based on the pre-/postevent optical images (2009/05/25 and 2010/05/06 from Google Earth), several expert volunteers investigated and compared the sources to yield a more detailed collapse map, which ensured that each small block corresponded to an approximate pure collapse level, as far as possible. Therefore, our ground-truth reference scale is much smaller than the city block unit, but is not a single building, and we refer to it as the “single collapse type block.” As a side-looking remote sensing system, it should be mentioned that some specific damage, such as inner damage and cracks in intact buildings, cannot be identified in SAR imagery [11], [52]. As a consequence, our study focuses on building collapse, not damage, but in general, it would be possible to convert the ground-truth map into the D0–D5 building damage grade system by minor (D0–D1), moderate (D2–D4), and serious (D5) levels. The common accuracy evaluation parameters are the OA and the Kappa coefficient (Kappa), based on a confusion matrix. However, according to [60], the standard Kappa cannot reveal more information than the OA. Therefore, in this paper, the collapse assessment performance is assessed primarily by the OA. In addition, the two measures of quantity disagreement (QD) and allocation disagreement (AD) are used instead of Kappa. The QD is defined as the amount of difference between the ground-truth map and a classification map that is due to the less than perfect match in the proportions of the categories, and the AD is defined as the amount of difference between the ground-truth map and a classification map that is due to the less than optimal match in the spatial allocation of the categories [60]. The relationship between the OA, QD, and AD is OA + QD + AD = 1
IV. EXPERIMENTS WITH DATA FROM THE YUSHU EARTHQUAKE In this section, the various collapse assessment results and the corresponding impact analysis are given together, as we
3797
(2)
A. Case I: Texture Impact Analysis Before the main discussion about the impact of texture, there is a question that should first be raised: it has been proved that
3798
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 9, NO. 8, AUGUST 2016
Fig. 5. Scatter plots of variance and energy textural features based on the gray-level histogram for three collapse levels: (a) variance, (b) energy, and (c) variance and energy.
Fig. 6.
Accuracy evaluation results in the X/P-band Res 1 cases: (a) OA, (b) QD, and (c) AD.
Fig. 7. Importance scores and k-means classification accuracies of each dimension in the gray-level histogram texture. (a) X-band HH results. (b) P-strip 1 HH/HV/VV average results. (c) P-strip 2 HH/HV/VV average results.
Fig. 8. Importance scores and k-means classification accuracies of each dimension in the GLCM texture. (a) X-band HH results. (b) P-strip 1 HH/HV/VV average results. (c) P-strip 2 HH/HV/VV average results.
some textural features in postevent SAR images are correlated with different damage levels [4], [19], but the large dispersion of texture variables for each damage level prevents further reliable damage evaluation [6]. In the relevant literature, Polli et al. [4] simply employed an individual single feature (e.g., mean) based on the GLCM to divide the damaged area ratio into three parts, and a homogeneity measure was adopted to correct the inaccuracy caused by oblique buildings in collapse interpretation by Zhao et al. [6], but neither study took multiple texture measures into account. A significant dispersion phenomenon is probably inevitable if we simply consider employing only a single specific texture measure to indicate damage levels individually. In this study, a further test was made: we calculated the variance and energy measures based on the gray-level histogram using X-band Res 1 imagery. Then, for simplicity and visualization, we assigned the corresponding average values to each local region in which an approximate uniform collapse type existed. The results are displayed in Fig. 5. When variance or energy is observed alone, there are indeed some differences in the three collapse levels, but the large dispersions can also not be ignored
in Fig. 5(a) and (b). Fortunately, when combining them both, the situation is relieved and an acceptable distinction appears in Fig. 5(c). That is to say, an excellent tendency to distinguish collapse levels exists when surveying more texture measures, which encourages us to employ abundant textural features to identify building collapse levels. It is worth mentioning that the ruleless behaviors of the minor collapse level are a result of the inconsistent construction, density, and size of the buildings and their surroundings. For the macro-effectiveness comparison, the accuracy evaluation results in the X/P-band Res 1 cases are first given in Fig. 6, i.e., the OA and the QD/AD values after 20 repeats. In Fig. 6, “P1,” “P2,” and “SP” represent the P-band strip 1, strip 2, and span, respectively. It can be seen that most of the OA values are greater than 70% and, accordingly, the QD/AD values are generally less than 0.07/0.25, representing an acceptable accuracy, which confirms the underlying power of the texture information to distinguish building collapse levels. The best result in the X-band 1.0 m/P-band 1.1 m resolution is based on the RILBP features extracted from the P-strip 1 HH imagery [the
SUN et al.: BUILDING COLLAPSE ASSESSMENT IN URBAN AREAS USING TEXTURE INFORMATION FROM POSTEVENT SAR DATA
Fig. 9. Importance scores and k-means classification accuracies of each dimension in the GMRF texture. (a) X-band HH results. (b) P-strip 1 HH/HV/VV average results. (c) P-strip 2 HH/HV/VV average results.
(a)
(b)
(c)
Fig. 10. Importance scores and k-means classification accuracies of each dimension in the RILBP texture. (a) X-band HH results. (b) P-strip 1 HH/HV/VV average results. (c) P-strip 2 HH/HV/VV average results.
Fig. 11. Importance scores and k-means classification accuracies of each dimension in the Gabor texture. (a) X-band HH results. (b) P-strip 1 HH/HV/VV average results. (c) P-strip 2 HH/HV/VV average results.
corresponding classification map is shown in Fig. 16(a)], with an OA of 77.2% and QD/AD values of 0.067/0.16, respectively. Although this accuracy is a little worse than the state-of-the-art in postevent SAR imagery interpretation, it should be noted that this is not the optimum result among all the various configurations since the 0.5 m X-band image was not used here, and the previous studies which utilized only postevent SAR imagery considered block-based assessment, so the mapping reference scale was larger than ours. In other words, their results are meaningless at the pixel level. From the macro-effectiveness point of view, the accuracy queue is GLCM, gray-level histogram, Gabor, RILBP, and GMRF in the X-band dataset, and RILBP, Gabor, GLCM, gray-level histogram, and GMRF in the P-band datasets. On the whole, the results using various textural features are satisfactory, except for GMRF. This phenomenon may be caused by the assumption of a Gaussian distribution in the GMRF model. As the real and imaginary components of the backscattering returns obey an independent and identical Gaussian distribution, the intensity of the backscattering returns does not obey a Gaussian distribution any more, leading to the failure of the GMRF model. When further focusing on texture micro-effectiveness, i.e., the importance of each individual texture measure for collapse assessment, we again utilize the RF classifier to evaluate the texture measures in different bands and polarization cases. Firstly, the RF importance scores based on each kind of texture
3799
descriptor are shown in Figs. 7–11. In addition, as the RF importance score represents the classification accuracy loss when a certain feature totally fails, the interpretation accuracies based on each feature dimension alone are also given for comparison, by means of a supervised k-means classifier, which simply classifies pixels into the nearest class center. All these RF scores and k-means classification accuracies make use of the Res 1 version in the pyramids, and Figs. 7–11(b) and (c) show the averages of the P-band HH, HV, and VV results. The feature numbers are listed in Table II. It should be noted that the importance score is a relative indicator, so it is meaningless to compare scores from different RF classification models. The k-means classification accuracies of each dimension are shown in Figs. 7–11, along with the corresponding RF scores. Clearly, there are some differences between the RF scores and k-means accuracies. However, the overall trends and most of the local peaks are coincident with each other, confirming the robustness of the RF importance score. In the relative RF scores and the k-means accuracies in Fig. 7(a)–(c) for the gray-level histogram texture, variance measure No. 002 makes the best contribution. As we know, an intact building has some typical architecture in SAR images, such as double-bounce scattering between facing walls and the ground. In an extraction window, the embedded buildings and their surroundings, which represent a mixture of typical architectures, result in large gray-level dispersion. With the disappearance of this mixture to various degrees, corresponding to the various collapse levels, the dispersion becomes gradually smaller, similar to the tendency in Fig. 5(a). This is the reason why the No. 002 feature has a very high score in most cases, which is in agreement with Dell Acqua’s relevant works based on the GLCM [4]. As for the RF scores of the GLCM (No. 007–026), in most cases, the contrast (No. 008, 013, 018, and 023) and correlation (No. 009, 014, 019, and 024) measures obtain good results in general in Fig. 8, followed by the inverse difference moment (No. 011, 016, 021, and 026), which is also called homogeneity, and finally the angular second moment (No. 007, 012, 017, and 022) and entropy (No. 010, 015, 020, and 025). Considering the measures based on the GLCM as weighted averages of normalized GLCM cell contents, all the GLCM features can be divided into three groups describing different texture properties [53]: 1) contrast; 2) orderliness; and 3) descriptive statistics. There are two points worth mentioning here. The first point is that, in most cases, the texture measures in the contrast group (contrast and homogeneity) and the descriptive statistics group (correlation) are superior to the orderliness descriptors (angular second moment, entropy). The reason for this is that the contrast, homogeneity, and correlation measures allow for the relative position in the co-occurrence matrix, and the other two measures use every element in the matrix or its derivative as weight. The second point is that despite homogeneity being adopted in the work of Zhao et al. [6], we found that the contrast measures have a better discriminative capability for comprehensive collapse interpretation, as shown in Fig. 8. As was observed in [53], although contrast and homogeneity essentially describe similar properties, their effectiveness is a little different. As the weight is (i − j)2 for calculating contrast, and the weight
3800
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 9, NO. 8, AUGUST 2016
Fig. 12. Rotation-invariant binary patterns corresponding to: (a) No. 027, (b) No. 028, (c) No. 031, (d) No. 035, and (e) No. 043. “0” means that the gray value is less than the center value; and “1” means the opposite.
is 1/(1 + (i − j)2 ) for homogeneity [(i, j) is a certain combination of gray levels which occur in an image], clearly, the former will always be higher than the latter, thus enhancing the performance. We prefer to investigate the GMRF features (No. 063–074) before RILBP, because both the GLCM and GMRF methods take different spatial orientations into account. Among the RF scores of the GLCM, the No. 007–011 features corresponding to ( x, y) = (0, 1) are a little better than No. 012–016 corresponding to ( x, y) = (−1, 1), and No. 017–021 are better than No. 022–026, corresponding to (−1, 0) and (−1, −1). This indicates that the features based on horizontal and vertical orientations are superior to the features based on diagonal orientations. In the GMRF model, textural features are the estimated parameters corresponding to the neighbors, representing their contributions to the center pixel. Although the assumption of a Gaussian distribution makes the GMRF an inappropriate texture descriptor for a SAR intensity image, we can see that the No. 063, 064, 067, and 068 features are stable and relatively effective. The first two are the parameters of lattice 1.1 and 1.2 in Fig. 4(b), and the last two are for lattice 3.1 and 3.2. It seems that the parameters corresponding to horizontal and vertical neighbors are more important than the parameters corresponding to diagonal neighbors in the GMRF model, which agrees with the GLCM cases. Both of these conclusions imply that when a directionality texture is extracted, various orientations can make various contributions to distinguish collapse levels. Finally, the two high-dimensional texture descriptors, RILBP and Gabor, are investigated. Conventionally, the components of the LBP histogram are utilized to describe the texture directly, and therefore, no further statistical characterizations exist, such as mean and variance, resulting in a complicated and unstable performance for most of the RILBP features (No. 027– 062) in Fig. 10. However, there are still a few features that are quite stable and effective—No. 027, 028, 031, 035, and 043— representing the five rotation-invariant binary patterns in Fig. 12, respectively. According to the definition of the RILBP, it is easy to find that the No. 027 feature means that the gray-level values of the neighbors next to the center pixel are all less than the center value, describing an isolated bright point originating from scattered rubble, which tends to be common in collapse areas. For intact buildings, isolated points are very rare, but more familiar are the lines and rectangles caused by walls and roofs. Moreover, No. 028 is very similar to No. 027, representing an approximate isolated point in loose cases. Therefore, No. 028 tends to increase when more collapsed buildings appear. In Fig. 12(c)–(e), the property which No. 031, 035, and 043 have in common is that the gray values on one side are more than the
Fig. 13. RILBP extraction results in typical regions for three collapse levels. (a) Typical X-band regions. (b) No. 027 results. (c) No. 028 results. (d) No. 031 results. (e) No. 035 results. (f) No. 043 results.
center value, but not the values on the other side, which tends to appear on the boundaries between buildings and the associated shadows. Therefore, the values become lower when buildings collapse. From all of the above, the five descriptors can reveal the states of rubble and buildings, to some extent. This point makes them relatively effective, which is consistent with the observation in Fig. 13, where some RILBP results corresponding to typical structure instances of the three collapse levels in Fig. 1 are shown. In Fig. 13(b) and (c), the values of No. 027 and 028 increase, and the other three values gradually decrease in (d)–(f). For Gabor texture extraction, the features can be divided into four groups according to a gradual increase in the scale parameter values of the Gabor wavelets: No. 075–080/No. 099–104 (mean/variance), No. 081–086/No. 105–110, No. 087–092/No. 111–116, and No. 093–098/No. 117–122, as each group shares an identical scale parameter. It is unfortunate that the behaviors of each individual Gabor feature in Fig. 11 are puzzling to understand in depth, as their scores are unstable in the various images, and the dimension is the highest. Furthermore, the frequency-domain operation is difficult to intuitively explain. However, there is a slight tendency in that the mean measures in the latter two groups (No. 087–098) are more valuable than the mean measures in the former two groups (No. 075–086), and when considering the variance measures, similarly, the latter groups (No. 111–122) are better than the former (No. 099– 110). As the two scale parameters of the former are wider than the latter, No. 087–098 and No. 111–122 contain additional
SUN et al.: BUILDING COLLAPSE ASSESSMENT IN URBAN AREAS USING TEXTURE INFORMATION FROM POSTEVENT SAR DATA
3801
Fig. 14. Collapse assessment importance scores of the 122 proposed texture dimensions extracted from the X- and P-band images. (a) X-band HH result. (b) P-strip 1 HH/HV/VV average results. (c) P-strip 2 HH/HV/VV average results.
detail information of a high frequency, which suggests that the importance of the Gabor features is sensitive to scale. However, No. 093–098/No. 117–122 do not show an advantage over No. 087–092/No. 111–116. The reason for this may be that the excessively high frequency focuses on image details, but is sensitive to noise. Therefore, the selection of an appropriate frequency range is an open issue. Overall, the performance of the Gabor features is confusing and cannot be explained very well at present. In Figs. 7––11, five types of texture are independently investigated based on their RF scores and k-means accuracies. However, the question as to which kind of texture is more appropriate for collapse interpretation should be answered from both macro- and micro-effectiveness points of view. In Fig. 6, the accuracy queue is GLCM, gray-level histogram, Gabor, RILBP, and GMRF in the X-band dataset, and RILBP, Gabor, GLCM, gray-level histogram, and GMRF in the P-band datasets. Nevertheless, when all the 122 proposed texture dimensions are adopted together to generate the RF importance scores, the situation is changed from the micro-effectiveness point of view. In Fig. 14, the relative importance scores of the 122 proposed texture dimensions listed in Table II extracted from the X-band data, P-strip 1, and P-strip 2 are shown, respectively. On the whole, the importance priority queue is gray-level histogram, GLCM, RILBP, Gabor, and GMRF in turn. The queue shows some differences with the OA values of the five methods, which is caused by the difference between macro and micro-effectiveness. For example, as a result of the lower dimension, although the scores of the gray-level histogram features in Fig. 14 are higher than the others in general, the corresponding assessment result is not the best in Fig. 6.
Fig. 15.
OA values of the interpretation results in various resolution cases.
As all the features are uniformly taken into account, the importance of every dimension is influenced by the features based on not only the same texture method but also the other four methods, giving a texture comparison at the micro-effectiveness level. The relative score trends of the five kinds of descriptors in Fig. 14 have some differences, compared with Figs. 7–11. It is clear that the statistics-based methods (gray-level histogram, GLCM, and RILBP) perform relatively well. Generally speaking, the most intuitive gray-level histogram texture is the best from a micro-effectiveness perspective, and some of the GLCM and RILBP features also have a good collapse identification capability. In conclusion, after a comprehensive analysis of the macro- and micro-effectiveness, it can be concluded that owing to the inconsistent performances between interpretation accuracy and individual dimension contribution, the GLCM and Gabor features are fairly stable and give a good interpretation performance, but the Gabor features are difficult to analyze further, as a result of their puzzling micro-effectiveness behaviors. However, all the gray-level histogram features have a fairly good discriminative capability, and the RILBP features give a moderate performance from both the macro- and micro-effectiveness point of view.
3802
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 9, NO. 8, AUGUST 2016
Fig. 16. Partial collapse interpretation results. (a) The best pixel-based interpretation result map in the Res 1 case (configuration: P-strip 2, HH polarization, RILBP texture). (b) The local dominant collapse assessment result map (configuration: P-strip 1, span image, 1.1 m resolution, GLCM texture). (c) The ground-truth map.
TABLE IV OA AND QD/AD VALUES OF THE LOCAL DOMINANT COLLAPSE ASSESSMENT USING THE GLCM TEXTURE EXTRACTED FROM THE MULTIRESOLUTION SPAN IMAGES IN THE TWO P-BAND STRIPS Strip 1
OA
QD/ AD
Strip 2
OA
QD/ AD
Res 1 Res 2 Res 3
83.42% 78.09% 72.14%
0.050/0.115 0.085/0.134 0.096/0.183
Res 1 Res 2 Res 3
82.41% 72.56% 63.79%
0.089/0.086 0.123/0.151 0.127/0.235
B. Case II: Resolution Impact Analysis In Fig. 15, the OA values of the different interpretation results using multiresolution pyramids of the X/P-band datasets are displayed. Undoubtedly, for both the P-band and X-band, the results become more precise with the improved spatial resolution. Moreover, among these broken lines, many are concave downward, and the improvements from X-Res 1/P-Res 2 to X-Res 0/P-Res 1 are usually better than the improvements from X-Res 2/P-Res 3 to X-Res 1/P-Res 2. This means that with the same level of resolution increase, the increase in the accuracy performance is greater in the higher resolution cases. One possible reason for this is that when a smaller window size is used for lower-resolution imagery, the texture calculation can be a biased statistical observation because the number of samples in the window is so small. According to the first half of the lines in Fig. 15, although all the assessment results of Res 2 and Res 3 are not good enough, and the best P-band OA among them is only 70.1%, which is in Fig. 15(c), we still conducted an additional test. Considering each small block which corresponds to an approximate pure collapse level in the ground-truth map as one unit, a majority vote based on the predicted collapse levels of all the pixels in each small block was made use of to obtain the local dominant collapse level, similar to the conventional block-based assessment using postevent SAR imagery. As the GLCM has always been utilized in recent years [4], [6], [19], the local dominant collapse assessment result based on the GLCM texture is displayed in Fig. 16(b). Because the result of Zhao et al. [6] for the Yushu earthquake made use of our P-strip 1 as well as the GLCM feature extracted from the span image, Table IV lists the corresponding OA and QD/AD values of the dominant collapse assessment using the P-band span images in the multiresolution pyramid for comparison. Clearly, it can be seen that, in this case, even the interpretation performances using the Res 2 or Res 3 span images are
relatively effective, by and large, demonstrating that non-VHR postevent SAR imagery also has the ability to discriminate the average collapse levels at a coarse scale. In addition, we used the span image from the same PolSAR data (P-strip 1) as those used by Zhao et al., resulting in an OA of 83.42% [ in Table IV, and the corresponding dominant collapse interpretation map is displayed in Fig. 16(b)]. Another test using the P-strip 2 span image (Res 1) obtained an OA of 82.41%. Both results are quite good when compared with the results of Polli and Zhao et al. [4], [6]. Despite the fact that the “single collapse type block” scale we used here is finer than the city block scale which our predecessors used, the results still reveal the power of texture information in these non-VHR images for the block-based interpretation. C. Case III: Channel Impact Analysis Apart from the resolution and texture descriptors, the various polarization state combinations or their derivatives (span and texture image) can still have a significant impact on the postevent collapse interpretation, as shown in Figs. 6 and 15. As different polarization state combinations are sensitive to different scattering mechanisms, backscattering returns in the HH or VV basis reflect a mixture of surface scattering, double-bounce scattering, and some volume scattering, but returns in the HV basis mainly reflect volume scattering. This indicates that HV is not appropriate for collapse interpretation. The reason for this is that intact buildings and their surroundings usually show very complex backscattering mechanisms, e.g., surface scattering from the ground near the buildings and double-bounce scattering between facing walls and the ground, but these significant features of intact buildings are “blurred” in HV imagery. Therefore, the results of the HV polarization basis are worse than the others in Fig. 15. However, this disadvantage of the HV basis has different impacts on different textures. As some significant features in the HH/VV basis are not obvious any more in the HV basis, after gray-level quantization, the differences between the local statistics directly based on the gray-level histogram and the cooccurrence matrix for the three collapse levels (e.g., variance and contrast) are reduced. Thus, for the gray-level histogram and GLCM, the HV results are much worse than the others in Fig. 15(a)–(d). On the other hand, although RILBP is also a statistics-based method, it first thresholds the neighbors by the center gray value, and the corresponding binary number and local pattern histogram are later acquired. The “blurring”
SUN et al.: BUILDING COLLAPSE ASSESSMENT IN URBAN AREAS USING TEXTURE INFORMATION FROM POSTEVENT SAR DATA
effect of significant features in HV imagery does not greatly affect the thresholding operation, so HV performs quite well in Fig. 15(e)–(f). In addition, the performances of the GMRF and Gabor features in the HV imagery are only a little worse than the others in Fig. 15(g) and (j), because GMRF only considers the contributions of the neighbors to generate facsimiles of the center pixel, but not statistical information, and Gabor filters can handle texture extraction in various frequency ranges. On the whole, the performances in Fig. 15 are inconclusive except for HV, and no obvious superiority exists in the HH, VV, or span images. The differences in the OA values between HH and VV are also very small in most cases, especially for Res 1, so it seems there is no preference between them. Moreover, as it is difficult to confirm which combination between transmitting and receiving polarization states is most suitable to extract texture information in a PolSAR dataset, the span image can be adopted as it represents the total power of the polarimetric covariance matrix, because the span image has a lower speckle noise level than single-polarization images. Furthermore, as HH, HV, and VV have different scattering characteristics, the scattering responses of features, which may appear differently in each polarization channel, are likely to appear in the span image [54]. However, it is worth mentioning that despite the results in the span images usually being a little better than HH and VV for Res 1, this superiority may not be kept for the lower resolutions (Res 2 and Res 3). The reason for this may be the smearing of significant features with the excessively low noise level in the Res 2 and Res 3 span images. Furthermore, to investigate the micro-effectiveness impact in the P-band highest-resolution version, all five groups of texture variables were used to evaluate the interpretation performance. Considering the RF importance scores shown in Figs. 7–11, features were gradually added into the RF model, from the highest RF score to the lowest, to obtain a series of collapse assessment results, in order to compare the influences of various channel configurations. In addition, considering PolSAR data processing as the product of two separate random processes—the polarimetric diversity and texture diversity influenced by the polarization information and spatial scene variations, respectively [55], [56]—the SIRV model and FP estimator can be employed to extract the pure “texture” image without coupling of the polarization process from the two PolSAR datasets [57]. For the completeness of comparison, we also calculated the 122 textural features in the so-called texture images [57] and normalized texture images [56], and we have included the interpretation of the results in the following discussion. As this topic is beyond the scope of this paper, the description of the related algorithm procedure is ignored here, and the two kinds of texture images generated from P-strip 1 are shown in Fig. 17 as an example. In Figs. 18 and 19, the OA values of the collapse assessment with increasing input dimensionality are displayed. Because the impact of the polarization configuration, but not the interpretation accuracy, is focused on in the micro-effectiveness point of view, the highest input feature dimension was limited to 20 for simplicity, but this was enough to investigate the general tendency of the OA change curve. Limited by the high complexity of texture research, relevant theoretical analyses about the
3803
Fig. 17. Pure texture image which was generated from the P-band PolSAR strip 1 using SIRV and the FP estimator: (a) texture, (b) normalized texture.
Fig. 18. OA values of collapse assessment using a series of input dimensions according to the RF importance scores in the Res 1 P-strip 1 images. “TEX” and “NTEX” stand for the texture image and normalized texture image, respectively.
Fig. 19. OA values of collapse assessment using a series of input dimensions according to the RF importance scores in the Res 1 P-strip 2 images. “TEX” and “NTEX” stand for the texture image and normalized texture image, respectively.
impact of polarization on texture are very rare. Conventionally, the span image has been considered as the primary basis for image processing, as a result of the reasons outlined above, e.g., the classical refined Lee PolSAR filter utilizes the span image to select an edge-aligned window [58]. However, span has no advantage for Res 2 and Res 3 in Fig. 15, which implies that span is not the best choice at all times. Furthermore, the results in the Res 1 span images appear better than those in the HH, HV, and VV images, but when each individual feature is selected in turn to assess building collapse in Figs. 18 and 19, the span OA curves are not always at the top. When the features used reach a sufficient quantity, the span image starts to show its advantage.
3804
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 9, NO. 8, AUGUST 2016
Fig. 20. Typical regions in three collapse levels in the SAR image and normalized texture image. The first row shows the SAR image patches, and the second row shows the normalized texture image patches. Fig. 21. Building collapse assessment OA curves with increasing input dimensionality in the 0.5 m X-band image and the 1.1 m P-band span images.
However, if there is a requirement for rapid processing using limited inputs, the span image is probably not the best choice. For HH, HV, and VV, the performances are very similar to Fig. 15. There is still no preference between HH and VV, and HV is again worse than them in different levels. As for the additional two images, the texture images perform a little worse than span, and it can be seen that the OA values corresponding to the normalized texture images are usually the best of all in the first few dimensions, but are then gradually eclipsed. As the same typical regions in three collapse levels in the SAR image and normalized texture image are shown in Fig. 20, it is possible that in the normalized texture image, the polarimetric information is removed and the underlying texture details which reflect the spatial scene variations are further uncovered. However, the lost polarimetric information could also benefit the building collapse assessment, which results in the reduced reliability of most of the features. On the other hand, the assumption of a positive scalar texture variable is still debatable, as a multitexture model for PolSAR data has recently been proposed [59]. Even so, the texture image greatly enhances some of the texture measures in the statistics-based methods to distinguish collapse levels, which could help to improve the traditional block-based assessment [4], [6].
Fig. 22. Best collapse interpretation result map using the 122 proposed textural features (configuration: X-band, HH polarization, 0.5 m resolution).
TABLE V CONFUSION MATRIX OF THE COLLAPSE INTERPRETATION RESULT USING THE 122 DIMENSIONS IN THE 0.5 M RESOLUTION X-BAND IMAGE AFTER 20 REPEATS
MIN. MOD. SER. UA
MIN.
MOD.
SER.
PA
65 563 ± 707 8245 ± 602 1145 ± 125 0.875 ± 0.006
21 279 ± 588 174 128 ± 890 23 826 ± 711 0.794 ± 0.004
3367 ± 197 13 535 ± 562 154 453 ± 671 0.901 ± 0.003
0.727 ± 0.008 0.889 ± 0.005 0.861 ± 0.004 0.847 ± 0.002
D. Case IV: Band Impact Analysis As can be seen in Fig. 15, the comparison between the two bands is inconclusive. For the gray-level histogram and GLCM features, the OA of X-HH Res 1 (1.0 m) is better than that of P-HH Res 1 (1.1 m) in Fig. 15(a)–(d), and even X-HH Res 2 (2.0 m) almost amounts to P-HH Res 1, indicating a significant advantage in the X-band. However, the other instances do not show this superiority, and the OA of X-HH Res 1 is almost always lower than P-HH Res 1, especially in Fig. 15(e)–(h). However the P-band seems to be more stable when considering all the textural features. Nonetheless, this indicates that the different texture extraction methods tend to prefer different bands. The X-band is appropriate for the gray-level histogram and the GLCM texture, and the P-band is preferred for RILBP, GMRF, and the Gabor texture. Since the X-band image has the highest spatial resolution, and the span images generally perform best in P-band PolSAR datasets when the input dimension is high enough, the collapse assessment results using all 122 texture dimensions in the
X-band image and the P-band span images are finally presented. Fig. 21 shows the OA curves with increasing input dimensionality. As expected, the X-band curves are always at the top owing to the high spatial resolution, and it can be clearly seen that all the curves reach a steady state when the input dimensionality is more than 20, which conforms with the performance in Fig. 14, because the other dimensions give relatively low importance scores. The best collapse interpretation map with an OA of 84.7% using all the features extracted from the X-band image is shown in Fig. 22, and the confusion matrix is displayed in Table V. The confusion matrix in Table V shows an acceptable accuracy, with a user’s accuracy (UA) between 0.79 and 0.90 and a producer’s accuracy (PA) between 0.73 and 0.89, and, after 20 repeats, the result remains very stable throughout, as the maximum standard deviation in the UA and PA is only 0.008. However, despite the fact that the overall result has already reached
SUN et al.: BUILDING COLLAPSE ASSESSMENT IN URBAN AREAS USING TEXTURE INFORMATION FROM POSTEVENT SAR DATA
TABLE VI TEST RESULTS USING THE VARIOUS COMBINATIONS OF TRAINING AND TEST SETS
3805
tically distributed. However, the OAs in Table VI are between 55.83% and 71.11%, so they are at least superior to some of the results in Fig. 15, which implies that correlation between the texture information of the different postevent images exists in the Yushu case. Considering the intrinsic attributes of different SAR images, applying special methods and adding certain special modifications in the field of transfer learning could further improve the performance. V. CONCLUSION AND FUTURE WORK
Fig. 23.
Collapse interpretation result map (train: P1-HH, test: X-HH).
a high credibility, the distinction between the moderate collapse level and the other two levels still needs to be improved. This deficiency is mainly caused by the relatively difficult demarcation of the moderate level from the minor and serious levels. Furthermore, how to use knowledge based on pre-existing information to solve another different but related problem is a very interesting point, which transfer learning focuses on. Although this paper is aimed at classifying collapsed buildings only, a relevant transfer test was still conducted. The three training sample sets were randomly selected from the X-band and the two P-band HH intensity images, respectively, to train the RF models, and then the three generated models were utilized to classify the pixels from the other two images. For example, the RF model built using the training samples from the X-band HH image can be used to identify the collapse levels in the two P-band HH images. The experimental parameters were the same as the descriptions in Section III: 2% of the pixels available in one of the three images were randomly picked as a training set to build the RF model, and the sample selection and classification steps were repeated 20 times to obtain an average performance. The test results are listed in Table VI and the best collapse interpretation map (train: P1-HH, test: X-HH) is shown in Fig. 23. Although all 122 dimensions were used together, the interpretation accuracies are not satisfactory. One reason for this is the relatively difficult demarcation of the collapse levels. What is more, the key issue may be that the SAR images of the same area are greatly influenced by many factors, such as the frequency, incident angle, flight strip, etc., which results in the training and test samples not being independent and iden-
Recently, with the rapid increase in the spatial resolution, it has become possible to use texture information to effectively assess building collapse using only postevent SAR imagery. Inspired by this point [4], [6], [14], this paper focuses on independently applying different kinds of texture in collapse interpretation. However, this paper is very different from the previous relevant works. In the previous works, the interpretation process flows were designed by experts and based on a priori hypothesis, resulting in instability with the change of threshold parameters. Moreover, these studies adopted only a single measure based on the co-occurrence matrix to divide city blocks into several damage groups, which implies that once a block is classified incorrectly, the interpretation of buildings in this block will also be wrong. In order to overcome the shortcomings and make full use of the spatial contextual information in VHR SAR images, five common texture descriptors (gray-level histogram, GLCM, RILBP, GMRF, and Gabor filters) were investigated in this paper, to identify the pixel-based building collapse level. The conclusions are made from three main aspects: 1) The results based on the processing flow in Section III demonstrate that the texture variables in postevent VHR SAR images have the power to classify buildings into three collapse levels at the pixel scale. Even if just a few important features are used, as in Fig. 21, the accuracy is already good, and when all 122 texture dimensions are used together, the best OA reaches 84.7%. If rapid interpretation is required, the accuracy is still acceptable (77.2%) when independently using RILBP features in the P-band image. However, the distinction between the moderate collapse level and the other two levels still needs to be further improved. 2) A comprehensive comparison and analysis of the influence of bands, resolution, texture, and channels was carried out, according to not only an accuracy evaluation from the macro-effectiveness point of view, but also the individual dimension importance from the micro-effectiveness point of view. It was found that that the X-band and the P-band have different impacts on the different texture measures, and with the same level of resolution increase, the increase in the accuracy performance is greater in the higher resolution cases. For the influence of polarization, the span image is generally the best choice when the resolution configuration and input dimension are both high enough, but at a lower resolution, and with only a few input cases, the situation is reversed. For the influence of texture, the macro- and micro-effectiveness levels are
3806
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 9, NO. 8, AUGUST 2016
quite different. Although the interpretation result based on the gray-level histogram features is not the best, most dimensions of them make a pretty good contribution to collapse assessment, and some specific dimensions in the GLCM and RILBP features also have a good collapse identification capability. 3) Via microscopic observation in each individual dimension, various spatial relationships of the gray levels, which can reveal distinctions between collapse levels, were investigated. The variance of the local gray levels, contrast, and correlation measures of the co-occurrence matrix change to some extent with the collapse of buildings. Isolated bright points originating from scattered rubble and the boundaries between buildings and the associated shadows are significant characteristics in different collapse cases. Moreover, there are also some indications that it will be worth further exploring texture effectiveness in different directions and frequency ranges. It has to be said that this paper is by no means aimed at convincing the reader that the texture measures used in this paper are the optimal texture descriptors for SAR images, or that the independent application of spatial contextual information is the most suitable approach for postevent assessment, but it is rather aimed at revealing the potential of texture information and enabling a deeper understanding of the complex spatial behaviors of buildings in different collapse levels. Furthermore, interpretation based on remote sensing imagery is very different from in situ surveys. Much detail information can only be observed by experts in in situ surveys, but not in remote sensing images, where it is possible that the interpretation results underestimate the real damage. For example, if lots of cracks exist in a building, this is not slight “damage,” but it could be considered as slight “collapse” in remote sensing interpretation. Therefore, in this paper, we focus on building collapse, not damage. In addition, the conventional damage classification and grading systems [e.g., European Macroseismic Scale 1998 (EMS-98)] in seismology always take building types and materials into account. However, there are many different kinds of buildings in Yushu County: masonry structures, frame-masonry structures, frame structures, timber structures, rural houses, and so on. The data from the Yushu earthquake are therefore not suitable for an investigation of damage grading. However, according to EMS98, if not considering the building types, the ground-truth map used in this paper could be roughly transposed into the D0–D1 (minor), D2–D4 (moderate), and D5 (serious) damage levels. Based on this preliminary study, our future work will focus on exploring more effective interpretation approaches combining other information (e.g., polarization), considering the conclusions that were reached in this paper. Because they can be very different environments, due to urbanization and building types, the postevent building collapse assessment in different cities and towns is a very challenging task. It is worth noting that this paper has investigated the potential of texture in the collapse assessment of the Yushu earthquake, but considering the complexity of urbanization and building types, the proposed approach still needs to be further explored in other stricken areas. Cities and towns which have
similar building types and distributions to Yushu may be suitable for the methods used in this paper. It is unfortunate that there are no matched pre-event SAR images in the Yushu dataset. As a result, the comparison of collapse assessment using only postevent images and using pre/postevent images will be undertaken in our future work.
REFERENCES [1] S. Kaya, P. Curran, and G. Llewellyn, “Post-earthquake building collapse: A comparison of government statistics and estimates derived from SPOT HRVIR data,” Int. J. Remote Sens., vol. 26, no. 13, pp. 2731–2740, Jul. 2005. [2] C. Corbane, D. Carrion, G. Lemoine, and M. Broglia, “Comparison of damage assessment maps derived from very high spatial resolution satellite and aerial imagery produced for the Haiti 2010 earthquake,” Earthq. Spectra, vol. 27, no. S1, pp. S199–S218, Oct. 2011. [3] M. A. Lodhi, “Multisensor imagery analysis for mapping and assessment of 12 January 2010 earthquake-induced building damage in Portau-Prince, Haiti,” Int. J. Remote Sens., vol. 34, no. 2, pp. 451–467, Jan. 2013. [4] F. Dell’Acqua, C. Bignami, M. Chini, G. Lisini, D. A. Polli, and S. Stramondo, “Earthquake damages rapid mapping by satellite remote sensing data: L’Aquila April 6th, 2009 event,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 4, no. 4, pp. 935–943, Dec. 2011. [5] L. Dong and J. Shan, “A comprehensive review of earthquake-induced building damage detection with remote sensing techniques,” ISPRS J. Photogramm., vol. 84, pp. 85–99, Oct. 2013. [6] L. Zhao, J. Yang, P. Li, L. Zhang, L. Shi, and F. Lang, “Damage assessment in urban areas using post-earthquake airborne PolSAR imagery,” Int. J. Remote Sens., vol. 34, no. 24, pp. 8952–8966, Nov. 2013. [7] M. Matsuoka and F. Yamazaki, “Use of satellite SAR intensity imagery for detecting building areas damaged due to earthquakes,” Earthq. Spectra, vol. 20, no. 3, pp. 975–994, Aug. 2004. [8] M. Chini, N. Pierdicca, and W. J. Emery, “Exploiting SAR and VHR optical images to quantify damage caused by the 2003 Bam earthquake,” IEEE Trans. Geosci. Remote Sens., vol. 47, no. 1, pp. 145–152, Jan. 2009. [9] G. A. Arciniegas, W. Bijker, N. Kerle, and V. A. Tolpekin, “Coherenceand amplitude-based analysis of seismogenic damage in Bam, Iran, using Envisat ASAR data,” IEEE Trans. Geosci. Remote Sens., vol. 45, no. 6, pp. 1571–1581, Jun. 2007. [10] A. R. Brenner and L. Roessing, “Radar imaging of urban areas by means of very high-resolution SAR and interferometric SAR,” IEEE Trans. Geosci. Remote Sens., vol. 46, no. 10, pp. 2971–2982, Oct. 2008. [11] D. Brunner, G. Lemoine, and L. Bruzzone, “Earthquake damage assessment of buildings using VHR optical and SAR imagery,” IEEE Trans. Geosci. Remote Sens., vol. 48, no. 5, pp. 2403–2420, May 2010. [12] S. Stramondo, C. Bignami, M. Chini, N. Pierdicca, and A. Tertulliani, “Satellite radar and optical remote sensing for earthquake damage detection: results from different case studies,” Int. J. Remote Sens., vol. 27, no. 20, pp. 4433–4447, Oct. 2006. [13] T. Balz and M. Liao, “Building-damage detection using post-seismic high-resolution SAR satellite data,” Int. J. Remote Sens., vol. 31, no. 13, pp. 3369–3391, Jul. 2010. [14] D. Brunner, K. Schulz, and T. Brehm, “Building damage assessment in decimeter resolution SAR imagery: A future perspective,” in Proc. IEEE Joint Urban Remote Sens. Event, Munich, Germany, 2011, pp. 217–220. [15] M. Watanabe, T. Motohka, Y. Miyagi, C. Yonezawa, and M. Shimada, “Analysis of urban areas affected by the 2011 off the Pacific Coast of Tohoku earthquake and tsunami with L-band SAR full-polarimetric mode,” IEEE Geosci. Remote Sens. Lett., vol. 9, no. 3, pp. 472–476, May 2012. [16] S. Chen and M. Sato, “Tsunami damage investigation of built-up areas using multitemporal spaceborne full polarimetric SAR images,” IEEE Trans. Geosci. Remote Sens., vol. 51, no. 4, pp. 1985–1997, Apr. 2013. [17] X. Li, H. Guo, L. Zhang, X. Chen, and L. Liang, “A new approach to collapsed building extraction using RADARSAT-2 polarimetric SAR imagery,” IEEE Geosci. Remote Sens. Lett., vol. 9, no. 4, pp. 677–681, Jul. 2012. [18] L. Shi, W. Sun, P. Li, J. Yang, and L. Lu, “Building collapse assessment by the use of postearthquake Chinese VHR airborne SAR,” IEEE Geosci. Remote Sens. Lett., vol. 12, no. 10, pp. 2021–2025, Oct. 2015.
SUN et al.: BUILDING COLLAPSE ASSESSMENT IN URBAN AREAS USING TEXTURE INFORMATION FROM POSTEVENT SAR DATA
[19] D. Polli, F. Dell’Acqua, P. Gamba, and G. Lisini, “Earthquake damage assessment from post-event only radar satellite data,” presented at the Int. Workshop Remte Sens.Disaster Manage., Tokyo, Japan, 2010. [20] A. R. Rao, A Taxonomy for Texture Description and Identification. New York, NY, USA: Springer-Verlag, 1990. [21] E. Aptoula, “Remote sensing image retrieval with global morphological texture descriptors,” IEEE Trans. Geosci. Remote Sens., vol. 52, no. 5, pp. 3023–3034, May 2014. [22] L. Liao, J. Yang, P. Li, and F. Hua, “Polarimetric calibration of CASMSAR P-band data affected by terrain slopes using a dual-band data fusion technique,” Remote Sens., vol. 7, no. 4, pp. 4784–4803, Apr. 2015. [23] S. Quegan, “A unified algorithm for phase and cross-talk calibration of polarimetric data-theory and observations,” IEEE Trans. Geosci. Remote Sens., vol. 32, no. 1, pp. 89–99, Jan. 1994. [24] L. Shi, J. Yang, and P. Li, “Co-polarization channel imbalance determination by the use of bare soil,” ISPRS J. Photogramm., vol. 95, pp. 53–67, Sep. 2014. [25] D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” Int. J. Comput. Vision, vol. 60, no. 2, pp. 91–110, Nov. 2004. [26] A. A. Popescu, I. Gavat, and M. Datcu, “Contextual descriptors for scene classes in very high resolution SAR images,” IEEE Geosci. Remote Sens. Lett., vol. 9, no. 1, pp. 80–84, Jan. 2012. [27] T. Randen and J. H. Husoy, “Filtering for texture classification: A comparative study,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 21, no. 4, pp. 291–310, Apr. 1999. [28] X. Xie, “A review of recent advances in surface defect detection using texture analysis techniques,” Electron. Lett. Comput. Vision Image Anal., vol. 7, no. 3, pp. 1–22, 2008. [29] N. Otsu, “A threshold selection method from gray-level histograms,” IEEE Trans. Syst. Man Cybern., vol. SMC-9, no. 1, pp. 62–66, Jan. 1979. [30] I. Champion, C. Germain, J. P. Da Costa, A. Alborini, and P. DuboisFernandez, “Retrieval of forest stand age from SAR image texture for varying distance and orientation values of the gray level co-occurrence matrix,” IEEE Geosci. Remote Sens. Lett., vol. 11, no. 1, pp. 5–9, Jan. 2014. [31] R. M. Haralick, K. Shanmugam, and I. H. Dinstein, “Textural features for image classification,” IEEE Trans. Syst. Man Cybern., vol. SMC-3, no. 6, pp. 610–621, Nov. 1973. [32] F. T. Ulaby, F. Kouyate, B. Brisco, and T. H. L. Williams, “Textural information in SAR images,” IEEE Trans. Geosci. Remote Sens., vol. 24, no. 2, pp. 235–245, Mar. 1986. [33] T. Ahonen, A. Hadid, and M. Pietikainen, “Face description with local binary patterns: Application to face recognition,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 28, no. 12, pp. 2037–2041, Dec. 2006. [34] C. Chan, J. Kittler, and N. Poh, “State-of-the-art LBP descriptor for face recognition,” Local Binary Patterns: New Variants and Applications. New York, NY, USA:: Springer, 2014, pp. 191–217. [35] Z. Guo and D. Zhang, “A completed modeling of local binary pattern operator for texture classification,” IEEE Trans. Image Process., vol. 19, no. 6, pp. 1657–1663, Jun. 2010. [36] T. M¨aenp¨aa¨ and M. Pietik¨ainen, “Multi-scale binary patterns for texture analysis,” in Proc. Scand. Conf. Image Anal., Gothenberg, Sweden, 2003, pp. 885–892. [37] T. Ojala, M. Pietikainen, and T. Maenpaa, “Multiresolution gray-scale and rotation invariant texture classification with local binary patterns,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no. 7, pp. 971–987, Jul. 2002. [38] D. A. Clausi and B. Yue, “Comparing co-occurrence probabilities and Markov random fields for texture analysis of SAR sea ice imagery,” IEEE Trans. Geosci. Remote Sens., vol. 42, no. 1, pp. 215–228, Jan. 2004. [39] R. Chellappa and S. Chatterjee, “Classification of textures using Gaussian Markov random fields,” IEEE Trans. Acoust. Speech Signal Process., vol. 33, no. 4, pp. 959–963, Aug. 1985. [40] Gabor Filter. 2016. [Online]. Available: https://en.wikipedia.org/wiki/ Gabor_filter [41] M. Yang and L. Zhang, “Gabor feature based sparse representation for face recognition with gabor occlusion dictionary,” in Proc. Eur. Conf. Comput. Vis., Heraklion, Crete, Greece, 2010, pp. 448–461. [42] Y. B. Jemaa and S. Khanfir, “Automatic local Gabor features extraction for face recognition,” Int. J. Comput. Sci. Inform. Sec., vol. 3, no. 1, pp. 1–7, Jul. 2009. [43] J. Kamarainen, “Gabor features in image analysis,” in Proc. IEEE Image Process. Theory Tools Appl., Istanbul, Turkey, 2012, pp. 13–14. [44] A. C. Bovik, M. Clark, and W. S. Geisler, “Multichannel texture analysis using localized spatial filters,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 12, no. 1, pp. 55–73, Jan. 1990.
3807
[45] B. S. Manjunath and W. Y. Ma, “Texture features for browsing and retrieval of image data,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 18, no. 8, pp. 837–842, Aug. 1996. [46] L. Shi, L. Zhang, L. Zhao, J. Yang, P. Li, and L. Zhang, “The potential of linear discriminative Laplacian eigenmaps dimensionality reduction in polarimetric SAR classification for agricultural areas,” ISPRS J. Photogramm., vol. 86, pp. 124–135, Dec. 2013. [47] L. Loosvelt, J. Peters, H. Skriver, B. De Baets, and N. E. C. Verhoest, “Impact of reducing polarimetric SAR input on the uncertainty of crop classifications based on the random forests algorithm,” IEEE Trans. Geosci. Remote Sens., vol. 50, no. 10, pp. 4185–4200, Oct. 2012. [48] S. van Beijma, A. Comber, and A. Lamb, “Random forest classification of salt marsh vegetation habitats using quad-polarimetric airborne SAR, elevation and optical RS data,” Remote Sens. Environ., vol. 149, pp. 118–129, Jun. 2014. [49] R. Genuer, J. M. Poggi, and T. M. Christine, “Variable selection using random forests,” Pattern Recognit. Lett., vol. 31, no. 14, pp. 2225–2236, Oct. 2010. [50] L. Breiman, “Random forests,” Mach. Learn., vol. 45, no. 1, pp. 5–32, Oct. 2001. [51] T. Ojala, M. Pietik¨ainen, and D. Harwood, “A comparative study of texture measures with classification based on featured distributions,” Pattern Recognit., vol. 29, no. 1, pp. 51–59, Jan. 1996. [52] P. T. Brett and R. Guida, “Earthquake damage detection in urban areas using curvilinear features,” IEEE Trans. Geosci. Remote Sens., vol. 51, no. 9, pp. 4877–4884, Sep. 2013. [53] The GLCM Tutorial. 2007. [Online]. Available: http://www.fp.ucalgary. ca/mhallbey/tutorial.htm [54] J. S. Lee and E. Pottier, Polarimetric Radar Imaging: From Basics to Applications. Boca Raton, FL, USA: CRC Press, 2009. [55] A. Lopes and F. S´ery, “Optimal speckle reduction for the product model in multilook polarimetric SAR imagery and the Wishart distribution,” IEEE Trans. Geosci. Remote Sens., vol. 35, no. 3, pp. 632–647, May 1997. [56] G. Vasile, F. Pascal, and J. P. Ovarlez, “Heterogeneous clutter model for high-resolution polarimetric SAR data parameter estimation,” presented at the POLinSAR Workshop, Frascati, Italy, 2011. [57] G. Vasile, J. Ovarlez, F. Pascal, and C. Tison, “Coherency matrix estimation of heterogeneous clutter in high-resolution polarimetric SAR images,” IEEE Trans. Geosci. Remote Sens., vol. 48, no. 4, pp. 1809–1826, Apr. 2010. [58] J. S. Lee, M. R. Grunes, and G. De Grandi, “Polarimetric SAR speckle filtering and its implication for classification,” IEEE Trans. Geosci. Remote Sens., vol. 37, no. 5, pp. 2363–2373, Sep. 1999. [59] T. Eltoft, S. N. Anfinsen, and A. P. Doulgeris, “A multitexture model for multilook polarimetric synthetic aperture radar data,” IEEE Trans. Geosci. Remote Sens., vol. 52, no. 5, pp. 2910–2919, May 2014. [60] R. G. Pontius and M. Millones, “Death to Kappa: Birth of quantity disagreement and allocation disagreement for accuracy assessment,” Int. J. Remote Sens., vol. 32, no. 15, pp. 4407–4429, Aug. 2011. [61] L. Shi, L. Zhang, J. Yang, L. Zhang, and P. Li, “Supervised graph embedding for polarimetric SAR image classification,” IEEE Geosci. Remote Sens. Lett., vol. 10, no. 2, pp. 216–220, Mar. 2013.
Weidong Sun was born in Harbin City, Heilongjiang, China, in 1989. He received the B.S. degree in surveying and mapping from the Jilin University, Changchun, China, in 2012. He is currently working toward the Ph.D. degree at the State Key Laboratory of Information Engineering in Surveying, Mapping, and Remote Sensing, Wuhan University, Wuhan, China. His current research interests include processing and interpretation of synthetic aperture radar data, machine learning, and applications of PolSAR data to agriculture and earthquake damage assessment.
3808
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, VOL. 9, NO. 8, AUGUST 2016
Lei Shi (M’13) received the B.S. degree in sciences and techniques of remote sensing from the Wuhan University, China and the Ph.D. degree in photogrammetry and remote sensing from the State Key Laboratory of Information Engineering in Surveying, Mapping, and Remote Sensing, Wuhan University, China, in 2008 and 2013, respectively. Since 2013, he has been working with the SAR image Group, the State Key Laboratory of Information Engineering in Surveying, Mapping, and Remote Sensing, and is currently a Postdoctoral Research Fellow at the Department of Land Surveying and Geo-Informatics, Hong Kong Polytechnic University, Kowloon, Hong Kong. His current interests include SAR polarimetry, polarimetric interferometry, polarimetric image classification, and the natural media remote sensing. Dr. Shi received the Hong Kong Scholar Award from the Society of Hong Kong Scholars and Luojia Young Scholar Award from Wuhan University in 2015.
Jie Yang received the Ph.D. degree in photogrammetry and remote sensing, from the Wuhan University, Wuhan, China, in 2004. Since 2011, he has been a Professor at the State Key Laboratory of Information Engineering in Surveying, Mapping, and Remote Sensing, Wuhan University, Wuhan, China. His current research interests include SAR image understanding.
Pingxiang Li (M’06) received the B.S., M.S., and Ph.D. degrees in photogrammetry and remote sensing from the Wuhan University, Wuhan, China, in 1986, 1994, and 2003, respectively. Since 2002, he has been a Professor at the State Key Laboratory of Information Engineering in Surveying, Mapping, and Remote Sensing, Wuhan University, Wuhan, China. His research interests include photogrammetry and SAR image processing.