Bayesian Approach for Landslide Identification from

0 downloads 0 Views 821KB Size Report
by extraction of spatial and spectral features of landslides and classification based .... and designing multinomial Naive Bayes (MNB) classifier for each feature. ..... bayes text classification, AAAI-98 workshop on learning for text categorization, ...
Bayesian Approach for Landslide Identification from High-Resolution Satellite Images Pilli Madalasa, Gorthi R K Sai Subrahmanyam, Tapas Ranjan Martha, Rama Rao Nidamanuri and Deepak Mishra

Abstract Landslides are one of the severe natural catastrophes that affect thousands of lives and cause colossal damage to infrastructure from small to region scales. Detection of landslide is a prerequisite for damage assessment. We propose a novel method based on object-oriented image analysis using bi-temporal satellite images and DEM. The proposed methodology involves segmentation, followed by extraction of spatial and spectral features of landslides and classification based on supervised Bayesian classifier. The proposed framework is based on the change detection of spatial features which capture the spatial attributes of landslides. The proposed methodology has been applied for the detection and mapping of landslides of different sizes in selected study sites in Himachal Pradesh and Uttarakhand, India. For this, high-resolution multispectral images from the IRS, LISS-IV sensor and DEM from Cartosat-1 are used in this study. The resultant landslides are compared and validated with the inventory landslide maps. The results show that the proposed methodology can identify medium- and large-scale landslides efficiently.

P. Madalasa · R. R. Nidamanuri Department of Earth and Space Sciences, Indian Institute of Space Science and Technology, Trivandrum, India G. R. K. Sai Subrahmanyam (B) Department of Electrical Engineering, Indian Institute of Technology Tirupati, Tirupati, Andhra Pradesh, India e-mail: [email protected] D. Mishra Department of Avionics, Indian Institute of Space Science and Technology, Thiruvananthapuram, India T. R. Martha Geosciences Group, National Remote Sensing Centre, Hyderabad, India © Springer Nature Singapore Pte Ltd. 2018 B. B. Chaudhuri et al. (eds.), Proceedings of 2nd International Conference on Computer Vision & Image Processing, Advances in Intelligent Systems and Computing 704, https://doi.org/10.1007/978-981-10-7898-9_2

[email protected]

13

14

P. Madalasa et al.

1 Introduction Natural catastrophes such as landslides, avalanches, floods cause acute environmental problems, substantial loss of life and colossal damage to infrastructure in mountain regions. Landslides are ranked third in terms of death due to natural calamity. Landslide is defined as the movement of a mass of earth or debris or rock from a higher altitude due to gravitational force [1]. The most vulnerable regions of landslides in India are the Western Ghats and the Himalayas. Western Ghats are geologically stable and are on uplifted plateau. The Himalayan mountain region is one of the youngest mountains which is tectonically active. While Western Ghats experience intermittent landslides, Himalayan regions experience huge and massive landslides frequently with impact sites spreading across several north and north-eastern states of India. Every year, one person dies for every 100 sq kms in Himalayan regions due to landslides. Overall, ∼12.6% of total land area is affected by landslides in India. Spatial prediction and mapping of landslides are valuable for disaster preparedness and management. Even though landslides are highly unpredictable, there is a spatial pattern in their occurrences and landslide can occur within a neighbourhood multiple times. Therefore, the detection of sites of historical landslides helps to predict the occurrences of landslides and is vital for assessing the damage caused to landscape. The conventional methods of landslide detection mainly rely on the visual interpretation of aerial and satellite images and aided by moderate field measurements. Traditionally, interpretation can be done on vertical or oblique stereoscopic aerial photographs. In [2] nearly 60,000 earthquake-induced landslide scarps were mapped by visual interpretation. Visual interpretation is subjective, labour-intensive and timeconsuming. Digital methods for analysis of remote sensing satellite images for landslide mapping have been evolving [3]. From the satellite imagery landslide regions are identified based upon its spectral, spatial, contextual information and morphometric feature [4]. In [4] a study, object regions in multispectral images pertinent to landslides are identified based on spectral information, shape and morphometric characteristics. The resultant objects are recognised as landslides in further classification and are categorised as debris slides, debris flows and rock slides using adjacency and morphometric criteria. Landslide regions can be identified based upon a change detection analysis. Landslides are also mapped by tracking the differences in orthorectified co-registered images and digital elevation models (DEMs) acquired over the same geographical position at different times [4]. Using a change detection-based Markov random field (CDMRF) technique, [5] developed a near-automatic method for landslide mapping from bi-temporal aerial photographs at pixel level. With the advent of computational power, especially machine learning techniques for remote sensing image classification have gradually shifted from the visual interpretation to computer-aided methods. In [6] developed a semi-automated landslide detection and mapping, which combines OBIA and SVM classifier. In this, the classifier was trained using the spectral, spatial and texture features.

[email protected]

Bayesian Approach for Landslide Identification …

15

Highlighting the advantage of frequent revisiting capability and low cost of data from FORMOSAT-2, [7] proposed a method for landslide detection. To balance against of poor spectral resolution of FORMOSAT-2, the authors use log-polar wavelet packet transformation and slope derived from DEM. Finally, a context-based classifier was used to identify homogeneous objects. In this study, a novel method is proposed to detect landslides at object level. Bi-temporal satellite images, i.e. images captured before and after the occurrence of landslides were used in this study. The first step in object-oriented image analysis (OOIA) is segmentation. Segmentation divides the entire image homogeneously, such that each region in the image contains samples with similar attributes. In this study, segmentation is done using simple linear iterative clustering (SLIC) algorithm [3]. Next step is to extract six features, based upon the domain knowledge about landslides, namely top of atmospheric reflectance (ToA), green normalised difference vegetation index (GNDVI), digital elevation model (DEM), slope, principal component analysis (PCA) and brightness are selected to discriminate the non-landslide candidates. Change detection analysis on the features which are extracted from postand pre-landslide images, which changes due to landslides, is performed. Multinomial Naive Bayes (MNB) classifiers for each feature with their respective training segments were trained. This classifier on each feature in an order as shown in Fig. 5 was used to eliminate false positives. This methodology is developed based on OOIA implemented using tools like visual studio, QGIS and ArcGIS.

Fig. 1 Block diagram

[email protected]

16

P. Madalasa et al.

2 Proposed Methodology An overview of the proposed Bayesian methodology is shown in Fig. 1. The main stages in the methodology are image segmentation, feature extraction, forming the likelihood based on the histogram of the respective feature values of all segments and designing multinomial Naive Bayes (MNB) classifier for each feature.

2.1 Study Area and Imagery Data Sets The study area considered in this research is Uttarakhand region, and test cases are performed on the Himachal Pradesh. Bi-temporal orthorectified satellite images are considered for these regions. Bi-temporal satellite images are the co-registered images captured before and after the occurrence of landslide of the same geographic location. In this paper, these are referred as post- and pre-landslide images. These images are taken from the satellite Resourcesat-2, LISS-IV sensor with spatial resolution of 5.8 m. They have three spectral bands, viz. green (0.52 to 0.59 µm), red (0.62 to 0.68 µm) and near infrared (0.76 to 0.86 µm). The ground truth data given by the NRSC, Hyderabad, was used as validation data.

2.2 Segmentation Initial step in the object-oriented image analysis is the segmentation. The algorithm used in this work for segmentation is simple linear iterative clustering (SLIC) to create image objects. SLIC decomposes an image into homogeneous regions based upon the K-means clustering algorithm. It divides the image into object primitives based on shape, size, colour and pixel topology which can be controlled by the user. Segmentation will group all the pixels with similar characteristics into a segment. Different techniques are there to do image segmentation such as thresholding technique, clustering methods, compression-based methods, histogram-based methods, dual clustering method, region-growing method, partial-differential-equation-based methods, variational methods, graph partition methods, watershed transformation, model-based segmentation, multiscale segmentation, semi-automatic segmentation, trainable segmentation and other methods. It has two parameters: 1. Nominal size of region–region size. 2. Strength of spatial regularisation–regularise. The major steps of SLIC segmentation are defined as follows: 1. Image is divided into regions(grid) as per user-defined region size. 2. The centre of each region is initially considered as centre of the k-means.

[email protected]

Bayesian Approach for Landslide Identification …

17

3. Then, the formed clusters are refined by Lloyd algorithm [3]. The region size defines the object size; it should be small enough to identify the user interest. The segmentation was performed on post-landslide image. The labels for each region will be unique. For all the feature images, labels are assigned like a post-landslide segmented labels. So for the same objects, the feature vector will be formed for every segment to relate object and segment.

2.3 Extracting Features Based on domain knowledge, the landslide diagnostic features such as ToA, GNDVI, PCA, DEM, slope and brightness are selected [3]. From the orthorectified postand pre-satellite images ToA, GNDVI and PCA can be extracted and slope can be extracted from DEM. The features considered will be helpful to discriminate the landslides efficiently, i.e. reducing the false positives. Top of Atmospheric reflectance (ToA) The physical properties of the surface are measured remotely by its reflectance from the ground, and it is captured by the satellites. But this reflectance is affected by clouds, atmospheric aerosols and gases. In order to consider the reflectance from ground above the atmosphere, ToA can be computed. This can be calculated by the following equation. ℘=

πL λ d 2 E SU Nλ cosθs

(1)

where Lλ = (

L M AX λ − L M I Nλ ) ∗ (QC AL − QC AL M I N ) + L M I Nλ QC AL M AX − QC AL M I N (2)

where, Lλ = Spectral Radiance ESUNλ = Mean solar exoatmospheric irradiances θs = Solar zenith angle QCAL = Digital number L M I Nλ = Spectral radiance scales to QCALMIN L M AX λ = Spectral radiance scales to QCALMAX QCALMIN = Minimum quantised calibrated pixel value QCALMAX = Maximum quantised calibrated pixel value d = Earth–Sun distance in astronomical units. Green Normalised Difference Vegetation Index (GNDVI) GNDVI is more sensitive to the index of plant. Small changes in the chlorophyll of plant can be identified by using the GNDVI. Based on the domain knowledge, after the occurrence of landslides, vegetation will be lost in the slope areas. So GNDVI is a good measure to

[email protected]

18

P. Madalasa et al.

discriminate landslides. This can be computed by using the ToA as follows: G N DV I =

T o A N I R − T o A Gr een T o A N I R + T o A Gr een

(3)

Slope Mostly landslides occurs at steep slopes, but may occur in low relief areas or slope gradient. Slope is also used to identify the false positives such as river sediments, which is having the same brightness like landslides [3]. Slope is generated from DEM by using a spatial analyst tool in ArcGIS. Principal component analysis (PCA) Based on the training assessed in [3] identified that the signature of newly triggered landslides were primarily concentrated in the PCA4 and PCA5. Principal components are generated by stacking the pre- and postlandslide images. PCA combines all bands of pre- and post-LISS-IV images and transforms these bands into six uncorrelated components.

2.4 Change Detection The changes occurred after the landslide can be identified by performing change detection analysis (CDA) on pre- and post-landslide images. CDA can be performed on the features which are extracted from the pre- and post-images such as ToA and GNDVI. Difference in Post-ToA and Pre-ToA (DToA) After a landslide event, appearance of landslide-affected areas in a post-landslide image is more reflected compared with the pre-landslide image due to exposure of rock or soil. So changes in pre-ToA and post-ToA are helpful to discriminate the landslide from non-landslides. DT o A(Bi ) = T o A Post (Bi ) − T o A Pr e (Bi ), i = 1, 2, 3

(4)

where B=Band Difference in GNDVI (DGNDVI) After the triggering of landslides, there will be a change in the vegetation. To identify these changes, DGNDVI is considered. Due to the presence of vegetation cover, GNDVI of pre-landslide image will be more compared to the post-landslide image. These changes can be computed by the following equation. (5) DG N DV I = G N DV I pr e − G N DV I post

2.5 Multinomial Naive Bayes Classification In machine learning, Naive Bayes classifiers are a family of simple probabilistic classifiers that uses Bayes theorem with naive independence assumptions between

[email protected]

Bayesian Approach for Landslide Identification …

19

Fig. 2 Deciding landslide or non-landslide based on likelihood

the features. Multinomial Naive Bayes classifier is a supervised probabilistic learning algorithm which is suitable to classify the discrete features [8]. Landslide identification based on using likelihood Let us consider the training data, one of its sample having feature vector X, where (X = {X 1 , X 2 , ...X n }). For each feature say x, compute the normalised histograms separately for both the classes, assuming class labels are available. These histograms are considered as likelihoods p(x|Ci ). Design the classifier by overlapping the both class-1 (C1 ) and class-2 (C2 ) histograms as shown in the Fig. 2. For example, consider the two test segments x1 and x2 . Now, for the test segment, check which likelihood dominated in the corresponding training bin of feature and assign the same class as it refers. Here, the test segment x1 corresponding training bin belongs to class-1 and similarly for the test segment x2 belongs to class-2 as they have maximum likelihoods at those points. This is illustrated in Fig. 2. In general, if p(x|C1 ) < p(x|C2 ), then x ∈ C2 else, x ∈ C1

3 Results and Discussion For the analysis of landslide detection, bi-temporal LISS-IV multispectral images of Chamoli District in Uttarakhand region, which are collected from the NRSC, Hyderabad, India, as shown in the Fig. 3a and 3b have been used. The study area has small, medium and large landslides. This section will discuss results and inferences of proposed methodology. The pink colour on the output image is the reference shape file provided by the NRSC, Hyderabad.

[email protected]

20

P. Madalasa et al.

3.1 Segmentation SLIC segmentation applied to the post-landslide image. Heuristically fixed the segmentation scale parameters for the study area of 30 was used to create an objects as shown in the Fig. 4a.

3.2 Extracting Features and Change Detection For each and every segment in the post-landslide image, the features were extracted as mentioned in Sect. 2.3. Then, change detection was performed.

(a) Pre landslide image

(b) Post landslide image

Fig. 3 Pre- and post-landslide clippings of Chamoli District, Uttarakhand

(a) Segmented image

(b) Original image

Fig. 4 Segmentation

[email protected]

Bayesian Approach for Landslide Identification …

21

Fig. 5 Steps in eliminating false positives

3.3 Multinomial Naive Bayes Classifier Extract the training segments of landslide and non-landslide manually by overlapping the post-landslide segmented image and the reference landslide map. For each and every training segment, extract all the features described in Sects. 2.3 and 2.4. Now select the training features of both landslide and non-landslide segments, and create separate normalised histograms for each feature. These histograms are shown in Figs. 6 and 8 and can be considered as the likelihood p(x|Ci ) of a particular feature value for a landslide or non-landslide region. Now, for all the features X compute the likelihoods p(x|Ci ). These likelihoods are compared with the trained MNB classifier. If the likelihood of the object being landslide class is greater than being non-landslide class, then classify the object as a landslide, else classify the object as a non-landslide as shown in the Fig. 2. Elimination of false positives Use a cascaded approach to perform this classification, where first classify with the DToA and exclude those segments that are not qualifying as landslide region immediately. And it is followed by the other segments taken in the cascaded order as mentioned in Fig. 5. The pink colour in the images is the referenced LM. The nonzero segments are the identified landslides. Classification with Difference in ToA After the occurrence of landslides, the affected region will look more brighter as compared to the non-landslide region. Perform the Bayesian rule on DToA MNB classifier Fig. 6 for every segment of DToA. The resultant is shown in Fig. 7a. Classification with Slope and DEM To further reduce the number of false positives, classification was performed based on slope, which is derived from DEM and the DEM. These can be helpful to reduce the false positives due to river sands. The resultant images of this classification are shown in the Fig. 7b. Classification with PCA4 and PCA5 PCA4 and PCA5 are used to detect small changes that are triggered due to landslides. Many research works suggest that PCA4 and PCA5 are good features for identifying new landslides (Fig. 8). After applying the ToA , slope and DEM parameters to further remove the false positives, PCA is applied. The result is shown in Fig. 9a. Classification with Difference in GNDVI After the landslide, the vegetation will be reduced. But in the Fig. 9a, the vegetation still exists. So, to further remove the false positives, perform the classification with DGNDVI. The resultant images are shown in Fig. 9b.

[email protected]

22

P. Madalasa et al.

Fig. 6 Feature (x-axis) versus likelihood (y-axis) of landslide and non-landslide

(a) Using ToA difference

(b) Using slope angle and DEM

Fig. 7 Eliminating non-landslide candidates with

4 Evaluation Metrics For the evaluation of the above-mentioned methods, recall (R) was calculated. Recall also referred as true positive rate or sensitivity, which measures the fractions of positives that are correctly identified. Let us assume T p , F p and Fn are the true positive, false positive and false negatives, respectively. R=

Tp T p + Fn

[email protected]

(6)

Bayesian Approach for Landslide Identification …

23

Fig. 8 Feature (x-axis) versus likelihood (y-axis) of landslide and non-landslide

(a) PCA4 and PCA5

(b) Classification with Difference in GNDVI and Brightness

Fig. 9 Eliminating non-landslide candidates

The evaluation metric for developed method was given below. The recall value of the multinomial Naive Bayes with all the features is more compared to the multinomial Naive Bayes with selected features (Table 1).

5 Conclusion Landslide detection from satellite imagery using object-oriented image analysis with machine learning technique has proved to be promising. In this study, small, medium and large landslides were identified from bi-temporal satellite images. This study

[email protected]

24

P. Madalasa et al.

Table 1 Summary of change detection based upon multinomial Naive Bayes classifier Evaluation metrics Naive Bayes with all features Naive Bayes with selected features Tp Fn Recall

287 90 79.12

300 77 79.5

reports that the features DToA, DGNDVI, slope, DEM, PCA and brightness are helpful to discriminate the landslides at object level. The contribution of each feature to discrimination of landslide can be identified by using multinomial Naive Bayes classifier. The results presented are validated using the data set provided by the NRSC, Hyderabad, and most of the landslides are correctly identified. However, there is a need to improve the methodology to reduce false positives which can be achieved by including additional parameters like texture statistics in classification.

References 1. Cruden, David M.:A simple definition of a landslide. Bulletin of Engineering Geology and the Environment 43.1, 27–29(1991). 2. Gorum, Tolga, et al.:Distribution pattern of earthquake-induced landslides triggered by the 12 May 2008 Wenchuan earthquake, Geomorphology 133.3, 152–167(2011). 3. Veena, VS and Sai, Subrahmanyam Gorthi and Tapas, Ranjan Martha and Deepak, Mishra and Rama, Rao Nidamanuri.:Automatic detection of landslides in object-based environment using open source tools(2016). 4. Martha, Tapas R and Kerle, Norman and Jetten, Victor and van Westen, Cees J and Kumar, K Vinod: Characterising spectral, spatial and morphometric properties of landslides for semiautomatic detection using object-oriented methods. Geomorphology. 116. Elsevier. 78, 24– 36(2010). 5. Li, Zhongbin and Shi, Wenzhong and Lu, Ping and Yan, Lin and Wang, Qunming and Miao, Zelang: Landslide mapping from aerial photographs using change detection-based Markov random field. Remote Sensing of Environment, 187, 76–90(2016). 6. Heleno, Sandra and Matias, Magda and Pina, Pedro and Sousa, António Jorge.: Semiautomated object-based classification of rain-induced landslides with VHR multispectral images on Madeira Island, Nat. Hazards Earth Syst. Sci, 16, 1035–1048(2016). 7. Chang, Li-Wei, Pi-Fuei Hsieh, and Ching-Weei Lin.:Landslide identification based on FORMOSAT-2 multispectral imagery by wavelet-based texture feature extraction. Geoscience and Remote Sensing Symposium(2006). 8. McCallum, Andrew and Nigam, Kamal and others.:A comparison of event models for naive bayes text classification, AAAI-98 workshop on learning for text categorization, 752, 41– 48(1998).

[email protected]