Landslide susceptibility assessment: what are the effects of mapping ...

7 downloads 0 Views 1MB Size Report
Aug 20, 2011 - mapping unit and susceptibility mapping method on land- ... units. The procedure for investigation of effect of mapping unit on different ...
Environ Earth Sci (2012) 66:859–877 DOI 10.1007/s12665-011-1297-0

ORIGINAL ARTICLE

Landslide susceptibility assessment: what are the effects of mapping unit and mapping method? A. Erener • H. S. B. Du¨zgu¨n

Received: 1 December 2008 / Accepted: 8 August 2011 / Published online: 20 August 2011  Springer-Verlag 2011

Abstract Landslide susceptibility assessment forms the basis of any hazard mapping, which is one of the essential parts of quantitative risk mapping. For the same study area, different susceptibility maps can be achieved depending on the type of susceptibility mapping methods, mapping unit, and scale. Although there are various methods of obtaining susceptibility maps, the efficiency and performance of each method should be evaluated. In this study the effect of mapping unit and susceptibility mapping method on landslide susceptibility assessment is investigated. When analyzing the effect of susceptibility mapping method, logistic regression (LR) which is widely used in landslide susceptibility mapping and, spatial regression (SR), which have not been used for landslide susceptibility mapping, are selected. The susceptibility maps with logistic and spatial regression models are obtained using two different mapping units namely slope unit-based and grid-based mapping units. The procedure for investigation of effect of mapping unit on different susceptibility mapping methods is applied to Kumluca watershed, in Bartin Province of Western Black Sea Region, Turkey. 18 factor maps are prepared for landslide susceptibility assessment in the study region. Geographic information systems and remote sensing techniques are used to create the landslide factor maps, to obtain susceptibility maps and to compare the results. The relative operating characteristics (ROC) curve is used to A. Erener (&) Department of Geomatics Engineering, Faculty of Engineering and Architecture, Selcuk University, Konya, Turkey e-mail: [email protected] H. S. B. Du¨zgu¨n Department of Mining Engineering, Middle East Technical University, Ankara, Turkey

compare the predictive abilities of each model and mapping unit and also the accuracy is evaluated depending on the observations made during field surveys. By analyzing the area under the ROC curve for grid-based and slope unit-based mapping units, it can be concluded that SR model provide better predictive performance (0.774 in grids and 0.898 in slope units) as compared to the LR model (0.744 in grids and 0.820 in slope units). This result is also supported by the accuracy analysis. For both mapping units, the SR model provides more accurate result (0.55 for grids and 0.57 for slope units) than the LR model (0.50 for grids and 0.48 for slopes). The main reason for this better performance is that the spatial correlations between the mapping units are incorporated into the model in SR while this fact is not considered in LR model. Keywords Susceptibility mapping  Geographic information systems (GIS)  Logistic regression (LR)  Spatial regression (SR)  Grid-based mapping unit  Slope unit-based mapping unit

Introduction Quantitative landslide hazard assessments for regional scale have received increased attention in recent years as they form the basis of quantitative landslide risk maps. Although there are several challenges in obtaining quantitative landslide risk maps in regional scale (van westen et al. 2005), the recent research works on the issue may lead to promising results in near future. Quantitative risk mapping for landslides has two components, namely hazard and consequence maps (Einstein 1988). Hazard mapping should contain information about spatial and temporal probability of occurrence of a landslide (Varnes 1984). The

123

860

spatial probability of landslide occurrence for a region is obtained by the landslide susceptibility maps. Although landslide susceptibility maps do not include any temporal implication related to landslide occurrence, they are important in order to determine zones of landslide-prone areas. For this reason they are the first stage in hazard mapping. The choice of susceptibility mapping method depends on variety of factors which can be listed as: mapping unit, scale of investigation, type of method used, type of landslide, type of data required, triggers, and purpose of mapping. In order to assess the quality of susceptibility maps, these factors have to be investigated in detail. Production of a landslide susceptibility map for a region starts with the selection of a suitable mapping unit (Guzzetti et al. 2005). Selection of the mapping unit largely influences all the subsequent analyses and modeling. A mapping unit is the division of the land surface in such a way that it contains a set of ground conditions which differ from the adjacent units across definable boundaries (Hansen 1984). After determination of mapping unit, a value is assigned to each unit for each factor taken into consideration and each unit is treated as a case or sampling unit in the susceptibility mapping. Various methods have been proposed to partition the terrain (Meijerink 1988; Carrara et al. 1995; Leroi 1996) and each has certain advantages and limitations. The methods for tesselling the territory can be simply named as: grid cells; terrain units; unique-condition units; slope units and topographic units (Carrara 1983; Meijerink 1988; Pike 1988; Carrara et al. 1991, 1999; van Westen 1993a; Bonham Carter 1994; Chung and Fabbri 1995; van Westen et al. 1997; Guzzetti et al. 2000, 2006; Hearn and Griffiths 2001; Lee and Min 2001; Saha et al. 2002; Xie et al. 2004; Komac 2006; Castellanos Abella and van Westen 2008). The grids are obtained by simply overlaying regular quadrats of pre-defined size in the study region. Hence the selection of size of the grid is a problematic issue, where too small grids decrease the efficiency of computation and too large grids smooths out the local details (Guzzetti et al. 1999; Carrara et al. 1995). Generally, for 1:10,000 scale the grid size of 10 9 10 m, for 1:25,000 scale the grid size of 20 9 20 m and for 1:50,000 scale the grid size of 20 9 20 m or 50 9 50 m is preferred. Unique-condition units are constructed by the overlay of different categorical maps, so each map unit is defined by a unique combination of attributes (Begueria and Lorente 1999). Carrara et al. (1995) claim the technique to be fully objective; its main weakness is the inherent subjectivity in factor classification. The slope units are formed by subdividing the study region based on certain hydrological criteria, where possible slope boundaries are delineated. The main limitation of this mapping unit is the difficulties in manual identification of sub-basin boundaries. However, the use of

123

Environ Earth Sci (2012) 66:859–877

automatic tools developed in geographical information systems (GIS) can automate the construction of slope units by the intersection of drainage lines and divides. In this study, grid and slope unit are selected for evaluating their effect on the two selected susceptibility mapping methods (logistic regression and spatial regression). As constructing grids for a study region is easier in GIS and provide computational efficiency, it is one of the most frequently used mapping unit type in susceptibility mapping. However, the determination of grid size is subjective and grids may not represent a physical phenomenon (e.g. the terrain parts with no slope in the study region are also taken into account although they do not have any landslide potential). In this respect, since the slope units are formed based on hydrological considerations, geomorphology of the study region is considered in slope unit-based mapping unit. However, obtaining slope units require more effort and time. The susceptibility maps are also sensitive to the selected method, which can be grouped into two as qualitative or quantitative. Some examples and reviews of the concepts, principles, techniques for landslide susceptibility mapping methods are Brabb et al. (1972), Carrara et al. (1978), Carrara (1983, 1988), Brabb (1984), Varnes (1984), Crozier (1986), Einstein (1988), Hartlen and Viberg (1988), Mulder (1991), van Westen (1993b, 1994), Hutchinson (1995), Leroi (1996), Soeters and van Westen (1996), Aleotti and Chowdhury (1999), Guzzetti et al. (1999), Miles and Ho (1999), Wu and Abdel-Latif (2000), van Westen (2004), Huabin et al. (2005), Chacon et al. (2006), Akgu¨n and Bulut (2007), Bathrellos et al. (2009), Dahal et al. (2008), Choi et al. (2009), Yilmaz (2009), Pradhan (2010a), and Pradhan and Lee (2010). By the development of technology, GIS and remote sensing (RS) techniques are extensively used to produce landslide susceptibility mapping. RS techniques are preferable to extract data for larger regions and the resultant data can be efficiently used as variable in the susceptibility analysis. Additionally, GIS techniques are used to archive, process, analyze and display geo-referenced data. Recent susceptibility mapping techniques are applied in GIS (e.g. Carrara et al. 1992; Van Westen 1993a; Guzzetti et al. 2000; Dai et al. 2001, 2002; Lee 2004, 2005; Lee et al. 2004; Lan et al. 2004; Erener and Du¨zgu¨n 2006; Yalcin and Bulut 2007; Erener et al. 2007; Wang et al. 2008; Pandey et al. 2008; Erener and Du¨zgu¨n 2008; Pradhan et al. 2010). In this study the two different mapping units are evaluated using two different quantitative methods. The first selected method is logistic regression (LR) which is widely adopted in the literature (e.g. Menard 1995; Atkinson et al. 1998; Dai and Lee 2001, 2002; Lee and Min 2001; Ohlmacher and Davis 2003; Lee 2004; Can et al. 2005; Guzzetti et al. 2005; Yesilnacar and Topal 2005; Ayalew and Yamagishi 2005; Can et al. 2005; Lee 2005; Duman et al.

Environ Earth Sci (2012) 66:859–877

2006; Zhu and Huang 2006; Mathew et al. 2008; Tunusluoglu et al. 2008; Pradhan 2010b; Yilmaz 2010). The LR method relates predictor variables to the occurrence or nonoccurrence of landslides within mapping units. It uses this relationship to produce a map showing the probability of future landslide prone areas (Ohlmacher and Davis 2003). The spatial regression (SR) is a spatial modeling technique in which spatial autocorrelation among the regression parameters are taken into account (Du¨zgu¨n and Kemec¸ 2008; Erener and Du¨zgu¨n 2010). Spatial regression has better explanatory performance over LR especially when the phenomena have a spatial nature. The main distinction between SR model and the traditional LR model is that SR considers the spatial dependence or autocorrelation between the mapping units whereas, LR model completely ignores this issue. In order to investigate the effect of using slope units and grids with SR and LR methods on susceptibility mapping, a case study area is selected, which is Kumluca watershed covering 330 km2 areas, in Bartin Province of Western Black Sea Region, Turkey. 18 factor maps (DEM, slope, aspect, curvature, plan, profile, wetness index, distance to hydrology network, density of hydrology network, distance to road network, density of road network, geological formations, distance to fault lines, soil type, soil effective thickness, erosion coverage, land cover and vegetation cover) are prepared for landslide susceptibility analysis in the study region. Geographic information systems (GIS) and remote sensing (RS) techniques are used to create the landslide factor maps. The modeling for SR and LR is done in the Matlab and SPSS environment. The model result is then visualized and evaluated in the GIS environment.

Study region and data analyses The Black Sea region is characterized by its steep topography, and subjected to heavy precipitation. Due to these adverse effects, the region is prone to an extensive and severe landsliding (Go¨kc¸eog˘lu and Ercanog˘lu 2002). The study area, Kumluca watershed, is located in the middle part of the Black Sea Region. The watershed has an area of 330 km2 and located 15 km to the south west of the city center of Bartin, Turkey (Fig. 1). The study area has both natural and artificial triggers for landsliding. While heavy precipitation, stream erosion of slope toes, weathering of the bedrock are the natural triggers, steeply and improperly cut slopes, poorly controlled surface drainage and uncontrolled settlement and agricultural activities are the artificial triggers (Akgu¨n and Bulut 2007). The reports about the landslides in the study region prepared by the General Directorate of Disaster Affairs

861

(GDDA) show that the main triggers are intense rainfall, indirect human action and rapid snowmelt which produce flash floods (GDDA 2007). Long-lasting rainfall periods and slow snowmelt processes were responsible for the rise of the groundwater table. Depending on the 30 years rainfall database of meteorological stations around the study region, the annual mean rainfall is found to range between 900 and 1,071 mm. Based on the geology map, obtained from General Directorate of Mineral Research and Exploration (GDMRE 2007), there are six different lithologies in the region (Fig. 2). The largest areas are represented by sandstone– mudstone (70.6%) and Conglomerate (23.7%), which constitutes nearly 95% of the area. The remaining 4 units cover only 3% of the region. Most of the study area is covered by Ulus Formation, which is known to be susceptible to landslides in the region. Indeed, when the landslide reports obtained from the General Directorate of Disaster Affairs Bartın Division is evaluated for the spatial distribution of slope movements, it is proved that most of the landslides are identified in the Upper Cretaceous age in Ulus Formation. Ulus Formation is composed of mostly thick sandstone levels and sandy, loamy schist, claystone and loamy marl alternations at higher elevations (Landslide Reports 1985). It is known as a typical flysch sequence and is highly susceptible to weathering (Deveciler 1986; Demir and Ercan 1999; Ercanog˘lu et al. 2004; Ercanog˘lu 2005). The fault lines overlaid to geologic map of the study region have 39 segments with 24.17 km length. The average length of the segments is 619 m and 52% of the fault line segments have approximately 30 m length. In the soil map of the area, there are five different soil types. The soil map of the region is obtained from General Directorate of Rural Affairs (GDRA 2007). The soil type which has the largest coverage in the region is Brown Forest Soil (78%) and secondly brown forest soil without lime (20%). The remaining are colluvial soil, grey brown podzolic soil and alluvial soil (2%). The thickness of the soil units are classified into four classes: very thick soil ([90 cm), thick soil (90–50 cm), shallow soil (50–20 cm) and very shallow soil (20–0 cm). Shallow and very shallow regions occupy 93% of the region. The thick soil which occupies 3.8% of the region dominantly exists in parts of the study region where dense landslide occurrences are observed. This is mainly due to the potential of such soil classes to retain water during precipitation and then to have reduction in the shear strength. As there is a relationship between land use/cover and slope stability, especially for shallow landslides (Go¨mez and Kavzoglu 2005), the land use/cover map of the region to be included in the susceptibility mapping is beneficial. There are five land use classes in the study region: bare

123

862

Environ Earth Sci (2012) 66:859–877

Fig. 1 Study region, Kumluca watershed, in the south western part of Bartin City and map showing the landslide locations with two types of slides, type 1: active with depth [5 m, type 2: dormant with depth [5 m

Fig. 2 Geologic map of the study region overlaid with road network and fault lines. The legend refers to 1: alluvial, 2: andesite, 3: sandstone–mudstone, 4: marl 5: limestone, 6: conglomerate (GDMRE 2007)

rock and debris, flood inundation area, dry farming, settlement and forest (Fig. 3). The land use map is obtained from GDRA (2007). The landslide activity increases in the regions where original vegetation cover has been removed or altered. In the study region, dry farming land use class have high amount of landslide occurrence (Fig. 3). Similarly, it is observed that vegetation shows a high degree of relation with the landslide occurrences. When the ratio of all landslide containing regions to landslide occurrences in each class is analyzed, it is clearly seen that more than 75% of landslide containing area are in the unvegetated region. This indicates that the modification of natural conditions by human activities such as forest harvesting could have considerable effect on landslide activity in the study region.

123

Fig. 3 The land use map of the region showing five different land use classes overlaid with landslide locations (GDRA 2007)

In the study region, there are totally 184 landslide locations including dormant and active slides in the region covering approximately 47.68 km2 of the study area, and the largest slide area occupies 4.5 km2. Due to these severe events, in the period between 1975 and 2005, approximately 537 houses are reported to be moved to safer regions (GDDA 2007). Slides and flows are the two main types of mass movements in the study region. Among the slide types, rotational (Fig. 4a, d) or complex (Fig. 4c) landslides are more common. The second most common type of failure is flow, mostly earth and debris flow in highly weathered alteration zones (Fig. 4b). The relative depth of failure surfaces was classified as shallow (depth \ 5 m) and deep-seated (depth [ 5 m) (Go¨kc¸eog˘lu et al. 2005). All slides in the study area are deep-seated, as presented in Fig. 1. For simplicity, the activities of mass movements are classified into two groups

Environ Earth Sci (2012) 66:859–877

863

Fig. 4 Landslides in Bartın Kumluca region: a rotational slide, b debris flow, c shallow complex slide and d rotational slide

as active and inactive. In Fig. 1, type 1 represents inactive slides with a depth[5 m., and type 2 is active slides with a depth [5 m. Active landslides are defined as those currently moving, whereas inactive ones are relict according to WP/WLI (1993). In the study area, both active and inactive slides can be seen, and approximately 86.41% of all landslides are active whereas the rest is composed of non-active slides. The morphometric landslide size parameters, such as width, vary from 30 m to several hundred meters, while the length varies from 20 m to several kilometers. Data sources For the landslide susceptibility assessment, several parametric maps, or landslide influencing parameters, are necessary for evaluation, together with the landslide inventory. For this reason the first step was data collection and construction of a spatial database from which the relevant factors were extracted. From five different organizations, six different data sets are acquired: 1:25,000 scale topographic maps from General Command of Mapping (GCM 2007) including contour map, hydrology and transportation information, 1:25,000 scale geological maps from GDMRE including geological formations and fault lines, 1:25,000 scale soil maps from GDRA including the soil depth, soil erosion as well as land use maps (Table 1). Each data set includes different information related to landslide occurrences and has different map projection. Before production of input data from these row data sets,

all of the data sets are converted into the same map projection system. The Universal Transverse Mercator (UTM) projection with ED50 datum is used for the common map projection system. Then the raw data were processed for obtaining suitable data layers for landslide susceptibility mapping. The factors affecting landsliding in the study region can be categorized into three types (Table 1): The morphological factors (slope, aspect, curvature, plan curvature, profile curvature), geological factors (lithological formations of alluvium, andesite, clay stone, limestone, clay limestone, conglomerate, distance to fault, soil types of alluvial soil, grey brown podzolic soil, colluvial soil, brown forest soil, brown forest soil without lime, soil depth classes of very thick, thick, shallow, very shallow, erosion levels of less, medium, severe, very severe), environmental factors [topographic wetness index, normalized difference vegetation index (NDVI), distance to road, distance to hydrologic network, road density, hydrologic density, land use classes of bare rock and debris, flood inundation area, dry farming, settlement and forest]. Totally 37 factors are obtained by processing the row data in the ArcGIS 9.1 environment. The data set for the layers of factors affecting landslides in the region are continuous, categorical or ordinal data types. In this study all the categorical and ordinal data is converted into raster format under GIS environment. Hence, the factors land use, soil, geology, soil depth and erosion are converted into a discrete or nominal data type by assigning a code. As a result, each category has been physically represented as binary variables in the form of

123

864

Environ Earth Sci (2012) 66:859–877

Table 1 Data acquired from different organizations and the variables created from row data and their abbreviations used in the paper Data format

Scale

Type

Topographic map

1:25,000

Morphological

ASTER (14 Spectral Band, Level 3A) (22.10.2005)

VNIR (3): 15 m

Topographic map

1:25,000

Geology map

Variables

Scale

Category name

Elevation

Continuous



Slope

Continuous



Aspect

Continuous



Curvature

Continuous



Plan curvature

Continuous



Profile curvature

Continuous



Environmental

Topographic wetness index

Continuous



Environmental

NDVI (normalized difference vegetation index)

Binary



Environmental

Distance to road Distance to hydrologic network

Continuous Continuous

– –

Road density

Continuous



Hydrologic density

Continuous



Distance to fault lines

Continuous

Lithologic formations

Categorical

SWIR (6): 30 m TIR (5): 90 m

1:25,000

Geological

Alluvial Andesite Sandstone–mudstone Limestone Marl Conglomerate

Soil map

1:25,000

Soil type

Categorical

Alluvial soil Grey brown podzolic soil Colluvial soil Brown forest soil Brown forest soil without lime

Land use map

1:25,000

Environmental

Land use

Categorical

Bare rock and debris Flood inundation area Dry farming Settlement Forest

Soil map

1:25,000

Geological

Soil depth

Ordinal

Very thick Very shallow Thick Shallow

Erosion

Ordinal

Less Moderate Severe Very severe

function of the presence/absence or 1/0 of a class of slope instability factors in the study region. One of the key assumptions of susceptibility mapping is that slope failures in the future will be more likely to occur under the conditions which led to past and present slope movements (Varnes 1984; Carrara et al. 1991, 1995). For this reason, a landslide inventory for the study area should be available prior to susceptibility mapping. In this study,

123

the information about the past states of landslides was acquired from the General Directorate of Mineral Research and Exploration (MRE) on 1:25,000 scales. In the Project of Turkish Landslide Inventory Mapping (Duman et al. 2001), mass movements are classified according to the terminology of Varnes (1978), i.e. slides, creep, falls, and flows. In the study area, slide type mass movements are dominant and flows are also present. This

Environ Earth Sci (2012) 66:859–877

study separates flows from slide type movements. Hence, the flow deposits present in Kumluca watershed were not included in the landslide inventory map and only the slides were involved in further susceptibility assessment studies.

Generation of mapping units The selection of appropriate mapping unit is the first step in landslide susceptibility mapping. The mapping unit also affects data processing during the preparation of factor layers to be used in susceptibility mapping. Mapping unit is basically partitioning the land surface into homogeneous regions. It is the minimum meaningful spatial unit in the analysis as each unit is assigned a unique susceptibility value and each unit has a set of ground conditions that are different from its adjacent units (Begueria and Lorente 1999). In this study two different mapping units, namely grid-based mapping unit and slope unit-based mapping unit are selected for the investigation of their effect on susceptibility mapping. While handling the two different mapping units, it should be highlighted that the data given in Table 1 should be prepared differently for each mapping unit. The grid-based mapping unit and slope unit-based mapping unit have different forms; therefore, 37 variables given in Table 1 change their characteristics in terms of size and shape for both mapping units. In the grid-based mapping unit, a grid mesh is overlaid for each variable with the selected size. In this study, grids have cell size of 20 9 20 m and the total number of cells is 810,005. Then the attribute of each variable are assigned to each grid cell. Hence for this study, the dimension of the data is 810,005 9 37 pixel units which is computationally a large data set. On the other hand for the slope unit case, each data set in Table 1 is overlaid with the slope unit map created for the Kumluca watershed. There are 138 slope units obtained for the study area. Then the 37 variables in Table 1 are assigned to each slope unit using zonal statistical functions which performs operations on a per-zone basis. As a result the dimension of the data became 138 9 37 slope units which are computationally more convenient to handle. In grid-based approach, it is difficult to handle data set of 810,005 9 37. Thus, the multicollinearity analysis is applied to reduce redundancy and hence the dimension of data. Due to different data forms for both mapping units, the number of output data becomes different at the end of the multicollinearity analysis. In other words, slope units are larger units than grids and while assigning data values to slope units there are more smoothing as the zonal statistics are used. The result of multicollinearity analysis for the grid-based mapping unit reduced 37 variables to 15 variables (Table 2). Similarly, 37 variables are reduced to 20 variables (Table 2) in the slope unit-based mapping unit

865

after multicollinearity analysis. As can be seen from Table 2, in grid-based approach there are three morphological parameters, namely elevation, slope and aspect, while the number of morphological parameters are four (elevation, slope, aspect and curvature) in slope unit-based approach. Similarly the number of factors in grid-based approach for environmental and geological factors is less than that of slope unit-based approach. In slope unit-based approach, curvature from morphological factors and forest from environmental factors are added to the factors of gridbased approach. For the geological factors sandstone– mudstone, brown forest soil, grey brown podzolic soil are common factors in both mapping units. More soil thickness parameters are found to be significant in slope unit-based approach; whereas distance to fault is included in the list of geological factors in grid-based approach. As can be seen from Table 2, the models from both of the mapping units contain the similar morphological factors. The grid-based mapping unit does not contain curvature as the curvature is not physically meaningful at grid-based slope unit. Similarly for the environmental factors, almost all of the factors are the same except for the forest, which is included in the slope unit-based mapping unit but not included in the grid-based mapping unit. This is also related to the aggregation of slope-unit mapping approach as compared to grid-based mapping units, which is directly related to the trivial issue of modifiable areal units problem (MAUP). The same situation is observed for the geological factors, where most of the geological factors are common in both models but two are different due to MAUP. When the nature of excluded parameters is investigated, it can be seen that they are mostly derived variables from the main parameter set. For example, curvature is derived from elevation data set and hence including elevation is already an explanation potential on landslide occurrence. In addition to that although statistically significant correlation is obtained for some of the parameters, their inclusion in the model is performed based on the nature of landslides in the area. For instance, in slope-based mapping unit, the distance to fault shows a high correlation with deep soil depth. The existence of such correlation can hardly be explained when the physical nature of the region is considered. This correlation is probably only due to statistical reasons. For this reason, as the area is not seismically active, there were almost no information about the activity of the faults and also the landslides were mostly in the form of slides and flows, which is closely related to the soil depth, the soil depth is included in the model instead of distance to fault. Data preparation in grid-based mapping unit The grid units are most widely used mapping units in landslide susceptibility mapping as it is easier to obtain,

123

866

Environ Earth Sci (2012) 66:859–877

Table 2 Variables reduced at the end of the multicollinearity analysis for grid-based and slope unit-based mapping units Mapping unit

Morphological factors

Environmental factors

Geological factors

Grid-based mapping unit

Elevation, slope, aspect

Distance to road, distance to hydrologic network, NDVI, dry farming

Sandstone–mudstone, distance to fault lines, colluvial soil, brown forest soil, grey brown podzolic soil, very thick soil, moderate erosion

Slope unitbased mapping unit

Elevation, slope, aspect, curvature

Distance to road, distance to hydrologic network, NDVI, dry farming, forest

Sandstone–mudstone, conglomerate, alluvial, brown forest soil, grey brown podzolic soil, brown forest soil without lime, very shallow soil, shallow soil, thick soil

where the grids are formed simply by overlaying regular quadrats in the study region. The grid data processing is fast due to its matrix form. However, it may become computationally inefficient if the selected grid size (resolution) is small, as it produces overwhelming number of grid cells. Hence, selection of the appropriate grid size for susceptibility mapping requires an initial cost benefit analysis to determine optimum cell size. Moreover, it may be difficult to express the relation between the grid cells and the geological, morphological or other terrain units. For this reason it is mostly problematic to assign landslide occurrence to many different cells that represent a single movement (Begueria and Lorente 1999). In this study, the data layers given in Table 1 are prepared in grid format using GIS. The grids are obtained by simply overlaying regular quadrats in the study region. Then the factor values are assigned to the grid cells. As quantitative susceptibility mapping requires the establishment of regression equations, the factor values in the grids have to be in ratio/interval or nominal scale. This leads to further processing of categorical factor maps such as land use, geology, soil type maps, etc. In this study, the categorical factor maps which are land use, soil, geology, soil depth and erosion were converted into ordinal or nominal data format. For this reason, each category has been physically represented as binary variables, as a function of the presence/absence or 1/0 of a class. In order to evaluate the landslide characteristics, the landslide inventory map was converted into a Boolean image using ArcMap with a grid resolution of 20 m 9 20 m. The study area has totally 810,005 pixels where 14.7% of the total pixels contain landslides (one is assigned as the pixel value for landslide containing cells). All the other pixels were given value of zero thereby producing a Boolean layer representing the landslide database to be used in susceptibility mapping. To store the landslide related information and landslide inventory into a database, mid points of the cells are created with 20-m intervals. Then this point mesh was overlaid over all factor maps. In this way, separate attribute tables of these points are obtained for each factor maps. These separate attribute tables were then merged to construct the complete relational database for all of the

123

parameters, which are going to be used in susceptibility mapping. Before constructing the regression models, the variables were further analyzed to fulfill the modeling assumptions of logistic regression. For this purpose, multicollinearity and normality checks have to be done before constructing the model. The data sets contain nominal and continuous data (Table 1). Hence for different data formats, suitable data reduction methods should be used. For the nominal variables, cross tabulation procedure is applied to investigate bivariate correlations between the variable pairs. The bivariate correlations procedure is useful for studying the pair wise associations for a set of continuous variables. The association between nominal and continuous data is analyzed by logistic regression taking each nominal variable as dependent and the rest of continuous data as independent, respectively. As there has been no established criterion for the variables to be labeled as correlated or not, the correlation coefficient larger than 0.7 is considered to be the criterion of correlation in this study. Thus the variable pairs which have correlation higher than 0.7 are marked to be used further in the multicollinearity analysis. The results of multicollinearity analyses were given in Table 2 and 37 variables are reduced to 15. For normality checks, the Q–Q plot is used for each factor considered in the analysis. To ensure normal distribution of variables it is seen that log transformation was appropriate for slope, distance to hydrolic network and dry farming whereas power transformation was proper for NDVI. Data preparation in slope unit-based mapping unit The second selected mapping unit is the slope unit where space is subdivided into regions based on certain hydrological criteria. The slope units eliminate the shortcoming of grid-based mapping by representing a clear physical relationship between slope instability phenomena and the fundamental morphological elements of a hilly or mountainous terrain such as drainage and divide lines (Huabin et al. 2005). Especially large mass movements can be represented more realistically in slope units, because they mostly subdivide the region into uniform slope formations (Begueria and Lorente 1999). The main disadvantage of the

Environ Earth Sci (2012) 66:859–877

slope unit is that the same probability of landslide occurrences is assigned to the entire slope unit. Moreover, in the estimation of the probability of finding a slope failure, they provide no information about which part of the slope is more likely to be affected (Huabin et al. 2005). Physically the slope unit can be considered as the left or right side of a sub-basin of any order into which a watershed can be partitioned. Therefore, slope unit can be identified by the intersection of a ridge line and a valley line. A GIS-based hydrologic analysis and modeling tool, Arc Hydro (Maidment 2002), is used to obtain the dividing lines for identifying slope units in this study. The procedure used to extract slope units is as follows (Fig. 5): In the first step, the digital elevation model (DEM) and Inverse DEM (InvDEM) are obtained. InvDEM is the reverse DEM which is obtained by turning the high DEM values into low values, and low DEM values into high values (Xie et al. 2004). In the second step, the hydrological model is applied both for the DEM and InvDEM. In the third step, the outline of the watershed polygon both for the DEM and the InvDEM is obtained. The watershed boundaries obtained from the DEM are topologically the watershed divides or ridge lines. The watershed boundaries obtained from the InvDEM are topologically the valley line or drainage line. Then for the forth step, the watershed boundaries obtained both from DEM and InvDEM are combined in the GIS environment to generate slope units. The hydrological model for obtaining watershed boundary can be described as follows: The DEM or InvDEM surface is hydrologically connected to obtain watershed boundary. For this reason, the low elevation areas in the DEM or InvDEM which are surrounded by

Fig. 5 Flow chart showing the steps of slope unit generation

867

higher terrain that disrupts the flow path are filled. The flow direction is calculated by examining the eight neighbors of a cell according to the eight direction method and by determining the neighbor with the direction of the steepest downhill slope with respect to the cell of interest. Then the associated flow-accumulation grid is computed by summing the number of uphill cells that ‘‘flow’’ to any other cell. As a result, each cell value represents the number of uphill cells flowing to it. In addition, a grid representing a stream network is created by querying the flow-accumulation grid for cell values above a certain threshold. This threshold is defined either as a number of cells or as a drainage area in square kilometers. In general, the recommended size for stream threshold definition is 1% of the maximum flow accumulation (Gopalan et al. 2002). A smaller threshold results in denser stream network and usually in a greater number of delineated catchments. The watershed boundaries are determined for DEM or InvDEM by following a flow direction grid backward. By this process all of the cells that drain through a given outlet are determined. The created grid carries a value in each cell indicating to which watershed the cell belongs to. For the last step of hydrological procedure, these cells are converted to a polygon representing the watershed. More information about the hydrological procedure can be acquired from Chinnayakanahalli et al. (2002) and Gopalan et al. (2002). As a final step, the slope units are obtained by combining the watershed deduced from DEM and the watershed deduced from the reverse DEM. In order to assign the factor values to the slope units, each data layer in raster format are analyzed statistically. Zonal statistical analyses performed on a per-zone basis are assigned to a single slope unit value. The mean value for each slope unit is calculated and assigned to each slope unit. An index is created to show the percentage of landslide area in each slope unit area. A threshold index value is determined by analyzing the index values. The threshold value is selected to be 0.6. If a landslide occupies less than 60% of the slope unit, the value of zero is assigned and one is assigned otherwise. As a result, totally 91 landslide occurrences are assigned to the slope units and there are 138 slope units in the study region with landslide free zones. As a result a database for slope unit is created which includes both the information of factors considered for affecting the landslide and information on landslide inventory. After the creation of the spatial database for the data layers, the database is rechecked for the missing or invalid data. The multicollinearity and normality checks are also performed for the variables created for slope units. As a result of the analysis the variables reduced into 20 variables given in Table 2. In addition, for normality checks the Q–Q plot is used for each factor considered in the analysis. It is found that Elevation, Curvature and distance to hydrologic

123

868

Environ Earth Sci (2012) 66:859–877

network need log and Distance to road need power transformation for better fit to normal distribution.

Quantitative susceptibility mapping models In this study logistic regression (LR), which is one of the most widely used quantitative landslide susceptibility mapping method and spatial regression (SR), which has not been used previously in landslide susceptibility mapping, is selected for comparison. Logistic regression (LR) Logistic regression (LR) is basically an extension of multiple regressions in situations where the dependent variable is not a continuous or quantitative variable (George and Mallery 2000). In other words the dependent variable is sampled as a binary variable (i.e. presence/absence of landslide). The advantage of logistic regression over the multiple regression and discriminant analysis is that logistic regression enables analyzing predictor variables of all types (i.e. continuous, discrete and dichotomous) and allows one to produce nonlinear models (Mertler and Vannatta 2002). The LR procedure offers several methods for stepwise selection of the ‘‘best’’ predictors to be included in the model. In this study, at the beginning 15 independent variables are considered for grid-based approach and 20 independent variables are considered in the slope unitbased approach in the LR model and a forward stepwise procedure was used to introduce the independent variables in the analysis. Forward stepwise methods start with a model that does not include any of the predictors. At each step, the variables which are determined to be significant are added to the model while all others are withheld. The variables are left out of the analysis at the last step which have significance level larger than 0.05. As a result, the procedure selects only the variables that significantly contribute to improve the model. Generally, LR involves fitting the dependent variable using an equation in the following form: f ð xÞ¼ logitðPðXÞÞ¼ lnðpðXÞ=ð1  pð X ÞÞ ¼ c0 þ c1 x1 þ    þ cn xn

ð1Þ

where p(X) is the probability of event X occurring (landslide occurrence), and 1 - p(X) is the probability of event X not occurring. p/(1 - p) is the odds or likelihood ratio. The natural logarithm of the odds is called logit c0 is the intercept, and c1, c2,…, cn are coefficients that measure the contribution of independent factors (x1, x2,…, xn) to the variations in the dependent variable Y (landslide occurrence). The forward stepwise procedure is used to obtain LR models for both grid-based and slope unit-based

123

approaches. The models created for both grid-based and slope unit-based approaches are given in Eqs. 2 and 3, respectively. As seen from Eqs. 2 and 3, although approximately similar morphological, environmental and geological factors (Table 2) are included in the analyses, slope unit-based mapping unit results in more simplified form. While eleven factors are found to be significant in Eq. 2, only six factors are significant in Eq. 3. In grid-based mapping unit (Eq. 2), factors of slope, aspect, distance to road, distance to hydrologic network, NDVI, colluvial soil, gray brown podzolic soil have a reducing effect on landsliding, elevation, distance to fault lines, brown forest soil, dry farming, settlement, moderate erosion, sandstone– mudstone have contribution to landsliding. Among the factors contributing to landslides (Eq. 2), moderate erosion, settlement and dry farming have the highest contribution. On the other hand, colluvial and grey brown podzolic soil types have reducing effect on landsliding in grid-based approach, which is plausible when the spatial distribution of landslides is considered, where very few landslides are observed in these soil types. In addition, the grid-based units does not represent the physical real slopes, hence the model reflects this situation with negative effect of slope parameter on landsliding. In slope unit-based approach, regression equation (Eq. 3) involves factors of distance to road, forest, brown forest soil type, thick soil, sandstone–mudstone and conglomerate. Among these factors, forest has a reducing effect on landsliding while the rest of the factors in Eq. 3 have contribution to landsliding. When the coefficients of factors in Eq. 3 are examined thick soil and distance to road have the highest effect on the occurrence of landslides in the study region. Distance to road indicates the role of human effect in the region hence the analysis shows that human effect has more influence than the other factors. f ð xÞ ¼ 0:527 elevation þ 0:133 distance to fault lines þ 1:251 brown forest soil þ 2:118 dryfarming þ 2:481 settlement þ 3:690 moderate erosion þ 1:496 sandstone  mudstone  1:340 slope  0:686 aspect  1:626 distance to road  0:146 distance to hydrologic network  0:301 NDVI  3:854 colluvial soil  2:318 grey brown podzolic soil  4:849

ð2Þ

f ðxÞ ¼ 32:871 distance to road þ 5:264 brown forest soil þ 160:502 thick soil depth þ 30:371 sandstone  mudstone þ 18:640 conglomerate  52:499 forest þ 10:570:

ð3Þ

In order to evaluate the significance of the obtained equations in Eqs. 2 and 3, training sets are obtained and then they were used for performing Chi-square Hosmer–

Environ Earth Sci (2012) 66:859–877 Table 3 LR model test results for grid-based and slope unitbased mapping units

869

Training set no.

-2 log likelihood

Cox and Snell R2

Nagelkerke R2

Chi-square

Grid-based model

1

132,987

0.17

0.33

2,867

Slope unit-based model

2

17.28

0.62

0.87

6.62

Type Forward stepwise

Lemeshow test, and evaluating Cox and Snell R2 and Nagelkerke R2 values (Table 3). The -2 log likelihood provides an index of model fit. The lower the value the better the model fits the data. The slope unit-based model results indicate that it fits better than grid-based model. The Chi-square value compares the actual values for dependent variable with the predicted values. Cox and Snell R2 and Nagelkerke R2, are essentially estimates of R2 indicating the proportion of variability in the dependent variable that may be accounted for all predictor variables included in the model. Larger pseudo-R2 statistics indicate that high amount of variation is explained by the model and it ranges from 0 to 1. As can be seen from Table 3, the Cox and Snell R2 and Nagelkerke R2 values are higher when the model is constructed using slope unit-based approach. Landslide susceptibility maps for both data sets are created after obtaining logistic regression models. The logit of the f(x) function in Eq.4, P(L), which is defined by the logistic function in terms of probability, was calculated for all of the mapping units. As f(x) varies from -? to ??, the probability varies from 0 being no susceptibility to 1 being complete susceptibility. PðLÞ ¼

1 : 1 þ Expf ðxÞ

ð4Þ

The calculated probability values of grid-based and slope unit-based approaches are then used to produce thematic landslide susceptibility maps in GIS as shown in the (Fig. 6). The obtained susceptibility map is classified as low, medium and high susceptibility classes. For the grid-based model, it is found that 13% of the study region lies in the high susceptibility zone. 17 and 70% of the study region is in the medium and low susceptibility zones, respectively (Fig. 6a). For the slope unit-based approach, it is found that 36% of the study region lies in the high susceptibility zone. 9 and 55% of the study region is in the medium and low susceptibility zones, respectively (Fig. 6b). The resultant maps indicate that slope unit-based approach classified larger regions as highly susceptible as compared to gridbased approach. This is mainly due to the higher amount of generalizations in slope unit-based mapping unit. Spatial regression (SR) Spatial regression is a global spatial modeling technique in which spatial autocorrelation among the regression

Fig. 6 LR model prediction map created for landslide susceptibility. a Map for grid-based approach. b Map for slope unit-based approach

parameters are taken into account (Du¨zgu¨n and Kemec¸ 2008). Spatial autocorrelation or dependence means that observations at location i depend on other observations at locations j = i. When there is spatial dependence, neighboring units exhibit a high degree of spatial correlation than units located far apart (LeSage 1999). If the phenomenon has a spatial nature, incorporating spatial correlation into model provides better performance, which is reflected by higher R2 values. Spatial correlation can be incorporated into the model by modifying the regression model with the

123

870

Environ Earth Sci (2012) 66:859–877

contiguity matrix (proximity matrix), which contains neighborhood information for the spatial units (grids or slope units). There are a large number of ways to construct the proximity matrix. Some alternative ways can be listed: sharing a common edge (linear contiguity), sharing a common side (Rook contiguity), sharing a common vertex (Bishop Contiguity), length of shared borders, and intercentroid distance functions. For the spatial regression model the first task is to construct a spatial contiguity matrix. In this study, in order to obtain proximity matrix, a function is developed based on Delaunay triangularization (Matlab 7.1 help 2008). The input variables are coordinates of x and y as n by 1 vectors. The Delaunay triangulation is a set of lines connecting each point to its natural neighbors (Fig. 7). The circle circumscribed about a Delaunay triangle has its center at the vertex of a Voronoi polygon. It returns a set of triangles such that no data points are contained in any triangle’s circumscribed circle, for the data points defined by vectors x and y. After creation of Delaunay triangulation, the sum of corners (ssum) that share a common vertex is computed (LeSage 1999). These sum of corners are used to calculate the weight matrixes (Eqs. 6–8) based on Eq. 5. Three different spatial weight matrixes are created as output. These matrixes are:   0 wmat ¼ sqrt ð1=ssumÞ ; ð5Þ W1 ¼ wmat

ð6Þ

W2 ¼ wmat  wmat  A

ð7Þ

W3 ¼ wmat  A  wmat

ð8Þ

where A represents the adjacency matrix from Delaunay triangles which depicts the neighbors of each vertexes,

ssum is the sum of corners connected to same vertex, wmat is the standardized first-order contiguity matrix, computed based on the sum of corners connected to the same vertex. There are basically three spatial regression models depending on the formulation of spatial interaction: simultaneous auto regression (SAR), moving average (MA) and conditional spatial regression (CSR). In this study, spatial autoregressive models are extended to the case of limited dependent variables where the logit and probit variants of the spatial autoregressive models are devised. Logit and probit models arise when the dependent variable y in the spatial autoregressive model takes values 0 and 1 where y = 0 might represent a coding scheme indicating a lack of landslides and y = 1 denotes the presence of landslide (LeSage 1999). Spatial autoregressive modeling of limited dependent variable data samples would be interpreted in the framework of a probability model. SAR model (Eqs. 9–11) is also called autocorrelated error model as the error term in non-spatial regression model is formulated in such a way that it involves spatial autocorrelation. Y ¼ Xb þ U:

ð9Þ

U ¼ qWy þ e

ð10Þ

Then Y¼ Xb þ qWy þ e

ð11Þ

where e is the vector of errors with zero mean and constant variance r2, W is the proximity matrix, q is the interaction parameter or spatial autoregressive coefficient, and b is the parameter to be estimated due to relationship between the variables. After application of spatial autoregressive modeling for the limited dependent variable in grid-based and slope unitbased approaches, the models developed for the study region are presented in Eqs. 12 and 13, respectively. f ð xÞ ¼ 0:255 dry farming þ 0:112 brown forest soil þ 0:508 moderate soil erosion þ 0:01371 sandstone  mudstone  0:304 elevation  0:820 slope  0:297 aspect  0:272 distance toroad  0:21 NDVI  1:113 colluvial soil þ 0:811326 ð12Þ f ð xÞ ¼ 2:45 slope þ 7:648 topographic wetness index þ 0:893 distance to road þ 3:605 brownforest soil þ 2:980 thick soil þ 1:71 conglomerate  2:06 NDVI  6:08 dry farming  7:766 forest  1:3411 very shallow soil depth  1:638 shallow soil depth þ 0:342:

Fig. 7 Delaunay triangularization approach

123

ð13Þ

The developed models for grid-based data set in Eq. 12 indicate that factors of dry farming, brown forest soil,

Environ Earth Sci (2012) 66:859–877

moderate soil erosion and sandstone–mudstone have contribution to landsliding; elevation, slope, aspect, distance to road, NDVI and colluvial soil have a reducing effect on landsliding. When the coefficients of factors in Eq. 12 are examined dry farming, moderate soil erosion provide highest contribution to landsliding which are consistent with the observations in the field. On the other hand, slope and NDVI provide highest effect on reducing landslide which is also meaningful. In slope unit-based approach, regression equation (Eq. 13) involves factors of Slope, topographic wetness index, distance to road, brown forest soil, thick soil, conglomerate, NDVI, dry farming, forest, very shallow soil depth, and shallow soil depth. Among these factors NDVI, dry farming, forest, very shallow soil depth, shallow soil depth has a reducing effect on landsliding while the rest of the factors in Eq. 13 have contribution to landsliding. When the coefficients of factors in Eq. 13 are examined topographic wetness index, brown forest soil and thick soil depth have the highest effect on the occurrence of landslides in the study region which is logical. On the other hand forest, dry farming and NDVI have the highest effect on the reducing of landslides in the study region. In SR model local pseudo R2 shows that nearly 67 and 87% of the variance in landslide occurrence is explained by the model for grid-based and slope unit-based approaches respectively. The R2 values of SR models provide considerably higher R2 values than LR models. The susceptibility maps produced using Eq. 4 are given in (Fig. 8). For the grid-based model, it is found that 18% of the study region lies in the high susceptibility zone. 42 and 40% of the study region are in the low and medium susceptibility zones, respectively (Fig. 8a). For the slope unitbased approach, it is found that 40% of the study region lies in the high susceptibility zone. 16 and 44% of the study region are in the medium and low susceptibility zones, respectively (Fig. 8b). The resultant maps indicate that slope unit-based approach classified larger regions as highly susceptible as compared to grid-based approach similar to LR models. This may be again due to the higher amount of generalizations in slope unit-based mapping unit.

Comparison of LR and SR models The relative operating characteristics (ROC) curve is used to compare the predictive abilities of SR and LR models for grid-based and slope unit-based approaches. The ROC curves illustrate how well the two models predict landslide occurence. The plot of the curves offers an excellent visual comparison of the models’ performances. For the accuracy analysis, the landslide locations which are separated for

871

Fig. 8 Susceptibility map created with spatial regression model a for grid-based approach and b for slope unit-based data set

only accuracy assessment and which are not used for the analysis is compared with LR and SR prediction results. The accuracy test data contains 20% of dormant landslides of the whole landslides. The further the curve lies above the reference line the more accurate is the test. Sensitivity is the probability that a ‘‘positive’’ case is correctly classified and is plotted on the y-axis of the ROC curve. Specificity is the probability that a ‘‘negative’’ case is correctly classified (Fig. 9). Based on their distances from the reference line, all models do better than guessing, which can also be seen from the area under the curve—AUC (Table 4), indicating that the asymptotic significance of each model is less than 0.05. The SR model shows a higher predictive performance than the LR models both in the slope unit- and grid-based approaches as the ROC curve for SR is further away from the reference line than LR model.

123

872

Environ Earth Sci (2012) 66:859–877

Fig. 9 Prediction accuracy assessment of models using ROC curve. a Grid-based models. b Slope unit-based models

From the confidence intervals it can be seen that SR model have better prediction performance than LR model for both grid-based and slope unit-based mapping units. The main reason for this better performance is that the spatial correlations between the mapping units are incorporated into the model in SR while this fact is not considered in LR model. This fact is also consistent with the physical nature of data where neighboring mapping units are expected to have similar characteristics as compared to units which are further apart.

In addition when the same models are compared for the mapping unit type, it can be clearly seen that SR and LR model performances increase when the slope unit-based models are used. This can also be seen from the increase of AUC from 0.77 to 0.90 in SR models and from 0.74 to 0.82 in LR models (Table 4). Moreover, the model results can be compared based on firstly the percentage of susceptibility class for both mapping unit in both models (Table 5) and secondly overlapped susceptibility classes for both mapping unit and for both models (Table 6). From Table 5, it can be clearly seen that the high susceptibility zones increase in slope unitbased mapping unit both in SR and LR models. Whereas, the percentage of medium class decreases in slope unitbased mapping unit both in SR and LR models as compared to grid-based mapping unit. This might be due to the nature of grid-based mapping unit, where the study region is over partitioned. In the grid-based approach, the whole region is divided into cells which result in too many cells with nonoccurrence of landslides. The ratio of landslide occurrence to landslide nonoccurrence is 14%. Whereas in the slope unit-based mapping unit, this ratio is 65%. In order to compare models with different mapping units, the spatial overlaps of the susceptibility classes should also be considered (Table 6). In the slope unit-based mapping unit 28 and 38% of the high and low susceptible classes overlap in both LR and RS models, respectively, whereas the medium classes overlaps only 1%. In the grid-based mapping unit 11, 15 and 42% of the high, medium and low classes overlaps, respectively. The spatial overlay analysis clearly indicates (Table 6) that the grid-based susceptibility classes of LR and SR models result in more overlapping regions than slope unit-based susceptibility classes of LR and SR models. This shows that the grid-based model results in both models are consistent within themselves. When the total percentage of the overlapping regions are considered, it can be clearly seen that in both mapping units the total overlapping regions are around 67% (Table 6). When the models with different mapping units are compared (Table 6), it can be seen that for each model the susceptibility classes overlap for both slope unit-based and

Table 4 Area under the curve Mapping unit type

Grid-based models Slope unit-based models

Test result variable(s)

AUC

Std. error (a)

Asymptotic sig. (b)

Asymptotic 95% confidence interval

Lower bound

Upper bound

Lower bound

Upper bound

Lower bound

SR

0.774

0.007

0.000

0.760

0.788

LR

0.744

0.007

0.000

0.730

0.758

SR

0.898

0.020

0.000

0.859

0.937

LR

0.820

0.028

0.000

0.766

0.875

a Under the non parametric assumption b Null hypothesis: true area = 0.5

123

Environ Earth Sci (2012) 66:859–877

873

Table 5 The percent of susceptibility classes for each mapping unit and for each SR and LR model Model type

Percentage of susceptibility class for grid-based mapping unit

Percentage susceptibility class for slope unit-based mapping unit

Low

Medium

High

Low

Medium

High

LR

70

17

13

55

9

36

SR

42

40

18

44

16

40

Table 6 Comparison of overlapped susceptibility classes for both mapping unit and for both model type Overlap (%) High

Total overlap Medium

Low

Overlapping of LR and SR susceptibility class for slope unit-based mapping unit

28

1

38

Overlapping LR and SR susceptibility class for grid-based mapping unit

11

15

42

68

Overlapping LR susceptibility class for slope and grid-based mapping unit

1

1

44

46

Overlapping SR susceptibility class for slope and grid-based mapping unit

14

9

30

53

grid-based mapping units. The results indicates that the SR model provide more overlapping regions in both mapping units for all susceptibility classes whereas, LR provide only high degree of overlapping regions in low susceptible zones. The overlapping of high susceptible zones is more critical for planning decisions than the low susceptible zones. Hence, it can be concluded that SR model is more robust to changes in mapping units. Accuracy assessments The landslide susceptibility classes for each method in each mapping unit are evaluated depending on the observations made during field surveys. In the study region, there are 28 villages and 18 of them are surveyed. The susceptibility of the visited regions is evaluated visually with a group of experts. Then the ground truth is compared with the model results. To evaluate the susceptibility of the landslide activity, landslide density at each slope extends and size of the landslide is considered in the study region. In the study region, the ground truth data is spatially located using both the DGPS (rms ± 60 cm) and Magellan Hand GPS (rms ± 3 m) by giving some indexes to indicate if the slope is on the high, medium or low susceptible zone (Fig. 10a). Then these ground data is combined with the landslide susceptibility map of the region (Fig. 10b). To compare the model results, the density map of the combined map (ground truth added to existing landslide locations) is created (Fig. 10c) and the model results are compared using error matrix. The overall accuracy (Eq. 14) and the kappa coefficients (Eq. 15) are obtained from the error matrix. This comparison is performed by overlaying the created density map of the ground truth data with the prediction models.

67

Pr Overall accuracy ¼ Kappa ¼ k ¼

N

Pr

i¼1 xii

N

P  ri¼1 ðxiþ  xþi Þ i¼1 xiiP N 2  ri¼1 ðxiþ  xþi Þ

ð14Þ ð15Þ

where r is the number of rows in the error matrix, xii is the number of observations in row i and column j (on the major diagonal), xi? is the total of observations in row i (shown as marginal total to right of the matrix), x?i is the total of observations in column i (shown as marginal total at bottom of the matrix), and N is the total number of observations included in matrix. Descriptive measures as overall accuracy is computed by dividing the total number of correctly classified pixels by the total number of reference pixels. The accuracies of individual categories can also be calculated by dividing the number of correctly classified pixels in each category by either the total number of pixels in the corresponding row or column which is called producers and users accuracy, respectively. Overall accuracy does not include the errors of omission and commission (off diagonal elements) but only include data along the major diagonal. On the other hand, kappa coefficient includes both values which are desirable to compute and analyze. The kappa accuracies of grid-based approach for LR and SR models are computed to be 50 and 55%, respectively (Fig. 11). For the slope unit-based approach, the kappa accuracy of SR increased to 57% and LR decreased 0.48%. As a result, it can be concluded that the mapping unit affects the predictive performance of models. Besides, each model can be affected differently from different mapping units. In this case the SR models are positively affected from slope unit selection of mapping unit whereas the LR model shows better performance in grid-based mapping

123

874

Environ Earth Sci (2012) 66:859–877

b Fig. 10 a Ground truth data overlaid to DEM of study region and existing landslide location data. b Existing landslide locations overlaid with the villages and road network. c The density map of ground truth data with existing landslide location

Fig. 11 Accuracy assessment of prediction models by kappa and overall accuracy

unit. As a result in our case study, the LR model applied to the grid unit provides better result than LR applied to slope unit case. On the other hand, SR model provide better results in the slope unit-based mapping unit.

Discussions and conclusions This study clearly demonstrates the effect of using different mapping unit and model types on susceptibility maps. The area and spatial distribution of the susceptibility zones differ depending on the selected model and mapping unit. This may be due to the form of the data prepared for the analysis of different mapping units. For the LR analysis when the common model parameters are considered for grid-based mapping unit and slope unit-based mapping unit, it is seen that brown forest soil, dry farming has positive effect on landslide occurrence and there is no common model parameter for negative effect on landslide occurrence in both units, whereas elevation has positive effect on grid-based and negative effect on slope unit-based mapping units. When the common parameters are evaluated for the SR model, it is observed that, distance to road, brown forest soil, thick soil depth, conglomerate, have positive effect on landslide occurrence in both units and Forest has negative effect on landslide occurrence in both units. By analyzing the area under the ROC for grid-based and slope unit-based mapping units, it can be concluded that SR model provide better predictive performance (0.774 in grids and 0.898 in slope units) as compared to the LR model (0.744 in grids and 0.820 in slope units). This result is also supported by the accuracy analysis. For both mapping units, the SR model provides more accurate result

123

Environ Earth Sci (2012) 66:859–877

(0.55 for grids and 0.57 for slope units) than the LR model (0.50 for grids and 0.48 for slopes). The main reason for this better performance is that the spatial correlations between the mapping units are incorporated into the model in SR while this fact is not considered in LR model. In addition when the same models are compared for the mapping unit type, it can be clearly seen that SR and LR model performances increase when the slope unit-based models are used. This can also be seen from the increase of AUC from 0.77 to 0.90 (Table 4) in SR models and from 0.74 to 0.82 (Table 4) in LR models. The spatial overlay analysis indicates that the SR model is more robust than LR model. This result is deduced by the comparison of each model in both mapping units. The SR model provides more similar overlapping regions in both mapping units than LR model. Furthermore, the SR and LR provide approximately the same percent of similar regions on total for different mapping units. The accuracy analysis also indicates that the LR model provides better accuracy in grid-based mapping unit whereas SR model provides better accuracy in slope unit-based mapping unit. This study illustrates that susceptibility mapping for any region requires careful consideration of mapping unit selection and susceptibility mapping model and the results obtained in this paper can be used as a preliminary guideline for making such selections.

References Akgu¨n A, Bulut F (2007) GIS-based landslide susceptibility for Arsin-Yomra (Trabzon. North Turkey) region. Environ Geol 51:1377–1387 Aleotti P, Chowdhury R (1999) Landslide hazard assessment: summary review and new perspectives. Bull Eng Geol Env 58:21–44 Atkinson PM, Jiskoot H, Massari R, Murray T (1998) Generalized linear modeling in geomorphology. Earth Surf Proc Land 23:1185–1195 Ayalew L, Yamagishi H (2005) The application of GIS-based logistic regression for susceptibility mapping in the Kakuda-Yahiko Mountains. Central Japan. Geomorphology 65(12):15–31 Bathrellos GD, Kalivas DP, Skilodimou HD (2009) Landslide susceptibility mapping models, applied to natural and urban planning, using GIS. Journal Estudios Geolo´gicos 65(1):49–65 Begueria S, Lorente A (1999) Landslide hazard mapping by multivariate statistics: comparison of methods and case study in the Spanish Pyrenees Damocles. Debrisfall assessment in mountain catchments for local end-users (1999) Contract No EVG1—CT-1999-00007 Bonham Carter GF (1994) Geographic Information Systems for geoscientists: modeling with GIS (1994). Pergamon, Ottawa Brabb EE (1984) Innovative approaches to landslide hazard mapping. In: Proceedings of 4th International Symposium on Landslides, Toronto, pp 307–324 Brabb EE, Pampeyan EH, Bonilla MG (1972) Landslide susceptibility in San Mateo County California. Misc Field Studies Map MF360 US Geological Survey Reston

875 Can T, Nefeslioglu HA, Go¨kc¸eoglu C, So¨nmez H, Duman TY (2005) Susceptibility assessments of shallow earth flows triggered by heavy rainfall at three subcatchments by logistic regression analyses. Geomorphology 72(1–4):250–271 Carrara A (1983) Multivariate methods for landslide hazard evaluation. Math Geol 15:403–426 Carrara A (1988) Landslide hazard mapping by statistical methods: a black box model approach. In: Proceedings of the Workshop on Natural Disaster in European Mediterranean Countries, Consiglio Nazionale delle Ricerche, Perugia Carrara A, Catalano E, Sorriso-Valvo M, Reali C, Ossi I (1978) Digital terrain analysis for land evaluation. Geologia Applicata e Idrogeologia 13:69–127 Carrara A, Cardinali M, Detti R, Guzzetti F, Pasqui V, Reichenbach P (1991) GIS techniques and statistical models in evaluating landslide hazard. Earth Surf Process Landf 16:427–445 Carrara A, Cardinalli M, Guzzetti F (1992) Uncertainty in assessing landslide hazard and risk. ITC J 2:172–183 Carrara A, Cardinali M, Guzzetti F, Reichenbach P (1995) GIS technology in mapping landslide hazard. In: Carrara A, Guzzetti F (eds) Geographical Information Systems in assessing natural hazards. Kluwer, Dordrecht, pp 135–175 Carrara A, Guzzetti F, Cardinali M, Andreichenbach P (1999) Use of GIS technology in the prediction and monitoring of landslide hazard. Nat Hazards 20:117–135 Castellanos Abella EA, van Westen CJ (2008) Qualitative landslide susceptibility assessment by multicriteria analysis: a case study from San Antonio del Sur, Guanta´namo, Cuba. Geomorphol: Int J pure Appl Geomorphol 94(3–4):453–466 Chacon J, Irigaray C, Fernandez T, El Hamdouni R (2006) Engineering geology maps: landslides and geographical information systems. Bull Eng Geol Environ 65:341–411 Chinnayakanahalli K, Hill R, Olson J, Kroeber C, Tarboton DG, Hawkins C (2002) The Multi-Watershed Delineation Tool: GIS Software in support of regional watershed analyses. Revisited date 24.11.2009 from http://hydrology.neng.usu.edu/mwdtool/ MWDManual.pdf Choi J, Oh HJ, Won JS, Lee S (2009) Validation of an artificial neural network model for landslide susceptibility mapping. Environ Earth Sci 60:473–483 Chung CF, Fabbri AG (1995) Multivariate regression analysis for landslide hazard zonation. In: Carrara A, Guzzetti F (eds) Geographical Information Systems in assessing natural hazards. Kluwer, Dordrecht, pp 107–142 Crozier MJ (1986) Landslides: causes, consequences and environment. Croom Helm, London, p 252 Dahal RK, Hasegawa S, Nonomura S, Yamanaka M, Masuda T, Nishino K (2008) GIS-based weights-of-evidence modelling of rainfall-induced landslides in small catchments for landslide susceptibility mapping. Environ Geol 54(2):314–324 Dai FC, Lee CF (2001) Terrain-based mapping of landslide susceptibility using a geographical information system: a case study. Can Geotech J 38:911–923 Dai FC, Lee CF (2002) Landslide characteristics and slope instability modeling using GIS, Lantau Island, Hong Kong. Geomorphology 42(3–4):213–228 Dai FC, Lee CF, Li J, Xu ZW (2001) Assessment of landslide suspectibility on the natural terrain of Lantau Island. Hong Kong. Environ Geol 43(3):381–391 Dai FC, Lee CF, Ngai YY (2002) Landslide risk assessment and management: an overview. Eng Geol 64:65–87 Demir B, Ercan S (1999) Yenice ilcesindeki heyelanlar uzerine bazi gozlemler. In: Abstracts Book of 3rd National Symposium on Landslides of Turkey, Cukurova University, Adana, pp 15–27 (in Turkish)

123

876 Deveciler E (1986) Alapli–Bartin–Cide (B. Karadeniz) jeoloji raporu. MTA yayinlari, Derleme 7938:58 (in Turkish) Duman TY, Emre O, Can T, Ates S, Kecer M, Erkal T, Durmaz S, Dogan A, Corekcioglu E, Goktepe A, Cicioglu E, Karakaya F (2001) Turkish landslide inventory mapping project: methodology and results on Zonguldak quadrangle (1/500000), working in progress 25 on the geology of Turkey and its surroundings. In: abstract book of the 4th Int. Turkish Geology Symp., 24–28, p 392 Duman TY, Can T, Go¨kc¸eog˘lu C, Nefesliog˘lu HA, So¨nmez H (2006) Application of logistic regression for landslide susceptibility zoning of Cekmece Area. Istanbul. Turkey. Environ Geol 51(2):241–256 Du¨zgu¨n HSB, Kemec¸ S (2008) Spatial regression and geographically weighted regression for spatial prediction. In: Shekhar S, Xiong H (eds) The encyclopedia of geographical information science. Springer, New York Einstein HH (1988) Special lecture: landslide risk assessment procedure. In: Proceedings of 5th International Symposium on Landslides. Lausanne, vol 2, pp 1075–1090 Ercanog˘lu M (2005) Landslide susceptibility assessment of SE Bartin (West Black Sea region, Turkey) by artificial neural networks. Nat Hazards Earth Syst Sci 5:979–992 Ercanog˘lu M, Go¨kc¸eog˘lu C, Van Asch ThWJ (2004) Landslide susceptibility zoning north of Yenice (NW Turkey) by multivariate statistical techniques. Nat Hazards 32:1–23 Erener A, Du¨zgu¨n HSB (2006) Comparison of Statistical Landslide Hazard Assessment Methods at Regional Scale, International Disaster Reduction Conference (IDRC), Davos. Switzerland 3:164 Erener A, Du¨zgu¨n HSB (2008) Analysis of landslide hazard mapping methods: regression models versus weight rating; XXIst ISPRS Congress 2008; 3–11 July. Beijing, China. Commission VIII papers, Part B8, ISSN 37:1682–1750 Erener A, Du¨zgu¨n HSB (2010) Improvement of statistical landslide susceptibility mapping by using spatial and global regression methods in the case of More and Romsdal Norway. Landslides 7(1):55–68 Erener A, Lacasse S, Kaynia AM (2007) Landslide hazard mapping by using GIS in the Lilla Edet province of Sweden, 28th Asian Conference on Remote Sensing ACRS2007, Kuala Lumpur GCM (2007) General command of mapping. Elevation map, hydrology and transportation map, Ankara GDDA (2007) General Directorates of Disaster Affairs. Landslide reports, Bartın GDMRE (2007) General Directorate of Mineral Research and Exploration. Geologic Map and Landslide Map, Ankara GDRA (2007) General Directorate of Rural Affairs. Soil Maps and Land Use Maps, Ankara George D, Mallery P (2000) SPSS for Windows step-by step: a simple guide and reference, 2nd edn. Allyn and Bacon, Boston Go¨kc¸eog˘lu C, Ercanog˘lu M (2002) Heyelan duyarlılık haritalarının hazırlanmasında kullanılan parametrelere iliksin belirsizlikler. ¨ niversitesi Yerbilimleri Uygulama ve Arastırma Hacettepe U Merkezi Bu¨lteni 23:189–206 Go¨kc¸eog˘lu C, So¨nmez H, Nefeslioglu HA, Duman TY, Can T (2005) The 17 March 2005 Kuzulu landslide (Sivas, Turkey) and landslide-susceptibility map of its near vicinity. Eng Geol 81:65–83 Go¨mez H, Kavzoglu T (2005) Assessment of shallow landslide susceptibility using artificial neural networks in Jabonosa River Basin. Venezuela. Eng Geol 78:11–27 Gopalan H, Whiteaker T and Maidment D (2002) Determining Watershed Parameters Using Arc Hydro. http://gis.esri.com/ library/userconf/proc03/p0805.pdf Guzzetti F, Carrara A, Cardinali M, Reichenbach P (1999) Landslide hazard evaluation: a review of current techniques and their

123

Environ Earth Sci (2012) 66:859–877 application in a multi-scale study central Italy. Geomorphology 31:181–216 Guzzetti F, Cardinalli M, Reichenbach P, Carrara A (2000) Comparing landslide maps: a case study in the upper Tiber river basin. Central Italy. Environ Manag 25(3):247–263 Guzzetti F, Reichenbach P, Cardinali M, Gali M, Ardizzone F (2005) Landslide hazard assessment in the Staffora basin. Northern Italian Apennines. Geomorphology 72:272–299 Guzzetti F, Galli M, Reichenbach P, Ardizzone F, Cardinali M (2006) Landslide hazard assessment in the Collazzone area, Umbria, Central Italy. Nat Hazards Earth Syst Sci 6:115–131 Hansen A (1984) Landslide hazard analysis. In: Brundsen D, Prior DB (eds) Slope instability. Wiley, New York, pp 523–602 Hartlen J, Viberg L (1988) Evaluation of landslide hazard. In: Proceedings of 5th International Symposium on Landslides, Balkema, Rotterdam pp 1037–1058 Hearn GJ, Griffiths JS (2001) Landslide hazard mapping and risk assessment. Geol Soc Special Publ 18:43–52 Huabin W, Gangjun Weiya LX, Gonghui W (2005) GIS-based landslide hazard assessment: an overview. Prog Phys Geogr 29(4):548–567 Hutchinson JN (1995) Landslide hazard assessment. In: Proceedings of VI Int Symposium on Landslides, vol 1. Christchurch, pp 1805–1842 Komac M (2006) A landslide susceptibility model using the Analytical Hierarchy Process method and multivariate statistics in perialpine Slovenia. Geomorphology 74:17–28 Lan HX, Zhou CH, Wang LJ, Zhang HY, Li RH (2004) Landslide hazard spatial analysis and prediction using GIS in the Xiaojiang watershed. Yunnan. China. Eng Geol 76(12):109–128 Landslide Reports (1985) The Ministry of Public Works and Settlement, General Directorate of Disaster Affairs, The Geologic Etude and Investigation Directory of Bartın Lee S (2004) Application of likelihood ratio and logistic regression models to landslide susceptibility mapping using GIS. Environ Manag 34(2):223–232 Lee S (2005) Application of logistic regression model and its validation for landslide susceptibility mapping using GIS and remote sensing data. Int J Remote Sens (preview article) Lee S, Min K (2001) Statistical analysis of landslide susceptibility at Yongin, Korea. Environ Geol 40:1095–1113 Lee S, Choi J, Min K (2004) Probabilistic landslide hazard mapping using GIS and remote sensing data at Boun. Korea. Int J Remote Sens 25(11):2037–2052 Leroi E (1996) Landslide hazard—risk maps at different scales: objectives, tools and developments. In: Proceedings of VII International Symposium on Landslides, vol 1. Trondheim, pp 35–52 LeSage JP (1999) The theory and practice of spatial econometrics department of economics. University of Toledo, Toledo Maidment RD (2002) ArcHydro GIS for Water resources. ESRI Press. 380 New York Street. Redlands California. ISBN-13:978-158948-034-6 Mathew J, Jha VK, Rawat GS (2008) Landslide susceptibility zonation mapping and its validation in part of Garhwal Lesser Himalaya, India, using binary logistic regression analysis and receiver operating characteristic curve method. Landslides 6(1):17–26 Meijerink AMJ (1988) Data acquisition and data capture through terrain mapping unit. Int Comput J 1:23–44 Menard S (1995) Applied logistic regression analysis. Sage University Paper Series on Quantitative Applications in Social Sciences. Thousand Oaks 106:98 Mertler CA, Vannatta RA (2002) Advanced and multivariate statistical methods practical applications and interpretations, 2nd edn. Pyrczak Publishing, Los Angeles

Environ Earth Sci (2012) 66:859–877 Miles SB, Ho C (1999) Rigorous landslide hazard zonation using Newmark’s method and stochastic ground motion simulation. Soil Dyn Earthq Eng 18:305–323 Mulder HF (1991) Assessment of landslide hazard Profschrift ter Verkrijging van Graad van Doctor an de Rijkuniversiteit te Utrecht. University of Utrecht, Utrecht, p 150 Ohlmacher GC, Davis CJ (2003) Using multiple logistic regression and GIS technology to predict landslide hazard in northeast Kansas. USA. Eng Geol 69:331–343 Pandey A, Dabral PP, Chowdary VM and Yadav NK (2008) Landslide Hazard Zonation using Remote Sensing and GIS: a case study of Dikrong river basin. Arunachal Pradesh. India. Environ Geol 54(7):1517–1529. doi:10.1007/s00254-007-0933-1 Pike RJ (1988) The geometric signature: quantifying landslide terrain types from digital elevation models. Math Geol 20(5):491–511 Pradhan B (2010a) Use of GIS-based fuzzy logic relations and its cross application to produce landslide susceptibility maps in three test areas in Malaysia. Environ Earth Sci 63(2):329–349 Pradhan B (2010b) Remote sensing and GIS-based landslide hazard analysis and cross-validation using multivariate logistic regression model on three test areas in Malaysia. Adv Space Res 45:1244–1256 Pradhan B, Lee S (2010) Delineation of landslide hazard areas using frequency ratio, logistic regression and artificial neural network model at Penang Island, Malaysia. Environ Earth Sci 60:1037–1054 Pradhan B, Lee S, Buchroithner MF (2010) A GIS-based backpropagation neural network model and its cross application and validation for landslide susceptibility analyses. Comput Environ Urban Syst 34:216–235 Saha AK, Gupta RP, Arora MK (2002) GIS-based Landslide Hazard Zonation in the Bhagirathi (Ganga) Valley, Himalayas. Int J Remote Sens 23(2):357–369 Soeters R, Van Westen CJ (1996) Slope stability: recognition, analysis and zonation. In: Turner AK, Shuster RL (eds) Landslides: investigation and mitigation. Transportation Research Board—National Research Council. Sorriso-Valvo Special Report, pp 129–177 Tunusluoglu MC, Go¨kc¸eog˘lu C, Nefesliog˘lu HA, So¨nmez H (2008) Extraction of potential debris source areas by logistic regression technique: A case study from Barla. Besparmak and Kapi Mountains (NW Taurids. Turkey). Environ Geol 54:9–22 Van Westen CJ (1993a) Application of Geographic Information System to landslide hazard zonation. ITC-Publication No. 15 ITC, Enschede, p 245 van Westen CJ (1993b) Remote sensing and geographic information systems for geological hazard mitigation. ITC J 4:393–399 van Westen CJ (1994) GIS in landslide hazard zonation: a review with examples from the Colombian Andes. In: Price MF,

877 Heywood DI (eds) Mountain environments and Geographic Information Systems. Taylor and Francis, London, pp 135–165 van Westen CJ (2004) Geo-information tools for landslide risk assessment: an overview of recent developments. In: Lacerda WA, Ehrlich M, Fontoura SAB, Sayao ASF (eds) Landslides: evaluation and stabilizatio, vol 1. Balkema, London, pp 39–56 van Westen CJ, Rengers N, Terlien MTJ, Soeters R (1997) Prediction of the occurrence of slope instability phenomena through GISbased hazard zonation. Geol Rundsch 86:404–414 van Westen CJ, van Asch TWJ, Soeters R (2005) Landslide hazard and risk zonation: why is it still so difficult? Bull Eng Geol Environ: Off J Int Assoc Eng Geol Environ IAEG 65(2):176–184 Varnes DJ (1978) Landslides types and processes. In: Eckel EB (ed) Landslides and engineering practice. Highway Research Board Special Report, 29. pp 20–47 Varnes DJ, with IAEG Commission on landslides and other mass movements (1984) Landslide hazard zonation: a review of principles and practices. UNESCO Press, Paris, p 63 Wang WD, Xie CM, Du XG (2008) Landslides susceptibility mapping based on geographical information system, GuiZhou, south-west China. Environ Geol 58(1):33–43 WP/WLI (International Geotechnical Societies = UNESCO Working Party on World Landslide Inventory) (1993) A suggested method for describing the activity of a landslide. Bull Int Assoc Eng Geol 47:53–57 Wu TH, Abdel-Latif MA (2000) Prediction and mapping of landslide hazard. Can Geotech J 37:781–795 Xie M, Esaki T, Zhou G (2004) GIS-based probabilistic mapping of landslide hazard using a three-dimensional deterministic model. Nat Hazards 33:265–282 Yalcin A, Bulut F (2007) Landslide susceptibility mapping using GIS and digital photogrammetric techniques: a case study from Ardesen (NE-Turkey). Nat Hazards 41:201–226 Yesilnacar E, Topal T (2005) Landslide susceptibility mapping: a comparison of logistic regression and neural networks methods in a medium scale study. Hendek region (Turkey). Eng Geol 79:251–266 Yilmaz I (2009) Landslide susceptibility using frequency ratio, logistic regression, artificial neural networks and their comparison: a case study from Kat landslides (Tokat-Turkey). Comput Geosci 35(6):1125–1138 Yilmaz I (2010) Comparison of landslide susceptibility mapping methodologies for Koyulhisar, Turkey: conditional probability, logistic regression, artificial neural networks, and support vector machine. Environ Earth Sci 61(4):821–836 Zhu L, Huang J (2006) GIS-based logistic regression method for landslide susceptibility mapping in regional scale. J Zhejiang Univ Sci A 7(12):2007–2017

123