vigtige i landbrugs-og miljømæssige undersøgelser og dyrkningspraksis, ...... ined in conjunction with the installation of a natural gas pipe- ...... als behind.
This dissertation is dedicated to
Mr & Mrs Khem Raj Adhikari-Bishnu Devi For everything you sacrificed To facilitate my education
"We know more about the movement of celestial bodies than about the soil underfoot." -Leonardo da Vinci
Main Supervisor Mogens Humlekrog Greve Senior Scientist Department of Agroecology Aarhus University Co-supervisors Rania Bou Kheir Professor Department of Agroecology Aarhus University Peder Klith Bøcher Department of Biological Sciences Aarhus University
Assessment Committee Lis Wollesen De Jonge (Chairperson) Professor Department of Agroecology Aarhus University Florence Carré European Research Manager INERIS- French National Institute of Industrial Environment and Risk, France Rosario Napoli Senior Scientist CRA- Agriculture Research Council, Italy
Preface This thesis presents a summary of my three years of research in the field of Digital Soil Mapping in Denmark. The research was carried out between January 2010 and January 2013 in Foulum. The thesis is submitted to the Faculty of Science and Technology of Aarhus University, as partial fulfilment of the requirements for the degree of Doctor of Philosophy (PhD). The research funding was provided partly by Aarhus University and partly by the SINKS Project (2009-2012) funded by the Danish Ministry of Climate and Energy. The thesis contains three core elements, where the first part highlights the methodology adopted to model the vertical distribution of key soil properties (texture, pH, and bulk density) in Danish soil profiles. The second part focuses on the building of a soil-landscape model to predict and map those properties at multiple soil depths, together with mapping of FAO soil classes at ~30 m resolution for the whole of Denmark. The last part is devoted to discussing and suggesting the quality control of the products by means of validation and assessment of uncertainty associated with each prediction and mapping. Four scientific papers, of which two have been published and the remaining two are in the form of working manuscripts, are included. Special gratitude goes to my supervisor Mogens Humlekrog Greve, without whom it would have been difficult for me to initiate and conclude the research. I am very much indebted to him for his support, guidance and constant encouragement, which I have been able to receive throughout the research period. Now it‟s finito. Tusind Tak! for giving this opportunity and being always with me during my entire stay as a PhD student. I am grateful too to my co-supervisors, Rania Bou Kheir and Peder Klith Bøcher, who are also associated with the Aarhus University, for their continuous supervision and invaluable suggestions. My sincere thanks also go to Mette B. Greve and René Larsen for their support related to GIS. I am thankful to Alex B. McBratney, and Budiman Minasny from the Faculty of Agriculture and Environment, The University of Sydney, for hosting me during my stay abroad. It was an absolute privilege to spend almost half a year of my PhD studies being supervised by Budi and Alex in the field of digital soil mapping. Thanks a lot for your inspiring supervision and training, which produced a significant amount of inputs for my research. My best regards also go to Brendan P. Malone from the same department for essential scientific support. Furthermore, I wish to thank all my fellow students and colleagues for the incredible support they have provided during my stay in Foulum. You all make this a wonderful place to work. I am also thankful to Mary McAfee for proof-reading the final text. Last, but not least, my heartfelt gratitude to my parents for their endless love, encouragement and all support during my early education. Special thanks to my wife Bimala for her care, love, sacrifice and patience. Without her, this thesis would never have been accomplished. To our son Anshuman, my love for you is eternal. I am so lucky to have you in my life.
Kabindra Adhikari April 2013, Foulum i
Summary Soils are vital for sustaining life on earth. Proper use and management of soil resources require a deep understanding of the spatial distribution of soil types/properties, which determine various ecological and socioeconomic roles of soils and influence the quality of services they provide to society. Soil surveying, monitoring and assessment using digital soil mapping (DSM) techniques are common practices to generate information about the types and properties of soils and their spatial distribution in a landscape. Denmark has a long history of soil resource investigations in the past. Previous Danish works have developed a rich soil database, which is readily available for different groups of end-users, including researchers. However, the database is still lacking information on the continuous variation of soil properties with depth, which is very important in agricultural and environmental studies and practices, such as crop production, irrigation and water management, land evaluation, identification of risk area, carbon stocks and gas emissions studies. Moreover, the data generated in a conventional way should also be updated with the new information through efficient tools and methods so that the changing demands of end-users can be properly addressed. This thesis bridges these gaps by setting up some objectives where the continuous function of soil properties in Danish soil profiles was modelled and its distribution down to 200 cm from the soil surface was mapped for the entire country. The autocorrelation of soil properties from different study areas was also investigated to see how the variability changed over space. Similarly, different spatial techniques to predict soil properties were compared to find suitable methods applicable to current Danish conditions. A new FAO soil map of Denmark was also compiled using existing soil and environmental information. These objectives were addressed through state-of-the-art DSM techniques, where a large number of soil and environmental data collected from different sources in Denmark were used, and predictions made to different extents. Analytical data from the Danish Soil Classification (~7000 points) and Soil Profile database (~1100-1960 profiles), digital elevation model (~30 m resolution) and its derivatives, existing choropleth maps such as geology, landscape, land use, soil maps etc. were the main data sources on which the entire study was based. In comparison of the prediction performance of regression tree (RT), ordinary kriging (OK), stratified ordinary kriging (OKst), and rules-based regression kriging (RKrr) based on 20% validation data while predicting topsoil (0-30 cm) clay ii
content from a selected area in Denmark, RKrr appeared to outperform all other methods. The prediction performance of OKst further improved when the samples were divided using landscape classes as stratifying units. Due to its superior predictive capacity (i.e., R2 = 0.74, RMSE = 0.28, RPD = 2.2), RKrr is recommended for future soil mapping activities in Denmark. Equal-area quadratic spline applied to profile data modelled a continuous distribution of soil texture, pH and bulk density in Danish soil profiles down to 200 cm depth (100 cm for pH and bulk density). It consists of a series of connected quadratic polynomials fitted piecewise through the horizons, and is mass preserving in nature. It passes through the midpoint of the horizons maintaining the average value of the attribute where the measured and predicted attribute mass in each horizon remain the same during this process. As is typical for Danish conditions, where the topsoil in agricultural areas forms a highly mixed soil material due to agricultural practices (Ap horizon), the splines were not forced to extrapolate in this particular layer so that the homogeneity of the surface mineral horizon would be maintained. The continuous data obtained from the splines were aggregated to specified depth intervals (i.e. 0-5, 5-15, 15-30, 30-60, 60-100, 100-200 cm for texture components; and 0-5, 5-10, 10-20, 20-30, 30-50, 50-70, 70-100 cm for soil pH, and bulk density) and the values corresponding to the depths were predicted with regression rules generated by the Cubist data mining tool, where a number of environmental variables were used as predictors. The maps of all properties from all soil depths were generated at a spatial resolution of ~30 m after adding the residual surface to the regression grid. The variogram analysis suggested that the variability of all soil properties increased with depth except for pH, for which it was higher in the topsoil. The increasing amount of predicted clay with soil depth clearly indicated a leaching effect, while a higher soil pH in surface layers compared with subsurface layers suggested a possible effect of agricultural activities, for example liming. Agriculture also influenced the distribution of soil bulk density in the profiles, which was lower and fairly constant for the top 20-30 cm depth but increased with depth. Geographically, the soils of western Denmark were rich in coarse sand content throughout the profile, whereas eastern soils were rich in clay for all predicted depths. Average silt content decreased with depth, although its distribution
iii
pattern was similar to that of clay. Northern soils were mainly rich in fine sand content and its distribution also decreased with depth. Among the environmental variables used as predictors, soil map, geology, landscape type, elevation, wetness index, slope and slope-length factors were of higher importance based on Relative Importance (RI) than other variables such as, aspect, direct insolation etc. when mapping soil texture components. For example, while predicting clay content from the top o-30 cm soil depth, soil map scored 100% RI, geology (~40%), topographic wetness index (>90%), slope gradient (~90%), direct insolation (5%), and slope aspect received only about 1% RI. Interestingly, these models showed an increasing influence of geology with depth which was more obvious for clay prediction (for 0-5 cm, RI = 38%; for 100-200 cm, RI = 100%). For soil pH and, bulk density, as expected, land use appeared to be the most important variable, although soil map, geo-region and elevation also had relatively higher importance. For example, land use scored about 100% RI during the prediction of pH and bulk density from 0-5 cm soil depth. A small portion of the work related to the GlobalSoilMap project (www.GlobalSoilMap.net). Some progress was made regarding Danish soil mapping according to project specifications. However, detailed work and outcomes are still being finalized and will be presented in the near future. As an attempt to apply DSM in national soil class mapping, a national soil map of Denmark based on the FAO Revised legend 1990 was generated. The relationship between soil classes and existing environmental variables was quantified with the decision tree model included in the See5 data mining tool. During the modeling, boosting of the trees reduced classification errors almost by 20%. The overall prediction accuracy based on 20% validation profiles was about 60%. But when the similar soil groups were considered it increased to 76%. The predicted soil map showed that >60% of Denmark is covered by Podzols and Luvisols alone, while the remainder is covered by other classes (e.g., Gleysols 8%, Arenosols 8.6%, Cambisols 7%, Podzoluvisols 1.7%). Podzols mostly occupy western Denmark, whereas Luvisols dominate in the east. Comparing our product with the conventionally generated FAO map, we found very good agreement in most areas, especially when considering the areal coverage of a soil type in a given soil mapping unit (SMU) for the old map. One of the benefits of the new map is that it can identify the
iv
geographical location of the predicted soil classes within the SMU, which the old map could not. Similarly, information about the prediction uncertainty associated with mapping can also be another important benefit as far as the reliability or use of the new soil map is concerned. Prediction uncertainty evaluation is also a major part of this thesis work. Hold-back validation, standard error of prediction and kriging variance were used in the assessment of uncertainty for soil property mapping, whereas overall accuracy and prediction confidence were used for soil classes. According to the validation indices (R2, ME, RMSE), predictions for each property were better in soil surface layers than in deeper layers. Increased prediction error with depth was mainly due to the presence of heterogeneous soils in deeper layers. For texture components, areas with higher predicted values were always associated with higher standard errors. Prediction error for pH was less for forest soils, whereas a higher error was observed for sand dunes and also areas in Himmerland. During the prediction of soil classes, relatively lower uncertainties were associated with Podzols (30%), Luvisols (35%), and Gleysols (33%), whereas Podzoluvisols (56%) and Alisols (52%) showed higher uncertainties. Lastly, equal area quadratic splines proved to be an efficient tool to model the depth function of soil attributes in Danish soil profiles. Integration of Cubist and See5 tools in DSM has revolutionised soil mapping activities in Denmark, as reported in this thesis. The results clearly show that DSM can be a major procedure to enrich the Danish soil database, both in terms of quantity and quality, although more research is required for quantification. The scope of utilisation of DSM products also needs to be expanded to address a series of soil and environmental issues from local to national level in Denmark, where rich soil information could make a difference.
v
Sammendrag Jordbunden er afgørende for at opretholde livet på jorden. God anvendelse og forvaltning af jordressourcerne kræver en dybere forståelse af den rumlige fordeling af de jordtyper/ egenskaber,
der
afgør
jordens
økologiske
og
socioøkonomiske
egenskaber.
Jordbundsundersøgelser, jordbundskortlægning samt overvågning af jorden ved hjælp af digital jordbundskortlægning (Digital Soil Mapping, DSM) genererer oplysninger om jordtyper, jordens egenskaber og dens rumlige distribution i et landskab. Danmark har en lang historie af jordressourceundersøgelser, og i forbindelse med disse er der etableret nationale databaser, som er let tilgængelige for forskellige grupper af slutbrugere, herunder forskere. Databasen mangler dog stadig information om den kontinuerte variation af jordens egenskaber både horisontalt og vertikalt. Disse oplysninger er meget vigtige i landbrugs-og miljømæssige undersøgelser og dyrkningspraksis, såsom i forbindelse med en vurdering af produktionspotentialet for afgrøder, planlægning af kunstvanding,
jordevaluering,
identifikation
af
risikoområder,
kulstoflager
og
drivhusgasundersøgelser etc. Endvidere bør de data, der er genereret på den konventionelle måde, også opdateres med de nye oplysninger ved hjælp af DSM-værktøjer og –metoder, så slutbrugernes forskellige behov kan imødekommes. Formålet med denne Phd er at undersøge jordbundens kontinuerte variation både horisontalt og vertikalt. Den vertikale variation af jordens egenskaber er blevet modeleret i de tilgængelige danske jordbundsprofiler ned til 200 cm dybde for hele landet. Ligeledes er den horisontale autokorrelation undersøgt for at kvatificere den rumlige variation i de forskellige danske landskabstyper. Sideløbende med dette er forskellige DSM-teknikker undersøgt, for at dokumentere hviken metode, der predikterer bedst under danske forhold. I forbindelse med dette arbejde blev der udarbejdet et nyt FAO jordbundskort på baggrund af den eksisterende jordprofildatabase og en række eksisterende jord- og landskabskort samt en LIDAR baseret højdemodel. Alle de ovennævnte formål blev adresseret ved anvendelse af state-of-the-art DSM teknikker. Disse teknikker blev anvendt ved at bruge data fra alle tilgængelige danske jord-, landskabs- og terrændatabaser. Der
blev
anvendt
jorddata
fra
Den
danske
Jordklassificering,
Den
Danske
Jordprofildatabase, eksisterende kort over jordtyper, geologi, landskabstyper, areal
vi
anvendelse, Danmarks højdemodel og en række afledte heraf som prædiktorer i dette studie. For at modellere den kontinueret vertikale fordeling af jordegenskaberne tekstur, pH og volumenvægt i de tilgængelige jordprofiler, fra overfladen ned til 200 cm dybde, blev der anvendt en arealtro spline. For at tage højde for at langt de fleste danske jorder på et eller andet tidspunkt har været pløjet, blev spline funktionen tvunget til at ikke at prædiktere variation i jordens egenskaber i pløjelaget. De kontinuerete data, som blev genereret af splinefunktionen, blev aggregeret til seks specifikke dybde intervaller defineret af aktiviteterne i GlobalSoilMap-projektet. Efterfølgende blev der genereret landsdækkende kort for hver af de specifikke dybdeintervaller i en rumlig opløsning på 30 meter. Interpoleringen blev udført med rule based regression kriging (RKrr). Variagramanalysen viser, at variabiliteten for alle jordens egenskaber bliver større med dybden, med undtagelse af jordens pH, der er højest i topjorden. Jordens lerindhold stiger med dybden, hvilket indikerer lerudvaskning fra de øvre lag. Jordens pH er højest i pløjelaget, hvilket er en effekt af landmændenes kalkning. Landbrugsdriftens indflydelse på jordens volumenvægt er også værd at bemærke, da volumenvægten er lav og konstant i topjorden og stiger med dybden. Geografisk set findes de grovsandede jorder i Vestdanmark og de mere lerede jorder i øst. De mest finsandede jorder findes i Nordjylland. Af de variable, der anvendes som prædiktorer i kortlægningen, havde jordtypekort, geologi, arealanvendelse, elevation, wetness indices, slope and slope-length faktoren størst relativ betydning i forhold til aspect, direkte indstråling osv. Interessant nok viste modellerne, som prædikterer lerindhold, en stigende vigtighed af geologi som prædiktor med dybden. For jordens pH og volumenvægt var det som forventet arealanvendelse, som viste sig at være den vigtigste variabel, selv om jordtypekort, georegioner og elevation også havde relativt høje betydninger for prædiktionerne. Som en del af arbejdet blev der lavet et sammenlignende studie over hvor gode forskellige prædiktionsmetoder var til at prædiktere topjordens lerindhold. Metoderne blev testet på en transekt på tværs af Jylland. Der blev anvendet regression tree (RT), ordinary kriging
vii
(OK), stratified ordinary kriging (OKst), og rule based regression kriging (RKrr) i dette studie. Der blev anvendt punktdata fra den danske jordklassificering, 80% af punkterne blev anvendt til modeludvikling og 20% af punkterne blev anvendt til validering af modelingen. RKrr var den bedste metode og vil blive anvendt i fremtidige studier. Størstedelen af arbejdet er blevet brugt på at prædiktere jordens egenskaber som kontinuerte variable, men som et førsøg på at anvende DSM metoder til at prædiktere klasser, blev der udarbejdet et nationalt jordbundskort, som viser FAO jordklassernes udbredelse i Danmark (FAO Revised legend 1990). Som udgangspunkt for dette arbejde blev der anvendt data fra Den Danske Jordprofil Database i kombination med en række miljøvariable (f.eks jordens tekstur i top- og underjord, landskabskort samt den digitale højdemodel). Korrelationen mellem FAO-jordklasserne og de forskellige miljøparametre blev kvantificeret vha. beslutningstræer i See5 dataminingværktøjet. Boosting af træerne forbedrede prædiktionen meget og den endelige prædiktionsnøjagtighed var 60%. Hvis man accepterede naboklasser som korrekte, steg nøjagtigheden til 76%. Jordbundskortet viser at mere end 60% af Danmark er dækket af podsoller og luvisoller. Podsollerne dominerer på de sandede jorder, hvorimod luvisollerne dominerer på lerjorderne. Hvis man sammenligner det nye kort med ældre eksisterende kort, er der i det meste af Danmark en god overenstemmelse. Detaljeringsgraden er dog langt højere på det nye kort. Der er observeret problemer på enkelte landskabstyper med meget lille udbredelse. Evaluering af usikkerheder er også en stor del af arbejdet bag denne afhandling. Hold-back validering, standardfejlen ved prædiktion og krigingvarians er brugt i vurderingen af usikkerhederne for kortlægningen af jordens egenskaber, hvorimod den overordnede præcision og prædiktionens konfidens er brugt i forhold til usikkerhederne for kortlægning af jordklasserne. Prædiktionen af hver egenskab er - baseret på valideringsindeksene (R2, ME, RMSE) - bedre for de overfaldiske jordlag end for de dybere lag. Den stigende prædiktionsfejl med jorddybden kan primært begrundes med heterogeniteten af de dybere jordlag. I forhold til teksturelementer er områder med højere prædikterede værdier altid forbundet med højere standardfejl. Prædiktionsfejlen for pH var mindre for skovdækkede jorde, og en højere fejl var observeret for klitter samt områder i Himmerland. I forbindelse med prædiktion af jordklasser var relativt lave usikkerheder forbundet med Podzols (30%), Luvisols (35%) og Gleysols (33%), hvorimod Podzoluvisols (56%) and Alisols (52%) viste større usikkerheder. viii
Afslutningsvist skal det nævnes, at equal area quadratic splines viste sig at være et effektivt værktøj til modellering af dybde-funktionen af jordkarakteristika i danske jordprofiler. Integration af Cubist- og See5-værktøjer i DMS har, som vist i denne afhandling, revolutioneret jordkortlægningsaktiviteterne i Danmark. Resultaterne viser tydeligt, at DSM kan bidrage signifikant til forbedring af den danske jorddatabase, både i forhold til kvantitet og kvalitet, dog er der behov for yderligere undersøgelser for at kvantificere arbejdet. Målet med anvendelse af DSM-produkter skal ligeledes ekspanderes for at imødekomme en række jord- og miljørelaterede problemstillinger fra lokalt til nationalt niveau i Danmark, hvor fyldestgørende kendskab til jordens karakteristika kunne gøre en forskel.
ix
Contents Preface................................................................................................................................................... i Summary .............................................................................................................................................. ii Sammendrag ....................................................................................................................................... vi Contents ............................................................................................................................................... x List of Figures .................................................................................................................................... xii List of Tables ...................................................................................................................................... xv List of Equations ............................................................................................................................... xvi List of supporting papers ................................................................................................................ xviii 1
Introduction.................................................................................................................................. 1 1.1
General concept of soil and soil formation ........................................................................... 1
1.2
Why is soil so important? ...................................................................................................... 3
1.2.1
Ecosystem services/soil functions ..................................................................................... 3
1.2.2
Soil threats ..................................................................................................................... 4
1.2.3
Use of soil information................................................................................................... 4
1.3
Concept of Soil survey, Classification and Mapping ............................................................. 6
1.4
Soil classification and mapping in Denmark ........................................................................ 7
1.4.1
Earlier soil assessments ..................................................................................................... 7
1.4.2
Danish Soil Classification (1975-1980) .......................................................................... 8
1.4.3
Pedological investigations............................................................................................. 11
1.4.4
Earlier FAO-Unesco soil maps of Denmark................................................................. 12
1.5
Soil spatial variability .......................................................................................................... 16
1.6
Prediction and mapping of soil spatial variability .............................................................. 17
1.6.1
Conventional soil mapping approach .............................................................................. 17
1.6.2
Digital Soil Mapping approach .................................................................................... 18
1.6.2.1
Digital Terrain Analysis............................................................................................20
1.6.2.2
DSM methods ...........................................................................................................20
1.6.2.2.1 Geostatistical methods .......................................................................................20 1.6.2.2.2 Non-geostatistical methods ............................................................................... 22 1.6.2.2.3 Mixed methods .................................................................................................. 22 1.6.3
Quality assessment of digital soil maps ....................................................................... 22
1.6.4
Links to the global soil mapping activities................................................................... 24
1.7 2
Objectives of the thesis ........................................................................................................ 24
Study area and data preparation ................................................................................................ 26 2.1
Study area ............................................................................................................................ 26 x
2.2
2.2.1
Point observations ....................................................................................................... 27
2.2.2
Existing choropleth maps ............................................................................................ 29
2.3
3
Soil data ............................................................................................................................... 27
Terrain data ......................................................................................................................... 31
2.3.1
Generation of a DEM and its post processing ............................................................. 31
2.3.2
Extraction of Land Surface Parameters ....................................................................... 32
Modelling the vertical distribution of soil properties in Danish soil profiles ............................ 34 3.1
Fitting splines to soil texture data ....................................................................................... 34
3.2
Fitting splines to soil pH and bulk density data ................................................................. 37
4
Assessment of spatial autocorrelation of soil properties............................................................40
5
Spatial prediction and Mapping ................................................................................................. 48 5.1
Geostatistical methods ........................................................................................................ 48
5.2
Mixed methods .................................................................................................................... 52
5.2.1
Mapping topsoil clay content of the selected area ....................................................... 52
5.2.2 Mapping texture components, pH and bulk density at multiple soil depths in Denmark ..................................................................................................................................... 57 5.3
DSM for mapping soil classes ............................................................................................. 77
6
Prediction uncertainty assessment ............................................................................................. 81
7
Conclusions.................................................................................................................................86
8
Perspectives for future studies ...................................................................................................88
9
References...................................................................................................................................90
10 Appendices.................................................................................................................................. 99
11
10.1
Appendix A .......................................................................................................................... 99
10.2
Appendix B ........................................................................................................................ 101
10.3
Appendix C ........................................................................................................................ 105
Supporting papers ..................................................................................................................... 113 11.1
PAPER I .............................................................................................................................. 115
11.2
PAPER II ............................................................................................................................157
11.3
PAPER III ...........................................................................................................................177
11.4
PAPER IV .......................................................................................................................... 219
11.5
PAPER V ............................................................................................................................ 229
xi
List of Figures Fig. 1. Four major components of the ideal soil (Source: www.soilsurvey.org)................................... 1 Fig. 2. Schematic representation of a pedon from a landscape and development of soil horizons (Source: www.madrimasd.org). ....................................................................................................... 2 Fig. 3. Linkages of soil to different aspects of life [Source: (Omuto et al., 2012)]. ................................ 3 Fig. 4. Geographical location of the point samples from the Danish Soil Classification Database (after Madsen et al., 1992)................................................................................................................. 10 Fig. 5. Soil types of Denmark according to the Danish Soil Classification (Madsen and Jensen, 1992). .................................................................................................................................................... 10 Fig. 6. Soil map of Denmark according to FAO-Unesco legend 1974 (Jacobsen, 1984). .................... 14 Fig. 7. Soil map of Denmark according to the revised FAO legend 1990 [Source: Madsen and Jensen (1996)]. (AL- Alisols, AR- Arenosols, CM- Cambisols, FL- Fluvisols, GL- Gleysols, HSHistosols, LP- Leptosols, LV/L- Luvisols, PD- Podzoluvisols, PZ- Podzols) (Madsen and Jensen, 1996). ..................................................................................................................................... 15 Fig. 8. Study area located in western Hungary and the soil sampling sites as marked on orthophoto. ............................................................................................................................................................... 26 Fig. 9. Country border representing the whole study area of Denmark and an enlarged view of the area in central Jutland selected for comparison of prediction methods. ................................... 27 Fig. 10. Soil sample locations from the selected area in Denmark derived from the Danish Soil Classification. ...................................................................................................................................... 28 Fig. 11. Geographical location of the soil profiles used in the study (Inset- 7-km grid profiles, profiles along the pipeline trench, and specific profiles as in the south-west corner). ............ 28 Fig. 12. Geological types of Denmark (simplified classes at a scale of 1:200,000). ............................ 30 Fig. 13. Map showing different landscape types in Denmark (scale 1:100,000). ................................ 30 Fig. 14. Digital elevation model of Denmark used in the study (grid resolution- 30.4 x 30.4 m). .... 32 Fig. 15. Fitted splines for the texture data from Profile no. 1921 (left), and a photo of the same profile (right)....................................................................................................................................... 36 Fig. 16. Fitted splines for the texture data from Profile no. 1255 (left), and a photo of the same profile (right)....................................................................................................................................... 36 Fig. 17. Fitted splines for the texture data from Profile no. 1138, (left), and a photo of the same profile (right)....................................................................................................................................... 37 Fig. 18. Fitted splines for the soil pH data from the six selected soil profiles ...................................... 38 Fig. 19. Fitted splines for the soil bulk density data from the six selected soil profiles ...................... 39 Fig. 20. Experimental variogram (Gamma) and, fitted variogram model (Fitted) with variogram parameters for soil clay content (reproduced from Paper I). ....................................................... 41 Fig. 21. Variogram and variogram parameters of soil clay content from ten different landscape types in the study area. The dots represent the points of the experimental variogram, whereas the continuous line is the best fitted variogram model. The colour scale of the points (red to blue) corresponds to the increasing number of point pairs used to generate each of these experimental variogram points (red = minimum and blue = maximum). ................................. 43 Fig. 22. Variogram of soil pH from seven depth intervals within the top 1 m of soil in Denmark. The points represent the experimental variogram and the continuous line the modelled variogram. The colour scale for the points (from red to blue) suggests an increasing number of point pairs used to generate each of these experimental variogram points. ........................................ 45
xii
Fig. 23. Residual variogram and its parameters for soil clay content at a depth of 0-5 cm in the study area. ........................................................................................................................................... 47 Fig. 24. (a) Soil clay content as predicted by Ordinary Kriging (OK), reproduced from Paper I, and (b) the standard error of prediction. ................................................................................................ 50 Fig. 25. (a) Soil clay content as predicted by Stratified Ordinary Kriging (OKst), reproduced from Paper I, and (b) the standard error of prediction. ......................................................................... 51 Fig. 26. (a) Scatter plot of measured clay content (%) and clay content predicted by regression rules and (b) distribution of the regression residuals. ............................................................................ 54 Fig. 27. (a) Soil clay content as predicted by Rule-based Regression Kriging (RKrr), (b) regression residuals and (c) the final predicted map obtained using the RKrr approach. Part (c) reproduced from Paper I. .................................................................................................................. 55 Fig. 28. Soil clay content in the study area as predicted by the Regression Tree (RT) method, reproduced from Paper I. .................................................................................................................. 56 Fig. 29. Scatter plot of measured texture components (g/kg) and texture components predicted by the Rule-based Regression method from 0-5 cm depth. The continuous line represents the 1:1 line. ....................................................................................................................................................... 59 Fig. 30. Histograms and normal quantile plots of residuals derived for (a) clay, (b) silt, (c) fine sand and (d) coarse sand, for the 0-5 cm soil layer. The lefthand graph in each of the four diagrams represents a histogram, the middle one an outlier-box plot, and the righthand one a normal quantile plot. ....................................................................................................................................... 60 Fig. 31. (a) Map of clay content at 0-5 cm depth as predicted by Rule-based Regression and (b) residuals from regression. ................................................................................................................. 61 Fig. 32. Predicted clay content at six standard soil depths for the whole of Denmark (reproduced from Paper II). .................................................................................................................................... 62 Fig. 33. Predicted silt content at six standard soil depths for the whole of Denmark (reproduced from Paper II). .................................................................................................................................... 63 Fig. 34. Predicted fine sand content at six standard soil depths for the whole of Denmark (reproduced from Paper II)............................................................................................................... 64 Fig. 35. Predicted coarse sand content at six standard soil depths for the whole of Denmark (reproduced from Paper II)............................................................................................................... 65 Fig. 36. Changes in R2 and RMSE values with depth in the soil, as observed during the prediction of soil texture components with RKrr. .................................................................................................. 69 Fig. 37. Scatter plot of (a) predicted and measured soil pH from 0 to 5 cm soil depth and (b) the distribution of residuals. ................................................................................................................... 71 Fig. 38. Scatter plot of (a) predicted and measured bulk density (g/cm3) from 0 to 5 cm soil depth and (b) the distribution of residuals. ............................................................................................... 71 Fig. 39. (a) Soil pH as predicted by Rule-based Regression and (b) the residuals from regression. 72 Fig. 40. Predicted soil pH at seven depth intervals within the top 1 m of soil for the whole of Denmark. ............................................................................................................................................. 73 Fig. 41. (a) Soil bulk density from 0 to 5 cm soil depth as predicted by Rule-based Regression and (b) the residuals from regression. .................................................................................................... 74 Fig. 42. Predicted bulk density (g/cm3) of seven soil depth intervals in the top 1 m soil in Denmark. ............................................................................................................................................................... 75 Fig. 43. Geographical distribution of predicted FAO soil classes in Denmark (Inset: map of Prediction Uncertainty). These maps are reproduced from Paper III. ....................................... 79
xiii
Fig. 44. Prediction uncertainty for clay content in the selected area of Denmark. ............................. 82 Fig. 45. Prediction uncertainty for coarse sand content in the 0-5 cm soil layer. ............................... 83 Fig. 46. Prediction uncertainty for soil pH in the 0-5 cm soil layer. ..................................................... 83 Fig. 47. Prediction confidence of FAO soil classes predicted by decision tree modelling. ................. 84
xiv
List of Tables Table 1. Areas of use for soil information globally. .................................................................................... 5 Table 2. Definition of soil types in the Danish Soil Classification (Madsen et al., 1992). .................. 11 Table 3. List of land surface parameters derived from the Digital Elevation Model........................... 33 Table 4. Results of the variogram analysis of soil texture components at six standard soil depths in the study area (reproduced from Paper II). † ................................................................................. 44 Table 5. Variogram parameters of soil bulk density from seven soil depth intervals within the top 1 m of soil depth in the study area. † .................................................................................................. 46 Table 6. Mean and standard deviation of predicted texture maps from six standard depths. .......... 66 Table 7. Most influential environmental variables with their relative importance identified during the prediction of soil texture components from six standard soil depths for the whole of Denmark.† ........................................................................................................................................... 67 Table 8. Residual prediction deviation and relative error of soil texture components at six standard soil depths (N = 392) †....................................................................................................................... 69 Table 9. Average predicted soil pH and bulk density values at different soil depths in Denmark. ... 75 Table 10. Evaluation of model performances in predicting soil pH at different soil depths in Denmark. † .......................................................................................................................................... 76 Table 11. Prediction confidence derived for different Soil Mapping Units. .......................................... 84 Table A.12 Description on the Major soil groupings of FAO-Unesco Soil Map of the World (Revised-legend). .............................................................................................................................. 100
xv
List of Equations S c x, y, ~ t or S p x, y, ~ t f sx, y, ~ t , cx, y, ~ t , ox, y, ~ t , rx, y, ~ t , px, y, ~ t , ax, y, ~ t , n Eq. (1)
19
Z ( xi ) m( xi ) ' ( xi ) ''
Eq. (2)
1 n( h) z( xi ) z( xi h)2 ( h) 2 N (h) i 1 1 n
y i f i
2
n
i 1
f xn
x0
'
x dx 2
3h 1 h 3 (h) c0 c 2a 2 a (h) c c 0 (0) 0
2 OK ( x0 )
21
Eq. (4)
34
Eq. (5)
for h a
for 0 h
N ( x0 )
N ( x0 )
i 1
i 1
i .Z ( xi ) with
Eq. (6)
i 1
N ( x0 )
i . ( xi x0 )
Eq. (7)
Eq. (8)
40
40
48
48
i 1
z * xi z ( x) n
R2
Eq. (3)
for 0 h a
3h (h) c0 c 1 exp a (0) 0
Z *OK ( x0 )
21
2
i 1 n
Eq. (9)
zxi z ( x)
49
2
i 1
RMSE RPD
RE
ME
1 n zxi z * xi 2 n i 1
SD n RMSE n 1
Eq. (11)
1 n z ( xi ) z * ( xi ) n i 1
1 n z ( xi ) z ( x) n i 1
Eq. (10)
1 n zxi z * xi n i 1
49
49
Eq. (12)
53
Eq. 13)
68
xvi
X ii
UA j
C
X ij
Eq. (14)
78
Eq. (15)
78
i 1
PA j
X jj C
X ij i 1
C
OA
Eii i 1
N
Eq. (16)
78
xvii
List of supporting papers I.
Adhikari, K., R. Bou Kheir, M.B. Greve, and M.H. Greve. 2012. Spatial Prediction and Mapping of Soil Clay Content in a Diverse Low-relief Landscape Using Kriging and Regression Techniques. Submitted to Journal of Environmental Management – under revision. Working manuscript included.
II.
K. Adhikari, R. Bou Kheir, M.B. Greve, P.K. Bøcher, B.P. Malone, B. Minasny, A.B. McBratney, and M.H. Greve. 2013. High-Resolution 3-D Mapping of Soil Texture in Denmark. Soil Sci. Soc. Am. J. 77(3): 860-876.
III.
Adhikari, K., Minasny, B., Greve, M.B., and Greve, M.H. Constructing the FAO Soil Map of Denmark Using Digital Techniques. Submitted to Geoderma.
IV.
Adhikari, K., R. Bou Kheir, M.B. Greve, P.K. Bøcher, B.P. Malone, B. Minasny, A.B. McBratney, and M.H. Greve. 2012. Progress towards GlobalSoilMap.net soil database of Denmark. In: Minasny, Malone, and McBratney (Eds.): Digital Soil Assessment and Beyond. CRC Press/Balkema, pp. 451-455.
V.
Adhikari K., G. Tóth, A. Makó and M.H. Greve. 2010. Assessment of spatial variability of surface soil moisture content; A Geostatistical Perspective. In: de Jonge, L.W., Moldrup, P., and Vendelboe, A.L., (eds.): 1st International Conference and Exploratory Workshop on Soil Architecture and Physico-Chemical Functions “CESAR”, Aarhus Universitet, Det Jordbrugsvidenskabelige Fakultet, Foulum, DK, pp. 17-24.
These papers will be referred to by their roman numerals in the text.
xviii
1 1.1
Introduction General concept of soil and soil formation
Soil is essentially a non-renewable and vital natural resource (Glanz, 1995), which is dynamic and the most complex bio-material on planet Earth (Young and Crawford, 2004). It is an interface between the lithosphere (the rocks), atmosphere (air), hydrosphere (water) and biosphere (living organisms), and is developed from the actions and interactions of climate and living organisms on parent material over time, as conditioned by topography on the Earth's surface. Its formation on the Earth‟s surface is explained by a fundamental equation (i.e. s = f (cl, o, r, p, t, …) (Jenny, 1941). The variables in the equation are soil forming factors, where cl stands for climate, o organisms, r relief or topography, p parent material and t geological time, and they are all assumed to be functionally interrelated. The dots indicate that besides the listed variables, additional factors could also be included. Soil is mainly composed of four major components (air, water, mineral matter and organic matter) mixed in different proportions. About 50% of the soil volume consists of solid materials (inorganic or mineral, and organic materials) and the rest consists of air and water (Fig. 1). The relative proportion of these four components greatly influences the behaviour and property of soils at a given location.
Fig. 1. Four major components of the ideal soil (Source: www.soilsurvey.org). Development of soil in a landscape is a product of destructive and synthetic processes. The destructive processes include weathering of parent material and mineral decay of organic residues, whereas synthetic processes include the formation of new materials and minerals such as clays and organic compounds. Such ongoing processes and activities of different
1
types (physical, chemical, and biological) on the Earth's surface result in the formation of contrasting soil layers at different stages of development, which are called soil horizons. The development of soil horizons (horizonation) in the upper regolith is a unique characteristic of soil that sets it apart from the deeper consolidated materials. Based on the properties and nature of horizonation, they are designated by horizon symbols, which consists of one or two capital letters for the master horizon and lower case letter suffixes for subordinate distinctions within master horizons. The master horizons include the capital letters H (saturated and organic), O (unsaturated and organic), A (mineral horizon below O horizon), E (bleached or eluviated horizon), B (illuviated subsurface horizon), C (unconsolidated materials) and R (hard bedrock). Sometimes two symbols are combined (e.g. BE) to describe a mixed or a transitional horizon (Fig. 2). The individual soil unit generally used in field investigations is termed a “pedon”, which is an imaginary three-dimensional smallest soil sampling unit that displays the full range of properties and characteristics of a particular soil. The pedon serves as the fundamental unit of soil classification. A group of similar pedons is called a polypedon; a characteristic of a soil individual in a landscape.
Fig. 2. Schematic representation of a pedon from a landscape and development of soil horizons (Source: www.madrimasd.org). 2
1.2 Why is soil so important? The importance of soil can be explained in terms of the following three main components: 1.2.1 Ecosystem services/soil functions Soil provides ecosystem services to sustain life on earth. It provides food, fibre and a place to live, and is also the foundation for all terrestrial transformation and fluxes. Daily et al. (1997) listed six major ecosystem services supplied by the soil, which are: (1) buffering and moderation of the hydrological cycle; (2) physical support of plants; (3) retention and delivery of nutrients to plants; (4) disposal of wastes and dead organic matter; (5) renewal of soil fertility; and (6) regulation of major element cycles. Figure 3 also highlights the importance and interaction of soils with aspects of life ranging from food production to the state where soil also acts as a gene pool. The items listed in the diagram include the seven major functions of soils defined in a Communication from the European Commission “Towards a Thematic Strategy for Soil Protection” (European Commission, 2002), which have subsequently been used in much literature (Blum and Eswaran, 2004; Bouma, 2009; Bouma and Droogers, 2007). The seven functions that the soil performs are: biomass production, carbon pool, a source of raw materials, biodiversity pool, natural heritage, physical and cultural habitat, and storing and filtering of nutrients and water. These functions need to be preserved to ensure the sustained supply of ecosystem services from soils.
Fig. 3. Linkages of soil to different aspects of life [Source: (Omuto et al., 2012)]. 3
1.2.2 Soil threats In spite of the ecosystem services they provide, soils are being constantly threatened by a number of natural or anthropogenic activities causing severe damage to soils, leading to loss of its capacity to provide the services in an effective way. The major threats from which the soil needs to be protected have also been well defined in the Thematic Strategy for Soil Protection in Europe (European Commission, 2002). This communication lists eight main threats to soils: erosion, decline in organic matter, soil contamination, soil sealing, soil compaction, decline in soil biodiversity, salinisation, and floods and landslides. 1.2.3 Use of soil information Soils are a primary concern and, as they are directly or indirectly linked to the existence and development of society, soils are back on the global agenda (Bouma, 2009; Hartemink, 2008). A thorough knowledge of soils and their formation and distribution in the landscape is of the utmost importance to maintain ecosystem service quality and protect soil itself from the above-mentioned threats. Moreover, major global issues, such as food security and hunger eradication, water scarcity, climate change, declining biodiversity, etc. are also, in many aspects, related to soils and to soil management and policy concerns. Therefore, a better understanding of this vital resource is key to good ecosystem functioning and to assure quality of life on Earth. To know more about a soil, one should begin by opening up a soil profile or making augur holes, analysing samples, describing and reporting the results in such a way that the user community gets the right information to support them in managing their soil resources in a better and more sustainable way. Soil survey and classification is a common way to describe soils and to understand their distribution in the landscape. The soil data coming from the field survey or laboratory or other secondary sources together form Soil information, which has a variety of uses, the majority of which are listed below (Omuto et al., 2012):
Agronomic assessment: Crop and land management, food and fibre production, yield estimation, fertiliser application, improving soil productivity, irrigation needs, recommendation of best management practices, crop varieties and cultivars.
4
Engineering applications: Urban planning, evaluation of construction materials, site selection, foundation design, irrigation dam and flood control structures.
Hydrology and hydro-geological assessments: Modelling floods and drought, groundwater pollution, flow characterisation.
Environmental assessments: Land degradation modelling, climate change studies, sediment transport and deposition in water bodies, environmental impacts, pollution control, etc.
Policy decisions: Resource allocation, nature conservation, agricultural subsidies, environmental regulations, socioeconomic developments and so on.
A recent online survey on the user needs for soil information carried out by the FAO (Food and Agriculture Organization) in 2012 in connection with the GSP (Global Soil Partnership) project showed that most soil information is used for research and by academics (25.8%), followed by land degradation assessments (16.7%) (Table 1) (Omuto et al., 2012). The survey also reported that soil physical and chemical properties are by far the most demanded soil attribute data. Table 1. Areas of use for soil information globally. Usage
% of total survey responses
Research
17.5
Land degradation assessment
16.7
Agronomic decisions
11.7
Climate change adaptation/mitigation
11.3
Environmental modelling
9.8
Policy development
8.6
Academic
8.3
Planning
6.8
Hydrology
3.9
Forestry applications
2.1
Others, including engineering, and commercial
3.3
Looking at soil data needs and usage in Denmark, the national soil database has been widely used in planning rural land use at county and national levels, where the initial
5
intention was to protect the expanding urban settlements (Greve and Madsen, 1999). However, at later dates the database was also used for agriculture water planning (Holst and Madsen, 1988), wind, water and tillage erosion studies (Hasholt et al., 1990; Heckrath et al., 2005), monitoring nitrate losses from farmland (Børgesen et al., 1997), and in the study of marginal land (Madsen and Holst, 1987). Moreover, Danish soil data have been used as part of several European projects, including the European Monitoring for Forest Health Program in 1995, Integrated Model for Predicting European Land Use (IMPEL) in 1997, Hydraulic Properties of European Soils (HYPRES) in 1998, and Land Use/Cover Area frame Statistical Survey (LUCAS) in 2009. Some environmental models, such as the Danish simulation model DAISY, require depth-wise soil information for the estimation of crop yield, nitrate leaching and soil hydrological properties (Hansen et al., 1990). Several institutes and universities in Denmark have been using Danish soil data for research and academic purposes for decades. 1.3 Concept of Soil survey, Classification and Mapping Soil survey, classification and mapping refers to the systematic examination, description, classification or naming and mapping of soil resources and production of reports by a pedologist or a soil surveyor. Specifically, it involves all the processes starting from the site description through soil profile observations, soil sampling and laboratory analysis, soil landscape characterisation and ultimately the production of soil maps. Very often, this entire process is termed soil mapping. There are a number of soil classification systems in practical use in different parts of the world. One of the oldest systems, the Russian system, which is mainly based on the factors of soil formation, is still widely used in Russia and other countries of the former Soviet Union, and also in a few other European countries. Other countries such as China, Belgium, France, Canada etc. have developed their own system of soil classification which is very specific to the country or region in which it was developed. Such national or regional classification systems are of limited use for the rest of the world, because they lack a common language for communication and possess different perceptions or understanding of soils and its development, which in many aspects are specific to the local environment. However, a comprehensive soil classification system developed by the Soil Survey Staff
6
(Soil Taxonomy 1975) has long been in use in the United States, and also in more than 55 other countries in the world (Brady and Weil, 2002). In order to have a common system to understand the soil resources of the entire globe, and in response to the recommendation of the International Society of Soil Science (ISSS) Congress held at Madison, USA, an action towards the preparation of a soil map of the world was started, where the Food and Agriculture Organization of the United Nations together with the United Nations Educational, Scientific and Cultural organization (FAO/UNESCO) took the lead. This task developed a global soil nomenclature system (FAO soil legend), based on which a soil map of the whole world was published in 1974 at a scale of 1:5,000,000 (FAO-Unesco, 1974). Although this output was a milestone in the history of global soil nomenclature and assessments, it was also realised by the international soil community that an update on the very first legend was necessary to retain its value and also to ensure that the soil map of the world contains the most up-to-date soil information. This could be achieved through the incorporation of new soil information and knowledge gathered from all over the world to shed more light on the world‟s soil cover and develop a revised version of the original FAO soil legend from 1974. So, in 1990, FAO published a revised legend to the Soil Map of the World (FAO-Unesco, 1990), and many countries in the world, including Denmark, have adopted this new system to create the soil maps of their countries at varying scales. Brief descriptions of the FAO-Unesco soil map legend together with a note on major changes to the 1974 legend are given in Appendix A. 1.4 Soil classification and mapping in Denmark 1.4.1 Earlier soil assessments Soil resource assessments and investigations in Denmark started as early as the 17 th century, when the first nationwide land assessment was carried out. The assessment is known as King Christian V‟s Great Danish Land Register of 1688, which was based on the systematic evaluation of soils to be used for taxation purposes. The soils were classified according to their potential yield of various crops, e.g. good soil for barley and rye or good soil for oat production, and this system existed for more than 150 years. In 1844, a new system, The Great Danish Land Register, was established, based on which the land was
7
evaluated on a points scale according to the fertility of its soils and related characteristics, with 24 points for optimal quality soil (Greve et al., 2001). These were the two pioneering works in the history of Denmark in which the soil received considerable attention. However, it was always realised by the soil research community, soil surveyors and authorities in the past that a new detailed national land assessment should be undertaken, giving more emphasis to soil conditions and soil properties in Denmark (Brink, 1926; Pedersen, 1932). To address this issue, in 1949 the Danish Ministry of Agriculture established a Land Assessment Commission, where a recommendation was to have a new nationwide land assessment, but the Commission‟s report was never approved by Parliament. Later, with the establishment of the Regional Planning Act of 1973 and the Local Planning Act of 1975, it was decided that information on land and agriculture should be included in overall physical and environmental planning in Denmark, as the country had already experienced a huge loss of agricultural land to other land uses, mainly urbanisation, after the Second World War. This demanded an initiation of systematic soil mapping work in Denmark, when the Danish Ministry of Agriculture in 1974 appointed a Commission to establish a national soil classification procedure in Denmark. 1.4.2 Danish Soil Classification (1975-1980) The Commission established in 1974 consisted of eight members as experts from soil research centres, universities and agricultural organisations and they had a mandate to develop a procedure for practical soil classification in accordance with the following five basic requirements (Greve and Madsen, 1999):
The areas should be classified on the basis of permanent or stable characteristics
There should be a national standard or code of reference that would make it possible to classify soils as uniformly as possible
The results should clearly illustrate the range of fertile and infertile soils
The maps should be prepared in such a way that they could be used for future planning at all levels
The classification should be completed within a reasonable time limit of a maximum of 3 years.
8
Following the requirements, the experts agreed to focus on a few important and easily detectable soil properties such as texture of the plough layer, slope, overall drainage status and geological origins of the soil material as a basis for classification. The task also employed local experts who could help in soil sampling and delineating an area of similar soil types at local scale. This inclusion provided the national mapping campaign with local experience that was helpful in dealing with local soil variability. Altogether, about 36,000 sample sites were identified throughout the agricultural area in Denmark (approx. 1 sample per km2) and soil samples were taken from the topsoil (0-20 cm), and also from the subsoil (35-55 cm) at selected sites. This sampling excluded urban and forest areas (Fig. 4). Samples were analysed for texture components, soil organic carbon content and calcium carbonate and the results were stored in a database (Mathiesen, 1980). Four texture components, namely clay (10
8
Organic 11 Calcareou 12 s
-
1.4.3 Pedological investigations Pedological investigations in Denmark started in 1981 during the establishment of the main gas pipeline system from the North Sea across Denmark (Madsen and Jensen, 1985). About 800 soil profiles were described in detail and about 8000 soil classifications were made at a spacing of approximately 25 m along the trench based on a pedological soil classification developed for field classification in Denmark (Madsen, 1983). During 1986, to improve the efficiency of nitrogen fertiliser use in Denmark, a nationwide 7-km soil 11
monitoring grid was established and at each 850 grid intersection, a detailed profile description and soil classification was made. Soil samples were collected from all profiles according to the genetic horizons and were analysed in the laboratory for texture, organic carbon, pH and carbonates. The texture was analysed with the hydrometer method and the four components (i.e. clay, silt, fine sand and coarse sand) were expressed in g/100 g. Similarly, soil pH was determined with pH electrodes dipping into a solid:liquid solution made with H2O or CaCl2. The latter method used 0.001M CaCl2 in a soil:liquid ratio of 1:2.5. Based on the profile observations and analytical data, it was possible to classify soils according to FAO-Unesco 1974 legend (Madsen and Jensen, 1996). Furthermore, profile data from 200 new locations have since been added to the database in connection with regional research activities and environmental monitoring projects (Madsen et al., 2001). As a result, a rich soil profile database has been developed in Denmark. 1.4.4 Earlier FAO-Unesco soil maps of Denmark Looking at the distribution of Danish soil types in the FAO-Unesco soil map (1:5,000,000 scale), and in the European soil map (1:1,000,000 scale), soils appear as a rough sketch due to the coarse scale the maps represent. So, there was a need to produce a soil class map of Denmark in which existing information about the bio-physical environment was included. In 1984 a soil map of Denmark based on FAO-Unesco legend 1974 was generated, where geomorphology, climate and the concept of soil units and soil development were used as soil mapping criteria. Of the 26 higher soil units in the FAO system, nine were mapped together with soil inclusions and associations. However, for easy reference and communication, they were indicated by 14 soil mapping units and the map was generated in colour (Jacobsen, 1984), as shown in Fig. 6. In 1990, when FAO published a revised legend of the Soil Map of the World, a new soil map of Denmark based on this system was compiled at a cartographic scale of 1:1,000,000 (Madsen and Jensen, 1996). The main aim with generation of this new map was to harmonise soil maps of Europe, as many other European countries had already started using the revised legend to map the soils of their territories. It was also reported that a revision of the Danish contribution to the EC soil map was necessary to update it with new data and the new system of classification. The main inputs for this new mapping were the soil classification data from the pedological investigations in Denmark (Section 1.4.3),
12
texture information from the Danish Soil Classification (Section 1.4.2), maps of landscape types, wetland boundaries, and the map showing potential acid sulphate soils. The new soil map of Denmark according to FAO-Unesco Revised legend delineated 17 different soil mapping units, each of which includes a single or a combination of different soil classes in different proportions (Fig. 7).
13
Fig. 6. Soil map of Denmark according to FAO-Unesco legend 1974 (Jacobsen, 1984). 14
Fig. 7. Soil map of Denmark according to the revised FAO legend 1990 [Source: Madsen and Jensen (1996)]. (AL- Alisols, AR- Arenosols, CM- Cambisols, FL- Fluvisols, GLGleysols, HS- Histosols, LP- Leptosols, LV/L- Luvisols, PD- Podzoluvisols, PZ- Podzols) (Madsen and Jensen, 1996). 15
1.5 Soil spatial variability Soil is a complex and dynamic system and its variability in the landscape is a well-known phenomenon which has been recognised for many years (Beckett and Webster, 1971; Burrough, 1993). Soil variability is a result of differences in soil forming factors through which the soil develops, or differences in the soil forming process, i.e. pedogenesis, which defines the soil type and governs the majority of the properties e.g. texture, horizon colour, cation exchange capacity (CEC), mineralogy and soil depth. The interactions between parent material, topography, vegetation, tillage, fertilisation and cropping history can also influence the variability of the physical and chemical properties of soils in the fields. Whelan (2003) categorises this overall variability in soil attributes as:
Soil textural and structural variability
Variability in soil organic matter
Soil moisture variability
Variability in soil nutrient content and their availability
Variability in soil pH
Spatial variability in soil properties directly or indirectly influences different soil functions. For example, variability in soil texture may contribute to differences in soil performance and crop yield (Russell, 1973; Tanji, 1996; Warrick and Gardner, 1983), soil moisture retention and availability (Crave and Gascuel-Odoux, 1997; Frenkel et al., 1978), soil aggregation and risk of splash erosion (Luk, 1979). It ultimately affects the fertility status of the soils and crop growth, and hence the yield potential of any site is affected (Davey, 1990). Variability in soil organic carbon (SOC) is also important, as SOC helps in maintaining soil physical properties, storing and releasing moisture and plant nutrients and influencing the quantity and quality of soil microbial activity. pH variability influences nutrient availability and liming requirements. The soil varies both in lateral and vertical dimensions, although spatial soil variability is generally perceived as variations in soils in the lateral dimension. During the course of soil genesis, a number of soil-landscape processes occur in soils. The magnitude and extent of these processes show a varied influence at different soil depths, e.g. the processes common at the soil surface may not be the same as those prevailing in deeper soil horizons. These 16
differential actions of soil forming factors and soil processes lead to soil attribute variability with depth, evident as a difference in soil colour, clay content, SOC content, etc. throughout the profile. Similarly, the geological processes also give rise to differences in e.g. texture; fluvial soils are normally finer in the topsoil depths, whereas glacial tills are normally coarser in the topsoil. The variation in soil properties with depth has already been recognised (Jenny, 1941; Russell and Moore, 1968), and it has been reported that soil properties vary more or less continuously with depth (Poncehernandez et al., 1986). Such depthwise variation in soil attributes can be modelled using different functions ranging from a very simple free-hand curve that connects the midpoint values of the horizons (Jenny, 1941), to advanced statistical functions such as exponential depth function (Minasny et al., 2006; Mishra et al., 2009), linear and polynomial functions (Moore et al., 1972) or the smoothing spline function (Poncehernandez et al., 1986). Similarly, a function based on profile depth-blocks which combines general pedological knowledge with geostatistics has been developed and applied to SOC distribution in the Netherlands (Kempen et al., 2011). However, Bishop et al. (1999) suggested higher efficacy of masspreserving equal area splines while modelling the depth function of a number of soil properties. Recently such spline functions have been widely used in modelling different soil properties around the globe (e.g., Adhikari et al., 2013; Malone et al., 2009; Odgers et al., 2012). 1.6 Prediction and mapping of soil spatial variability 1.6.1 Conventional soil mapping approach Methods for mapping soils in a conventional way have been described in detail by several authors including Hewitt et al. (2008) and Simonson and Bomer (1989). Conceptually, the conventional approach is based on the tacit model of a soil surveyor (Hudson, 1992), who develops a mental model relating field soil observations to the surrounding environment as soil forming factors to infer soil variation. The model is then transformed into a choropleth map, where a boundary is drawn on the aerial photographs separating a similar soil unit from the rest (Scull et al., 2003), assuming that soil properties are homogeneous within the polygon (Heuvelink and Huisman, 2000). However, the sharp boundaries concept between the two soil units is also questionable, as soils vary continuously in geographical space. With their less cost-effective and more time-consuming nature, the soil maps thus
17
produced may suffer from several limitations if one considers the spatial detail and accuracy of the soil attributes they represent (McSweeney et al., 1994; Zhu et al., 1997). Some of the major drawbacks of the conventional soil maps suggested by Hartemink et al. (2010) include their inaccuracies, imprecision and inflexibility for quantitative analysis and their static nature, where the information contained in the given scale is seldom useful for a particular question. Moreover, the qualitative tacit models are never clearly described (Jafari et al., 2012) and are unverifiable in any objective sense (Hewitt, 1993; Lagacherie et al., 1995). Nevertheless, as the conventional procedure and the map outputs have a long useful history, and the approach was a sensible solution in the pre-digital era (Hartemink et al., 2010), such maps are a good source of soil data for today‟s digital mapping approaches. 1.6.2 Digital Soil Mapping approach In spite of the fact that the conventional soil map has plenty of limitations, it cannot be replaced by any mechanical models. However, as an analogy of the surveyor‟s tacit model, the relationship between field observations and the environmental variables or soil forming factors are quantified statistically and the soils are predicted in the spatial domain as in several studies (Bou Kheir et al., 2010; Bui et al., 1999; Carre and Girard, 2002; Dobos and Hengl, 2009; Greve et al., 2012; Grunwald, 2006; Kempen et al., 2012a; McKenzie and Ryan, 1999; Minasny et al., 2008; Minasny et al., 2013; Zhu et al., 2001). All the quantitative models used in the above-mentioned studies are collectively defined as tools of digital soil mapping (DSM), a thorough review of which has been provided by McBratney et al. (2003). Scull et al. (2003) also explain the method, but they consider it a predictive soil mapping (PSM) method. Conceptually, DSM is a pedometric method, which is the „application of mathematical and statistical methods for the study of the distribution and genesis of soils‟ (Webster, 1994), to spatially predict the distribution of soils (Grunwald, 2006; McBratney et al., 2003). Lagacherie and McBratney (2007) defined DSM as the „creation and population of spatial soil information system by use of field and laboratory observational methods coupled with spatial and non-spatial soil inference systems‟. In a simple way, it is a soil mapping process where different statistical tools are used to define the relationship between observed soil data, legacy soil information and the environmental variables influencing the formation and distribution of soils to generate a
18
geo-referenced soil database for end-users. In addition, DSM describes the uncertainties associated with spatial prediction. The approach of DSM basically follows the clorpt model of Jenny (1941) (see Section 1.1 for explanation) not just to explain the factors, but also to describe the quantitative relationship between soils and spatially referenced environmental variables to be used in the prediction functions (McBratney et al., 2011). The model is called scorpan (McBratney et al., 2003) and is explicitly written as: S c x, y, ~ t or S p x, y, ~ t f sx, y, ~ t , cx, y, ~ t , ox, y, ~ t , rx, y, ~ t , px, y, ~ t , ax, y, ~ t , n
Eq. (1) where: Sc= soil class Sp= soil property s = soils, other attributes of the soil at a point c = climate, climate properties of the environment at a point o = organisms, vegetation, or fauna, or human activity r = topography, landscape attributes p = parent material, lithology a = age, the time factor n = space, spatial position t = time (where t is defined as an approximate time) x, y = the explicit spatial coordinates f = function or soil spatial prediction function Unlike the clorpt model, scorpan considers soil (s) itself as a factor because soil can be used for prediction, e.g. using prior soil information or expert knowledge on soils. The (n) factor explains the spatial location or some distance to or from some objects, such as distance from a road, from a river or from the pollution source. Due to the benefits of DSM over conventional soil mapping (Bui et al., 1999; Hewitt, 1993; Kempen et al., 2012b; McKenzie and Ryan, 1999), a huge number of DSM activities have been conducted in different parts of the world during the past decade. This exponential growth in DSM was possible due to the (free) availability of environmental variables at a
19
finer resolution and also to the advancement in computer technologies as required for larger data processing and analysis (Bui, 2007; Lagacherie and McBratney, 2007). Most of the DSM works reported during the past decade, together with the systematic development of this novel soil mapping approach, are addressed in books by Lagacherie et al. (2007), Hartemink et al. (2008), Boettinger et al. (2010) and Minasny et al. (2012). 1.6.2.1 Digital Terrain Analysis As discussed earlier, modelling soil and terrain relationships is a key parameter in DSM, as successful mapping requires a thorough analysis of topography and its links to soil spatial variability assessments. Digital Terrain Analysis (DTA) is a common way by which topographical information is derived from the Digital Elevation Model (DEM) which stores information on elevation, stream networks and other terrain-related attributes, together with their geographical locations (Moore et al., 1993; Wilson and Gallant, 2000a). The generation of the DEM is also a part of DTA, where several terrain attributes can be derived from the DEM. Basically, the DEM attributes can be grouped into primary and secondary attributes (Moore et al., 1993; Wilson and Gallant, 2000b): the primary attributes are directly derived from the elevation values, whereas secondary attributes use a functional combination of primary attributes. The main primary attributes are slope gradient, slope aspect, plan and profile curvatures, upslope contributing area, etc., and the secondary attributes are topographic wetness index, stream power index, incoming solar radiation, sediment transport capacity index, etc. These indices are the indicators of pedological, geomorphological, hydrological or ecological processes on the Earth's surface (Pike, 2000; Wilson and Gallant, 2000b). 1.6.2.2 DSM methods A number of tools and methods are used in DSM. A thorough review of such methods is provided by McBratney et al. (2003) and Scull et al. (2003). Broadly, the methods can be grouped into three main categories, geostatistical, non-geostatistical and mixed. 1.6.2.2.1 Geostatistical methods Spatial autocorrelation is a prerequisite for the application of geostatistics (Goovaerts, 1999). The geostatistical methods are based on the theory of regionalised variables
20
(Matheron, 1965) and assume that the spatial variation of any variable can be expressed as the sum of three major components: (a) a structural component, having constant mean or trend; (b) a spatially correlated component, which shows the autocorrelation, known as the variation of regionalised variable; and (c) a spatially uncorrelated random noise or residual error (Burrough and McDonnell, 1998). Z ( xi ) m( xi ) ' ( xi ) ''
Eq. (2)
where m(xi) is the deterministic function of a random variable Z at xi, ε’(xi) is a stochastic, locally varying, spatially dependent residual from m(xi), and ε’’ is a spatially independent residual component having mean zero and variance σ2. The main function of geostatistics is to characterise ε‟(xi) by means of semivariance, i.e. γ(h) (Eq. 3). By plotting a set of semivariances of point pairs against lag distance h (see eq. 3), an experimental variogram is obtained, which measures the average degree of dissimilarity between unsampled values and a nearby data value (Deutsch and Journel, 1999), providing information on the spatial auto-correlation of phenomena such as soil properties (McBratney and Pringle, 1999). Theoretical models are fitted to the experimental variogram to describe it, the most common forms being the Spherical, Exponential, Linear and Gaussian models (Burrough, 1993).
1 n( h) z( xi ) z( xi h)2 ( h) 2 N (h) i 1
Eq. (3)
where γ(h) is a semivariance between the attribute value of the regionalised variable Z at a spatial location (xi) and (xi + h), n(h) represents the number of observation pairs for each lag h involved to calculate γ(h). The geostatistical techniques are commonly known as kriging, a name given after D.G. Krige, a South African mining engineer. The kriging techniques can be broadly classified into two main groups, namely univariate (using only one variable) and multivariate (using more than two variables of prediction). Each group has a number of kriging variants based on the type of input data and the way the kriging system is developed.
21
a)
Univariate kriging: e. g. Simple kriging, Ordinary kriging, Block kriging, Indicator kriging, Log-normal kriging, etc.
b)
Multivariate kriging: e. g. Universal kriging, Co-kriging, Kriging with external drift, Principle Component kriging, etc.
1.6.2.2.2
Non-geostatistical methods
Non-geostatistical methods do not take into account the spatial autocorrelation of the attributes during modelling. Such methods comprise simple interpolation techniques that are used to derive a continuous surface of the variables based on point observations. Some examples include Inverse Distance Weighting (IDW), Triangular Irregular Network (TIN), Nearest neighbours, Regression models, Trend surface analysis, Classification or Decision trees etc. Among these different methods, in this thesis Regression trees was applied to predict the spatial distribution of soil clay content in a small area in Denmark (see Paper I for more detailed information on the method). Similarly, for the soil classes prediction, a Decision trees model was applied and a map was generated for all of Denmark (see Paper III for explanation). 1.6.2.2.3
Mixed methods
Mixed methods use the combined form of geostatistical and non-geostatistical methods for prediction. These are often called hybrid methods, as they utilise both modelling principles. Some of the data-mining induction tools such as Regression rules or Decision trees are also combined with geostatistics and used in the prediction of different soil properties or soil classes. Other examples of mixed methods are Regression kriging, Regression trees combined with kriging, Linear mixed models, etc. In this thesis, a rule-based Regression kriging method was used to predict soil texture, pH and bulk density at multiple soil depths in Denmark. A detailed explanation of the method and its application in mapping soil properties can be found in Papers II and IV. 1.6.3 Quality assessment of digital soil maps DSM relies on the quantification of soil-landscape relations using statistical models, which in many aspects cannot be error free. Moreover, since soil is rather dynamic and a complicated biomaterial (Young and Crawford, 2004), none of the quantitative methods 22
can describe it completely (Webster and Oliver, 2006). Therefore, together with the predicted map, it is important that the uncertainty associated with the prediction should also be included (Minasny and Bishop, 2008). This helps the producers to assure the reliability of the digital soil map when it is transferred to the end-user community. There can be different sources of uncertainty in DSM, as suggested by Minasny and McBratney (2002). These include uncertainty in the input data (both primary or secondary variables) and uncertainty in model parameters or model structure. A number of methods have been proposed to determine uncertainties in DSM products, but a widely used approach is validation, where the value of the product at some specified locations is compared with the observed value at the same locations. Some of the examples where this strategy is applied include: (a) Cross-validation: In cross-validation, an observation from the training data is left out and the rest are used for model calibration. The procedure is repeated until all the observations have been left out. This is called leave-one-out cross validation (LOCV). Similarly, in n-fold cross-validation, which is a variant of LOCV, the dataset is divided into n-folds and the cross-validation process is repeated for each n-fold. (b) Hold-back validation: In this validation, a random proportion of observations are held back during modelling and these points are used to compare the predicted and observed values. (c) Independent validation: This uses additional random or probability-based samples collected from the mapping domain for validation. In spite of the availability of a number of validation procedure and their use in DSM, Brus et al. (2011) suggested a higher efficiency of probability sampling validation over the other methods described above. For all the validation methods, certain statistical indices are derived to compare the predicted values with the observations. Examples of validation indices generally used in DSM include Root Mean Square Error (RMSE), Mean Error (ME), Relative Error (RE) etc., which are common in soil property predictions, whereas Producer‟s Accuracy (PA), and User‟s Accuracy (UA) are mostly used in soil class predictions. Assessment of model confidence in predicting a given class is another way of evaluating the quality of soil class predictions. RMSE measures the average error of
23
prediction, while ME evaluates the tendency of a prediction model to make under or overpredictions. For an ideal model, ME should be closer to zero, and RMSE as small as possible. RE is the ratio of ME and the error that would result from always predicting the mean. For the useful models, this should always be less than 1. Likewise, UA is the probability that describes how a predicted soil group matches that being observed, whereas PA indicates how well the observed soil group is predicted by the model. 1.6.4 Links to the global soil mapping activities With the growing need for detailed and more accurate soil information to support better decisions while dealing with current global issues such as food security, climate change, soil carbon decline, bio-diversity conservation, environmental degradation and so on, soil scientists from around the globe came together and formed a consortium to map the world‟s soils using state-of-the-art and emerging technology in soil mapping (Sanchez et al., 2009). Key soil attributes such as soil texture fractions (g/kg), soil organic carbon (g/kg), soil pH (x10), bulk density (Mg/m3) and available water capacity (mm) are among those considered for mapping at multiple soil depths of 0-5, 5-15, 15-30, 30-60, 60-100, and 100-200 cm from the soil surface. For each continent, a corresponding project node has been established for coordination and to assist mapping activities within the territory. As Denmark falls within the European node, supporting the global soil mapping initiative by providing national soil information according to project specifications was realised by linking this thesis work to the global project GlobalSoilMap (Hartemink et al., 2010). Data preparation, model building and test mapping of certain soil properties were the main tasks performed during the thesis work. Full operational mapping to deliver data outputs according to the global standards will be performed afterwards. See www.GlobalSoilMap.net for more information on the project. 1.7
Objectives of the thesis
The overall aim of this thesis was to map soil properties/soil classes in Denmark using state-of-the-art digital soil mapping techniques, with the emphasis on modelling and mapping the vertical distribution of soil attributes in Danish soil profiles.
24
Specific objectives were: a) To analyse and describe the spatial variability of soil properties. b) To compare kriging and regression-based DSM techniques as regards their ability to predict soil properties in Denmark. c) To model continuous depth functions of soil properties in Danish soil profiles. d) To predict and map soil texture at multiple soil depths in Denmark. e) To compile an FAO-soil map of Denmark using DSM. These objectives were addressed in a set of studies, the outputs of which are presented in the form of Papers I-V. As mentioned earlier, a number of DSM tools are in use for mapping soils, but the method that normally performs best in the Danish context has never been reported so far. Therefore, Paper I compared geostatistical methods (e.g. Ordinary kriging and Stratified Ordinary kriging) and mixed methods (e.g. Regression trees and Rule-based Regression kriging) to predict soil clay content in a representative area in Denmark, with the method that performed best being recommended for further mapping work. The vertical distribution of soil properties in the soil profiles was then modelled with equal-area quadratic splines and texture components were predicted at multiple soil depths, i.e. six standard depths as specified in Section 1.6.4. Similarly, the mapping of FAO soil classes at a national scale in Denmark followed a data-mining tool based on decision tree modelling, as described in Paper III. Paper IV presents the work done within the GlobalSoilMap project, where progress in soil mapping activities in Denmark according to the global standard was reported. Based on data from a small area in Hungary, the spatial variability of soil moisture content was mapped using geostatistics in Paper V. This demonstrated how geostatistics could be used in dealing with the spatial autocorrelation of soil attributes.
25
2 Study area and data preparation 2.1 Study area Based on the thesis objectives, three study areas were identified. For the general application of spatial analytical techniques in describing the soil spatial variability, an area of about 250 km2 in western Hungary (Fig. 8) was selected and the spatial pattern of soil moisture content was predicted (objective a). For the comparative analysis of prediction methods (objective b), a small area in central Jutland in Denmark was selected. The area is characterised by the presence of major Danish landform types, so it was intended to represent Denmark in terms of geomorphology and landscape development. It consisted of an approximately 45 km wide stripe extending from the east to the west of central Jutland, covering an area of around 7100 km2. The diversity of soil types in this strip corresponded to the main Danish soil types, ranging from coarse sandy to loamy and organic soil. For the rest of the objectives (objectives c, d, and e), the whole of Denmark (excluding Greenland and the Faroe Islands) was defined as the study area (Fig. 9).
Fig. 8. Study area located in western Hungary and the soil sampling sites as marked on orthophoto.
26
Fig. 9. Country border representing the whole study area of Denmark and an enlarged view of the area in central Jutland selected for comparison of prediction methods. 2.2 Soil data 2.2.1 Point observations Three sets of point soil observations were used in the studies. The first set of data was derived from the European Soil Data Centre (ESDAC) and consisted of soil moisture content (θm) information for the samples collected from the genetic horizons of 100 soil sampling sites located in western Hungary. θm was determined in the laboratory as percentage weight loss by the soil sample after drying at 105oC for 24 hrs. The other two sets of soil data originated from the Danish Soil Classification database and from soil profile observations collected during soil surveys in Denmark. A detailed explanation of these data from Danish sources is provided in Sections 1.4.2 and 1.4.3. Soil clay content data at about 6920 point observations from 0-30 cm soil depth within the 27
selected area were extracted from the Danish Soil Classification and used for the comparative study. For soil texture mapping at multiple depths, horizon data from 1958 soil profiles (7-km grid profiles, pipeline profiles and specific research profile) were used. The mapping of soil pH and bulk density used the same profile sets but a smaller number of profiles (i.e. 1934 profiles for pH, 1113 profiles for bulk density). The geographical distribution of points and profile observations in the study area are shown in Fig. 10 and Fig. 11, respectively.
Fig. 10. Soil sample locations from the selected area in Denmark derived from the Danish Soil Classification.
Fig. 11. Geographical location of the soil profiles used in the study (Inset- 7-km grid profiles, profiles along the pipeline trench, and specific profiles as in the south-west corner). 28
2.2.2 Existing choropleth maps Information from existing choropleth maps was also derived and used in the work. This group of data consisted of historical maps of soil types, geology (parent material), georegions, landscape types, wetland areas and land use/cover in Denmark compiled during different research/mapping activities in the past. The soil map classifies the agricultural areas of Denmark into eight soil texture classes (see Table 2) and was derived during 1975-1978 using about 36,000 point soil samples collected from the 0-20 cm plough layer. The soil types range from coarse sand to heavy clay and organic soils, as shown in Fig. 5. Apart from the soil analytical data from sample locations, other parameters such as slope gradient that define the limits of agricultural machinery use, geological origin of the surface materials, drainage conditions and local expert knowledge were also considered during soil map generation. The geology map describes the geological origin of the sediments at 1 m soil depth and was compiled at a cartographic scale of 1:25,000 in Denmark (Danmarks Geologiske Undersøgelse, 1978) (a simplified version is shown in Fig. 12). The landscape map divides Denmark into different landform types at 1: 100,000 scale. Moraine, aeolian, glacial-flood plains etc. were among the types of landscape classes found in the study area (Fig. 13). To generate wetland boundaries for Denmark, different historical polygon maps were combined. These data included wetland delineated from old topographical maps compiled in 1910 (Madsen et al., 1992), humus soils defined in the Danish Soil Classification during 1975-1978 (Madsen et al., 1992), and maps on peat and gytja based on the parent material at 1 m depth compiled during 1880-2008 (GEUS (Geological Survey of Denmark and Greenland), 2009). The land use/cover map showing the type of surface cover was derived from the CORINE 2000 database adjusted for Denmark (Stjernholm and Kjeldgaard, 2004). It consists of 31 classes ranging from open waters, forest (coniferous), heath or construction sites. The map of geo-regions divides Denmark into 10 regions based on climate and geographical settings. They include East Denmark, Bornholm, Himmerland, Midtjylland, Thy and other regions. The maps of land use types, wetland boundaries and geo-regions are not shown here.
29
Fig. 12. Geological types of Denmark (simplified classes at a scale of 1:200,000).
Fig. 13. Map showing different landscape types in Denmark (scale 1:100,000).
30
2.3 Terrain data Digital elevation model (DEM) was the main source of terrain information used in this work. Several terrain parameters were derived to capture the variability in land surface features that could explain soil variability in a landscape. 2.3.1 Generation of a DEM and its post processing A national DEM for Denmark was generated by National Survey and Cadastre (2011) of the Danish Ministry of Environment using airborne LiDAR (Light Detection and Ranging) technology. The LiDAR points were interpolated using Delaunay Triangulation methods to generate a TIN (Triangular Irregular Surface) surface of the topography, which was then rasterised to 1.6 m grid size, producing an original raster-based DEM of Denmark. This DEM needed to be corrected by removing any unwanted pits or peaks present, which would otherwise hinder runoff processing, creating problems during the extraction of flowrelated land surface parameters. Processing of the DEM was performed by locating and removing pits or peaks of about 50 cm depth/height. Once the DEM was hydrologically corrected, it was resampled to 30.4 m grid size by simple aggregation considering the mean (Fig. 14). Greve et al. (2012) found better prediction performance for clay using a 24 m grid size DEM compared with a 90 m DEM. We therefore selected a grid size of 30.4 m, a multiple of the original resolution of the DEM. This resolution is comparable to 24 m as far as variation in soil properties is concerned, and it also greatly reduces the number of processing pixels. The TerraStream algorithms (Danner et al., 2007) were applied for the creation and processing of the DEM.
31
Fig. 14. Digital elevation model of Denmark used in the study (grid resolution- 30.4 x 30.4 m). 2.3.2 Extraction of Land Surface Parameters Once the DEM was processed, the 12 Land Surface Parameters (LSP) listed in Table 3 were derived from it using ArcGIS (ESRI, 2012) and SAGA GIS (SAGA GIS) platforms. For all the flow-related calculations, multiple flow direction (MFD) or FD8 algorithms (Freeman, 1991) were adopted because their divergent nature gave a more realistic distribution of the contributing area (Hengl and Reuter, 2008). A detailed explanation and derivation of those parameters can be found in several studies (Bendix, 2004; Böhner and Antonić, 2009; Böhner et al., 2001; Desmet and Govers, 1996; Gallant and Dowling, 2003; Moore et al., 1993). Paper I used these LSP only for the selected area, whereas in Papers II, III and IV, full Danish coverage was used. The average and the range of these parameter values are given in the paper/s in which they were used.
32
Table 3. List of land surface parameters derived from the Digital Elevation Model. Variables Slope aspect TWI† Direct Sunlight Insolation (1 yr) Elevation Flow Accumulation Mid-Slope Position MRVBF† SAGA† Wetness Index† Slope gradient Slope-length factor Vertical distance to channel network Valley depth
Brief description Direction of the steepest slope gradient from the North Calculates the slope gradient and specific catchment area based TWI. TWI = ln (As/tan β) : where As is upslope catchment area and β is slope gradient (Moore et al., 1993) Calculates potential incoming solar radiation (insolation) calculated in SAGA GIS (Böhner and Antonić, 2009) LiDAR† produced elevation of the land surface Number of upslope cells Covers the warmer zones of slopes (Bendix, 2004) Identifies the depositional areas (Gallant and Dowling, 2003) Same as TWI but uses modified catchment area instead (Böhner et al., 2001) Maximum rate of change between cells and neighbours. Calculates the slope-length as used by the Universal Soil Loss Equation (Desmet and Govers, 1996) Calculates vertical distance to the nearest channel for each cell Relative position of the valley
†
TWI, topographic wetness index; LiDAR, light detection and ranging; MRVBF, multi-resolution index of valley bottom flatness; SAGA, a system for automated geoscientific analyses.
33
3 Modelling the vertical distribution of soil properties in Danish soil profiles One of the main aims of this thesis was to model the vertical distribution of soil properties in the soil profile. Equal area quadratic splines were used to model the continuous depth function of soil texture components, soil pH and bulk density data from the Danish soil profile database. This specific function was chosen because of its reported higher efficacy in modelling the depth function of several soil attributes, including soil texture and soil pH (Bishop et al., 1999). The mathematical function used is given in Eq. (4). For a detailed explanation of spline function and its fit to the profile data, please refer to Paper II. During the fit, the spline passes through the midpoint of each observed horizon maintaining the average texture value of the horizons. Due to their mass preserving nature, the measured and predicted attribute mass in each horizon remain the same during this process. The quality of the spline fit of the profile data depends on the value of the spline smoothing parameter, known as lambda (λ). A higher λ produces a smooth or a loose fit, whereas a lower λ produces a very tightly fitted spline. To find the most suitable spline for a given condition, seven λ values (0.00001, 0.0001, 0.001, 0.01, 0.1, 1 and 10) were tested for the whole profiles and the one with the lowest error was selected to fit the data. All the spline fitting and required calculations were done in the R program (R Development Core Team, 2008). 1 n
y i f i n
i 1
2
f xn
x0
'
x dx 2
Eq. (4)
where x denotes the depth of the soil profile, yi is the modelled estimate of the measured soil attribute data y from layer i, f i is the mean from n soil layers and the depth of the boundaries of n layers as x0 < x1, ……. xn. The function that minimises Eq. (4) is the spline function used for fitting the profile data. 3.1 Fitting splines to soil texture data For the soil texture components, more than 1950 soil profile data coming from 7-km national grids, pipeline profiles and other specific research profiles (Fig. 11) were used to fit the splines. Before fitting splines, a specific treatment was applied to the profile texture data. As Denmark has a long history of mechanised agriculture, the soil materials in the 34
plough layers (upper 20 or 30 cm) are regularly mixed, and are believed to be less heterogeneous than those in the underlying unploughed layers. So, to maintain a homogeneous texture distribution in the plough layer, the surface horizon was sliced in such a way that a 1 cm thick artificial horizon was generated above and below the horizon with the same attribute value. This treatment forced the spline not to extrapolate on the surface to maintain a similar texture value throughout the first horizon. This treatment was found to be logical for our conditions and is well explained in Paper II and partly also in Paper IV. During the modelling, the maximum depth to fit the spline was set to 200 cm from the soil surface, and the most suitable λ to generate the best fitted spline for all texture components was found to be 0.01. Once the spline was fitted, a continuous distribution of soil texture components for the whole profile was obtained, where the value of a specific depth range could be extracted by weighted average calculations. As specified in the GlobalSoilMap specifications, the splined texture values were extracted for six depth intervals (i.e. 0-5, 5-15, 15-30, 30-60, 60-100, and 100-200 cm) to map the spatial distribution of soil texture components at these standard soil depths. The profile data showed a varying pattern of soil texture distribution in the profiles. Some profiles were observed to have a higher clay content in upper horizons, whereas others showed increased clay in the lowermost layers. The majority of the soil profiles were found to be clay eluviated, with the subsurface layers being clay-enriched due to the deposition of migrated clay from the upper horizons. Figures 15-17 illustrate the three types of profiles just described and show how the splines generated a continuous texture distribution that also followed the observed pattern in the profiles. Fig. 15 is a representative profile where the surface soil had a lower clay content than deeper layers (i.e. depths from 150-200 cm). Fig. 16 illustrates a situation where surface soils were rich in clay content, but the clay content decreased with depth. Fig. 17 shows a typical soil profile where the leached clay from the upper soil horizons is deposited immediately below the surface horizons.
35
Fig. 15. Fitted splines for the texture data from Profile no. 1921 (left), and a photo of the same profile (right).
Fig. 16. Fitted splines for the texture data from Profile no. 1255 (left), and a photo of the same profile (right). 36
Fig. 17. Fitted splines for the texture data from Profile no. 1138, (left), and a photo of the same profile (right). The splines used were also able to preserve the compositional property of soil texture components, meaning the sum of all four fractions (i.e. clay, silt, fine sand and coarse sand content) at any specified depth after the spline fitting, summed to approximately 1000 g/kg (average texture sum of 15-30 cm depth was 998.9 g/kg). This was because of the mass preserving nature of the spline used. Although it could also be possible to address the uncertainty in spline prediction, it was not the main focus of this thesis. 3.2 Fitting splines to soil pH and bulk density data Similarly to the texture modelling, the distribution of soil pH and bulk density in the Danish soil profiles was also predicted with the equal area splines. Of the 1958 profiles from the study area, 1934 profiles have had the soil pH determined, and 1113 profiles have had bulk density measured, so these data were used in this thesis. For both properties, the top 1 m soil depth was considered for modelling, as this is the most active, and perhaps most exploitable layer, where most crop roots occur. Seven depth intervals (i.e. 0-5, 5-10, 10-20, 20-30, 30-50, 50-70, and 70-100 cm) were defined for mapping purposes. For both properties, a lambda value of 0.01 was found to generate a best fitted spline to all profiles. 37
The observed horizon soil pH of the six soil profiles and fitted splines to 1 m depth from the soil surface are shown in Fig. 18. For Profile no. 273, a reduced pH was observed between 15 and 30 cm depth, whereas for the same or a slightly higher depth interval (i.e. 20-50 cm), an increased soil pH was recorded for Profile 274. Profile 280, on the other hand, indicated decreasing pH with depth. A fairly stable pH was observed throughout Profile 1561, whereas pH increased with depth in Profile 590. The measured bulk density data from different horizons of the selected soil profiles together with fitted splines for the measured data are shown in Fig. 19. It was found that the bulk density in the profiles did not vary greatly, although most profiles had a slightly increased bulk density at depths below 20 or 25 cm (probably a plough pan). Some profiles (No. 802 and 1687 as in Fig. 19) also indicated increased bulk density at lower soil depths than the surface horizons.
Fig. 18. Fitted splines for the soil pH data from the six selected soil profiles
38
Fig. 19. Fitted splines for the soil bulk density data from the six selected soil profiles From the results above, we concluded that equal area quadratic splines were able to predict the vertical distribution of soil texture, pH and bulk density from the Danish soil profiles. The slicing of the first horizon from the profiles before spline fitting managed not to extrapolate the value of the soil properties at that specific depth (mostly plough layer), where a rather homogeneous texture distribution was expected. Although such a treatment also affected the non-agricultural soil profiles (forest profiles, for example), where spline restriction could be questionable, the first horizon from the forest profiles was very thin so that even if no slicing was used, the splined attribute value would be comparable. Therefore, slicing treatment was perhaps a good choice to deal with the plough layer of Danish agricultural soils. However, in the future, an approach of fitting the splines separately for agricultural and non-agricultural profiles is recommended to check whether the slicing is appropriate in forest profiles.
39
4 Assessment of spatial autocorrelation of soil properties Understanding the spatial variability in soil properties in all three study areas was another main aim of this thesis. Therefore variograms were used to analyse the spatial variability of soil properties. As described in Section 1.6.2.2.1, variograms measure the average semivariance among observation points separated by some distance, often called a lag distance. The spline-predicted soil property data from the specified soil depths (six depths for texture and seven depths for soil pH and bulk density) were used to make the variogram. The VESPER program (Minasny et al., 2005) was used to calculate the experimental variogram, to which the theoretical variogram models were fitted based on weighted least squares estimation (McBratney and Webster, 1986). The different variogram models used in soil mapping are explained in the literature (McBratney and Webster, 1986). In our study, the two widely used Spherical and Exponential models were considered to fit the experimental variogram. Although rarely reported in soil science, a Gaussian model was also tested here. The selection of the best fit model among the three was based on Akaike Information Criteria (AIC), with the lowest AIC value indicating the best fit (Akaike, 1973). After the variogram modelling, variogram parameters that explained the spatial autocorrelation, i.e. nugget (c0), sill (c) and range (a), were derived. The mathematical functions for the Spherical and Exponential semi-variogram models are given in Eqs. (5) and (6). In both models, a is the range, c0 is the nugget and c is the sill of the variogram. while h is simply a model parameter different than lag as is used in the text.
3h 1 h 3 (h) c0 c 2a 2 a (h) c c 0 (0) 0
3h (h) c0 c 1 exp a (0) 0
for 0 h a Eq. (5)
for h a
for 0 h
Eq. (6)
The Spherical model has a linear start up to two-thirds of the range, beyond which it levels off quite abruptly to the constant value of the sill. The Exponential model has a linear increase for the first one-third of the range, but the starting point is lower than for the
40
Spherical model. The sill in the Exponential model is approached asymptotically, so there is no strict range. However, for practical purposes the effective range is defined as the range at which the function reaches 95% of the sill. The use of variogram parameters for the assessment of spatial autocorrelation of soil moisture content (GSMC) using 100 topsoil samples from Hungary (Fig. 8) is briefly explained in Paper V. The nugget, sill and range of spatial dependence of GSMC were found to be 0.102%, 0.205% and 2.2 km, respectively, suggesting that about 50% of the variability was purely random and that at a distance beyond 2.2 km GSMC was not spatially auto-correlated. A similar variogram analysis was performed in Paper I, where spatial autocorrelation of soil clay distribution in a selected study area in Denmark (Fig. 9) was reported. In total, 16 lags of about 2.4 km distance were considered, which allowed the variogram of clay content up to a distance of more than 37 km in the study area (a distance slightly higher than the diagonal distance) to be calculated. The resulting variogram is shown in Fig. 20, together with the experimental variogram, for which the Spherical model gave the best fit. Figure 20 also lists the variogram parameters that best characterised the spatial autocorrelation of soil clay content. It was found that after about 35 km distance, there was no spatial relationship to the distribution of clay in the study area.
Fig. 20. Experimental variogram (Gamma) and, fitted variogram model (Fitted) with variogram parameters for soil clay content (reproduced from Paper I). 41
Paper I also investigated the autocorrelation of clay content from the 10 different landscape types present in the study area. Due to a lack of data points from bedrock and late glacial marine deposits, no variogram was calculated for these. The variograms for the soil clay content from each landscape type were calculated separately, where, depending on the landscape dimensions, 16 lags of about 400-800 m size were considered. The variograms from 10 landscape types together with the parameters derived after fitting the theoretical models are shown in Fig. 21. In most cases, the Spherical and Exponential models were found to be the best models to fit the experimental variogram. The variogram from Post-glacial Marine Deposits clearly followed a periodic pattern, as expressed by the wavy shape, where the Hole-effect model would fit the best. However, as no such model was considered in this thesis, the Exponential model was fitted. The analysis of these variograms showed that the most variable clay distribution was from the Reclaimed Land, followed by the clay from Marsh Areas and Post-glacial Marine Deposits (Fig. 21). Among the moraine types, clay coming from Kettled and Terminal Moraine showed a lower variability than clay from Salian and Terminal Moraine. On the other hand, Aeolian deposits were found to be the least variable of all. As already mentioned, clay from the Post-glacial Marine Deposits showed a periodic pattern of distribution, probably influenced by the type and composition of the materials deposited by the sea at different periods of time. Glacial Flood Plain clay showed the shortest range of spatial dependence (short range variability), while the Moraine Landscape had the longest range. Among all the landscape types, the maximum nugget variance of clay was from the Sub-glacial Tunnel Valley. Due to the smaller number of data points for Reclaimed Land (42 samples), the variogram from there looked very unusual (Fig. 21).
42
Fig. 21. Variogram and variogram parameters of soil clay content from ten different landscape types in the study area. The dots represent the points of the experimental variogram, whereas the continuous line is the best fitted variogram model. The colour scale of the points (red to blue) corresponds to the increasing number of point pairs used to generate each of these experimental variogram points (red = minimum and blue = maximum). Similarly, the variograms of clay, silt, fine sand and coarse sand content from the six standard depths as derived from spline fitting were also calculated and analysed. As there were a large number of variograms generated, only the results of the analysis are listed in Table 4. For all components at all soil depths, the best model fit to the experimental
43
variogram was provided by the Spherical model. For all four components, a decreasing range of values together with increasing semi-variance was observed with increasing depth in the soil. This suggests that the soil in deeper horizons was more variable than the surface soil, strengthening our assumption of more homogeneous topsoil (i.e. plough depth) due to decades of continuous agricultural activities in Denmark. Table 4. Results of the variogram analysis of soil texture components at six standard soil depths in the study area (reproduced from Paper II). † Texture fraction Clay
Silt
Fine sand
Coarse sand
Parameter AIC Fitted Model C0 C1 A1 AIC Fitted Model C0 C1 A1 AIC Fitted Model C0 C1 A1 AIC Fitted Model C0 C1 A1
Soil depth (cm) 0-5 5-15 15-30 30-60 60-100 100-200 257 250 247 237 214 225 Spherical Spherical Spherical Spherical Spherical Spherical 2389 2510 74.4 221 Spherical
2399 2397 72.8 215 Spherical
3179 2385 65.3 218 Spherical
5623 1910 66.7 206 Spherical
7178 1707 59.6 215 Spherical
7220 1749 35 229 Spherical
2209 1939 97.1 290 Spherical
2102 1925 96.8 286 Spherical
2125 1892 93.8 289.2 Spherical
2354 1490 90.2 297 Spherical
2847 1310 80.4 303 Spherical
2749 1379 75.7 305 Spherical
16008 11650 73.6 282 Spherical
14366 11386 72.3 278.2 Spherical
12449 12702 68.2 278.1 Spherical
14072 14755 67.9 285 Spherical
18146 14460 72.2 294 Spherical
23449 13355 76.9 289.7 Spherical
35127 16051 36.9
32962 15705 37.8
32160 16912 44.9
36494 23055 45.1
43890 26682 38.3
49052 26536 34.4
† AIC, Akaike Information Criteria; C0, nugget (g/kg); C1, partial sill (g/kg); A1, range (km).
In another study, autocorrelation of soil pH from seven specified soil depths in Denmark was investigated. The variogram and its parameters for soil pH at all seven depths as derived from spline fitting are shown in Fig. 22. Compared with the soil material from deeper layers, soil pH in the uppermost 0-20 cm showed a higher variability and was well characterised by rather stable variograms, which normally indicates a smooth distribution pattern. Unlike the texture components, the analysis suggested the presence of a high
44
spatial variability in pH distribution on the surface layers. The variability decreased with increasing soil depth, where a long range variability pattern was noticed. Moreover, the variograms became increasingly unstable with increasing depth.
Fig. 22. Variogram of soil pH from seven depth intervals within the top 1 m of soil in Denmark. The points represent the experimental variogram and the continuous line the modelled variogram. The colour scale for the points (from red to blue) suggests an increasing number of point pairs used to generate each of these experimental variogram points. The variograms of soil bulk density from all seven specified soil depths were also modelled and the autocorrelation was analysed. The results are shown in Table 5. It was found that variograms of soil bulk density did not change much with depth. A slightly increased variability was observed in the top 0-10 cm soil layer, while below that it remained almost the same. Another noticeable difference among the depths was the range of spatial dependence, which was higher for the surface layers than deeper layers. Compared with soil pH, bulk density had a higher range of spatial dependence for all depths, suggesting a
45
long range variability pattern. A spatial influence on the distribution of bulk density was not expected beyond approx. 70-80 km in the study area. Table 5. Variogram parameters of soil bulk density from seven soil depth intervals within the top 1 m of soil depth in the study area. † Soil Parameter Property Bulk AIC density Fitted Model C0 C1 A1
0-5 -226 Exp.
5-10 -227 Exp.
10-20 -256 Exp.
0.02 0.013 82
0.02 0.01 82.4
0.01 0.01 79.2
Depth (cm) 20-30 30-50 -264 -263 Exp. Exp. 0.01 0.009 80.6
0.01 0.01 82.4
50-70 -266 Exp.
70-100 -256 Exp.
0.01 0.008 67.4
0.01 0.01 70
† AIC, Akaike Information Criteria; Exp., exponential variogram model; C0, nugget (g/cm3); C1, partial sill (g/cm3); A1, range (km).
The autocorrelation of the residuals of prediction (difference between measured and predicted soil properties) from regression analysis was also investigated to check whether the residuals were still influenced by the spatial phenomena, or whether their appearance was random. Variograms of the residuals were derived from all the soil properties and at all defined soil depths where the prediction was made. It was found that all the residuals had some, but rather weak, spatial connection, as shown by the corresponding variogram parameters for the soil properties concerned. As an example, while predicting topsoil clay content from the selected study area in Denmark, the variogram of the residuals was best characterised by the Spherical model (AIC -132.4). The nugget, partial sill and spatial range of this variogram were 0.035 %, 0.07 % and 1471 m, respectively. This suggests that the residual distribution was also spatially influenced in the study area. Another example is the variogram of residuals of soil clay content at a depth of 0-5 cm from the whole of Denmark (Fig. 23). It had a very high nugget compared with the partial sill, with the contribution of the nugget to the overall variance being more than 64%, suggesting that only about 36% of the variability was spatially structured. With the variogram analysis, it was therefore possible to model and understand the autocorrelation and spatial variability of soil properties. The variogram of clay from the Reclaimed Land was unusual (Fig. 21), probably because it was based on only 42 samples that were too low for a stable and a meaningful variogram. The variograms of soil texture from different soil depths suggested that its variability increased with depth, with surface soils being less heterogeneous than soils from deeper layers, especially for texture 46
components. Soil pH exhibited a higher variability in the top three layers (0-30 cm depth), whereas the variability in soil bulk density was more or less the same for all depths except the first two (0-5 and 5-10 cm), which showed relatively higher variability. Similarly, the residual variogram of soil properties and its modelling also highlighted the existence of spatial phenomena in the distribution of regression residuals in the study area.
Fig. 23. Residual variogram and its parameters for soil clay content at a depth of 0-5 cm in the study area.
47
5 Spatial prediction and Mapping 5.1 Geostatistical methods The geostatistical methods applied in the studies included Ordinary Kriging (OK) and Stratified OK (OKst). Although OK is a univariate kriging where the covariates cannot be included in the prediction model, it was chosen because it is one of the most widely used geostatistical techniques in soil science (Burgess and Webster, 1980; Goovaerts, 1999). In the case of soil properties, where the local mean may vary significantly over the study area, OK allows such local variation to be accounted for by limiting the domain of stationarity of the mean to the local neighbourhood (Goovaerts, 1997). The derivation function of OK together with the associated variance is given as:
Z *OK ( x0 )
2 OK ( x0 )
N ( x0 )
N ( x0 )
i 1
i 1
i .Z ( xi ) with
i 1
Eq. (7)
N ( x0 )
i . ( xi x0 )
Eq. (8)
i 1
where λi are the weights assigned to n number of observations taken around ( x0 ) , Z ( xi ) is the attribute value at ith location, and γ( xi x0 ) is the semivariance of Z between the sampling point xi and the unvisited point x0 . The quantity ψ introduced in Eq. (8) is to minimise OK variance under the restriction that the sum of the weights must be equal to one. Paper I briefly explains the concept of OK and OKst, together with their application in soil clay mapping of the selected area in Denmark. An example of OK application is also shown in Paper V, where the spatial variability of soil moisture content from a small area in western Hungary was mapped. Similarly, an OK-based mapping procedure was applied in Papers II and IV to visualise the spatial distribution of prediction residuals of texture components in the entire Danish area. During the mapping, the whole set of data (6919 samples) from the selected study area in Denmark was first divided randomly into training (80% data) and test (20% data) sets. The training set was used for model building, while the test set was kept aside for model
48
validation. The data were log-transformed to normalise the distribution prior to variogram modelling and the results were back-transformed once the kriging was done to get the original values. Model performance was evaluated with the coefficient of determination (R2), root mean square error (RMSE), and residual prediction deviation (RPD) (Williams, 1987), derived according to Eqs. (9), (10) and (11), respectively. Models with RPD higher than 1.4 are generally considered useful and reliable (Chang et al., 2001).
z * xi z ( x) n
R2
2
i 1 n
Eq. (9)
zxi z ( x)
2
i 1
RMSE
RPD
1 n zxi z * xi 2 n i 1
Eq. (10)
SD n RMSE n 1
Eq. (11)
where point observations are denoted by n, measured and predicted clay content at ith location as z xi and z xi , respectively, mean of the observed values as z (x) and the standard deviation of validation data set as SD. After kriging, the following two predicted maps of clay content were obtained. Fig. 24a shows the map of OK-predicted soil clay content, together with the map of kriging variance (Fig. 24b) from the selected area in Denmark. Most of the area had clay content less than 10%, whereas the model-predicted clay content was slightly higher (10-25%) for only a few small areas towards the east. Fig. 24b suggests a higher prediction error towards the boundary or in the areas with low sample density. As an intuitive rule of OK variance, the error was found to increase with distance from the sample location. A similar pattern was reported in Paper V. Fig. 25 shows similar maps of clay content, but derived from the OKst procedure. As the kriging was performed for each landscape type separately, the OKst map appeared to be more detailed than the OK-predicted map, which was rather smooth in appearance. Interestingly, it was also observed that the OK-predicted map had fewer errors than the
49
OKst map. The mean clay content as predicted by OK and OKst were 6.16% and 6.26%, respectively, with corresponding standard deviations of 2.97% and 3.42%. Similarly, the mean and standard deviation of the OK prediction error map were 1.14%, and 0.014, whereas they were 1.31% and 0.14 for the OKst map.
Fig. 24. (a) Soil clay content as predicted by Ordinary Kriging (OK), reproduced from Paper I, and (b) the standard error of prediction. Comparing the validation results, both OK and OKst had RPD higher than 2 and a comparable RMSE of about 0.29. However, R2 was slightly higher for OKst than for OK, suggesting that OKst was a better predictor for mapping soil clay content of the study area. However, higher mean prediction variance was reported for OKst than for OK. The validation process adopted in Paper V was based on leave-one-out cross-validation. The correlation coefficient between predicted and measured soil moisture content was 0.68. The normality of the distribution of prediction standard errors, falling within the range of ±2 of the standard deviation, confirmed that the prediction was reasonable and acceptable.
50
Fig. 25. (a) Soil clay content as predicted by Stratified Ordinary Kriging (OK st), reproduced from Paper I, and (b) the standard error of prediction. Based on the above analysis, it was therefore concluded that stratification of samples before prediction could improve the performance of OK, even though the OKst prediction error was slightly higher than that of OK. It is also worth noting that geostatistics proved to be a very useful tool for analysing and mapping the spatial distribution of soil properties.
51
5.2 Mixed methods 5.2.1 Mapping topsoil clay content of the selected area Two mixed methods, namely Rule-based Regression Kriging (RKrr) and Regression Trees (RT) were also applied to map topsoil clay content from the study area using the same data points as above (Fig. 10). The prediction based on RKrr followed three specific steps, where a number of regression rules were generated to model the soil-landscape relationship using a data-mining tool called Cubist (www.rulequest.com). In the second step, the difference between measured and predicted clay content from each training point location was derived (known as residuals), and its spatial distribution throughout the study area was mapped using OK. In the last step, the kriged residual surface was added to the map generated from regression rules only to get a final prediction output. This modelling approach also provided an opportunity to quantify the importance of environmental variables used to predict soil clay content in the study area. Paper I explains the underlying theory and the mapping procedure applied. As mentioned above, the model output of the rule-based regression is a large set of prediction rules, each of which is specific to certain conditions, if met; the attached regression model predicts the soil attribute in question. Although in our case the model generated 19 different rules to predict soil clay content, only one rule (Rule 1) is included here as an example. (Please refer to Appendix B for the remaining rules). Rule 1: [154 cases, mean 0.98, range 0.12 to 1.62, est err† 0.22] if georegions in (1, 3) land use in (3, 5, 8, 21, 24) mrvbf† > 0.37 soil map in (1, 2) valley depth > 1.93 valley depth 0.37 and valley depth index between 1.93 and 3.35. In addition, the rule suggested an error of 0.22 while predicting clay content in these locations. The predicted and measured clay content are plotted against each other in Fig. 26a, while Fig. 26b shows the distribution of the prediction residuals. The correlation between predicted and measured clay content was 0.85, and the relative error (RE) was 0.47. RE gives the ratio of average error and the error that would result from always predicting the mean. For the useful models, this should always be less than 1. The RE was derived from Eq. (12) (Minasny and McBratney, 2008). Similarly, the mean and standard deviation of the regression residuals were -0.01 and 0.32, respectively.
RE
1 n z ( xi ) z * ( xi ) n i 1
1 n z ( xi ) z ( x) n i 1
Eq. (12)
where point observations are denoted by n, measured and predicted values at the ith location as z xi and z xi , respectively, and the mean of the observed values as z (x) .
53
Fig. 26. (a) Scatter plot of measured clay content (%) and clay content predicted by regression rules and (b) distribution of the regression residuals. The prediction model was then applied to the whole set of covariate grid data to generate the maps. The code for this, written in FORTRAN, was used to generate prediction and residual grids at a spatial resolution of 30.4 m for the whole study area. The maps generated for each of the three steps described earlier are shown in Fig. 27, where Fig. 27a is the regression output, Fig. 27b is the residual surface, and Fig. 27c is the final predicted clay map for the study area. The final predicted map showed that the eastern part of the study area was richer in clay content compared with the west. Some clay-rich patches were also present towards the north-west corner. The soils with less clay content in the west were developed on sandy parent material dominated by glacio-fluvial plains, whereas the clay-rich areas in the east were developed in a loamy parent material mostly in moraine landscapes. As Fig. 27b shows, in the areas where the predicted clay content was higher, huge differences between measured and predicted values were observed and vice versa. In some minor pockets, represented as dark-blue patches on the map, clay content was predicted badly. The positive value of the residuals for the whole study also suggested that the model under-predicted the clay distribution.
54
Fig. 27. (a) Soil clay content as predicted by Rule-based Regression Kriging (RKrr), (b) regression residuals and (c) the final predicted map obtained using the RKrr approach. Part (c) reproduced from Paper I. The modelling approach based on RT basically involves partitioning the data into more homogeneous components (nodes) in terms of the prediction covariates used, and assigning a predicted value of the attribute in question to each of these nodes or leaves of the generated tree. The difference between regression-rules (RR) and RT is that the former builds a regression model of prediction for each of the predicting leaves, where the latter only assigns a rigid value. The map of soil clay content as predicted by the RT method is
55
shown in Fig. 28. Comparing Fig. 27c and Fig. 28, the pattern of clay distribution was more or less similar, but the predicted values within a given pattern always seemed higher with the RT model than the RKrr model (see Paper I for more details).
Fig. 28. Soil clay content in the study area as predicted by the Regression Tree (RT) method, reproduced from Paper I. Both of these methods also provided some information about the importance of the variables used to predict soil clay content. RT found a dominant effect of soil map (RI 100%) over other predictors such as geology and geo-region (corresponding RI about 2%). However, in case of RKrr, apart from the soil map, which had a RI of 95%, predictors such as landscape type, geology, geo-region, land use, and elevation all had RI higher than 40%. This showed that RKrr considered more variables with their higher contribution in the prediction model, and probably ensured that the model captured more variability of clay than RT, which only considered soil map as a major predictor. Of the four methods applied to map soil clay content in the study area (i.e. OK, OKst, RKrr, and RT), the highest mean clay content of 8.09% was predicted by RT and the lowest by OK (6.16%). The prediction by OKst was slightly higher than that by OK (6.26%), but was still lower than that provided by RKrr. The mean and standard deviation of clay content as predicted by RKrr was 6.42% and 3.87, respectively. Comparing the prediction performance of the four methods applied, the best performance was given by RKrr, for which the highest R2 and RMSE, of 0.74 and 0.281, respectively, were recorded. The RPD was always higher than 2 for all prediction methods except RT, which had RPD of 1.17. The lowest R2 (0.53) and the highest RMSE (0.512) were recorded for RT, suggesting that it is the weakest method among all tested here to predict soil clay
56
content. The prediction performance of the methods decreased in the order: RKrr > OKst > OK > RT. Based on these results, we recommend RKrr as the most suitable method for any further soil mapping activities in Denmark. 5.2.2 Mapping texture components, pH and bulk density at multiple soil depths in Denmark The main aim of this thesis was to map the vertical distribution of soil properties at a national extent in Denmark using state-of-the-art DSM techniques. As Paper I recommended RKrr as the most suitable DSM method, we adopted it to predict soil texture components at multiple soil depths for the whole of Denmark (Paper II). Prediction of soil pH and bulk density followed a similar mapping approach. For mapping soil texture components, we selected 1958 soil profiles (Fig. 11) for which clay, silt, fine sand and coarse sand contents at six standard soil depths were derived using spline functions (Section 3). Seventeen environmental variables, many of which were derived from the DEM (Fig. 14), were used as predictors of soil texture (Table 3). As before, 80% of data were used for model building, while the remaining 20% were kept for validation studies. Altogether, 24 prediction models (four texture components at six soil depths) were generated in Cubist, where each of the models consisted of a number of rules that specified certain conditions, when met; prediction was made by the associated regression function. The examples below show one of the several rules of the prediction models for clay, silt, fine sand and coarse sand content from 0-5 cm soil depth. The full models are provided in Appendix C. According to the first rule in the box below (i.e. Rule 1 for clay from 0-5 cm), when the conditions set by the given class of geology and soil types were met, clay in the 0-5 cm soil layer was predicted using the regression function, where different predictors such as slope, slope length factor, etc. with their corresponding coefficients were used to get a predicted value of clay in that specified area. This rule was obeyed by 74 out of 1566 training profiles (80% of total 1958 profiles), where the prediction error was found to be 25.28 g/kg. The other examples in the box include the first rule of the models for silt, fine sand and coarse sand content.
57
Rule 1: [74 cases, mean 27.47, range 0 to 156.65, est err† 25.28] if geology in (1, 3, 5, 8, 11, 14) soil map = 11 then clay_0-5(g/kg) = 89.73 - 6.6 * slp_deg† + 10.6 * ls_factor† - 2.7 * mrvbf† - 2.5 * twi† 1.9 * saga_wi† + 0.0006 * vertdist_chn† Rule 1: [169 cases, mean 25.93, range 0 to 163.27, est err 18.82] If† landscape in (1, 2, 4, 8, 9, 10) landuse in (2, 4, 7, 11, 17, 18, 23, 24) then silt_0-5 (g/kg) = 75.28 – 1.60 * mrvbf –2.80 * saga_wi – 3.10 * ls_ factor – 0.09 * elevation + 0.001 * vertdist_chn – 0.50 * slp_deg Rule 1: [36 cases, mean 98.44, range 0 to 497.52, est err 108.16] if twi > 8.49 geology in (1, 3, 5, 10, 14) landuse in (2, 5, 7, 9, 11, 18, 20, 21) soil map = 11 then finesand_0-5(g/kg) = 131.45 - 8.8 * mrvbf - 0.76 * elevation - 7 * slp_deg + 14 * ls_factor - 5 * twi + 0.05 * aspect† Rule 1: [63 cases, mean 25.70, range 0 to 122.08, est err 20.34] if geology in (6, 20) then coarsesand_0-5(g/kg) = 32.44 - 3 * twi – 3 * slp_deg - 1.3 * mrvbf + 4 * ls_factor + 2 * saga_wi + 0.11 * elevation † est
err, estimation error; slp_deg, slope gradient; ls_factor, slope length factor; mrvbf, multi- resolution index of valley bottom flatness; twi, topographic wetness index; saga_wi, System for Automated Geoscientific Analyses wetness index; vertdist_chn, vertical distance to channel network.
Once the models were generated, the predicted texture values at the training profile locations were compared with the measured values from the same locations and a scatter plot was generated, as shown in Fig. 29. The correlation between measured and predicted values for clay and silt content was 0.76 and that for fine and coarse sand components was 0.64 and 0.67, respectively. The RE of prediction for clay and silt was lower than that for both the sand fractions, but all the values were lower than 1, suggesting that our predictions were useful.
58
Fig. 29. Scatter plot of measured texture components (g/kg) and texture components predicted by the Rule-based Regression method from 0-5 cm depth. The continuous line represents the 1:1 line. As specified in Section 5.2.1, prediction residuals at each location for all components from their corresponding soil depths were derived and their distribution was also checked with frequency histogram and normal quantile plots. As an example, Fig. 30 represents the histograms and the normal quantile plots of clay, silt, fine sand and coarse sand content of the 0-5 cm soil layer in the study area. In general, the residual distribution patterns for all components followed a normal distribution, where mean residual values for clay and silt were close to zero (-1.2 and -1.8 g/kg, respectively) and those for fine sand and coarse sand content were -11.4 and -12 g/kg, respectively. The median value for all components was zero. For perfect prediction, the residuals should be distributed normally with a mean of zero and standard deviation of 1.
59
Fig. 30. Histograms and normal quantile plots of residuals derived for (a) clay, (b) silt, (c) fine sand and (d) coarse sand, for the 0-5 cm soil layer. The lefthand graph in each of the four diagrams represents a histogram, the middle one an outlier-box plot, and the righthand one a normal quantile plot. The spatial extent of all the regression outputs was generated using tools written in FORTRAN, and the grid maps were displayed in ArcGIS for further processing and analysis. The extent of residuals throughout the study area was generated using OK with local variogram. Both the grids were then added together, as described above, to get the final predicted grid map of the texture components (see Papers I and II for explanation). As an example, Fig. 31 displays the map of regression rules, and the map of the residuals from the regression of clay content at 0-5 cm soil depth. The final predicted map shown in Fig. 32 is the result of adding Fig. 31a and Fig. 31b together. The full set of final predicted maps for all texture components is shown in Figs. 32-35.
60
Fig. 31. (a) Map of clay content at 0-5 cm depth as predicted by Rule-based Regression and (b) residuals from regression.
61
Fig. 32. Predicted clay content at six standard soil depths for the whole of Denmark (reproduced from Paper II).
62
Fig. 33. Predicted silt content at six standard soil depths for the whole of Denmark (reproduced from Paper II).
63
Fig. 34. Predicted fine sand content at six standard soil depths for the whole of Denmark (reproduced from Paper II).
64
Fig. 35. Predicted coarse sand content at six standard soil depths for the whole of Denmark (reproduced from Paper II). These maps showed that Danish soils are rich in sand content. While looking at the geographical distribution of these texture components, more sands were found towards the west and in the north, whereas central and eastern areas were rich in clay content for all depths. Soil clay content was found to increase with increasing depth for the whole country, suggesting a migration of clay from surface soils to deeper layers, where it accumulates. The highest average soil clay content was found at the lowest depth studied, 100-200 cm. Silt content followed a similar spatial distribution pattern to clay, but decreased with depth. The highest average silt content was found at the surface and the lowest at a depth of 100-
65
200 cm. A higher fine sand content at all soil depths was observed towards the north, where the post and late-glacial marine materials were deposited. However, along the glacial flood plains, fine sand content was found to decrease with depth. The coarse sand fraction also increased with depth, but its maximum value was observed at 60-100 cm. The glacial flood plains in the west were found to be very rich in coarse sand content at almost all predicted soil depths. Table 6 lists the mean and standard deviation of the predicted texture components at all six standard depths, as derived from the predicted maps (see Paper II for explanation. Table 6. Mean and standard deviation of predicted texture maps from six standard depths. Texture component
Parameter
Clay
Mean Std. dev. Mean Std. dev. Mean Std. dev. Mean Std. dev.
Silt Fine sand Coarse sand
0-5 79.3 48.6 84.4 50.1 368.9 134.6 357.9 170.0
5-15 80.3 48.8 85.0 50.4 364.5 130.6 363.6 169.4
Soil depth (cm) 15-30 30-60 87.0 101.0 51.8 65.7 82.3 77.9 48.9 50.3 370.7 362.9 124.9 123.3 376.5 377.2 173.6 184.8
60-100 107.3 67.5 78.3 53.0 359.8 133.8 389.0 201.8
100-200 112.3 65.1 78.1 51.2 358.2 136.2 385.5 208.5
The RKrr model was also able to identify the important variables and quantify their influence in predicting soil texture components at multiple soil depths in Denmark. Based on the relative importance of the variables used in prediction models, the most influential variables were determined and are listed in Table 7. In most cases, soil map, geology and landscape type, together with LSPs such as SAGA wetness index, slope length factor, elevation, etc. were the variables most frequently used in the models. It was also observed that for the prediction of clay content in the upper horizons, soil map made the highest contribution while setting the rule conditions. However, for the deeper layers (i.e. below 30 cm depth), the influence of geology seemed higher. Similar results were observed for silt and coarse sand content. Together with the soil map, landscape type played an influential role in silt distribution. Elevation, on the other hand, was an important variable for setting the conditions for surface clay, and also for the coarse sand distribution in the study area.
66
Table 7. Most influential environmental variables with their relative importance identified during the prediction of soil texture components from six standard soil depths for the whole of Denmark.† Texture component
Depth (cm)
Clay
0-5 5-15 15-30 30-60 60-100 100-200
Silt
0-5 5-15 15-30 30-60 60-100 100-200
Fine sand
0-5 5-15 15-30 30-60 60-100 100-200
Coarse sand
0-5 5-15 15-30 30-60 60-100 100-200
Predictors used (Relative Importance %) For Rule setting (Top 3) For Model building (Top 5) Soil map (100), Geology Slope gradient, and TWI (91), SAGA WI, and (38), Elevation (16) Vertdist_chn (81), LS-factor (47) Soil map (100), Geology TWI (95), Slope gradient (94), Vertdist_chn (38), Elevation (16) (76), SAGA WI (72), LS-factor (53) Soil map (100), Geology Vertdist_chn (90), SAGA WI (88), Elevation (34), Elevation (30) and Valley depth (49), LS-factor (30) Geology (90), Soil map (79), Vertdist_chn (90), LS-factor (74), SAGA WI Landscape (28) (70), MRVBF (52), Valley depth (48), Geology (100), Soil map SAGA WI, LS-factor, and Vertdist_chn (93), (74) MRVBF, and Slope gradient (65) Geology (100), Soil map Slope gradient (69), SAGA wetness index (90), Geo-regions (28) (62), TWI (34), MRVBF, and LS-factor (34) Soil map (90), Landscape LS-factor (100), MRVBF (92), Elevation (86), (54), Land use (27) Vertdist_chn (80), SAGA WI (81) Soil map (85), Landscape LS-factor (100), SAGA WI (84), Vertdist_chn (56), MRVBF (30) (79), Elevation (73), MRVBF (69) Soil map (100), Landscape MRVBF (90), Slope gradient (84), Valley (26), Geology (23) depth (75), SAGA WI (73), LS-factor (49) Soil map (80), Geology (51), SAGA WI (100), LS-factor (88), Vertdist_chn Landscape (392) (69), MRVBF (61), TWI (44) Soil map (82), Geology, and SAGA WI, LS-factor, and Slope gradient (91), Vertdist_chn (50) MRVBF (74), Vertdist_chn (43) Soil map (81), Geo-regions SAGA WI (85), LS-factor (80), Slope gradient (55), Geology (51) (76), Valley depth, and TWI (44) Soil map (97), Land use Elevation (73), SAGA WI (60), LS-factor (57), (65), Geology (51) MRVBF (41), Slope gradient (39) Soil map (100), Geology Elevation (89), Slope gradient (88), SAGA WI (88), Vertdist_chn (61) (72), MRVBF (51), Valley depth (44) Soil map (97), Geology (65), Elevation (78), SAGA WI (65), Valley depth Vertdist_chn (47) (48), MRVBF (45), TWI (44) Soil map (98), Geology (61), Elevation (94), SAGA WI (63), LS-factor (51), Landscape (57) Valley depth (36), Slope gradient (35) Geology (96), Soil map (94), SAGA WI (86), Elevation (79), LS-factor (65), Vertdist_chn (48) TWI (20), Vertdist_chn (17) Landscape (100), Geology Elevation (100), SAGA WI (60), LS-factor (78), Soil map (68) (54), Valley depth (38), Slope gradient (22) Soil map (96), Geology TWI (84), LS-factor (65), Elevation (55), (66), Vertdist_chn (43) MRVBF (52), Vertdist_chn (51) Soil map (96), Geology (59), SAGA WI (80), TWI (68), Elevation (61), Elevation (41) MRVBF (58), LS-factor (52) Soil map (96), Geology (79), SAGA WI (96), TWI (87), Elevation (66), Slop Land use (36) gradient (64), MRVBF (58) Geology (96), Soil map (95), SGA WI (88), Elevation (62), TWI (59), LSLandscape (29) factor (57), MRVBF (28) Geology (95), Soil map (72), SAGA WI (81), TWI (70), Elevation (56), LSElevation (28) factor (53), Slope gradient (52) Geology (100), Soil map Elevation (99), SAGA WI (95), LS-factor (71), (95), Elevation (30) MRVBF (54), Slope gradient (32)
† TWI, Topographic wetness index; SAGA WI, System of automated geographical analyse wetness index; Vertdist_chn, vertical distance to channel network; LS-factor, slope-length factor; MRVBF, multi-resolution index of valley bottom flatness.
67
The validation of the prediction models based on 20% unused data clearly indicated that the clay and silt contents were better predicted (i.e. higher R2 and lower RMSE) than fine sand and coarse sand contents for the top three soil layers (0-5, 5-15 and 15-30 cm), whereas in the bottom three depth layers, prediction of coarse sand was better than that of the other texture components (Fig. 36). The RPD, RE and ME of the predictions are given in Table 8. The ME was calculated using Eq. (13). The highest deviation of ME from 1 was associated with the coarse sand content, whereas the ME closest to 1 was recorded for silt content.
1 n ME z xi z * xi n i 1
Eq. 13)
It was also found that with increasing depth in the soil, model predictability decreased (i.e. increasing RMSE with decreasing R2). The textural heterogeneity at lower depths was higher than in the surface layers (Table 4) for almost all properties. A similar finding was made in Paper IV, where soil clay content in multiple soil depths was mapped for the whole of Denmark. Probable reasons behind this decreasing trend of model performance with depth in the soil profile suggested in Paper II relate to the nature of the LSPs used (Table 3), which mostly specify superficial processes and phenomena, and pedogenetic processes in soil profiles (e.g. tonguing of surface materials into the B horizon), which could be the main reason behind such heterogeneous subsoils. Moreover, the higher textural heterogeneity at deeper layers than in surface layers, as suggested by variogram analysis (Table 4), also underlined the fact that increased variability by depth reduced the model performance in deeper layers.
68
Fig. 36. Changes in R2 and RMSE values with depth in the soil, as observed during the prediction of soil texture components with RKrr. Table 8. Residual prediction deviation and relative error of soil texture components at six standard soil depths (N = 392) † Texture component Clay
Parameter
RPD RE† ME† Silt RPD RE ME Fine sand RPD RE ME Coarse sand RPD RE ME †RPD,
†
0-5 2.06 0.58 2.2 1.94 0.58 0.05 1.86 0.70 3.2 1.82 0.63 13.7
Depth (cm) 15-30 30-60 2.02 1.46 0.59 0.70 1.8 -5.5 1.93 1.89 0.62 0.68 1.8 -0.3 1.74 1.74 0.69 0.77 -2.0 7.0 1.78 1.58 0.65 0.68 6.8 20.7
5-15 1.87 0.60 4.2 1.96 0.58 1.1 1.74 0.70 2.8 1.83 0.63 8.0
60-100 1.75 0.67 1.7 1.53 0.72 -2.0 2.03 0.77 7.6 1.56 0.69 12.1
100-200 1.96 0.75 0.1 1.72 0.74 -1.3 1.78 0.85 12.1 1.75 0.68 27.8
residual prediction deviation; RE, relative error; ME, Mean error.
Mapping soil pH and bulk density at different soil depths also followed a similar prediction principle as described above. The spline predicted values for seven specified depths in the selected profiles (i.e. 1934 profiles for soil pH and 1113 for bulk density) were used to build rule-based prediction models in Cubist on 75% training data. An example of one of the rules for soil pH and the bulk density prediction model is given below.
69
Soil pH Rule 1: [129 cases, mean 3.61, range 2.7 to 6.4, est err 0.44] if landuse in (2, 4, 6, 9, 11, 13, 17, 18) georegions in (3, 4, 5, 6, 7, 9) then pH_CaCl2 (0-5 cm) = 4.189 + 0.024 valley_dep - 0.077 twi Bulk density Rule 1: [200 cases, mean 1.53, range 1.00 to 1.83, est err 0.10] if landuse in (2, 3, 4, 5, 7, 8, 11, 16, 17, 20, 23, 25, 29) soil types in (6, 7) then bulk density (0-5 cm) = 1.70 - 0.018 twi + 0.0005 vertdist_chn + 0.077 elevation 0.49 slp_deg - 0.0046 mrvbf †
est err, estimation error; slp_deg, slope gradient; mrvbf, multi- resolution index of valley bottom flatness; twi, topographic wetness index; vertdist_chn, vertical distance to channel network; valley_dep, valley depth.
The soil pH prediction model identified land use type (RI 100%), geo-region (RI 100%) and soil type (RI 86%) as the most important variables influencing the spatial distribution of soil pH at 0-5 cm soil depth. Of the LSPs derived from DEM, valley depth (RI 100%), elevation (RI 91%) and MRVBF (RI 91%) showed a maximum contribution. The topographic wetness index showed the lowest contribution (RI 14% only). Similarly, the bulk density model of the same soil depth identified land use (RI 99 %) and soil type (RI 92%) as the two most important variables, where the vertical distance to channel network (RI 90%), elevation (RI 89%), slope gradient (RI 89%), and the topographic wetness index (RI 85%) were extensively used in the regression function. The predicted pH is compared with measured values on the training data in Fig. 37, which also displays the distribution of the prediction residuals. The correlation between the two values was 0.76, with RE = 0.61. The variogram parameters for the residuals were recorded as nugget of 0.5, partial sill of 0.08 and spatial range of 4.1 km. Similarly, Fig. 38 displays a scatter plot of measured and predicted bulk density (correlation coefficient 0.53, RE 0.85) values together with the residuals from 0-5 cm depth. The variogram parameters of the residuals were nugget 0.01, partial sill 0.007, and range ~50 km. The average of both residuals was close to zero (pH = 0.03, bulk density = -0.005).
70
Fig. 37. Scatter plot of (a) predicted and measured soil pH from 0 to 5 cm soil depth and (b) the distribution of residuals.
Fig. 38. Scatter plot of (a) predicted and measured bulk density (g/cm3) from 0 to 5 cm soil depth and (b) the distribution of residuals. Once the regression and residual grids for each of the corresponding soil depths were added, predicted maps of the soil pH and bulk density were obtained for all seven soil depths (Fig. 40, Fig. 42). The regression and residuals grid from 0-5 cm depth for soil pH and bulk density are shown in Fig. 39 and Fig. 41, respectively.
71
Fig. 39. (a) Soil pH as predicted by Rule-based Regression and (b) the residuals from regression. The mean soil pH for 0-5 cm depth was found to be 5.65, while at the bottom depth of 70100 cm, it dropped to 5.59 (Table 9). Comparison of predicted soil pH from all soil depths showed that average pH did not change much for the top 1 m of soil, the only difference being in the pattern of its geographical distribution over the study area. The soils in the western part of Denmark along the glacial flood plains and in salian moraine landscapes were more acidic than the soils in central and eastern parts, which developed in loamy
72
parent materials in moraine landscapes. Such a distribution pattern was observed for almost all seven depths (Fig. 40). Soils developed in the aeolian deposits were also found to have low pH compared with soils from other areas. With increasing depth in the soil, the pH became more acidic, especially in western Denmark, whereas in the south-eastern part, an increasing pH was observed with depth. Similarly, soils developed under coniferous trees and heath were found to have a low pH compared with soils under agriculture and grassland.
Fig. 40. Predicted soil pH at seven depth intervals within the top 1 m of soil for the whole of Denmark. Similarly, the mean bulk density of Danish soils from the top 0-30 cm soil depth also remained more or less stable at a value of 1.44 g/cm3, while an increased bulk density was found at a depth of 70-100 cm (Table 9). Looking at its distribution, soil in the north appeared to have lower density than soils of central and western Denmark, especially in the top 20 or 30 cm of soil. Below 30 cm, soils in south-east Denmark had higher bulk density, whereas soils in the south-west corner had the lowest value. It also emerged that 73
soils developed in marsh areas (south-west corner) and in post-glacial marine deposits (northern Denmark) had a lower bulk density than soils of moraine or glacio-fluvial plains.
Fig. 41. (a) Soil bulk density from 0 to 5 cm soil depth as predicted by Rule-based Regression and (b) the residuals from regression.
74
Fig. 42. Predicted bulk density (g/cm3) of seven soil depth intervals in the top 1 m soil in Denmark. Table 9. Average predicted soil pH and bulk density values at different soil depths in Denmark. Soil depth (cm) 0-5 5-10 10-20 20-30 30-50 50-70 70-100
Mean 5.65 5.64 5.65 5.64 5.62 5.56 5.51
Soil pH Std. Dev. 0.94 0.94 0.92 0.87 0.89 0.92 1.04
Mean 1.44 1.44 1.44 1.48 1.51 1.55 1.58
Bulk density (g/cm3) Std. Dev. 0.1 0.08 0.09 0.06 0.07 0.1 0.09
The results of the model performance evaluation to predict soil pH at all seven depths are shown in Table 10. The best prediction was made for 20-30 cm depth, where the highest R2 (0.58) and the lowest RMSE (0.53) were recorded. The lowermost layer had the worst
75
prediction. In case of bulk density prediction, all the models were found to be very weak, but the best performance of the model was seen at 10-20 cm (R2 = 0.20, RMSE = 0.06). Table 10. Evaluation of model performances in predicting soil pH at different soil depths in Denmark. † Soil depth (cm) 0-5 5-10 10-20 20-30 30-50 50-70 70-100
R2 0.58 0.57 0.55 0.58 0.52 0.56 0.49
RMSE 0.61 0.61 0.61 0.53 0.58 0.56 0.71
RPD 1.56 1.55 1.51 1.57 1.45 1.54 1.41
† R2, coefficient of determination; RMSE, root means square error; RPD, residual prediction deviation.
The results of the above analysis suggest that the predicted soil properties not only vary in the horizontal dimension, but also show a varied pattern with increasing soil depth. The predicted values for deeper soil layers were found to be different than the values predicted for surface layers for almost all soil properties. Texture distribution was highly variable in deeper soil layers, whereas soil pH showed the maximum variability in surface soil layers. Although the pH and bulk density did not change greatly throughout the profiles, increased bulk density was observed below 30 cm, where soil pH values were lower. A sudden increase in bulk density below 20-30 cm probably represents the plough pan developed by agricultural machinery in the field. The application of the RKrr model to predict soil properties at multiple soil depths was found to be appropriate, as it allowed a number of environmental variables to be included in the prediction model without data reduction steps being required. This was the benefit offered by the Cubist tool. The model was also able to quantify the site-specific complex soil-landscape relationship by setting the prediction rules which would only operate once the condition set by the rules were met. Moreover, the model was also able to identify the most influencing variables used in the prediction and to quantify their importance during modelling. The handling of the residuals by kriging was also logical, as many of the residuals found had a clear spatial autocorrelation and distribution pattern in the study area. Addition of residual surface back to the regression output grid was also specific, as it maintained a balance between under- or over-predicted areas that was generally the case with the regression models applied in this thesis.
76
The indices used to evaluate model performance are very common and widely used by DSM researchers. Although we observed a very weak performance for bulk density prediction, the rest of the predictions were in good agreement with several other DSM findings. For example, Stoorvogel et al. (2009) reported R2 of 0.08-0.23, while Malone et al. (2009) found R2 of 0.05-0.55 while predicting soil properties at multiple soil depths. 5.3 DSM for mapping soil classes Although soil mapping in Denmark has a long history, where earlier investigations have already mapped the Danish soil types as described in Section 1.4.4, the aim in this thesis was to apply some advanced DSM tools to quantify the soil-landscape relations and come up with a new raster-based soil class map of Denmark. Another aim was to address the errors associated with the mapping process, as one of the major issues of these conventional soil maps was the lack of information regarding the error of its mapping. We selected 1171 soil profiles from the whole of Denmark where soils were classified and recorded. The selected profiles included national 7-km grid profiles, profiles from pipeline construction and some specific profiles from different parts of the study area. The system of classification was the FAO-Unesco Revised Legend 1990 adopted in Denmark, which was developed by translating the earlier FAO-Unesco legend of 1974. Some of the major changes during the translation followed the procedures adopted by Madsen and Jensen (1996). Altogether, eight different soil classes (Soil Grouping levels) were recorded from the study area. They were Alisols, Arenosols, Cambisols, Fluvisols, Gleysols, Luvisols, Podzols and Podzoluvisols. Although Anthrosols, Leptosols and Randzinas were also present, they were not considered in the mapping. Similarly, Histosols were not included in the mapping. However, the extent of this class was estimated from the recent peat map of Denmark compiled by Greve et al. (2013) (unpublished data). During the prediction, the whole set of data was divided into a training (80%) and a test (20%) data set, following a stratified random division using soil classes as strata. This simple procedure ensured the proportionate representation of each class in both training and test data sets. Sixteen environmental variables were used as predictors of soil classes in the study area. Most of the predictors originated from the DEM of the study area (Table 3) (see Paper III for more details on data preparation and the methods applied).
77
The prediction of soil classes followed a Decision Tree modelling approach, where the See5 data-mining tool (www.rulequest.com) was used as a classifier. The tool generated a large tree model to quantify the relationship of FAO Soil classes to the environmental variables used, and assigned an appropriate soil class to each predicting leaf as suggested by the model. During the process, adaptive boosting was applied, which was found to minimise the classification errors compared with the error that would propagate without boosting. The tool also quantified the importance of the predicting variables to identify the most influencing variables responsible for the distribution of FAO soil classes for the whole of Denmark. User Accuracy (UA), Producer Accuracy (PA), and Overall Accuracy (OA) were used to access the prediction performance following Eqs. (14), (15) and (16), respectively (Taghizadeh-Mehrjardi et al., 2012). Moreover, a new assessment criterion, Accuracy to Predict Similar Soil Groups (ASSG), was introduced, which was intended to specify the model strength while predicting similar soil classes in the study area. Soil classes that have a similar pedogenetic process of profile development were placed into a single group, for example Arenosols and Podzols were considered a similar soil group (see Paper III for other groups).
X ii
UA j
Eq. (14)
C
X ij i 1
PA j
X jj
Eq. (15)
C
X ij i 1
C
OA
Eii i 1
Eq. (16)
N
The results showed that about two-thirds of the Danish land area were covered by Luvisols (35%) and Podzols (30%) only. Alisols, Podzoluvisols and Fluvisols were the classes which had minimum area coverage, of 2.22%, 1.67% and 2.12%, respectively. Considering the geographical distribution of predicted soil classes, Podzols occupied a major part of the west, whereas Luvisols were mainly present in the east and some areas towards the northwest of Denmark (Fig. 43). Gleysols mostly covered the area of the post-glacial marine
78
deposits, whereas the Arenosols were the major classes found in aeolian deposits. The marsh area and reclaimed land were also covered by Gleysols. The Histosols, which were not predicted, were assumed to cover most of the peatlands of Denmark.
Fig. 43. Geographical distribution of predicted FAO soil classes in Denmark (Inset: map of Prediction Uncertainty). These maps are reproduced from Paper III. Of the different predictors used, the maps of surface (0-30 cm), and sub-surface (60-100 cm) clay content together with the geology of the study area were among the top variables that contributed most to the spatial distribution of FAO soil classes in Denmark. The RI for these three variables was found to be 100%. Similarly, landscape type and elevation played a significant role in defining the type of soils classes present in the study area, with their corresponding RI of 85% and 88%, respectively.
79
Validation of the predicted map based on 20% test profiles suggested an overall prediction accuracy of 60%. Boosting with 10 trials reduced the classification error by 20%. The prediction also showed that Podzols and Luvisols had the highest UA, of about 80%, whereas Fluvisols had a UA of zero. Of five predicted Alisols, only one matched as observed in the field (PA 20%). Similarly, of seven observed Alisols, only one was predicted to the same class (UA 15%). Moreover, the quality of the predicted map was also checked with the conventionally produced soil class map of Denmark. The estimated area occupied by a soil class in a given soil mapping unit (SMU) of the conventional map was compared with the calculated area of the predicted class within the same SMU. The map with the definition of different SMUs is given in Fig. 7. The results showed that some SMUs, e.g. SMU 7 and 8, were well predicted, while SMU 16 had a poor prediction (see Table 9 in Paper III). Both SMU 7 and 8 represent Luvisols, with an estimated area coverage of >75%, whereas SMU 16 had >90% Arenosols according to the conventional map. Although this area-based comparison encountered rather weak performance for some of the SMUs, it should also be noted that we were comparing our prediction to the map for which mapping qualities or mapping errors have never been reported. This mapping approach showed that the tree-based model (Decision Tree using See5 tool) offered a very good platform in mapping soil classes, as it was applied in this thesis to compile an FAO soil map of Denmark at national extent. It also highlighted that boosting could drastically reduce classification errors. The distribution of soil clay content in the subsoil was very important while classifying soil types. Geology and elevation were also among the important variables for soil class variability in Denmark (see Paper III for explanation).
80
6 Prediction uncertainty assessment As discussed earlier, none of the prediction methods used in DSM is free of uncertainty. When mapping soil properties/soil classes in particular, it is of the utmost importance that each mapping procedure addresses and makes an assessment of the uncertainty associated with its prediction, in order to increase the reliability of the products. Therefore, in this thesis the reliability of the maps generated by different prediction methods was evaluated with uncertainty or errors associated with each mapping. Validation of the maps with measured points, as discussed for each mapping above, is also a general way of assessing the error in mapping. However, we also considered information from kriging variance map as a measure of prediction uncertainty, especially for OK and OKst. The uncertainty maps for OK and OKst relating to the predictions of topsoil clay content in the selected area of Denmark are shown in Fig. 24b and Fig. 25b, respectively. As a general rule of kriging, prediction error always increases with distance from the sample location, and such a pattern was clearly identified in glacial flood plains and salian moraine landscape types (Fig. 25b). The error distribution generally followed the extension of landscape types based on which the soil samples were stratified in OKst (Fig. 25b). The areas of post-glacial marine deposits and marsh exhibited the highest errors, whereas the aeolian deposits showed the lowest error. An intermediate error with a smooth distribution was associated with moraine landscape types, whereas a medium to higher error was observed in salian moraine types. Glacial flood plains, on the other hand, also exhibited a mixed error distribution pattern for predicted clay content. Reviewing the variogram in Fig. 21, the clay content of marsh and post-glacial marine deposits showed the maximum variability and the variogram from these landscapes to be rather unstable. Although a lower number of points was available for variogram calculation (Table 3 in Paper I), the periodic effect as suggested by the corresponding variogram models probably increased the spatial variability of clay in these landscapes. This effect probably caused the models to suffer greatly during the prediction, leading to a higher mapping uncertainty for those areas. Mapping uncertainty in the sub-glacial tunnel valley, where an unstable variogram was reported, was also found to be higher than the uncertainty from the different moraine landscapes. The variograms for salian moraine, terminal moraine and kettled landscape types were rather smooth and stable (Fig. 21).
81
Regarding the texture prediction by RKrr at multiple soil depths, the standard error associated with each prediction which represents mapping uncertainty, was calculated. The geographical distribution of prediction error while predicting clay and coarse sand contents from the study areas is shown in Figs. 44 and 45. In both predictions a high value of error was recorded where predicted attribute values were also higher. In most of the western parts of the selected area, especially in glacial flood plains and in salian moraine areas, the model appeared to be more confident in predicting clay content (Fig. 44). A relatively lower uncertainty was observed on predicting coarse sand content in eastern compared with western Denmark (Fig. 45). Similarly, the prediction uncertainty of soil pH from all seven specified depths was derived. As an example, the uncertainty map for 0-5 cm depth is shown in Fig. 46. Throughout the study area, the uncertainty of pH prediction was relatively low except for the soils from Himmerland, where a higher error was recorded. Small patches as seen in the western forested areas (mixed or coniferous) were found to have less error than the rest of the Danish land. As the prediction of bulk density was found to be rather weak, it displayed a relatively higher error throughout Denmark (diagram not shown).
Fig. 44. Prediction uncertainty for clay content in the selected area of Denmark.
82
Fig. 45. Prediction uncertainty for coarse sand content in the 0-5 cm soil layer.
Fig. 46. Prediction uncertainty for soil pH in the 0-5 cm soil layer. As for the soil properties mapped, the uncertainty of soil class prediction was also assessed. In this case, the confidence of the tree model to predict a given soil class at each predicting leaf was derived and mapped to the spatial domain. The map produced represented the uncertainty associated with the prediction (Fig. 43 Inset). The model seemed very confident while predicting Podzols in the glacial flood plains, and also in mapping Fluvisols from the marsh areas, as in the south-west corner of Denmark. The overall assessment indicated that most of the Danish land had a moderate error associated with the prediction of soil classes. The average prediction confidence for each of the predicted
83
soil classes was also calculated. It showed that Podzols and Gleysols had the highest prediction confidence, while Podzoluvisols and Alisols exhibited the highest mapping error in the study area (Fig. 47).
Fig. 47. Prediction confidence of FAO soil classes predicted by decision tree modelling.
Table 11. Prediction confidence derived for different Soil Mapping Units. SMU 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
Prediction confidence Std. Dev. 0.15 0.16 0.17 0.18 0.16 0.18 0.18 0.18 0.20 0.19 0.18 0.17 0.19 0.18 0.19 0.18 0.17
Mean 0.57 0.72 0.63 0.64 0.69 0.68 0.68 0.63 0.77 0.62 0.59 0.64 0.73 0.60 0.63 0.66 0.54
84
While comparing our prediction with the existing FAO soil map from 1990 (Fig. 7), we also calculated the distribution of prediction confidence in each SMU. The results are included in Table 11. It was observed that our model was very confident in predicting SMU 2, 9 and 13, with mean confidence of more than 70%. The model encountered the highest uncertainty (about 50%) for SMU 17, which was a mixture of many soil classes, e.g. Arenosols, Cambisols, Gleysols, Podzols and Podzoluvisols in different proportions. As a result, the model was not very confident while making predictions, leading to very high error. It was also observed that Alisols and Podzoluvisols, which were found scattered all over the study area as small patches, had greater prediction uncertainty than the other classes.
85
7 Conclusions This thesis quantified the continuous function of some important soil properties in Danish soil profiles and mapped these properties at multiple soil depths for the whole of Denmark using state-of-the-art digital soil mapping (DSM) techniques. New continuous soil data from the surface to 2 m below the ground were generated for all the profiles used in this study. A new soil map of Denmark was compiled, in which the distribution of soil classes based on the FAO Revised Legend 1990 was predicted using some advanced DSM tools. Furthermore, a suitable prediction technique to map soil properties in Denmark was identified and recommended for future use. On the basis of the data analysis and the results presented in the thesis, the following conclusions were drawn:
Variograms were able to characterise the spatial autocorrelation in soil properties. Soils from different landscape types and at different depths were modelled differently with their specific varigram parameters. For texture components, the variability increased with depth in the soil, but for pH a higher variability was observed in the surface soil layers. It was also found that a smaller number of sample points (42 observations) led to an unstable variogram, as seen in that for clay in Reclaimed Land.
Among the four different prediction techniques applied to map soil clay content, prediction performance decreased in the order: RKrr > OKst > OK > RT. Prediction based on Rule-based Regression (RKrr) was found to be a very promising tool for mapping soil properties. While making predictions at multiple soil depths, model performance was better for surface layers than deeper layers. In the surface soil layers (0-30 cm depth), clay and silt were better predicted than fine or coarse sand content. Among all the soil properties considered, bulk density showed the weakest performance. Addition of the residuals to the regression grid balanced the under- or over-prediction made by regression rules.
Equal-area quadratic splines were found to be a very promising tool for modelling the vertical distribution of soil properties in Danish soil profiles. Slicing the first soil horizon before fitting the splines managed to deal with the expected more homogeneous soil material in the topsoil, which mainly originated from agricultural areas. The splines also preserved the compositional property of soil texture components.
86
A number of environmental variables could be used to predict soil properties in Denmark. Overall, soil map, geology, landscape type, elevation, slope length factor and the SAGA wetness index appeared to be the most important variables. However, land use type became a more influential variable than others when predicting soil pH and bulk density. A dominant effect of geology was seen in deeper horizons.
Danish soils are predominantly rich in sand content. Western Denmark appeared to be very rich in coarse sand at almost all predicted soil depths. Eastern Denmark, on the other hand, appeared to have more clay, the distribution of which increased with depth due to leaching from the surface layers. Soil pH and bulk density were more or less stable for the top 30 cm, but pH decreased with increasing depth in the soil, whereas bulk density increased.
Boosting applied to Decision Tree modelling reduced the classification error drastically. Podzols and Luvisols were the major soil classes found in Denmark. These two classes, together with Gleysols, were predicted with more confidence than the other classes. The clay content of the subsoil distinguished Podzols and Luvisols from the other soil types. This prediction was also in agreement with the estimated distribution of soil types in different soil mapping units (SMU) of the conventional soil map. SMU 2 and 9 were well predicted, while a few units (SMU 11, 17) showed a high prediction uncertainty. The conventional soil map assumed the presence of 65% Luvisols, 5% Arenosols, 5% for both Gleysols, and Cambisols in SMU 9 where our prediction showed 72% Luvisols, 10% Arenosols, 1% Gleysols, and 5% Cambisols. The overall prediction uncertainty for this SMU was only 33%. On the other hand, the calculated area for Arenosols, and Podzols were 21% and 32% in SMU 17 where the old map suggested 50% Arenosols, and 15% Podzols. Therefore, overall uncertainty for this SMU was found very high (45%).
87
8 Perspectives for future studies This thesis work generated a huge soil database for Denmark that can be useful for several purposes, including research. It is also an example of a national extent high-resolution DSM to map soil properties at multiple soil depths. However, during the course of the work, it was possible to identify some possible directions for future DSM activities in Denmark, as listed below, and the improvements needed in the methods to ensure the quality of the end-products in addressing the changing demands of end-users. The scope and applicability of the DSM products to deal with the environmental issues in Denmark are also important issues for future research.
The possibility of fitting the splines separately for agricultural and non-agricultural soils should be explored, as the common slicing treatment might not be highly representative of all soil profiles. Moreover, the profiles that have an abrupt textural change need extra attention to deal with the discontinuity in the profiles.
An expansion of the type of environmental variables used to predict soil properties/classes is necessary. Use of information from satellite remote sensing, data from proximal sensors (EM38DD, Dualem), spectroscopy data (VIS-NIR, MIR), gamma-radiometric data, etc. could probably improve the prediction and the reliability of the final output.
As carbon stocks in soils continue to be a major issue in the global debate, a national assessment should be carried out to quantify the soil carbon stocks for the whole of Denmark. Monitoring soil carbon in the Danish wetlands could also be linked to other research, as these produce a lot more greenhouse gases than the agricultural uplands.
Although this thesis mapped FAO soil classes, an attempt to correlate the Danish profile information to the World Reference Base (WRB) system and predict the corresponding classes for the whole of Denmark is also necessary to harmonise Danish soil information in a global context.
As a member country of the European Node of the GlobalSoilMap project, which aims to map the soils of the entire planet at about 90 m resolution, Denmark should provide the Danish soil maps generated according to the protocol specified in the project.
88
In addition to the indices used to evaluate prediction uncertainty, some advanced uncertainty assessment tools, e.g. based on the prediction interval (PI) (Malone et al., 2011), should be considered. Moreover, propagation of errors during the entire DSM process needs to be addressed. Such an investigation would demonstrate how errors in digital soil maps propagate to the results that use DSM products as inputs. Monte Carlo simulation (Heuvelink, 1998) could be used for this kind of study.
A new validation scheme to evaluate the DSM products should be developed. Independent validation using probability-based sampling (Brus et al., 2011) seems a promising suggestion.
For better communication and a user-friendly data delivery service, an effective system to visualise the DSM products should be developed in Denmark. A web mapping service (WMS) could be very useful.
Production of a number of quality soil map sheets cannot be the sole outcome of DSM in Denmark. We should start to think about different issues, from local to national level, where detailed or rich soil information can make a difference. Going a step beyond DSM, it should be possible to apply the knowledge base for the assessment, monitoring and sustainable use of Danish soils and environmental resources. One of the practical applications suggested by Carre et al. (2007) is Digital Soil Assessment (DSA), which links DSM-generated soil property thematic maps to the quantification and evaluation of soil functions and soil threats in a number of scenarios. This is probably what future DSM activities in Denmark should focus upon. Based on the findings, it should be possible to assist the authorities in developing novel soil/land policies to further improve the quality of soil information in Denmark.
89
9 References Adhikari, K., Kheir, R.B., Greve, M.B., Bøcher, P.K., Malone, B.P., Minasny, B., McBratney, A.B., Greve, M.H., 2013. High-Resolution 3-D Mapping of Soil Texture in Denmark. Soil Sci. Soc. Am. J. 77, doi:10.2136/sssaj2012.0275. Akaike, H., 1973. Information theory and an extension of the maximum likelihood principle. In: B.N. Petrov, F. Csaki (Eds.), Second International Symposium on Information Theory. Akademia Kiado, Budapest, pp. 267-281. Beckett, P., Webster, R., 1971. Soil variability: a review. Soils and Fertilizers 34(1), 1-15. Bendix, J., 2004. Geländeklimatologie – Gebrüder Borntraeger. Berlin, Stuttgart. Bishop, T.F.A., McBratney, A.B., Laslett, G.M., 1999. Modelling soil attribute depth functions with equal-area quadratic smoothing splines. Geoderma 91(1-2), 27-45. Blum, W.E.H., Eswaran, H., 2004. Soils for sustaining global food production. J. Food Sci. 69(2), R37-R42. Boettinger, J.L., Howell, D.W., Moore, A.C., Hartemink, A.E., Kienast-Brown, S., 2010. Digital soil mapping: Bridging research, environmental application, and operation. Springer-Verlag, Dordrecht, the Netherlands. Böhner, J., Antonić, O., 2009. Land-surface parameters specific to topo-climatology, pp. 195-226. Böhner, J., Köthe, R., Conrad, O., Gross, J., Ringeler, A., Selige, T., 2001. Soil regionalization by means of terrain analysis and process parameterization. In: Micheli, E., Nachtergaele, F., Montanarella, L. (Eds.) ed. Soil Classification. 2001. European Soil Bureau, Research Report No. 7, EUR 20398 EN, Luxembourg, pp. 213-222. Børgesen, C.D., Kyllingsbæk, A., Djurhuus, J., 1997. Modelberegnet kvælstofudvasknin fra landbruget. DJF SP-rapport nr 19., Ministeriet for fødevare, Lamdbrug og Fiskeri. Bou Kheir, R., Greve, M.H., Bøcher, P.K., Greve, M.B., Larsen, R., McCloy, K., 2010. Predictive mapping of soil organic carbon in wet cultivated lands using classification-tree based models: The case study of Denmark. Journal of Environmental Management 91(5), 11501160. Bouma, J., 2009. Soils are back on the global agenda: Now what? Geoderma 150(1–2), 224-225. Bouma, J., Droogers, P., 2007. Translating soil science into environmental policy: A case study on implementing the EU soil protection strategy in The Netherlands. Environ. Sci. Policy 10(5), 454-463. Brady, N.C., Weil, R.R., 2002. The nature and properties of soils. Pearson Education, Inc. Brink, A., 1926. En ny Hartkornsansættelse af Danmarks Landbrugsjord. Tidsskrift for Opmålingsog Matrikulsvæsen 12(7), 274-283.
90
Brus, D.J., Kempen, B., Heuvelink, G.B.M., 2011. Sampling for validation of digital soil maps. European Journal of Soil Science 62(3), 394-407. Bui, E., 2007. A review of digital soil mapping in Australia. In: P. Lagacherie, A.B. mcbratney, M. Voltz (Eds.), Digital soil mapping- An introductory perpective. Developments in Soil Science. Elsevier, Amsterdam, pp. 25-38. Bui, E.N., Loughhead, A., Corner, R., 1999. Extracting soil-landscape rules from previous soil surveys. Australian Journal of Soil Research 37(3), 495-508. Burgess, T.M., Webster, R., 1980. Optimal interpolation and isarithmic mapping of soil properties. 1. The sami-variogram and punctual kriging. Journal of Soil Science 31(2), 315-331. Burrough, P., 1993. Soil variability: a late 20th century view. Soils and Fertilizers 56(5), 529-562. Burrough, P.A., McDonnell, R.A., 1998. Principles of geographical information systems, 333. Oxford university press, Inc., New York. Carre, F., Girard, M.C., 2002. Quantitative mapping of soil types based on regression kriging of taxonomic distances with landform and land cover attributes. Geoderma 110(3-4), 241-263. Carre, F., McBratney, A.B., Mayr, T., Montanarella, L., 2007. Digital soil assessments: Beyond DSM. Geoderma 142(1-2), 69-79. Chang, C.W., Laird, D.A., Mausbach, M.J., Hurburgh, C.R., 2001. Near-infrared reflectance spectroscopy-principal components regression analyses of soil properties. Soil Science Society of America Journal 65(2), 480-490. Crave, A., Gascuel-Odoux, C., 1997. The influence of topography on time and space distribution of soil surface water content. Hydrological Processes 11, 203-210. Daily, G.C., Matson, P.A., Vitousek, P.M., 1997. Ecosystem services supplied by soil. Nature's Services: Societal Dependence on Natural Ecosystems. Island Press, Washington, DC, 113132. Danmarks Geologiske Undersøgelse, 1978. Foreløbige geologogiske kort (1:25,000) over Danmark. Danmarks Geologiske Undersøgelse, Denmark. Davey, B.G., 1990. The chemical properties of soils. In: K.O. Campbell, J.W. Bowyer (Eds.), The Scientific Basis of Modern Agriculture. Sydney University Press, Sydney, Australia Desmet, P.J.J., Govers, G., 1996. A GIS procedure for automatically calculating the USLE LS factor on topographically complex landscape units. Journal of Soil and Water Conservation 51(5), 427-433. Deutsch, C.V., Journel, A.G., 1999. GSLIB: Geostatistical software library and user&s guide. Oxford university press, Oxford, UK. Dobos, E., Hengl, T., 2009. Soil mapping applications. Developments in Soil Science, 33. pp. 461479.
91
ESRI, 2012. ArcGIS Desktop: Release 10.1. Environmental Systems Research Institute, Redlands, CA. European Commission, 2002. European Commission COM (2002) 179 final. FAO-Unesco, 1974. Soil map of the world, legend. FAO, Rome, Italy. FAO-Unesco, 1990. Soil map of the world, revised legend. FAO, Rome, Italy. Freeman, T.G., 1991. Calculating catchment-area with divergent flow based on a regular grid. Computers & Geosciences 17(3), 413-422. Frenkel, H., Goertzen, J.O., Rhoades, J.D., 1978. Effects of clay type and content, exchangeable sodium percentage, and electrolyte concentration on clay dispersion and soil hydraulic conductivity. Soil Science Society of America Journal 42, 32-39. Gallant, J.C., Dowling, T.I., 2003. A multiresolution index of valley bottom flatness for mapping depositional areas. Water Resources Research 39(12). GEUS (Geological Survey of Denmark and Greenland), 2009. Danmarks digitale jordardskort 1:25.000, CD-ROM. De Nationale Geologiske Undersøgelser for Danmark og Grønland, Denmark. Glanz, J., 1995. Saving our soil: solutions for sustaining earth's vital resource. Johnson Books. Goovaerts, P., 1997. Geostatistics for natural resources evaluation. Oxford University Press, USA. Goovaerts, P., 1999. Geostatistics in soil science: State-of-the-art and perspectives. Geoderma 89(12), 1-45. Greve, M.H., Christensen, O.F., Greve, M.B., Jensen, N.J., Balstrøm, T., Madsen, H.B., Bou Kheir, R., 2013. Change in peat coverage in Danish cultivated soils during the past 35 years. Greve, M.H., Greve, M.B., Bocher, P.K., Balstrom, T., Breuning-Madsen, H., Krogh, L., 2007. Generating a Danish raster-based topsoil property map combining choropleth maps and point information. Geografisk Tidsskrift-Danish Journal of Geography 107(2), 1-12. Greve, M.H., Kheir, R.B., Greve, M.B., Bøcher, P.K., 2012. Using digital elevation models as an environmental predictor for soil clay contents. Soil Science Society of America Journal 76(6), 2116-2127. Greve, M.H., Madsen, H.B., 1999. Soil Mapping in Denmark, European Soil Bureau Research Report No. 9. Greve, M.H., Mount, H., Hudson, B., Breuning-Madsen, H., 2001. History of Land Value Assessment and Establishment of Benchmark Soils in Denmark. Soil Survey Horizons 42(1), 19-23. Grunwald, S., 2006. Environmental soil-landscape modeling: Geographic information technologies and pedometrics. Grunwald S. ed. CRC Press, New York.
92
Hansen, S., Jensen, H.E., Nielsen, N.E., Svendsen, H., 1990. NPO-research, A10: DAISY: Soil Plant Atmosphere System Model. The National Agency for Environmental Protection, Copenhagen, Denmark. Hartemink, A.E., 2008. Soils are back on the global agenda. Soil Use and Management 24(4), 327330. Hartemink, A.E., Hempel, J., Lagacherie, P., McBratney, A.B., McKenzie, N.J., MacMillan, R.A., Minasny, B., Montanarella, L., Mendonça Santos, M.L., Sanchez, P., 2010. GlobalSoilMap. net–a new digital soil map of the world, Digital Soil Mapping: Bridging Research, Environmental Application, and Operation. Springer Science, Dordrecht, pp. 423-428. Hartemink, A.E., McBratney, A.B., de Lourdes Mendonça-Santos, M., (Eds.), 2008. Digital soil mapping with limited data. Springer-Verlag, Dordrecht, the Netherlands. Hasholt, B., Madsen, H.B., Kuhlman, H., Hansen, A., Platou, S.W., 1990. Erosion and transport of phosphorus to river and lakes, Miljøstyrelsen, Copenhagen. Heckrath, G., Djurhuus, J., Quine, T.A., Van Oost, K., Govers, G., Zhang, Y., 2005. Tillage erosion and its effect on soil properties and crop yield in Denmark. Journal of Environmental Quality 34(1), 312-324. Hengl, T., Reuter, H.I., 2008. Geomorphometry: concepts, software, applications. Developments in Soil science, 33. Elsevier Science. Heuvelink, G.B., 1998. Error propagation in environmental modelling with GIS. Taylor & Francis, London. Heuvelink, G.B., Huisman, J.A., 2000. Choosing between abrupt and gradual spatial variation. Quantifying Spatial Uncertainty in Natural Resources: Theory and Applications for GIS and Remote Sensing. Ann Arbor Press, Chelsea, MI, 111-117. Hewitt, A., 1993. Predictive modelling in soil survey. Soils and Fertilizers 56(3), 305-314. Hewitt, A.E., McKenzie, N.J., Grundy, M.J., Slater, B.K., 2008. Qualitative survey. In: N.J. McKenzie, M.J. Grundy, R. Webster, A.J. Ringrose-Voase (Eds.), Guidelines for surveying soil and land resources. CSIRO PUBLISHING, Collingwood, Australia, pp. 285-306. Holst, K.A., Madsen, H.B., 1988. Modelling the irrigation need. Acta Agriculturæ Scandinavica 38(3), 261-269. Hudson, B.D., 1992. The soil survey as paradigm-based science. Soil Science Society of America Journal 56(3), 836-841. ISRIC, 1997. FAO/Unesco Soil map of the World, Revised Legend, with corrections and Updates. World Soil Resources Report 60, FAO, Rome, Reprinted with updates as Technical Paper 20, ISRIC, Wagenningen. Jacobsen, N.K., 1984. Soil map of Denmark according to the FAO-UNESCO Legend. Danish Journal of Geography 84, 93-98.
93
Jafari, A., Finke, P.A., Van de Wauw, J., Ayoubi, S., Khademi, H., 2012. Spatial prediction of USDA- great soil groups in the arid Zarand region, Iran: comparing logistic regression approaches to predict diagnostic horizons and soil types. European Journal of Soil Science 63(2), 284-298. Jenny, H., 1941. Factors of soil formation: A system of quantitative pedology. McGraw-Hill, New York. Kempen, B., Brus, D.J., Stoorvogel, J.J., 2011. Three-dimensional mapping of soil organic matter content using soil type-specific depth functions. Geoderma 162(1-2), 107-123. Kempen, B., Brus, D.J., Stoorvogel, J.J., Heuvelink, G., de Vries, F., 2012a. Efficiency Comparison of Conventional and Digital Soil Mapping for Updating Soil Maps. Soil Science Society of America Journal 76(6), 2097-2115. Kempen, B., Brus, D.J., Stoorvogel, J.J., Heuvelink, G.B.M., de Vries, F., 2012b. Efficiency Comparison of Conventional and Digital Soil Mapping for Updating Soil Maps. Soil Science Society of America Journal 76(6), 2097-2115. Lagacherie, P., Legros, J.P., Burrough, P.A., 1995. A soil survey procedure using the knowledge of soil pattern established on a previously mapped reference area. Geoderma 65(3-4), 283-301. Lagacherie, P., McBratney, A., Voltz, M., 2007. Digital soil mapping: An introductory perspective, 31. Elsevier Science Limited. Lagacherie, P., McBratney, A.B., 2007. Spatial soil information systems and spatial soil inference systems: perspectives for digital soil mapping. In: P. Lagacherie, A.B. McBratney, M. Voltz (Eds.), Digital Soil Mapping - An introductory perspective. Developments in Soil Science. Elsevier, Amsterdam, pp. 3-22. Luk, S.H., 1979. Effect of soil properties on erosion by wash and splash. Earth Surface Processes and Landforms 4(3), 241-255. Madsen, H.B., 1983. A pedological soil classification system for Danish soils. Pedologie 33(2), 171197. Madsen, H.B., Greve, M.H., Nørr, A.H., 2001. Danish soil classification and establishment of the Danish Soil Database. Soil Survey Horizons, 24-34. Madsen, H.B., Holst, K.A., 1987. Potential marginal land (In Danish). Marginaljorder og miljøinteresser, Teknikerapport nr. 1. , Skov & Naturstyrelsen, København, Danmark. Madsen, H.B., Jensen, N.H., 1985. The establishment of pedological soil databases in Denmark. Geografisk Tidsskrift 85, 1-8. Madsen, H.B., Jensen, N.H., 1992. Pedological regional variations in well-drained soils, Denmark. Geografisk Tidsskrift 92, 61-69. Madsen, H.B., Jensen, N.H., 1996. Soil map of Denmark according to the revised FAO legend 1990. Danish Journal of Geography 96, 51-59.
94
Madsen, H.B., Nørr, A.H., Holst, K.A., 1992. The Danish soil classification: Atlas over Denmark I, 3. The Royal Danish Geographical Society, Copenhagen, Denmark. Malone, B., McBratney, A., Minasny, B., Laslett, G., 2009. Mapping continuous depth functions of soil carbon storage and available water capacity. Geoderma 154(1), 138-152. Malone, B.P., McBratney, A.B., Minasny, B., 2011. Empirical estimates of uncertainty for mapping continuous depth functions of soil attributes. Geoderma 160(3-4), 614-626. Matheron, G., 1965. Les variables régionalisées et leur estimation. Paris. Mathiesen, F.D., 1980. Soil classification in Denmark, its results and applicability, EEC-report on land resource evaluation, EUR 6875. McBratney, A.B., Minasny, B., MacMillan, R.A., Carré, F., 2011. Digital Soil Mapping. In: P.M. Huang, Y. Li, M.E. Sumner (Eds.), Handbook of Soil Sciences: Properties and Processes. Handbook of Soil Science. CRC Press, Boca Raton, FL, pp. 1-44. McBratney, A.B., Pringle, M.J., 1999. Estimating average and proportional variograms of soil properties and their potential use in precision agriculture. Precision Agriculture 1(2), 125152. McBratney, A.B., Santos, M.L.M., Minasny, B., 2003. On digital soil mapping. Geoderma 117(1-2), 3-52. McBratney, A.B., Webster, R., 1986. Choosing functions for semi-variograms of soil properties and fitting them to sampling estimates. Journal of Soil Science 37(4), 617-639. McKenzie, N.J., Ryan, P.J., 1999. Spatial prediction of soil properties using environmental correlation. Geoderma 89(1-2), 67-94. McSweeney, K., Gessler, P.E., Slater, B.K., Hammer, R.D., Bell, J.C., Petersen, G.W., 1994. Towards a new framework for modeling the soil-landscape continuum. Factors of soil formation. Proc. symposium, Denver, 1991, 127-145. Minasny, B., Bishop, T.F.A., 2008. Analysing uncertainty. In: N.J. McKenzie, M.J. Grundy, R. Webster, A.J. Ringrose-Voase (Eds.), Guidelines for surveying soil and land resources. CSIRO PUBLISHING, Collingwood, Australia, pp. 383-391. Minasny, B., Malone, B.P., McBratney, A.B., (Eds.), 2012. Digital Soil Assessments and Beyond. CRC Press/Balkema, Leiden, the Netherlands. Minasny, B., McBratney, A., Whelan, B., 2005. VESPER version 1.62. Australian Centre for Precision Agriculture, McMillan Building A05, The University of Sydney. NSW. Minasny, B., McBratney, A.B., 2002. Uncertainty analysis for pedotransfer functions. European Journal of Soil Science 53(3), 417-429. Minasny, B., McBratney, A.B., 2008. Regression rules as a tool for predicting soil properties from infrared reflectance spectroscopy. Chemometrics and intelligent laboratory systems 94(1), 72-79.
95
Minasny, B., McBratney, A.B., Lark, R.M., 2008. Digital Soil Mapping Technologies for Countries with Sparse Data Infrastructures. Digital Soil Mapping with Limited Data. pp. 15-30. Minasny, B., McBratney, A.B., Malone, B.P., Wheeler, I., 2013. Digital Mapping of Soil Carbon, pp. 1-47. Minasny, B., McBratney, A.B., Mendonca-Santos, M.L., Odeh, I.O.A., Guyon, B., 2006. Prediction and digital mapping of soil carbon storage in the Lower Namoi Valley. Australian Journal of Soil Research 44(3), 233-244. Mishra, U., Lal, R., Slater, B., Calhoun, F., Liu, D., Van Meirvenne, M., 2009. Predicting Soil Organic Carbon Stock Using Profile Depth Distribution Functions and Ordinary Kriging. Soil Science Society of America Journal 73(2), 614-621. Moore, A., Russell, J., Ward, W., 1972. Numerical analysis of soils: A comparison of three soil profile models with field classification. Journal of Soil Science 23(2), 193-209. Moore, I.D., Gessler, P.E., Nielsen, G.A., Peterson, G.A., 1993. Soil attribute prediction using terrain analysis. Soil Science Society of America Journal 57(2), 443-452. National Survey and Cadastre, 2011. Produktspecifikation. Danmarks Højdemodel, DHM/Terræn. Data Version 1.0 – December 2009. In: N.S.a. Cadastre (Ed.), Copenhagen. Odgers, N.P., Libohova, Z., Thompson, J.A., 2012. Equal-area spline functions applied to a legacy soil database to create weighted-means maps of soil organic carbon at a continental scale. Geoderma 189–190(0), 153-163. Omuto, C., Nachtergaele, F., Rojas, R.V., 2012. State of the Art Report on Global and Regional Soil Information: Where are we? Where to go?, Food and Agriculture Organization, Rome, Italy. Pedersen, V.E., 1932. Jordbonitering og Jordbeskatning. Tidsskrift for Opmålings- og Matrikulsvæsen 13(3), 57-69. Pike, R.J., 2000. Geomorphometry - diversity in quantitative surface analysis. Progress in Physical Geography 24(1), 1-20. Poncehernandez, R., Marriott, F.H.C., Beckett, P.H.T., 1986. An improved method for reconstructing a soil-profile from analyses of a small number of samples. Journal of Soil Science 37(3), 455-467. R Development Core Team, 2008. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. Russell, E.J., 1973. Soil Conditions and Plant Growth. 10th ed. Longmans Group Ltd., London, UK. Russell, J.S., Moore, A.W., 1968. Comparison of different depth weightings in the numerical analysis of anisotropic soil profile data, Proceedings of the 9th International Congress of Soil Science, pp. 205-213. SAGA GIS, System for Automated Geoscientific Analyses, http://www.saga-gis.org.
96
Sanchez, P.A., Ahamed, S., Carre, F., Hartemink, A.E., Hempel, J., Huising, J., Lagacherie, P., McBratney, A.B., McKenzie, N.J., Mendonca-Santos, M.D., Minasny, B., Montanarella, L., Okoth, P., Palm, C.A., Sachs, J.D., Shepherd, K.D., Vagen, T.G., Vanlauwe, B., Walsh, M.G., Winowiecki, L.A., Zhang, G.L., 2009. Digital Soil Map of the World. Science 325(5941), 680-681. Scull, P., Franklin, J., Chadwick, O.A., McArthur, D., 2003. Predictive soil mapping: a review. Progress in Physical Geography 27(2), 171-197. Simonson, R.W., Bomer, W., 1989. Historical highlights of soil survey and soil classification with emphasis on the United States, 1899-1970. International Soil Reference and Information Centre. Stjernholm, M., Kjeldgaard, A., 2004. CORINE landcover update in Denmark-Final report, National Environment Research Institute (NERI), Denmark. Stoorvogel, J.J., Kempen, B., Heuvelink, G.B.M., de Bruin, S., 2009. Implementation and evaluation of existing knowledge for digital soil mapping in Senegal. Geoderma 149(1–2), 161-170. Taghizadeh-Mehrjardi, R., Minasny, B., McBratney, A.B., Triantafilis, J., Sarmadian, F., Toomanian, N., 2012. Digital soil mapping of soil classes using decision trees in central Iran. 5th Global Workshop on Digital Soil Mapping, Sydney, NSW, pp. 197-202. Tanji, K.K., 1996. Agricultural salinity assessment and management. ASCE, New York. Warrick, A.W., Gardner, W.R., 1983. Crop yield as affected by spatial variations of soil and irrigation. Water Resources Research 19, 181-186. Webster, R., 1994. The development of pedometrics. Geoderma 62(1–3), 1-15. Webster, R., Oliver, M.A., 2006. Modelling spatial variation of soil as random functions. In: S. Grunwald (Ed.), Environmental soil-landscape modeling: Geographic information technologies and pedometrics. Taylor and Francis, Boca Raton, pp. 241-288. Whelan, B.M., 2003. Precision Agriculture: An Introduction to concepts, analysis and interpretation. A training course for graduate and industrial professional, Australian Center for Precision Agriculture, University of Sydney, Australia. Williams, P.C., 1987. Variables affecting near-infrared reflectance spectroscopic analysis. Nearinfrared technology in the agricultural and food industries, 143-166. Wilson, J.P., Gallant, J.C., 2000a. Digital terrain analysis. In: J.P. Wilson, J.C. Gallant (Eds.), Terrain analysis: Principles and applications. John Willey & Sons, INC, New York, pp. 1-27. Wilson, J.P., Gallant, J.C., 2000b. Secondary topographic attributes. In: J.P. Wilson, J.C. Gallant (Eds.), Terrain Analysis: Principles and Applications. John Wiley & Sons, INC, New York, pp. 133-160. Young, I.M., Crawford, J.W., 2004. Interactions and self-organization in the soil-microbe complex. Science 304(5677), 1634-1637.
97
Zhu, A.X., Band, L., Vertessy, R., Dutton, B., 1997. Derivation of soil properties using a soil land inference model (SoLIM). Soil Science Society of America Journal 61(2), 523-533. Zhu, A.X., Hudson, B., Burt, J., Lubich, K., Simonson, D., 2001. Soil mapping using GIS, expert knowledge, and fuzzy logic. Soil Science Society of America Journal 65(5), 1463-1472.
98
10 Appendices 10.1 Appendix A The main changes made in the 1974-legend (ISRIC, 1997) 1) 1974-legend: 26 major soil groupings with 106 soil units; Revised-legend: 28 major soil groupings with 153 soil units. 2) Major soil groupings of the 1974-legend deleted in the Revised legend: Lithosols, Rendzinas and Rankers now grouped within Leptosols; Yermosols and Xerosols, are now incorporated in other groups and a yermic phase is indicated where appropriate. 3) Major soil groupings added in the Revised legend: Leptosols, Calcisols, Gypsisols, Lixisols, Alisols, Plinthosols, Anthrosols. 4) New symbols used to avoid confusion between 1974-legend and Revised-legend.
99
Table A.12 Description on the Major soil groupings of FAO-Unesco Soil Map of the World (Revised-legend). S. No. 1
Major Soil groupings Fluvisols
2
Gleysols
3
Regosols
4
Leptosols
5 6 7 8
Arenosols Andosols Vertisols Cambisols
9 10 11 12 13
Calcisols Gypsisols Solonetz Solonchacks Kastanozems
14
Chernozems
15
Phaeozems
16 17 18
Greyzems Luvisols Planosols
19 20
Podzoluvisols Podzols
21 22 23 24 25 26
Lixisols Acrisols Alisols Nitisols Ferralsols Plinthosols
27 28
Histosols Anthrosols
A brief description Water-deposited soils with little alteration, in flood plains or in Tidal marshes Soils with mottled or reduced horizons due to fluctuating ground water effects Relatively young soils, formed from unconsolidated materials and no significant profile development Thin soils over continuous hard rock or highly calcareous material, extremely gravelly soils Relatively young soils formed from sand Soils formed in volcanic ash, soft and rich in allophanes Self-mulching, inverting soils, rich in smectitic clay Moderately developed soils with slight color, structure, or consistency change due to weathering Soils with high calcium carbonates, soft to hard cacic layer Soils rich in gypsum, soft to hard gypsic layer Alkaline soils with high sodium content Soils with soluble salt accumulation, mainly due to evaporation Soils with chestnut surface color, steppe vegetation, transition to drier climate Soils with black surface, high humus content, under prairie vegetation Soils with dark surface, more leached than Kastanozems or Chernozems, transition to humid climate Soils with dark surface, bleached E horizon, and textural B horizon Medium to high base status with argic B horizon, high activity clay Soils with abrupt textural discontinuity or abrupt A-B horizon contact Soils with leached horizons tonguing into argic B horizon Soils with light-colored eluvial horizon and subsoil accumulation of iron, aluminium, and humus Soils with argic B horizon, high base status and low activity clay Low base status soils with argic B horizon, low activity clay Low base status soils with argic B horizon, high activity clay Soils with low activity clay in argic horizons, strongly structured Highly weathered soils with sesqioxide or kaolinite rich clays Soils with high accumulation of iron under hydromorphic conditions, Plinthite formation Soils with thick organic layer, very rich in carbon Soils highly influenced by long and intensive agriculture, high human influence
100
10.2 Appendix B Rule 1: [154 cases, mean 0.98, range 0.12 to 1.62, est err† 0.22] if
then
georegions in (1, 3) landuse in (3, 5, 8, 21, 24) mrvbf† > 0.37 soilmap in (1, 2) vall_depth > 1.93 vall_depth 3.35 vertdist_chn 5.89 soilmap in (1, 2) vall_depth > 3.35 vertdist_chn > 1.25 then
log_clay(%) = 0.82 - 0.58 ls_factor + 0.153 slp_deg + 0.053 mrvbf
Rule 7: [185 cases, mean 1.21, range 0.17 to 1.87, est err 0.23] if georegions in (4, 5, 6, 8) landuse in (4, 6, 7, 11, 20, 24) soilmap in (1, 2) then log_clay(%) = 1.21 + 0.06 ls_factor - 0.004 mrvbf Rule 8: [581 cases, mean 1.26, range 0.14 to 3.57, est err 0.24] if georegions in (1, 3) landscape in (2, 5, 6, 7, 8, 9, 10, 11) landuse in (3, 5, 8, 21, 24) mrvbf > 0.37 mrvbf 1.25 then
log_clay(%) = 1.51 - 0.018 twi - 0.017 slp_deg + 0.0006 elevation - 0.002 vall_depth - 0.003 mrvbf
Rule 9: [476 cases, mean 1.41, range 0.48 to 2.76, est err 0.21] if georegions in (4, 5, 6, 8) landuse in (3, 5, 9, 12, 21) soilmap in (1, 2) then
log_clay(%) = 1.39 - 0.033 twi + 0.03 saga_wi - 0.006 vall_depth - 0.011 mrvbf
Rule 10: [162 cases, mean 1.42, range 0.47 to 3.36, est err 0.26] if mrvbf 5.37 geology in (1, 3, 5, 9, 11, 13, 14, 20) landscape in (6, 8) soilmap in (3, 4, 11) then
log_clay(%) = 1.88 + 0.0033 elevation - 0.041 ls_factor + 0.019 slp_deg + 0.003 vall_depth - 0.013 saga_wi + 0.007 twi
Rule 15: [88 cases, mean 2.02, range 0.98 to 3.26, est err 0.33] if
then
elevation 2.08 geology in (2, 7, 10) soilmap in (3, 4, 11) log_clay(%) = 2.01
Rule 16: [1021 cases, mean 2.08, range 0.79 to 3.17, est err 0.22] if
then
elevation > 20.83 geology in (2, 7, 10) soilmap in (3, 4, 11) log_clay(%) = 2.18 + 0.0021 elevation - 0.014 saga_wi + 0.002 vall_depth
Rule 17: [21 cases, mean 2.15, range 1.30 to 3.38, est err 0.23] if
then
elevation 5.29 vertdist_chn 5.23 finesand_0-5(g/kg) = 415.44
Rule 12: [82 cases, mean 405.89, range 0 to 849.78, est err 152.31] if
saga_wi > 13.95 soilmap in (0, 2, 4, 5, 6, 7) vall_depth