Tools for Data Mining. Symbolic techniques: rules and decision trees. Bayesian networks. BIG DATA & DM. Developments of Models in Geo-Engineering ...
Sichuan University College of Hydraulic and Hydroelectric Engineering
BIG DATA and Data Mining. Developments of Models in GeoEngineering Luis Ribeiro e Sousa Sichuan University, Chengdu, China
1
2
3
4
5
Chengdu, July 14, 2015
Contents 1. BIG Introduction 2. Underground Hydroelectric Schemes, Portugal o o
Venda Nova II Bemposta II
3. Models for Geomechanical Characterization of Rock Mass at DUSEL o o o
Geotechnical investigations at Sanford Laboratory Development of Geomechanical Models Use of Bayesian Networks (BN)
4. Models for the prediction of rockburst indexes o o o
Rockburst laboratory tests Rockburst maximum stresses Rockburst indexes
5. BIG Conclusions 2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
Sichuan University College of Hydraulic and Hydroelectric Engineering
1. BIG Introduction
1
2
3
4
5
Chengdu, July 14, 2015
Use of Data Mining techniques • The uncertainties in underground structures are related with geotechnical conditions and construction. • The determination of geotechnical parameters in RMs for underground works is submitted to high uncertainties. • A rigorous determination of the geomechanical parameters is the key for an efficient design and rigorous of the supports and for the excavation process. • The methodologies are based in laboratory and in situ tests and in the application of empirical methodologies (RMR, Q e GSI). • Prediction of rockburst indexes based on laboratory tests. • Multiple Model System Identification (Clustering Multiple Models) Schem of the Schwandbach bridge used to illustrate the proposed methodology for iterative sensor placement
2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
Genetic algorithms
Symbolic techniques: rules and decision trees
Neural networks
Clustering
Bayesian networks Tools for Data Mining
Fuzzy logic
Machine Learning Tools and DM Techniques 2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
Rule based systems
Steps in the process of discovering patterns in Databases
2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
DM and discover of knowledge • Commercials – – –
SAS Enterprise Miner SPPS Clementine IBM Intelligent Miner
• Public Domain – –
WEKA R Environment
• Inserted in a SGBD – –
Commercially available DM software 2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
Oracle SQL Server
Sichuan University College of Hydraulic and Hydroelectric Engineering
2. Underground Hydroelectric Schemes, Portugal o Venda Nova II o Bemposta II
1
2
3
4
5
Chengdu, July 14, 2015
DM application to Venda Nova II
access tunnel to the caverns, with about 1.5km, 10.9% slope and 58m2 crosssection
hydraulic circuit with a 2.8km headrace tunnel with 14.8% slope and a 1.4km tailrace tunnel and 2.1% slope, with a 6.3m diameter
upper surge chamber with a 5.0m diameter and 415m height shaft
powerhouse complex located at about 350m depth with two caverns, for the powerhouse and transforming units, connected by two galleries
2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
Initial data of Venda Nova II Excel file – 1230 registers, 60 attributes Q system – RQD, JW, Jn, Jr, Ja, SRF, Q, Q’, RQD/Jn, Jr/Ja, Jw/SRF, log Q, log Q’ RMR system – P1, P2, P3, P4, P5, P6, RMR, P41, P42, P43, P44, P45 GSI system – GSI, RCU, mb-D=0, mb-D=0,5, mb-D=0,8, mb-D=1, s-D=0, s-D=0,5, s-D=0,8, s-D=1, σcm-D=0, σ3max-D=0, φ-D=0, c’-D=0, σcm-D=0,2, σ3max-D=0,2, φD=0,2, c’-D=0,2, σcm-D=0,5, σ3max-D=0,5, φ-D=0,5, c’D=0,5, σcm-D=0,8, σ3max-D=0,8, φ-D=0,8, c’-D=0,8, σcmD=1, σ3max-D=1, φ-D=1, c’-D=1 Others – N, RCR 2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
11
Fluxogram used for the DM tasks 2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
RMR
Histograms of RMR, Q and GSI
Q
GSI
2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
Expressions used in the calculation of E
2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
Comparison between the number of times the expressions were calculated with the number of times the result was within the considered interval
2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
100
RMRpred = 0,9989RMR R2 = 0,9539
predicted RMR
80 60 40 20 0 0
20
40
60
80
100
predicted Em (GPa)
RMR
80 70 60 50 40 30 20 10 0 -10 0
Empred = 0,9944Em R2 = 0,9759
20
40 Em (GPa)
2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
60
80
φ - Importance of variables for Dataset 1
φ - Importance of variables for Dataset 2
φ - Importance of variables for Dataset 3
2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
Bemposta dam, Portugal Built in 1964, H=87m, length – 297m
2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
Repowering of Bemposta II, Portugal
Topology of the error function for the theoretical calculation: 3D view
2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
Database at Bemposta II The database is composed by the following information: • 286 lines with RMR values and their parameters (P1 to P6). • 270 lines with the Q values and their parameters (RQD, Ja, Jn, Jr, Jw and SRF). • 686 lines with the values of SMR and parameters P1 to P5 and adjustment factor AF.
• MR models were computed for RMR and E and for this parameter also two multilayer perceptron ANN (with one or two hidden layers) were developed.
2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
Models for RMR Parameters MAD RMSE
R2
Obtained models for RMR with sets of 2 and 3 input parameters
Model RMR= 27.134+0.002xP23+
P2+P4
3.288 4.035
0.685 0.851xP4 RMR= 29.665+0.047xP22+
P2+P5
3.291 4.158
0.665 0.053xP52 RMR= 38.623+0.004xP13+
P1+P2
3.367 4.357
RMR 27.134 0.002 P23 0.851 P4
0.632 0.002xP23
(16) RMR 9.848 1.206 P5 0.913 P4 0.043 P22
RMR= 9.848+1.206xP5+ P2+P4+P5 2.281 2.757
0.853 0.913xP4+0.043xP22 3
RMR=24.771+0.002xP2 + P1+P2+P4 2.257 3.094
0.815 0.004xP13+0.895xP4 RMR=25.467+0.804*P1+
P1+P2+P5 2.894 3.708
0.734 0.041*P22+0.046*P52
2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
Best RMR models with 2 and 3 input parameters.
Developed models for E Comparison of the E values obtained by the in situ tests and by the empirical formulae.
2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
Sichuan University College of Hydraulic and Hydroelectric Engineering
3. Models for Geomechanical Characterization of Rock Mass at DUSEL o Geotechnical investigations at Sandford Lab o Development of Geomechanical Models o Use of Bayesian Networks (BN)
1
2
3
4
5
Chengdu, July 14, 2015
Geotechnical investigations for the scientific experiences at Sandford Laboratory • In 2000, it was announced that the Homestake Gold Mine, located in Lead, SD, would cease operation. The mine had been the site of a physics experiment, which operated for 30 years, aimed at the detection and quantification of neutrinos originating from the sun. • The ability to install experiments underground is important because the overlying rock can shield sensitive detection experiments from cosmic radiation. Currently, experiments aimed at the detection of neutrinos from directed beam lines, generation of neutrinos from natural radioactive breakdown processes, and the detection of dark matter are either • being contemplated or are being installed in • the underground at the Sanford Laboratory at • Homestake.
Homestake Mine – started in 1876 Depth of 2,5km; 600km of galleries Nobel prize of Physics by Raymond Davis in 2002
2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
Purposes for DUSEL • • • • • • •
Dark Matter (A, B, and C) Neutrinoless Double Beta Decay (F e G) Nuclear Astrophysics (I) Proton Decay (E) Long Baseline Neutrinos (D, F, e H) This panel indicates the meaning of these experiences about physics of our universe and its history (J). The Laboratory incudes a multidisciplinary research program in the Earth Sciences including geomicrobiology (K), ruptures in faults (M and N), monitoring of excavations, coupled processes (L), and seismic monitoring systems (M).
Lesko et al., 2011
2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
Graphical representation of DUSEL
Lesko et al., 2011
2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
Long Baseline Neutrine Experiment (Roggenthen, 2012)
Vertical cross at Fermilab showing (from the right to left) the inclination of the beamline, target hall, decay pipe, and the detector complex. 2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
Development of concepts for DUSEL
1
2
3
4
5
BIG DATA & DM. Developments of Models in Geo-Engineering
2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
Geotechnical investigations 1. Geology 2. Mapping of galleries 3. Boreholes and analysis of samples 4. Use of Televiewer 5. In situ stress measurements 6. Numerical modelling 7. Laboratory tests 8. Monitoring of vibrations due to the use of blasting 9. Scanning by laser 10. Use of Data Mining techniques 11. Application of Bayesian Networks (BN)
2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
Isometric view of the geological structures in the neighboroud of LC-1 (level 4850)
2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
Geologic map at 4850 L and localization of the boreholes showing the planned in the triangle between shafts Ross e Yates (Golder)
1
2
3
4
5
Respec, 2010
2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
Summary of rock mass classifications for the rock mass zones
2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
Brazilian tests (49) UCS (54)
Triaxial tests (29)
RESPEC Shear tests on discontinuities and discontinuities (28) 2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
Acoustic and optical techniques were used, depending of the presence of water in the borehole.
Televiewers
1
2
3
4
5
Borehole BH 3.
The knowledge about in situ state of stress in Homestake mine was establishe before in the mine by Pariseau (1985) and Johnson et al. [1993].
1 psi
= 0,0069 MPa
RESPEC, 2010 2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
3D modelling by FE (Golders)
Numerical results considering 3 large cavities 2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
Processes for risk analysis (Popielak & Weinig, 2010)
A process developed by Golders Associates and applied to DUSEL consist in the following activities: i) ii) iii) iv)
Ranking of risk factors System to be modeled Conceptual models of the system Numerical analyses for the study of potential impacts of risk factors, and v) Risk management plan for the management of the different risk factors.
2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
Results obtained for RQD, RMR, GSI and Q
LF&A, 2009
2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
Mapping location map at 4850 level
Mapping location map at 4850 map
1
2
3
4
5
BIG DATA & DM. Developments of Models in Geo-Engineering
RMR Models RMR 13.022 1.195 P2 1.118 P3 1.352 P5
(5)
RMR 29.404 1.270 P2 1.258 P3 2 input parameters
(6)
3 input parameters
MAD
RMSE
MAD
RMSE
MR
1.248
1.543
1.137
1.598
ANN
1.317
1.833
1.330
2.279
SVM
1.584
2.384
1.168
1.828
2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
– Metrics for the predictive models for the RMR index
Real versus predicted RMR values for models using two input parameters
2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
GSI Models
GSI 33.916 0.420 P3 0.472 RQD 0.013 UCS 3 input parameters MAD
RMSE
MR
4.411
5.559
ANN
4.522
5.650
SVM
4.278
5.578
Metrics for the predictive models for GSI
2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
Q Models Q 0.656 0.016 RQD 0.408 J r 0.353 RQD J n
(8)
Q 0.037 0.020 RQD 0.390 RQD J n
(9)
(8)
(9) Table 4 Metrics for the predictive models for the Q index 2 input parameters
3 input parameters
MAD
RMSE
MAD
RMSE
MR
0.031
0.054
0.020
0.028
ANN
0.031
0.065
0.003
0.008
SVM
0.040
0.106
0.009
0.029
2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
Use of Bayesian Networks (BNs) BNs are another possibility that allows the introduction of uncertainties related to the geotechnical and construction aspects, risk management and decision making during construction.
BN for Risk Analysis of storage of CO2 with the existence of activate fault (presented at the 4th Sino-German Conf.)
2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
Several BN where learned and tested with the cases available on the database, using the software GeNIe. In this specific study, only models that allow predicting RMR values were developed. They were trained with about 4/5 of the cases and tested on 25 different cases. The algorithm used for learning the models was the “greedy thick thinning” with a uniform prior. For detailed information on the greedy thick thinning algorithm please refer to Heckerman
Naïve Bayesian network with 5 parameters (P1, P2, P3, P4, P5)
Figure shows the structure of learned models obtained using five parameters (P1, P2, P3, P4 e P5),
2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
Naïve BN with 3 parameters (P2,P3 and P5)
2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
BN with 2 parameters (P2 and P3)
2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
Strength of Influence
The learned networks were tested on 25 randomly selected cases from the database (not used to train the networks).
2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
Accuracy results for RMR predictive BN models
BN Models a) Naïve Bayesian network with 5 parameters (P1, P2, P3, P4, P5) b) Bayesian Network with 3 parameters (P2, P3 and P5) c) Bayesian Network with 2 parameters (P2 and P3)
2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
Accuracy (%) 68%
72% 76%
Sichuan University College of Hydraulic and Hydroelectric Engineering
4. Models for the Prediction of Rockburst Indexes o Rockburst laboratory tests
o Rockburst maximum stresses o Rockburst indexes
1
2
3
4
5
Chengdu, July 14, 2015
2 4 1 3 5 Developments of Models in Geomechanics Using DM Techniques
Influence diagram of rockburst Fig. 6. Influence diagram of rockburst (Adapted from Sousa 2010).
Type and rock strength
Geometry (Shape and equivalent diameter)
Faults (Folding)
Construction method (Support & advanced rate)
Rockburst
Stress state (Overburden & K=σh/ σv)
Severity (Time delay)
Dimensions of burst (Location)
Damage of tunnel
Fatalities & injuries
2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
Rockburst testing laboratory system
2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
Rockburst testing system
2 4 1 3 5 Developments of Models in Geomechanics Using DM Techniques
Rockburst tests in different rock types.
2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
Fields considered in the database Field Location of the test Dimensions of sample Rock material Main minerals and cracks Stresses before loading (MPa)
Stresses during tests (MPa)
Characteristics of test Critical depth (m)
rockburst
Topics Location sample; depth (m); country; date. Code, length; width; height (mm); volume (cm3) Type of rock; RQD; UCS (MPa); Specific weight (g/cm3); E (GPa) Elastic modulus; ν – Poisson ratio. % clay; % feldspar; % calcite; % carbon; Existence of cracks σv – vertical in situ stress; σh1 –horizontal in situ stress; σh2 – horizontal in situ stress in the face to be unloaded; I1 (first invariant of the stresses); I2 (second invariant of the stresses); I3 (third invariant of the stresses). Rockburst maximum stress (σRB); maximum stress axis; loading rate in MPa/s; unloading rate for vertical stresses in MPa/s; unloading times. Type of burst position; duration of the test in minutes; time of burst delay (minutes); mainly shape of fragments. Critical depth; rockburst risk index.
1
2 1 3 5 BIG DATA &4DM. Developments of Models in Geo-Engineering
Rockburst indexes
rockburst critical depth
Value of IRB IRB