sensors Article
An Efficient Neural-Network-Based Microseismic Monitoring Platform for Hydraulic Fracture on an Edge Computing Architecture Xiaopu Zhang 1,2 , Jun Lin 1,2 , Zubin Chen 1,2 , Feng Sun 1,2,3, *, Xi Zhu 3 and Gengfa Fang 3 1 2 3
*
College of Instrumentation and Electrical Engineering, Jilin University, Changchun 130061, China;
[email protected] (X.Z.);
[email protected] (J.L.);
[email protected] (Z.C.) Key Laboratory of Geophysical Exploration Equipment, Ministry of Education, Jilin University, Changchun 130061, China School of Electrical and Data Engineering, University of Technology Sydney, Sydney, NSW 2007, Australia;
[email protected] (X.Z.);
[email protected] (G.F.) Correspondence:
[email protected]; Tel.: +86-431-8850-2054
Received: 3 May 2018; Accepted: 3 June 2018; Published: 5 June 2018
Abstract: Microseismic monitoring is one of the most critical technologies for hydraulic fracturing in oil and gas production. To detect events in an accurate and efficient way, there are two major challenges. One challenge is how to achieve high accuracy due to a poor signal-to-noise ratio (SNR). The other one is concerned with real-time data transmission. Taking these challenges into consideration, an edge-computing-based platform, namely Edge-to-Center LearnReduce, is presented in this work. The platform consists of a data center with many edge components. At the data center, a neural network model combined with convolutional neural network (CNN) and long short-term memory (LSTM) is designed and this model is trained by using previously obtained data. Once the model is fully trained, it is sent to edge components for events detection and data reduction. At each edge component, a probabilistic inference is added to the neural network model to improve its accuracy. Finally, the reduced data is delivered to the data center. Based on experiment results, a high detection accuracy (over 96%) with less transmitted data (about 90%) was achieved by using the proposed approach on a microseismic monitoring system. These results show that the platform can simultaneously improve the accuracy and efficiency of microseismic monitoring. Keywords: microseismic monitoring; event detection; edge computing; neural networks; probabilistic inference
1. Introduction Hydraulic fracturing is a critical technology to improve oil and gas production that has been especially driven by the “shale gas revolution”, since the propagation paths in low-permeability reservoirs, where hydrocarbons may flow, are only created by hydraulic fracturing [1–4]. According to a report provided by the Energy Information Administration (EIA) in 2016, more than 50% of crude oil production in the U.S. was yielded by hydraulically fractured wells [5]. Hydraulic fracturing, as an essential technology for the development of unconventional resources, has also been used in over 2 million wells worldwide and about 90% of new U.S. gas wells [6]. Before hydraulic fracturing, people firstly shoot at the pipe of the fracturing well underground to create some initial cracks. Then, these cracks are fractured to form propagation paths. So, it is critical for hydraulic fracturing to be aware of the locations and growing trends of these cracks [7,8]. Usually, during the hydrofracturing process, a microseismic monitoring system is needed to map the locations of fractures by exactly sensing the induced microseismic events in order to, in a timely fashion, determine the correct Sensors 2018, 18, 1828; doi:10.3390/s18061828
www.mdpi.com/journal/sensors
Sensors 2018, 18, 1828 x FOR PEER REVIEW
2 of 19
locations of fractures by exactly sensing the induced microseismic events in order to, in a timely fracture orientation andcorrect dimension to make the propagation paths to grow efficiently [9]. As shown fashion, determine the fracture orientation and dimension make the propagation paths in Figure 1, a microseismic monitoring system mainly consists of a data center, surface monitoring grow efficiently [9]. As shown in Figure 1, a microseismic monitoring system mainly consists of a units, and borehole monitoring units [10]. To monitor the changes of underground cracks as a correct data center, surface monitoring units, and borehole monitoring units [10]. To monitor the changes of reference for hydraulic fracturing, data center and fracturing, analyzes thethe data acquired monitoring underground cracks as a correctthe reference for collects hydraulic data centerbycollects and units. For providing the references to hydraulic fracturing in time, a microseismic monitoring system analyzes the data acquired by monitoring units. For providing the references to hydraulic fracturing is required to perform at monitoring high accuracy in a real-time fashion. Therefore, the accuracy in time, a microseismic system is required to perform at high accuracyofinmicroseismic a real-time event detection and the data collection time are two important indices of a microseismic monitoring fashion. Therefore, the accuracy of microseismic event detection and the data collection time are two system. In summary, improve the efficiency of hydraulic are two major challenges important indices of to a microseismic monitoring system. Infracturing, summary,there to improve the efficiency of in microseismic monitoring. The first challenge is how to identify the induced microseismic events hydraulic fracturing, there are two major challenges in microseismic monitoring. The first challenge with high Theinduced second one is how to make thewith microseismic data collection as fast is how to accuracy. identify the microseismic events high accuracy. The second oneasispossible how to so that the system can work in real-time. make the microseismic data collection as fast as possible so that the system can work in real-time.
Figure 1. Microseismic monitor schematic diagram. Figure 1. Microseismic monitor schematic diagram.
In the last few decades, several solutions to achieve high-quality microseismic monitoring have the last as fewa consequence decades, several solutions to achieve high-quality microseismic have been In proposed of the increasing demand for petroleum and naturalmonitoring gas worldwide. been proposed asmainly a consequence of with the increasing demand for petroleum and natural [11–15]. gas worldwide. Most of them are concerned the accuracy of microseismic events detection On the Most of them are mainly concerned with the accuracy of microseismic events detection [11–15]. the other hand, for a real-time system, some solutions are presented to compromise accuracy forOn faster other hand, for a real-time system, some solutions are presented to compromise accuracy for faster data transmission [16–18]. Until now, there is no such solution that can be used to support a high data transmission there is no such solution can be used to high accuracy of events [16–18]. detectionUntil withnow, a short data transmission timethat simultaneously. Tosupport addressathese accuracy of events detection with a short data transmission time simultaneously. To address these two two issues, we design a neural-network-based monitoring platform, named Edge-to-Center issues, we design a neural-network-based monitoring platform, named Edge-to-Center LearnReduce LearnReduce Microseismic Monitoring Platform, which could be used in a microseismic monitoring Microseismic Platform, which could beFor used in a microseismic monitoring system, system, with Monitoring an edge computing architecture. improving the platform’s accuracy, a with an edge computing architecture. For improving the platform’s accuracy, a neural-network-based neural-network-based microseismic event detection model is developed, which is combined with a microseismic event developed, which memory is combined withThe a convolutional neural convolutional neuraldetection network model (CNN) is and long short-term (LSTM). model’s parameters network (CNN) and long short-term memory (LSTM). The model’s parameters and structure are and structure are designed by considering the features of microseismic events. In addition, upon the designed by considering the features of microseismic events. In addition, upon the neural network neural network model, a probabilistic inference is applied to minimize false negative results. For the model, a probabilistic inference is applied to minimize falseisnegative results. For transmission data transmission issue, an edge computing architecture proposed based onthe thedata framework of a issue, an edge computing architecture is proposed based on the framework of a monitoring network monitoring network so that the microseismic data needed to be transmitted from edge components so the thatdata the microseismic needed to be transmitted from edge components to the data center can to center can be data dramatically reduced. be dramatically reduced. The rest of this paper is organized as follows. In Section 2, related works in three main The restto of monitoring this paper is microseismic organized as follows. In Section 2,are related worksand in three approaches approaches events efficiently reviewed theirmain drawbacks are to monitoring microseismic events efficiently are reviewed and their drawbacks are identified. Then, identified. Then, our proposed efficient microseismic monitoring platform is presented in Section 3. our proposed efficient microseismic monitoring platform is presented in Section 3. In Section 4, In Section 4, both simulation and measurement results are presented and analyzed. Finally, the both simulation and measurement results are presented and analyzed. Finally, the conclusion is given conclusion is given in Section 5. in Section 5.
Sensors 2018, 18, 1828
3 of 19
2. Related Work Until now, the research on microseismic monitoring has been mainly focused on detecting microseismic events. In this regard, the most extensively utilized method is the short-term average to long-term average (STA/LTA) algorithm, whose main idea is based on the differences in the energy densities of noise and a signal [13]. Based on this method, various varieties of STA/LTA methods have been published, including methods that consider situations with a high level of noise [12,19]. By calculating and considering some parameters of energy, amplitude, or other entropy functions in multiple windows instead of using average energies in a fixed-length window, the performance compared with that of the initial STA/LTA method can be significantly improved. However, these algorithms need a relatively long time to obtain the required accuracy due to the computational cost of parameters in different windows. Another widely used approach to recognize events is based on spectrum analyses of temporal, spatial, and frequency domains, such as a wavelet transform [20,21], a Hilbert–Huang transform [22], and other time-frequency representations [23]. The algorithms in this approach attempt to give insight into the complex structure of a noisy microseismic signal by displaying the amplitudes of different frequency components for a given time so that microseismic events can be separated from specified random noise. However, these methods are either not very effective for removing high-amplitude unknown noise or need some parameters to be tuned manually. In the past decade, with the fast development of artificial intelligence, machine learning techniques have been used for the development of microseismic event detection methods, such as the fuzzy clustering algorithm [14], support vector machine (SVM) [24], and the Bayesian probability model [25]. Although all of these learning algorithms can obtain good results for detecting microseismic events within a short computational time, the data collection time, referred to as the second challenge, is not considered in this category of solutions since they only pay attention to the data center and do not view the whole system as a platform. Besides the studies on detecting microseismic events, some researchers are dedicated to compressing the microseismic data in the terminals, in an attempt to reduce the volume of data to be transmitted, so that the microseismic monitoring system can have less of a time delay in data retrieval [26,27]. However, this approach will add either some noise in fracturing mapping or extra computational time for data recovery in the data center. Therefore, methods that use compression can hardly solve the two challenges simultaneously. In summary, the drawbacks mentioned above are caused due to the fact that the performance of each part of the monitoring system is optimized separately. Therefore, the aim of this work is to provide an efficient solution by taking into consideration the whole platform. The new platform consists of edge components and a data center. To achieve high accuracy for microseismic event detection, we use shoot data, which includes the main information of the underground structure and characteristics to train a specific model in the data center before hydraulic fracturing. After training, this learned model will be sent to an edge component for detection of microseismic events. To reduce traffic load, the edge component focuses on detecting microseismic events exactly and only transmits data including microseismic information. 3. Edge-to-Center LearnReduce Microseismic Monitoring Platform Design 3.1. Platform Structure The structure of the Edge-to-Center LearnReduce Microseismic Monitoring Platform (ELMMP) is shown in Figure 2. The new platform consists of edge components and a data center. The edge components consist of the microseismic monitoring system’s surface monitoring units and borehole monitoring units, while the data center is the system’s data center. The data center and edge components are connected by access points. At an edge component, microseismic data is collected and processed by an embedded microchip and operation system. At the data center, a microseismic events detection model will be learned firstly by using shoot data. Then, the learned model will be sent to the
Sensors 2018, 18, 1828
4 of 19
2018, 18, x FOR PEER REVIEW of 19 edgeSensors components via access points to implement data reduction. After that, the data center 4uses the reduced data transmitted from the edge components to locate the source of a microseismic event in center uses the reduced data transmitted from the edge components to locate the source of a the fracturing. microseismic event in the fracturing.
Figure 2. The structure Edge-to-CenterLearnReduce LearnReduce Microseismic platform. Figure 2. The structure ofof Edge-to-Center MicroseismicMonitoring Monitoring platform.
When the platform is working, an edge component collects shoot data first and then transmits it When the platform is working, an edge component collects shoot data first and then transmits it to the data center for a learning purpose so that microseismic events can be detected exactly. Then, to the data center for a learning purpose so that microseismic events can be detected exactly. Then, the edge component will download the learned microseismic events detection model from the data the edge will download thebegins. learnedWhile microseismic events detection model the data centercomponent before hydrofracturing work hydrofracturing, microseismic datafrom is firstly center before hydrofracturing work begins. While hydrofracturing, microseismic data by is firstly recorded recorded by the edge component. Secondly, the raw data is going to be pre-processed some filters by the edge component. Secondly, the rawthe data is going pre-processed some filters reduce to reduce some ordinary noise. Thirdly, data outputto bybe the pre-processingbymodule will betoused someasordinary noise. Thirdly, the data output by the pre-processing module will be used as the input the input of the neural network model that is downloaded from the data center, whose output can beneural considered as the possibility a seismic event at each point in theoutput time domain. Then, a of the network model that isofdownloaded from the sampling data center, whose can be considered probabilistic based on the resultpoint yielded by the neural network model, is used as the possibilityinference, of a seismic event at probability each sampling in the time domain. Then, a probabilistic to output the minimum data only included by microseismic events. Finally, the edge component inference, based on the probability result yielded by the neural network model, is used to output the only transmits theincluded data contained in microseismic events to thethe data center. minimum data only by microseismic events. Finally, edge component only transmits the In the data center, the shoot data will be used to train a multiple-layer neural network model, at data contained in microseismic events to the data center. the beginning, using a strategy named end-to-end learning. This strategy is based on moving away In the data center, the shoot data will be used to train a multiple-layer neural network model, from hand-crafted feature detectors. The model is trained to produce the probabilistic result of a at the beginning, using a strategy named end-to-end learning. This strategy is based on moving away microseismic event directly from the input data. To meet the training requirement, we first label the fromshoot hand-crafted feature detectors. The model is data, trained the probabilistic result data according to the shoot time. In the shoot 1 or to 0 isproduce used to describe whether there is aof a microseismic eventevent directly fromrespectively. the input data. meet thedata training requirement, we firsttolabel a microseismic or not, Then,Tothe shoot and its label are imported the the shootneural data according to the shoot time. In the shoot data, 1 or 0 is used to describe whether there is a a network model proposed in this work, which comprises convolution and recurrent layers. It microseismic event both or not, respectively. the shootcharacteristics data and its label are imported to the neural aims to extract intrinsic featuresThen, and temporal of microseismic events. After training, the learned model will work, be transmitted to the edgeconvolution components. and recurrent layers. It aims network model proposed in this which comprises
to extract both intrinsic features and temporal characteristics of microseismic events. After training, 3.2. The Data Center the learned model willinbeELMMP transmitted to the edge components. 3.2.1.Data Overview Data Center Model 3.2. The Centerof inthe ELMMP These days, artificial neural networks, such as convolutional neural networks (CNNs) and
3.2.1.recurrent Overview of the Data Center Model neural networks (RNNs), have achieved state-of-the-art performance on several pattern
recognition tasks, including image recognition [28–30]. Although prior knowledge of recognition These days, artificial neural networks, such as convolutional neural networks (CNNs) and will not be explicitly integrated into the neural network, it is important to give the network a recurrent neural networks (RNNs), have achieved state-of-the-art performance on several pattern structure that enables it to learn these dependencies from the data. Obviously, microseismic data recognition tasks, including image recognition [28–30]. Although prior knowledge of recognition will detection is a kind of time-series data classification, which must depend on not only local features at not be explicitly integrated into the neural network, it is important to give the network a structure the moment but also correlations with the past. Considering the applications of CNNs and RNNs to that time-series enables it to learn these dependencies from the data. Obviously, microseismic data detection is a classification, CNNs are attractive for their feature-learning ability but are not sensitive kindtoofthe time-series data classification, which must depend on not only local at the moment temporal characteristics of time-series data, while RNNs are effective forfeatures sequencing data but but also correlations with the past. Considering the applications of CNNs and RNNs to time-series not good at capturing specific features locally. So, to obtain high accuracy in classification, we
classification, CNNs are attractive for their feature-learning ability but are not sensitive to the temporal
Sensors 2018, 18, 1828
5 of 19
characteristics of time-series data, while RNNs are effective for sequencing data but not good at Sensors 2018, 18, x FOR PEER REVIEW 5 of 19two capturing specific features locally. So, to obtain high accuracy in classification, we combined networks to leverage the advantages of both CNNs and RNNs. To take the common local features combined twodata networks to leveragesensors the advantages of both the CNNs andmodule RNNs. To of microseismic fromREVIEW different into account, CNN is take usedthe to common capture Sensors 2018, 18, x FOR PEER 5 of 19 the local features of microseismic data from different sensors into account, the CNN module is used to main local features of microseismic events in a specific hydrofracturing process. For detecting the capture thetwo main local features of microseismic events in aCNNs specific hydrofracturing process. For combined networks to leverage the advantages of both RNNs. Towork, take the common temporal dependencies of these features, an RNN (LSTM) layer isand used in our since recurrent detecting the temporal dependencies of these features, an RNN (LSTM) layer is used in our work, local features of microseismic data from different sensors into account, the CNN module is used to connections can store memoriescan of past features. Generally, for the microseismic event detection case, since recurrent connections store memories events of pastinfeatures. Generally, for the microseismic capture the main local features of microseismic a specific hydrofracturing process. For we design a neuralcase, network with aa convolution module, recurrent module, and a softmax layer as event detection wedependencies design neural network with an a aconvolution module, recurrent module, detecting the temporal of these features, RNN (LSTM) layer isa used in our work, shown in Figure 3. and a softmax layer as shown in Figure 3. since recurrent connections can store memories of past features. Generally, for the microseismic event detection case, we design a neural network with a convolution module, a recurrent module, and a softmax layer as shown in Figure 3.
Figure 3. Architecture of the neural network for microseismic event detection. LSTM, long
Figure 3. Architecture of the neural network for microseismic event detection. LSTM, long short-term short-term memory; CNN, convolutional neural network. memory; CNN, convolutional neural network. Figure 3. Architecture of the neural network for microseismic event detection. LSTM, long
3.2.2. short-term Training Set memory; CNN, convolutional neural network.
3.2.2. Training Set
As with traditional work in supervised learning, we must label the training data first. In this 3.2.2. Training Set As with traditional work in supervised we must label the training data first. ofIna this study, we use shoot data as training data andlearning, label it manually. Because of the high-level energy shoot event and awareness about the time have to shoot thethe well, athe microseismic in of a with traditional in supervised learning, we must in label training data first.event In this study, weAs use shoot data aswork training data andcrews label it manually. Because of high-level energy study, we use shoot data as training data and label it manually. Because of the high-level energy of a in the shoot data could be found out easily. Then, we could recognize a microseismic signal’s arrival shoot event and awareness about the time crews have to shoot in the well, a microseismic event shoot event and awareness about the time crews have to shoot in the well, a microseismic event in time and duration time by analyzing the specific morphology features of the microseismic signal the shoot data could be found out easily. Then, we could recognize a microseismic signal’s arrival the shoot data could be time. foundInoutFigure easily.4,Then, we recognize signal’s arrival the shoot theremorphology is could an example of a microseismic piece shoot data and its and timeand andseeking duration time by analyzing the specific features of theof microseismic signal time and duration by analyzing the sequence specific morphology features of as thethe microseismic signalIt corresponding labeltime sequence. The label has the same length data sequence. seeking the shoot time. In Figure 4, there is an example of a piece of shoot data and its corresponding and seeking the that shootthe time. 4, there is an marked example“1” of includes a piece of data and and its should be noted partInofFigure the label sequence theshoot arrival point label sequence. The label sequence has the same length as the data sequence. It should be noted that corresponding label sequence. The label sequence has the same length as the data sequence. It duration of the microseismic signal, and the remainder is marked as “0”. the part of the marked “1”label includes the arrival and duration of thepoint microseismic should be label notedsequence that the part of the sequence markedpoint “1” includes the arrival and signal, and the is marked “0”. duration of remainder the microseismic signal,as and the remainder is marked as “0”.
Figure 4. Shoot data and its label (training set).
Figure its label label(training (training set). Figure4.4.Shoot Shootdata data and and its set).
Sensors 2018, 18, 1828 x FOR PEER REVIEW
6 of 19
3.2.3. CNN Module 3.2.3. CNN Module After the training set is labeled, the neural network begins to be trained. The convolution After trainingofset is whole labeled,model. the neural network to be in trained. The convolution module module is the at bottom the According to begins prior work the study of the microseismic is at bottom of the whole model. According to prior work in the study of the microseismic signal signal process, there are many different nonlinear features in microseismic events [31]. So, for process, there are many different nonlinear features in microseismic events [31]. So, for improving the improving the ability to capture the main characteristics of microseismic events, we design a ability to capture theby main characteristics of microseismic we design a convolution module convolution module considering two main issues. One isevents, the variety of microseismic signals, and by considering two main issues. One is the variety of microseismic signals, and the other one is the the other one is the nonlinearity of a microseismic signal. To extract different specific features in nonlinearity of a microseismic signal. To extract different specific features in different frequencies and different frequencies and mathematical morphologies, multiple channels of kernels in different sizes mathematical morphologies, channels ofmodule kernels in different sizes sized should be used incatch each should be used in each layermultiple of the convolution [32]. Differently kernels can layer of the convolution module [32]. Differently sized kernels can catch differently scaled features, differently scaled features, so the smallest size kernels are designed to capture the features in the so the smallest size kernels are designed to capture the features highest frequency. A different highest frequency. A different channel of kernels is usedintothesense different features in a channel of kernels is used to sense different features in a mathematical morphology. According toin thea mathematical morphology. According to the successful examples using multiple-scale analyses successful examples multiple-scale analyses inthe a microseismic signal and its sampling frequency, microseismic signal using and its sampling frequency, parameters of these kernels are described in the parameters of these kernels are described in Figure 5. Because of the variety of microseismic Figure 5. Because of the variety of microseismic signals, it is really hard to decide how to select the signals, it is really hard to how to selectobtaining the most effective kernels at each beforedata. obtaining most effective kernels at decide each layer before some prior data, suchlayer as shoot So, a some prior data, such as shoot data. So, a structure, such as “Inception”, is used in the convolution structure, such as “Inception”, is used in the convolution module, which applies a wide structure to module, which applies a wide structure let thetoconvolution module decidethe which kernel to be let the convolution module decide whichtokernel be used [33]. To increase representational used [33]. To increase the representational power of nonlinearity, a kind of structure, named “Network power of nonlinearity, a kind of structure, named “Network in Network” (NiN) [34], is used in the in Network”module (NiN) [34], used in replaces the convolution module too. The NiN replaces with filtersclassical with a convolution too. is The NiN filters with a micro-network compared micro-network with classical Initially, micro-networks areperception, designed aswhich fully CNNs. Initially,compared the micro-networks are CNNs. designed as fullythe connected multilayer connected multilayer whichHowever, actually increases network’sabout depth.the However, to actually increases the perception, network’s depth. owing tothe information input or owing gradient information about the input or gradient passing through these layers, some of the captured characters passing through these layers, some of the captured characters may vanish when they reach the end may vanish when they reach the beginning of theconnected network. So, we apply a densely connected or beginning of the network. So,end weorapply a densely structure used in DensNet [35], structure used in DensNet [35], which creates short paths from early to later layers. Specifically, which creates short paths from early to later layers. Specifically, in a micro-network, each layer in a micro-network, each layer additional inputs from all preceding layers and passes on its obtains additional inputs fromobtains all preceding layers and passes on its own feature-maps to all own feature-maps all subsequent according to thedata, sparsity of microseismic data, subsequent layers. to Besides, accordinglayers. to the Besides, sparsity of microseismic the activation function in the activation function in the convolution module is a rectifier linear unit, since a rectify activation the convolution module is a rectifier linear unit, since a rectify activation function performs better function better when the data is sparseand compared to sigmoid and hyperbolic neurons. when theperforms data is sparse compared to sigmoid hyperbolic tangent neurons. As tangent the convolution As the convolution is designed to extract local and to be easily in an module is designedmodule to extract local features and to be features easily implemented in animplemented edge component, edge component, there are three convolution layers in the convolution module. Finally, the convolution there are three convolution layers in the convolution module. Finally, the convolution module is module designed asFigure shown5.in Figure 5. designedisas shown in
Figure neural network. network. Figure 5. 5. The The convolution convolution model model in in the the designed designed neural
3.2.4. RNN Module 3.2.4. RNN Module As emphasized earlier, any microseismic event needs to be detected by considering temporal As emphasized earlier, any microseismic event needs to be detected by considering temporal dependency as well. So, on the convolution module, there is a recurrent module in the neural dependency as well. So, on the convolution module, there is a recurrent module in the neural network. network. However, there is the issue of a long-term dependency problem in microseismic detection However, there is the issue of a long-term dependency problem in microseismic detection when when a microseismic event with a large energy takes place. If the energy of the microseismic event is large, the signal containing microseismic information will last a long time. So, detecting the end of a
Sensors 2018, 18, 1828 Sensors 2018, 18, x FOR PEER REVIEW
7 of 19 7 of 19
microseismic event is a problem, since it depends on some character of the arrival, which occurs a a microseismic event with a large energy takes place. If the energy of the microseismic event is long time before the event’s end. large, the signal containing microseismic information will last a long time. So, detecting the end of a Nowadays, the LSTM algorithm is well-known to be capable of learning the long-term microseismic event is a problem, since it depends on some character of the arrival, which occurs a long dependencies problem [36]. The key property of LSTM is that it explicitly takes account of long-term time before the event’s end. information in the past which is a limitation of the classic RNN architecture (the long-term Nowadays, the LSTM algorithm is well-known to be capable of learning the long-term dependency problem). Considering former studies on LSTM, none of the variants can improve upon dependencies problem [36]. The key property of LSTM is that it explicitly takes account of long-term the standard LSTM architecture significantly [37], and activation functions are the most critical information in the past which is a limitation of the classic RNN architecture (the long-term dependency components of it [38]. So, in our study, we implement the standard LSTM architecture in the problem). Considering former studies on LSTM, none of the variants can improve upon the standard recurrent module and choose the Elliott function as the activation functions by considering the prior LSTM architecture significantly [37], and activation functions are the most critical components of it [38]. research on activation functions and the complexity of the neural network model’s implementation So, in our study, we implement the standard LSTM architecture in the recurrent module and choose in an edge component. Instead of using sigmoid and hyperbolic tangent functions as activation the Elliott function as the activation functions by considering the prior research on activation functions functions, the Elliott-function-based LSTM can improve the detection accuracy without increasing and the complexity of the neural network model’s implementation in an edge component. Instead any computational costs. Specifically, the activation functions of the input and output are the Elliott of using sigmoid and hyperbolic tangent functions as activation functions, the Elliott-function-based function showed as (1), while a kind of modified Elliott function is given in (2) that is applied as the LSTM can improve the detection accuracy without increasing any computational costs. Specifically, activation function in the forget gate, input gate, and output gate. The architecture of the LSTM the activation functions of the input and output are the Elliott function showed as (1), while a kind algorithm used in this paper is illustrated in Figure 6, and the related formulas are expressed from of modified Elliott function is given in (2) that is applied as the activation function in the forget gate, (3) to (7). To extract the temporal correlation of local microseismic features clearly, each LSTM cell input gate, and output gate. The architecture of the LSTM algorithm used in this paper is illustrated in connects to a feature map generated by the convolution module. In total, there are 72 feature map Figure 6, and the related formulas are expressed from (3) to (7). To extract the temporal correlation outputs generated by the convolution module. So, the cell number of every LSTM unit is 72. of local microseismic features clearly, each LSTM cell connects to a feature map generated by the convolution module. In total, there are 72 feature mapxoutputs generated by the convolution module. f ( x) (1) So, the cell number of every LSTM unit is 72. 1 x x f( x ) =0.5 x | x0.5 | f ( x) 1 +
1 x
0.5x + 0.5 W1xi + |hxt |1 Whi bi )
f( x ) =
it i ( xt
i = σ (x · W + h
·W +b )
(1) (2) (2) (3)
Ct = f t ∗ Ct−1 + it ∗ σC ( xt · WxC + ht−1 · WhC + bC )
(3) (4) (4) (5) (5)
Ot = σ o((xxt· W Wxo+ hht 1 ·W bo ) O t o t xo t−1 Who ho + bo )
(6)(6)
HHtt =oott ∗σh ((C Ctt))
(7)(7)
ftt i f ( xt t Wxixf ht−t 11 Whfhi bif ) f t = σ f ( xt · Wx f + ht−1 · Wh f + b f )
Ct f t Ct 1 it C ( xt WxC ht 1 WhC bC )
Figure Figure 6. 6. The Thearchitecture architecture of of LSTM. LSTM.
The output vectors from all LSTM units are fully connected to a softmax layer to describe the raw data’s likelihood of being included by a microseismic event.
Sensors 2018, 18, 1828
8 of 19
The output vectors from all LSTM units are fully connected to a softmax layer to describe the raw data’s likelihood of being included by a microseismic event. 3.3. Edge Component in ELMMP 3.3.1. Overview of Edge Computing in ELMMP The edge component of our platform accomplishes microseismic event detection and related data transmission. To improve the efficiency of the microseismic monitoring system, there are more computing tasks assigned to an edge component besides the ordinary works, such as data recording and transmitting, that are seen in the usual edge devices. The computing tasks in our edge component can be sorted into three major parts. The first computing task in the edge component is to reduce the noise of microseismic data in a real-time way. This part is mainly focused on reducing the noise caused by some changes in the environment, such as a human walking, a vehicle running, or just some varieties of wind, in order to increase the accuracy of microseismic event detection. The second one, following environmental noise reduction, in the edge component is to execute the trained neural network to calculate how the collected data is likely to be contained in a microseismic event. The trained neural network parameters are downloaded to the edge component from the data center once the training model reaches a relatively high and stable classification result. Then, the neural network is implemented in the edge component to provide the microseismic data probability. The last computing task is to infer whether each sampling point belongs to a microseismic event based on its probability given by the neural network model. Finally, only the data included by the microseismic event is sent to the data center, which is much less than the raw data recorded by the edge component. 3.3.2. Noise Reduction of Microseismic Data Obviously, there are lots of changes in the field during hydraulic fracturing. Changes caused both by humans and nature will induce environmental noise in the microseismic signal recording as long as they are close to the edge component. There is no doubt that a considerable amount of environmental noise is contained in a raw recording. To handle this issue, some filters are designed in an edge component for increasing the signal-to-noise ratio (SNR) of the microseismic data, which is critical to both microseismic event detection and related data analyses. Specifically, after the 24-bit analog-to-digital conversion, the raw data is post-processed by filtering with a series of Butterworth zero-phase, 2nd-order filters in the definite band-stop frequency. The start and stop frequency of the band-stop filters are defined according to the shoot data, since the shoot data contains the geophysical information of the hydraulic fracturing monitoring region and has common features with the acquisition data in the frequency domain. 3.3.3. Implementation of the Neural Network After the first computing task, the raw data is put into the trained neural network. To implement the trained neural network in an edge component when considering limited resources and the specific computing operations of our neural network, we use a parallel architecture. Different neural network structures have different computing operations; for example, CNNs’ main operations are mostly composed of spatial convolutions and the LSTM’s are matrix-vector multiplications. Since the proposed neural network model has two parts, namely convolution modules and LSTM modules, there are two strategies applied in the design of an edge component’s architecture. First, for the convolution module, the feature map of every layer is reused to reduce the computing load, and each convolution layer runs different-kernel convolutions in parallel by using multiply-accumulate units. The computation architecture of the convolution layer is shown in Figure 7. Specifically, for each kernel, the filter weights are stationary inside the register file (RF) of the processing element (PE) and the input feature is streamed into the PE. The PEs are used to finish the multiply-and-accumulate (MAC) operations for each sliding window at one time. Since there are
Sensors 2018, 18, 1828
9 of 19
Sensors 2018, 18, x FOR PEER REVIEW
9 of 19
overlaps of input features between two consecutive sliding windows, the input features can be kept in Sensors x FOR PEER REVIEW the RF2018, and18, reused.
9 of 19
Figure 7. The convolutional computation reuse within a processing element (PE). (a) Step 1; (b) Step 2; (c) Step 3; (d) Step 4. Figure 7.The The convolutional computationreuse reuse within a processing element (PE). (a) Step 1; Step (b) Step Figure 7. convolutional computation within a processing element (PE). (a) Step 1; (b) 2; 2; (c) Step 3; (d) Step 4.
Secondly, for the LSTM module, many multiply-accumulate units are used here to accelerate Secondly, for the the LSTM module, many multiply-accumulate units used to multiplication accelerate the computing speed, since 71% of themany run-time of an RNN is due toare matrix-vector Secondly, for LSTM module, multiply-accumulate units are used here here to accelerate the computing speed, since 71% of the run-time of an RNN is due to matrix-vector multiplication according to related work [39,40]. it is wise use isasdue many multiply-accumulate units as the computing speed, since 71% ofHence, the run-time of antoRNN to matrix-vector multiplication according to related work [39,40]. Hence, it is wise to use as many multiply-accumulate units according to related work [39,40]. it is wise simultaneously. to use as many multiply-accumulate units as as possible so that those operations canHence, be processed In our work, the LSTM is one possible so that those operations can be processed simultaneously. In our work, the LSTM islayer one possible so that those operations can be processed simultaneously. In our work, the LSTM is one layer with 72 hidden cells. There are 144 multiply-accumulate units used in the gates block to layer 72 with 72 hidden cells.are There are 144 multiply-accumulate unitsthe used inblock the gates block to with hidden cells. There 144 multiply-accumulate units8. used gates to accomplish accomplish matrix-vector multiplication as shown in Figure Oneinelement of the input vectors (the accomplish matrix-vector multiplication as shown in Figure 8. One element of the input vectors (the matrix-vector multiplication as shown in Figure 8. One element of the input vectors (the output of output of of thethe CNN module) and one element of allall weight weightmatrix matrix rows are fed output CNN module) and one element of rows are fed to 72to 72 the CNN module) and one element of all weight matrix rows are fed to 72 multiply-accumulate units multiply-accumulate units the matrix-vector matrix-vector multiplication. multiplication. The other multiply-accumulate units to to compute compute The the The other 72 72 to compute the matrix-vector multiplication. other 72 multiply-accumulate units are fed by one multiply-accumulate units are by element ofthe thelast laststate state output vectors one element multiply-accumulate units arefed fed byone oneand element of output vectors andand one element of of element of the last state output vectors one element of all weight matrix rows to compute the all weight matrix rows to compute the matrix-vector multiplication. Then, the results from the all weight matrix rows to compute the matrix-vector multiplication. Then, the results from the two matrix-vector multiplication. Then, the results from the two groups of multiply-accumulate units two groups of multiply-accumulate units are addedintogether These results, in ain bus of data, groups of multiply-accumulate are added together inparallel. parallel. These results, bus oftodata, are added together in parallel.units These results, a bus ofindata, are then serialized into a astream are then serialized into a stream to compute the non-linear function (Elliott function or modified are then serialized into afunction stream (Elliott to compute non-linear function (Elliott function function or modified compute the non-linear functionthe or modified Elliott function). The non-linear Elliott function). The non-linear function is element-wise there is not Elliott function). The non-linear function an element-wise thus, there is much notformuch is an element-wise operation; thus, there is is notan much advantageoperation; tooperation; doing a thus, parallel computation advantage to doing a parallel computation for non-linear mapping. Besides, there are weight and and non-linear mapping. Besides,computation there are weight vector caches to storeBesides, the parameters, andweight DMA advantage to doing a parallel forand non-linear mapping. there are vector caches to store the parameters, and DMA ports are used to update the parameters. ports are used to update the parameters. vector caches to store the parameters, and DMA ports are used to update the parameters.
Figure 8. The structure of the gates block.
Figure 8. The structure of the gates block. Figureits 8. The structure of the block. For the whole neural network, CNN module andgates RNN module are implanted in parallel rather than sequence to accelerate calculation. Specifically, the two modules are executed in a For the whole neural network, its CNNaccording module to and RNN module are implanted in parallel two-stage pipeline, and they are processed their related strategies mentioned For the whole neural network, its CNN module and RNN module are implanted in parallelabove, rather rather than sequence to accelerate calculation. Specifically, the two modules are executed respectively. The whole computing architecture of the neural network in an edge component is in a than sequence to accelerate calculation. Specifically, the two modules are executed in a two-stage two-stage they are processed to strategies their related strategies mentioned above, shown pipeline, in and Figure 9.and pipeline, they are processed according toaccording their related mentioned above, respectively.
respectively. wholearchitecture computingofarchitecture of theinneural in isanshown edgeincomponent The wholeThe computing the neural network an edgenetwork component Figure 9. is shown in Figure 9.
Figure 9. The computing architecture of the neural network in edge components. RNN, recurrent neural network.
For the whole neural network, its CNN module and RNN module are implanted in parallel rather than sequence to accelerate calculation. Specifically, the two modules are executed in a two-stage pipeline, and they are processed according to their related strategies mentioned above, respectively. The whole computing architecture of the neural network in an edge component is Sensors 2018, 18, 1828 10 of 19 shown in Figure 9.
Figure 9.9. TheThe computing architecture of theofneural networknetwork in edge in components. RNN, recurrent computing architecture the neural edge components. RNN, neural network. recurrent neural network.
3.3.4. Probabilistic Inference Module Usually, neural networks are used as a kind of classifier to obtain discrete classification results directly. However, in a microseismic application, there is a strong dependence in consecutive time series data, where it is impossible to have many drastic fluctuations in classification results during only a few time intervals. Although the LSTM algorithm could capture some dependence features in the time domain by mining the series data, the output of the LSTM algorithm still has some drastic fluctuations if we let the LSTM algorithm generate discrete classification results directly. This kind of result cannot be considered as a final classification decision because outputs from LSTM are viewed independently without unambiguous probabilistic information of the classification. This problem may be caused by many different reasons, such as higher noise levels and a different pattern in microseismic events. According to these concerns, it is hard to only use neural networks and obtain high accuracy. Therefore, in this work, a probabilistic inference-based approach is used to handle this problem, which takes into consideration of the outcome from the neural network model. To decrease false negative errors in the classification results, as well as make the best use of the neural network model’s prediction, the output of the neural network model is considered as a microseismic event probability at each sampling data. Then, the probability is used to calculate the final classification result by using the following equations from (8) to (12). In the following equations, xi is the i-th sampling data, while p(xi ) is the neural network’s output of the i-th sampling data, which can also be thought of as the probability to be a microseismic event in our work. Pmei1 means the degree of membership of the i-th sampling data to be a part of a microseismic event, while Pmei0 means the degree of membership of the i-th sampling data not to be included by any microseismic event. Ptri1 represents the transition probability of the i-th sampling data to be a part of a microseismic event if the (i − 1)-th sampling data is not classified to be the label “1”. In the same way, Ptri0 stands for the transition probability of the i-th sampling data to be sorted to class “0” if the (i − 1)-th sampling data is labeled “1”. C(xi ) is the final result of the i-th sampling data. For the i-th sampling data, if its transition probability is larger than the membership value of the former sampling data, it will be sorted to a different class from the former one. Otherwise, it will be sorted to the same class as the former one, as (12) describes. Pmei1 = p( xi )
(8)
Pmei0 = 1 − p( xi )
(9)
Ptri1 = p( xi ) ∗ p( xi+1 ) ∗ p( xi+2 )
(10)
Ptri0 = (1 − p( xi )) ∗ (1 − p( xi+1 ) ∗ (1 − p( xi+2 )) ( 1, Ptri1 > Pmei0 C ( xi ) = 0, Ptri0 > Pmei1
(11) (12)
Sensors 2018, 18, 1828
11 of 19
4. Evaluation 4.1. Simulation Results and Analysis To evaluate the performance of our platform, we first conducted a series of simulation experiments. In the simulation, several Ricker wavelets with different frequencies, maximum amplitudes, and noise levels were used to evaluate the detection accuracy of our proposed algorithm, while the STA/LTA algorithm was also used as a benchmark for performance comparison. First, 10 training sets were generated for the simulations and each training set contained 40-channel simulation microseismic data. Considering the main frequency band of the microseismic wave recorded on PEER the surface, Hz. Sensors 2018, 18, x FOR REVIEW the simulations focused on the frequency band from 20 to 300 11 of 19 In each simulated microseismic channel, several Ricker wavelets within a frequency band from 20 to 300 Hz were generated, each of which consists of a time series considering the characteristics of real Then, in the generated generated series, series, the data that consists of microseismic data in the frequency domain. Then, Ricker wavelets was labeled with “1”. Otherwise, the data between two adjacent Ricker wavelets in the wavelets was labeled with “1”. Otherwise, the data between two adjacent Ricker wavelets in series waswas labeled as as “0”. Finally, thethe series neural network. network. the series labeled “0”. Finally, seriesdata dataand andits itslabels labelswere wereused used to to train our neural To ensure good reliability, reliability, aa testing testing set set with with 40 40 channels channels was was generated. generated. Each channel includes whose frequencies frequencies were were randomly randomly chosen chosen from from 20 20 to to 300 300 Hz. Hz. In In the the testing testing set, set, six Ricker wavelets, whose domain frequency in every 200200 samples. So, each has there was was aaRicker Rickerwavelet waveletwith withthe theparticular particular domain frequency in every samples. So, each 1200 samples. Then, these Then, 40 channels ofchannels testing data were added with different of Gaussian channel has 1200 samples. these 40 of testing data were added withlevels different levels of White Noise. Specifically, the signal-to-noise ratio (SNR) of the testofdata 0 dB, −5 0dB, 10 dB, dB, Gaussian White Noise. Specifically, the signal-to-noise ratio (SNR) the were test data were dB,−−5 and dB, −15and dB,−15 respectively. −10 dB, respectively. To illustrate relationship between the number of training samplessamples and the model’s illustratethe the relationship between the number of training and theprediction model’s accuracy, allaccuracy, of the 10all training were involved in involved the simulations. Each training set had 40 channels. prediction of the sets 10 training sets were in the simulations. Each training set had To be specific,Toinbe each channel, a Ricker wavelet with awavelet particular frequency wasfrequency included 40 channels. specific, in each channel, a Ricker withdomain a particular domain in every 200 samples. to illustrate of the different numbers conveniently, was included in every In 200order samples. In orderthe to impact illustrate impactsample of different sample numbers the wavelet number and number sample number of each channel in channel the sameintraining were set be conveniently, the wavelet and sample number of each the sameset training set to were an equal length. In the 10 different training sets, the number of Ricker wavelets contained by each set to be an equal length. In the 10 different training sets, the number of Ricker wavelets contained channel 5 tofrom 14, respectively. As a result, total sample numbers each channel in by each ranged channelfrom ranged 5 to 14, respectively. Asthe a result, the total sampleofnumbers of each the 10 different training sets rangedsets from 1000 to 2800. Then, the 10 different sets were used channel in the 10 different training ranged from 1000 to 2800. Then, the 10training different training sets to train theto model. After training, modelsthe were testedwere by the databywith SNR of an −15 dB of in −15 the were used train the model. Afterthe training, models tested the an data with SNR testing set.testing The testing results areresults compared in Figure 10. According to the results shown in Figure dB in the set. The testing are compared in Figure 10. According to the results shown10, in it could10, beitfigured outfigured that theout training settraining including Ricker wavelets (containing 2000 samples) is Figure could be that the set10 including 10 Ricker wavelets (containing 2000 able to enable thetomodel achieve stable accuracy of accuracy over 96%.of over 96%. samples) is able enabletothe modelan toacceptable achieve anand acceptable and stable
Figure 10. Results of different sample numbers. Figure 10. Results of different sample numbers.
In addition, to give a reasonable result of the training epochs needed by the model to reach a reliable detection, we had also compared the different classification accuracies achieved by the model with different training epochs. To be specific, the 10 Ricker wavelets (2000 samples) were used to train the model. The data with an SNR of −15 dB in the testing set was selected as testing data. Then, the trained models with 13 different epochs were tested, respectively. The relationship
Sensors 2018, 18, 1828
12 of 19
In addition, to give a reasonable result of the training epochs needed by the model to reach a reliable detection, we had also compared the different classification accuracies achieved by the model with different training epochs. To be specific, the 10 Ricker wavelets (2000 samples) were used to train the model. The data with an SNR of −15 dB in the testing set was selected as testing data. Then, the trained models with 13 different epochs were tested, respectively. The relationship between the classification accuracy and training epochs in the simulation is shown in Figure 11. After 80 epochs, Sensors 2018, 2018, 18, 18, xx FOR FOR PEER PEER REVIEW REVIEW 12 of of 19 19 Sensors 12 the model achieves a relatively high and stable classification result.
Figure 11. 11. Results Results of of different different epochs. epochs. Figure Results of different Figure
For providing more more sound shreds shreds of evidence evidence on robust robust performance, performance, the the proposed algorithm algorithm For For providing providing more sound sound shreds of of evidence on on robust performance, the proposed proposed algorithm was compared with the STA/LTA algorithm for detecting a microseismic event in test series data was compared with withthe theSTA/LTA STA/LTAalgorithm algorithm detecting a microseismic event in series test series was compared forfor detecting a microseismic event in test data data with with different different SNRs. SNRs. Particularly, Particularly, considering considering the the situation situation of of aa real real application, application, aa 4th-order 4th-order with different SNRs. Particularly, considering the situation of a real application, a 4th-order bandpass filter bandpass filter was used prior to the STA/LTA algorithm, while the proposed algorithm was used to to bandpass filter was used prior toalgorithm, the STA/LTA algorithm, while the proposed algorithm was used was used prior to the STA/LTA while the proposed algorithm was used to process the test process the test data directly. As with the related training parameters provided above, the model process the test Astraining with theparameters related training parameters provided above, the model data directly. Asdata withdirectly. the related provided above, the model was trained by the was trained by the training set with 2000 samples in each channel as well as 80 training epochs. Two was trained the2000 training set with 2000channel samples eachaschannel as well as 80 Two training epochs. Two training set by with samples in each asinwell 80 training epochs. typical channels typical channels (A and B) of test data were chosen to illustrate the detection results of the two typical channels (A and of testtodata were the chosen to illustrate the results and of the (A and B) of test data wereB)chosen illustrate detection results of thedetection two algorithms the two real algorithms and the real labels are shown in Figures 12–19. algorithms and the real labels are shown in Figures 12–19. labels are shown in Figures 12–19.
Figure 12. Channel A of test data (0 dB) and its results. (a) Test data (0 dB); (b) Detection results of the Figure 12. 12. Channel Channel A A of of test data data (0 dB) dB) and and its its results. (a) (a) Test Test data data (0 dB); dB); (b) (b) Detection Detection results results of of the the Figure proposed algorithm; (c) test Detection(0results of theresults. short-term average(0 to long-term average (STA/LTA) proposed algorithm; (c) Detection results of the short-term average to long-term average (STA/LTA) proposed (c)label. Detection results of the short-term average to long-term average (STA/LTA) algorithm;algorithm; (d) The real algorithm; (d) (d) The The real real label. label. algorithm;
Sensors 2018, 18, x FOR PEER REVIEW Sensors 2018, 18, 1828 Sensors Sensors 2018, 2018, 18, 18, xx FOR FOR PEER PEER REVIEW REVIEW
13 of 19 13 of 19 13 13 of of 19 19
Figure 13. Channel B of test data (0 dB) and its results. (a) Test data (0 dB); (b) Detection results of the Figure 13. 13. Channel Channel B(c) of Detection test data (0 (0 dB) dB) and its results. (a) Test data (0(d) dB); (b) (b) Detection Detection results of the the Figure (a) data proposed ofits theresults. STA/LTA algorithm; label. results Figure 13. algorithm; Channel B B of of test test data data (0results dB) and and its results. (a) Test Test data (0 (0 dB); dB);The (b)real Detection results of of the proposed algorithm; (c) Detection results of the STA/LTA algorithm; (d) The real label. proposed proposed algorithm; (c) Detection results of the STA/LTA algorithm; (d) The real label.
−5dB) dB) and and its its results. results. (a) (a) Test Test data ((−5 −5dB); dB);(b) (b) Detection Detection results results of Figure 14. Channel A of test data ((−5 Figure 14. Channel A of test data (−5 dB) and its results. (a) Test data (−5 dB); (b) Detection results the proposed algorithm; (c) Detection results of the STA/LTA algorithm; (d) The real label. Figure 14. Channel A of test data (−5 dB) and its results. (a) Test data (−5 dB); (b) Detection results of of the the proposed proposed algorithm; algorithm; (c) (c) Detection Detection results results of of the the STA/LTA STA/LTA algorithm; algorithm; (d) (d) The The real real label. label.
Figure 15. Channel B of test data (−5 dB) and its results. (a) Test data (−5 dB); (b) Detection results of Figure 15. Channel B of test data (−5 dB) and its results. (a) Test data (−5 dB); (b) Detection results of the proposed algorithm; (c) Detection results of the STA/LTA algorithm; (d) The real label. results of Figure 15. B data and its results. (a) data dB); (b) Detection the proposed algorithm; (c) Detection results STA/LTA algorithm; label. results of Figure 15. Channel Channel B of of test test data (−5 (−5 dB) dB) and of itsthe results. (a) Test Test data (−5 (−5 (d) dB);The (b) real Detection the proposed algorithm; (c) Detection results of the STA/LTA algorithm; (d) The real label. the proposed algorithm; (c) Detection results of the STA/LTA algorithm; (d) The real label.
Sensors 2018, 18, 1828 Sensors 2018, 18, x FOR PEER REVIEW
14 of 19 14 of 19
Figure −10 dB) dB) and and its its results. results. (a) (a)Test Test data data (−10 (−10dB); dB);(b) (b)Detection Detectionresults results Figure16. 16.Channel Channel A A of of test test data data ((−10 of the proposed algorithm; (c) Detection results of the STA/LTA algorithm; (d) The real label. of the proposed algorithm; (c) Detection results of the STA/LTA algorithm; (d) The real label.
Figure −10 dB) dB) and and its its results. results. (a) (a) Test Test data data (−10 (−10dB); dB);(b) (b)Detection Detectionresults results Figure17. 17.Channel Channel BB of of test test data data ((−10 of STA/LTA algorithm; algorithm;(d) (d)The Thereal reallabel. label. ofthe theproposed proposedalgorithm; algorithm; (c) (c) Detection Detection results results of of the the STA/LTA
Figure −15 dB) dB) and and its its results. results. (a) (a)Test Test data data (−15 (−15dB); dB);(b) (b)Detection Detectionresults results Figure18. 18.Channel Channel A A of of test test data data ((−15 of the proposed algorithm; (c) Detection results of the STA/LTA algorithm; (d) The real label. of the proposed algorithm; (c) Detection results of the STA/LTA algorithm; (d) The real label.
Sensors 2018, 18, 1828 Sensors 2018, 18, x FOR PEER REVIEW
15 of 19 15 of 19
Figure 19. 19. Channel Channel B B of of test test data data ((−15 −15dB) dB)and and its its results. results. (a) (a) Test Test data data (−15 (−15dB); dB);(b) (b)Detection Detection results results Figure of the proposed algorithm; (c) Detection results of the STA/LTA algorithm; (d) The real label. of the proposed algorithm; (c) Detection results of the STA/LTA algorithm; (d) The real label.
Based on of this simulation, one of theof advantages of our presented detection algorithm Based onthe theresults results of this simulation, one the advantages of our presented detection has been clearly shown. theUsing presented algorithm, the impact detection accuracy due to algorithm has been clearlyUsing shown. the presented algorithm, the on impact on detection accuracy different noise levels marginal compared with thewith STA/LTA-based approach. Especially, there is due to different noiseislevels is marginal compared the STA/LTA-based approach. Especially, a definite advantage of the proposed algorithmalgorithm in the lowinSNR case SNR compared with the STA/LTA. there is a definite advantage of the proposed the low case compared with the For example, an SNR of an −15 dB,of the proposed algorithmalgorithm is able to detect event. However, STA/LTA. Forwith example, with SNR −15 dB, the proposed is ableevery to detect every event. there is a mistake noise an event in as thean STA/LTA as shown by Figures 18 andby 19. However, there isofataking mistake of as taking noise event inalgorithm the STA/LTA algorithm as shown Specifically, in Figure 18, the part highlighted bypart a circle was a false classification when Figures 18 and 19. Specifically, in Figure 18, the highlighted by anegative circle was a false negative the STA/LTAwhen algorithm tried to detect the third or detect fifth event in the data using threshold, classification the STA/LTA algorithm tried to the third orseries fifth event in theaseries data whichaisthreshold, expressedwhich as theisdashed line.asAthe similar situation found incould the other channel as using expressed dashed line. A could similarbesituation be found in the shown in Figure 19. other channel as shown in Figure 19. For detailed metrics (accuracy, precision, and recall) the two For detailedanalysis, analysis,three three metrics (accuracy, precision, and of recall) of detection the two algorithms detection are calculated the results areresults listed in that, in order thetobest of result using algorithms are and calculated and the areTable listed1.inNote Table 1. Note that,to inget order get result the best the STA/LTA algorithm, the classification threshold has to be changed according to different noise of using the STA/LTA algorithm, the classification threshold has to be changed according to levels. Innoise fact,levels. it is difficult toisobtain antoappropriate threshold for all cases the different In fact, it difficult obtain an appropriate threshold forinallpractice, cases in since practice, SNR always a random To be more the proposed is compared since the SNRchanges always in changes in away. random way. Topersuasive, be more persuasive, the algorithm proposed algorithm is with the best result of STA/LTA, whose thresholds are selected Despite this, the proposed compared with the best result of STA/LTA, whose thresholds aremanually. selected manually. Despite this, the algorithmalgorithm performs better thanbetter the STA/LTA terms of all of the three especially in proposed performs than thealgorithm STA/LTAin algorithm in terms of all metrics, of the three metrics, a high-level situation. this, considering the resultsthe ofresults precision and recall, is especially in anoise high-level noiseBesides situation. Besides this, considering of precision andthere recall, another advantage of the proposed algorithm,algorithm, which is that it isishardly by affected the imbalanced there is another advantage of the proposed which that itaffected is hardly by the data. Furthermore, although both the proposed the STA/LTA to imbalanced data. Furthermore, although both thealgorithm proposed and algorithm and the algorithm STA/LTA seemed algorithm well-detect events withevents high SNR classification quality is different. Detection higher seemed to well-detect withdata, high the SNR data, the classification quality is different.with Detection quality means less redundancy the data to in bethe transmitted totransmitted the data center, which to awhich more with higher quality means less in redundancy data to be to the dataleads center, efficient Therefore, the proposed also algorithm enables the platform beplatform more efficient leads to asystem. more efficient system. Therefore,algorithm the proposed also enablestothe to be compared withcompared the STA/LTA more efficient with algorithm. the STA/LTA algorithm. Table 1. 1. The The contrast contrast of of detection detection accuracy. accuracy. Table SNR (dB) 0 −5 −10 −15
STA/LTASTA/LTASTA/LTA Proposed ProposedProposed STA/LTA Proposed Proposed Proposed STA/LTA STA/LTA SNR STA/LTA STA/LTA AccuracyAccuracyPrecision Algorithm Algorithm Algorithm Precision Algorithm Algorithm Algorithm (dB) Threshold Recall (%) Threshold Recall (%) Accuracy (%) (%) (%) Recall (%) (%) (%) Accuracy (%)Precision Precision (%) (%) Recall (%) 0 0.41 96.92 89.63 84.72 98.58 94.07 0.41 96.92 89.63 84.72 98.58 94.07 93.38 93.38 −5 0.3796.85 96.85 82.76 98.42 93.35 0.37 88.1488.14 82.76 98.42 93.35 92.64 92.64 −10 0.38 96.78 86.67 81.81 97.45 89.72 88.53 0.38 86.6780.13 81.81 97.45 89.72 86.12 88.53 −15 0.3996.78 95.25 78.26 96.83 87.36 0.39 95.25 80.13 78.26 96.83 87.36 86.12 SNR, signal-to-noise ratio; STA/LTA, short-term average to long-term average.
SNR, signal-to-noise ratio; STA/LTA, short-term average to long-term average.
Sensors 2018, 18, 1828
16 of 19
To illustrate the proposed algorithm’s performance on data reduction, the ratio of data reduced is calculated and its results are shown in Table 2. The ratios of data reduced among different SNRs are about 90%. That means that the data transmission time will be saved by 90% and the real-time performance of the microseismic monitoring system will be definitely improved. Table 2. The ratio of data reduced in different SNRs. SNR (dB)
Ratio of Data Reduced (%)
0 −5 −10 −15
89.17 89.67 90.67 88.75
4.2. Measurement Results and Analysis The proposed monitoring platform has also been applied in a testing microseismic monitoring system, which is researched by Jilin University [10,41]. The testing monitoring system includes surface monitoring units, borehole monitoring units, access points, and a data center. The surface monitoring units and borehole monitoring units are thought of as edge components, which are used to collect the microseismic data and transport data to the data center. The monitoring units are implemented by Xilinx’s Field-Programmable Gate Array (FPGA), while the data center is based on an Intel Server in a vehicle. The whole testing microseismic monitoring system with our proposed platform was used to measure real microseismic data in the northeastern China during a hydraulic fracturing project. During hydraulic fracturing monitoring, there were 36 edge components applied to the surface. Each edge component was connected to a three-axis sensor, which means that the total number of channels used in this measurement was 108. The 108-channel shoot data in that project were used to train the neural network in the proposed detecting algorithm, and there were 1400 samples in each channel. We labeled this training set according to the amplitude of the microseismic record and the shoot time. One channel of the training data and its corresponding labels are shown in Figure 4. A testing data set was generated with 108 channels; each channel included 60,000 samples. To verify that our platform has the capability to detect a microseismic event with high accuracy even in a low SNR situation, we added additional Gaussian White Noise, whose energy is 3 times larger than that of the original signal, to real data sensed by the testing system. Then, both the original data and the data with added noise were processed by the proposed detecting algorithm and the STA/LTA algorithm, respectively. Similar to the simulations, a 4th-order bandpass filter was also used prior to the STA/LTA algorithm. One piece of these data and its classification results are shown in Figures 20 and 21. In practical cases, it is usual for microseismic data to be consecutive microseismic events with different energies. In Figure 20, two microseismic events with different energies are detected clearly by the proposed algorithm, while the microseismic event with lower energy is missed by the STA/LTA algorithm. This result shows that it is easy for the STA/LTA algorithm to miss some low-energy events that are very close to a microseismic event with larger energy, while the proposed algorithm can definitely detect this type of event with high quality. In the low SNR case, shown by Figure 21, the proposed algorithm is capable of detecting the lower energy microseismic event even though it is drowning in noise and can be hardly recognized by a human. Therefore, compared with the STA/LTA algorithm, the measurement results prove that the algorithm used in our platform has an advantage in detecting multiple events that take place in a very short time interval even in a low SNR situation.
Sensors 2018, 18, 1828 Sensors 2018, 18, x FOR PEER REVIEW Sensors 2018, 18, x FOR PEER REVIEW
17 of 19 17 of 19 17 of 19
Figure 20. Original data and its classification results. (a) The original data; (b) The classification Figure itsits classification results. (a) The data; data; (b) The results Figure 20. 20. Original Originaldata dataand and classification results. (a) original The original (b)classification The classification results of the STA/LTA algorithm; (c) The classification results of proposed algorithm. of the STA/LTA algorithm; (c) The classification results of proposed algorithm. results of the STA/LTA algorithm; (c) The classification results of proposed algorithm.
Figure 21. The data with extra noise and its classification results. (a) The data with extra noise; (b) Figure 21. data withwith extraextra noisenoise and its classification results.results. (a) The (a) dataThe with extrawith noise; (b) Figure 21. The The data its classification extra The classification results of the STA/LTA and algorithm; (c) The classification results data of the proposed The classification results of the STA/LTA algorithm; (c) The classification results of the proposed noise; (b) The classification results of the STA/LTA algorithm; (c) The classification results of the algorithm. algorithm.algorithm. proposed
In this segment of measurement data, the ratio of transmitted data reduced is 93.1%. So, with In this segment of measurement data, the ratio of transmitted data reduced is 93.1%. So, with the proposed platform, this testingdata, microseismic transmits data So, in with a more In this segment of measurement the ratio ofmonitoring transmittedsystem data reduced is 93.1%. the the proposed platform, this testing microseismic monitoring system transmits data in a more efficient way, and its performance on real-time data collection be improved proposed platform, this testing microseismic monitoring systemcan transmits data in obviously. a more efficient way, efficient way, and its performance on real-time data collection can be improved obviously. and its performance on real-time data collection can be improved obviously. 5. Conclusions 5. Conclusions 5. Conclusions In this work, the design of a new Edge-to-Center LearnReduce Microseismic Monitoring In this thiswork, work,thethe designa of a Edge-to-Center new Edge-to-Center LearnReduce Microseismic Monitoring In design new LearnReduce platform platform is presented, whichofconsists of edge components and aMicroseismic data center. Monitoring In this platform, an platform is presented, which consists of edge components and a data center. In this platform, an is presented, which consists of edge components and a data center. In this platform, an edge component edge component is used for not only data transmission but also event detection. Since microseismic edge component is used for not only data transmission but also event detection. Since microseismic is used for onlyhave data different transmission but alsoaevent Since microseismic signals noise signals andnot noise properties, new detection. event detection algorithm based onand a neural signals and noise have different properties, aalgorithm new event detection algorithm based a neural have different a new event detection based on a neural network and aon probability network and properties, a probability inference is presented for edge computing. The proposed method is network andpresented a probability inference is presented for edgemethod computing. Thebyproposed method is inference for edge The proposed is tested both synthetic tested by isboth synthetic and computing. measured microseismic signals. Through comparison with and the tested by both synthetic andThrough measured microseismicthe signals. Through with the measured signals. STA/LTA method,comparison the proposed method STA/LTA microseismic method, the proposed methodcomparison was foundwith to improve the detection accuracy significantly STA/LTA method, the proposed method was found to improve the detection accuracy significantly was improve accuracy significantly eventhe when the noise strong. Moreover, evenfound whentothe noise the is detection strong. Moreover, after detection, volume of is data needed to be even detection, when thethe noise is strong. Moreover, after detection, to thethevolume of data needed to be after volume of data needed to be transmitted data center could be reduced transmitted to the data center could be reduced by about 90%. Therefore, according to the detection transmitted to the data center could betoreduced by about 90%. Therefore, according to the detection by about and 90%. according thereduced, detection and thethat ratio ofpresented transmission data accuracy theTherefore, ratio of transmission data weaccuracy could conclude the platform accuracy and the ratio of transmission data reduced, we could conclude that the presented platform is able to improve the efficiency of real-time microseismic monitoring, which means it has great is able to improve the efficiency of real-time microseismic monitoring, which means it has great potential to be used in practice. potential to be used in practice.
Sensors 2018, 18, 1828
18 of 19
reduced, we could conclude that the presented platform is able to improve the efficiency of real-time microseismic monitoring, which means it has great potential to be used in practice. Author Contributions: X.Z. and F.S. designed and performed the experiments; X.Z., F.S. and X.Z. analyzed the data; X.Z., F.S. and X.Z. wrote the paper; J.L., Z.C. and G.F. revised the paper. Funding: This research was funded by the Natural Science Foundation of China (No. 41074074, No. 41304139), the Key Projects of Science and Technology Development Plan of Jilin Province (20160204065GX, SXGJSF2017-5), and the China Scholarship Council (No. 201706175023, No. 201706170171). Conflicts of Interest: The authors declare no conflict of interest.
References 1. 2. 3. 4. 5. 6. 7. 8.
9. 10. 11. 12. 13. 14. 15. 16. 17.
18. 19. 20.
Maxwell, S. Microseismic hydraulic fracture imaging: The path toward optimizing shale gas production. Lead. Edge 2011, 30, 340–346. [CrossRef] Le Calvez, J.; Malpani, R.; Xu, J.; Stokes, J.; Williams, M. Hydraulic fracturing insights from microseismic monitoring. Oilfield Rev. 2016, 28, 16–33. Alexander, T.; Baihly, J.; Boyer, C.; Clark, B.; Waters, G.; Jochen, V.; Le Calvez, J.; Lewis, R.; Miller, C.K.; Thaeler, J.; et al. Shale gas revolution. Oilfield Rev. 2011, 23, 40–55. Huang, W.; Wang, R.; Li, H.; Chen, Y. Unveiling the signals from extremely noisy microseismic data for high-resolution hydraulic fracturing monitoring. Sci. Rep. 2017, 7, 11996. [CrossRef] [PubMed] Hydraulically Fractured Wells Provide Two-Thirds of U.S. Natural Gas Production. 2016. Available online: https://www.eia.gov/todayinenergy/detail.php?id=26112 (accessed on 5 June 2018). Hefley, W.E.; Wang, Y. Economics of Unconventional Shale Gas Development; Springer: Cham, Switzerland, 2016. Baig, A.; Urbancic, T. Microseismic moment tensors: A path to understanding FRAC growth. Lead. Edge 2010, 29, 320–324. [CrossRef] Martínez-Garzón, P.; Bohnhoff, M.; Kwiatek, G.; Zambrano-Narváez, G.; Chalaturnyk, R. Microseismic monitoring of CO2 injection at the Penn west enhanced oil recovery pilot project, Canada: Implications for detection of wellbore leakage. Sensors 2013, 13, 11522–11538. [CrossRef] [PubMed] Maxwell, S.C.; Urbancic, T.I. The role of passive microseismic monitoring in the instrumented oil field. Lead. Edge 2001, 20, 636–639. [CrossRef] Zhu, Y.; Li, N.; Sun, F.; Lin, J.; Chen, Z. Design and application of a borehole–surface microseismic monitoring system. Instrum. Sci. Technol. 2017, 45, 233–247. [CrossRef] Iqbal, N.; Al-Shuhail, A.A.; Kaka, S.I.; Liu, E.; Raj, A.G.; McClellan, J.H. Iterative interferometry-based method for picking microseismic events. J. Appl. Geophys. 2017, 140, 52–61. [CrossRef] Lee, M.; Byun, J.; Kim, D.; Choi, J.; Kim, M. Improved modified energy ratio method using a multi-window approach for accurate arrival picking. J. Appl. Geophys. 2017, 139, 117–130. [CrossRef] Akram, J.; Eaton, D.W. A review and appraisal of arrival-time picking methods for downhole microseismic data Arrival-time picking methods. Geophysics 2016, 81, KS71–KS91. [CrossRef] Zhu, D.; Li, Y.; Zhang, C. Automatic Time Picking for Microseismic Data Based on a Fuzzy C-Means Clustering Algorithm. IEEE Geosci. Remote Sens. Lett. 2016, 13, 1900–1904. [CrossRef] Chen, Y. Automatic microseismic event picking via unsupervised machine learning. Geophys. J. Int. 2017, 212, 88–102. [CrossRef] Hogarth, L.J.; Kolb, C.M.; Le Calvez, J.H. Controlled-source velocity calibration for real-time downhole microseismic monitoring. Lead. Edge 2017, 36, 172–178. [CrossRef] Li, Y.; Yang, T.H.; Liu, H.L.; Wang, H.; Hou, X.G.; Zhang, P.H.; Wang, P.T. Real-time microseismic monitoring and its characteristic analysis in working face with high-intensity mining. J. Appl. Geophys. 2016, 132, 152–163. [CrossRef] Wu, F.; Yan, Y.; Yin, C. Real-time microseismic monitoring technology for hydraulic fracturing in shale gas reservoirs: A case study from the southern Sichuan Basin. Nat. Gas Ind. B 2017, 4, 68–71. [CrossRef] Li, X.; Shang, X.; Wang, Z.; Dong, L.; Weng, L. Identifying P-phase arrivals with noise: An improved Kurtosis method based on DWT and STA/LTA. J. Appl. Geophys. 2016, 133, 50–61. [CrossRef] Mousavi, S.M.; Langston, C.A.; Horton, S.P. Automatic microseismic denoising and onset detection using the synchrosqueezed continuous wavelet transform. Geophysics 2016, 81, V341–V355. [CrossRef]
Sensors 2018, 18, 1828
21. 22. 23. 24.
25. 26. 27. 28. 29. 30. 31.
32. 33.
34. 35.
36. 37. 38. 39.
40.
41.
19 of 19
Mousavi, S.M.; Langston, C.A. Adaptive noise estimation and suppression for improving microseismic event detection. J. Appl. Geophys. 2016, 132, 116–124. [CrossRef] Li, X.; Li, Z.; Wang, E.; Feng, J.; Chen, L.; Li, N.; Kong, X. Extraction of microseismic waveforms characteristics prior to rock burst using Hilbert–Huang transform. Measurement 2016, 91, 101–113. [CrossRef] Vera Rodriguez, I.; Bonar, D.; Sacchi, M. Microseismic data denoising using a 3C group sparsity constrained time-frequency transform. Geophysics 2012, 77, V21–V29. [CrossRef] Jia, R.S.; Sun, H.M.; Peng, Y.J.; Liang, Y.Q.; Lu, X.M. Automatic event detection in low SNR microseismic signals based on multi-scale permutation entropy and a support vector machine. J. Seismol. 2017, 21, 735–748. [CrossRef] Pugh, D.J.; White, R.S.; Christie, P.A. A Bayesian method for microseismic source inversion. Geophys. J. Int. 2016, 206, 1009–1038. [CrossRef] Vera Rodriguez, I.; Sacchi, M.D. Microseismic source imaging in a compressed domain. Geophys. J. Int. 2014, 198, 1186–1198. [CrossRef] Lin, J.; Zhang, X.; Wang, J.; Long, Y. The techniques and method for multi-hop seismic data acquisition based on compressed sensing. Chin. J. Geophs. 2017, 60, 4194–4203. Zhao, R.; Yan, R.; Wang, J.; Mao, K. Learning to monitor machine health with convolutional bi-directional LSTM networks. Sensors 2017, 17, 273. [CrossRef] [PubMed] He, Z.; Zhang, X.; Cao, Y.; Liu, Z.; Zhang, B.; Wang, X. LiteNet: Lightweight Neural Network for Detecting Arrhythmias at Resource-Constrained Mobile Devices. Sensors 2018, 18, 1229. [CrossRef] [PubMed] Ma, X.; Dai, Z.; He, Z.; Ma, J.; Wang, Y.; Wang, Y. Learning traffic as images: A deep convolutional neural network for large-scale transportation network speed prediction. Sensors 2017, 17, 818. [CrossRef] [PubMed] Mousavi, S.M.; Horton, S.P.; Langston, C.A.; Samei, B. Seismic features and automatic discrimination of deep and shallow induced-microearthquakes using neural network and logistic regression. Geophys. J. Int. 2016, 207, 29–46. [CrossRef] Gu, J.; Wang, Z.; Kuen, J.; Ma, L.; Shahroudy, A.; Shuai, B.; Liu, T.; Wang, X.; Wang, G.; Cai, J.; Chen, T. Recent advances in convolutional neural networks. Pattern Recognit. 2018, 77, 354–377. [CrossRef] Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2818–2826. Lin, M.; Chen, Q.; Yan, S. Network in network. arXiv 2013. Huang, G.; Liu, Z.; Weinberger, K.Q.; van der Maaten, L. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; Volume 1, p. 3. Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [CrossRef] [PubMed] Greff, K.; Srivastava, R.K.; Koutník, J.; Steunebrink, B.R.; Schmidhuber, J. LSTM: A search space odyssey. IEEE Trans. Neural Netw. Learn. Syst. 2017, 28, 2222–2232. [CrossRef] [PubMed] Farzad, A.; Mashayekhi, H.; Hassanpour, H. A comparative performance analysis of different activation functions in LSTM networks for classification. Neural Comput. Appl. 2017, 1–5. [CrossRef] Han, S.; Kang, J.; Mao, H.; Hu, Y.; Li, X.; Li, Y.; Xie, D.; Luo, H.; Yao, S.; Wang, Y.; Yang, H. Ese: Efficient speech recognition engine with sparse LSTM on FPGA. In Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Monterey, CA, USA, 22–24 February 2017; pp. 75–84. Yin, S.; Tang, S.; Lin, X.; Ouyang, P.; Tu, F.; Liu, L.; Wei, S. A high throughput acceleration for hybrid neural networks with efficient resource management on FPGA. IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst. 2018. [CrossRef] Gao, N.; Zheng, F.; Wang, X.; Jiang, X.X.; Lin, J. High-speed download of seismographs using private cloud technology and a proportional integral derivative controller. Instrum. Sci. Technol. 2016, 44, 12–22. [CrossRef] © 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).