Embedding Intelligent Sensor Signal Change Detection into Sensor Network Protocols Leon Reznik, Gregory Von Pless, Tayeb Al Karim Department of Computer Science Rochester Institute of Technology Rochester, NY 14623 USA
[email protected] Abstract— The paper develops intelligent protocols, enhancing efficiency, reliability and security based on detection of sensor signal changes, and describes their possible implementation on the lower level in sensor networks. Embedded into the communication protocols, the signal change detection will allow data compression for improving network efficiency. It might enhance reliability and security also. The protocol utilizes a neural network function prediction methodology in order to determine if the sensor signals have changed. In addition to the change detection system, a modification of a standard neural network function predictor is proposed that allows the detection system to learn quickly how to accurately predict next sensor outputs. The parameter choice and the relationship between the threshold values and false positive and negative rates are studied. The protocol is implemented and tested in real life environments with sensor networks built from Crossbow MICA-2 motes. Sensor Network Anomaly Detection Systems, which is designed to become a protocol’s core utility, is described. The test results are analyzed and recommendations on applications are provided. Keywords - Sensor networks, neural networks, signal change detection.
I. INTRODUCTION Sensor networks are a new technology, which is envisioned to become a data acquisition part of complex adaptive network systems designed to cover the whole life of information within the system: from information gathering, through its communication, to storage and use. They are considered as one of the most promising emerging computing technologies. Distributed sensor networks’ foreseeable applications will help in protecting and monitoring military, environmental, safetycritical infrastructures and resources [1]. Typically, sensor networks are constrained in terms of power consumption, computational capacity and reliability of the input data. It might make the measurement results taken from each particular sensor less accurate than with conventional separate sensors. Reliability of results delivery is pretty low also due to a variety of reasons (loss of power, hacker’s attack, long route) because of which a sensor signal may not reach the base station or the processing unit. This new technology requires development of new paradigms and concepts and their implementation into new protocols, improving an overall operational efficiency, which might be evaluated as accuracy, reliability and/or security.
Although sensor networks have many common features with their mobile ad hoc relatives, the conceptual feature of networking of sensor systems and an arising ability of synchronized acquiring of a huge number of signals from various sensors at the same time has not been paid a proper attention in system design so far. As such, existing network protocols, including those developed for mobile networks, might make a poor fit for many application domains. The natural advantage of combining multiple sensor systems into networks is the possibility of organizing some form of collaboration between them. Significant improvement in coverage, reliability and security and a possibility of data compression at the lower levels could be achieved by fusing signals from multiple sensors. Collaborative signal and information processing over a network is a new area of research and is related to distributed information fusion. Fusion approaches range from simple rules for picking the best result to model-based techniques that consider how the information is generated. Among recent results, one should mention further development of data association use in detection, tracking and classification of targets. Zhao et. al. [2] developed the information-driven sensor querying approach, enabling collaboration based upon resource constraints and the cost of transmitting information. This line of investigation has been followed up. Michiardi et al. [3] go further by generalizing the rating mechanism. Now the neighbors of any single node collaborate in rating the node, according to how well the node executes the functions requested from it. The difficulty of this scheme lies in finding out how an evaluating node is able to evaluate the result of a function executed by the evaluated node. Many of the examples (see [4] for the review) and simple applications are based upon some kind of aggregation function, performed at the aggregate nodes. Traditionally, sensor network protocol designers consider sensor networks simply as a communication system transmitting information in one direction from sensors to processing units. The fact that sensor networks are a concurrent data acquisition system is often neglected. Sensor networks bring up a wealth of information on the object or process under measurement and the environment, and this information can be applied for increasing the efficiency of sensor networks applications as well as for enhancing their own security and reliability. The key difficulty is in the
0-7803-9012-1/05/$20.00 (C) 2005 IEEE
development of decision making and control mechanisms and the associated costs when a concurrent data collection takes place. Although change detection in signals has been investigated for a considerable time, over the last years there have been new important developments. The literature on change detection is rapidly growing mainly due to applications in engineering, financial mathematics and econometrics. These change detections have developed into various models, which might be classified as likelihood ratio tests [8-10], nonparametric approaches [11-13], linear model approaches [14-17] and intelligent techniques [18-20]. Csorgo and Horvath [21] provide a concise overview and rigorous mathematical treatment of methods for a change detection and use a number of datasets to illustrate the effectiveness of the various techniques. An important feature of the sensor networks application is an exciting opportunity of making decisions based on an association of the signals delivered by the network to an aggregate node. This technological achievement is supposed to produce gain in efficiency of SN applications. Application of intelligent techniques for personal and technical systems monitoring based on SN applications were reported recently [22-24]. Due to the SN technology novelty some fundamental design questions have not been solved yet. This paper investigates an application of intelligent techniques for change detection. It will develop the Sensor Network Anomaly Detection System (SNADS), with the goal to grow up into a synergetic framework of various intelligent agents, making a detection decision. Results received by the first author previously [25-29] indicate that neural networks and fuzzy models will be feasible for description of the kind of uncertainty we may anticipate in this problem. From a generic point of view, neural networks models seem to be appropriate as a base for developing fundamental algorithms to run software utilizing opportunities arising from networking of sensor systems. Over recent years in neural networks research particular interest has been paid to predicting chaotic time series using neural networks because of their universal approximation capabilities [30]. Most applications in this field are based on feed-forward neural networks, such as the Back Propagation (BP) network [31], Radial Basis Function (RBF) network [32], and so on. Defining a suitable number of hidden nodes is a difficult problem in these networks. Besides, being static networks, these networks have limitations to identify chaotic dynamical systems. Recurrent neural networks (RNNs) with the consideration of an internal feedback are more general models than the feed-forward networks. It is a widely used tool for the prediction of time series [33,34]. Although the gradient descent algorithms are commonly used to train RNNs, they exhibit certain problems during training, such as lack of stability and the difficulty of dealing with long-term dependencies in the time series, RNNs are computationally more powerful than feed-forward networks [35,36], and valuable approximation results were obtained for prediction problems. The purpose of this paper is to present a new methodology in detecting change in sensor networks. Novelty can result
from a change in a sensor’s environment, malicious activity, or sensor malfunction. This paper does not attempt to differentiate between these reasons or to investigate variations arising from changes caused by different sources. The presented implementation uses prediction to determine change, in effect computing a confidence level for each sensor in the network. However, existing neural network function prediction methodologies seem inadequate for this task. A background of commonly used neural network methodologies is given in Section II. This section also presents a new function prediction neural network model that is simpler and learns and runs faster than the classical time-based multi-layer perceptron. This network excels at periodic function prediction, a problem that has been traditionally considered very difficult to solve. To this end, an application has been created that allows any set of sensors to be used as inputs and any topology of a timebased multi-layer perceptron or a modified time-based multilayer perceptron to be used in prediction. The application is explained in Section III. In Section IV, a set of experiments to test the implementation and investigate the parameter choice and relationship is explained, and the results thereof are presented in Section VI. Section V describes a protocol implementation through development of the Sensor Network Anomaly Detection System, which is designed to become a synergetic collection of intelligent agents, acting as core functions of a new protocol family. This section provides information about SNADS modular design and implementation and gives examples of GUI implementations for different sensor network applications. Section VI provides the results of implementation of the protocol developed and its testing by both computer simulation and in real life sensor network applications for monitoring different objects and environment. Finally, sections VII and VIII present analyses of the results and the recommendations on efficient use of the system, respectively. II. NEURAL NETWORK FUNCTION PREDICTORS The basic idea of the methodology, presented in this paper, is the change detection in the set of networked sensor signals based on the difference between current signal values and the predicted values. Neural network is employed to produce these signal predictions from the recent history. One of the advantages of using neural networks for novelty detection is that it might be utilized with no a priori information on data distribution and no specific parameters related to data need to be set. This would allow to detect changes and after that to relearn new signal values to be used for further changes detection. Singh and Markou [41] considered approaches to functional prediction design as based on auto-associators, on Kohonen Self Organizing Networks, on multilayer perceptrons, on support vectors, and other not widely applied
Figure 1. The data flow for the change detection application
methods. MLP has probably the simplest linear structure among all neural networks considered above. Coupled with its highly parallelism, this makes a detection algorithms based on MLP very computationally efficient that facilitates its possible implementation on the lower, possibly motes, levels in sensor networks with strong memory and power constraints. To analyze the signals with a MLP, time dependence needs to be introduced into its architecture. A time-Based MLP is simply a backpropagation-trained MLP whose inputs are timedelayed values. Generally, time-based MLPs are used as predictors for functions of the form y = f (x ) . Our experiments (see section VII) discovered certain limitations of the time based MLPs, which required their further evolution and modifications, which were made in this project. The modifications are explained in the subsections below. The comparison of a new structure with a conventional one and its advantages are explained also. The Modified Time-Based Multi-Layer Perceptron (MTBMLP) consists of multiple time-based MLPs (referred to in this paper as TimeNets) all connected to a single end MLP (the MainNet). Each TimeNet is associated with a single function. The MainNet is responsible for predicting the next value for each function. The purpose for altering the standard time-based MLP structure is threefold. The modification reduces connections, isolates knowledge for each function, and produces knowledge about the system of functions as a whole. The selection of threshold is an important consideration. According to Li et al. [47], the best threshold should ideally maximize a function based on the log of the actual output of the network and the threshold output. They show that using 0.5 as a threshold on a given output node is not a good way since test data may lie outside the input data space. The output of a neural network can also be modified by training it to reduce mutual information [48]. III. CHANGE DETECTION APPROACHES
A. Change Detection Procedure The solution to the change detection problem discussed here is implemented around the concept of function prediction. The idea is that if a neural network can learn to predict the function associated with a given sensor’s outputs, then the system can
compare the knowledge generated by the network with the actual next value read from the sensor. If these are similar (the absolute value of their difference is less than a predetermined threshold), the sensor’s output can be, for the purposes of change detection, ignored. However, if the two values are dissimilar, then the sensor’s trend has changed and the system alerts the user to this. Finally, the network must re-habituate itself to the sensor, as a permanent change in trend eventually loses its novelty. Due to this requirement, the network is given an error goal and is only allowed to skip the training algorithm when all of its errors are below this value.
B. How the MTBMLP is Used In sensor networks the outputs of various sensors are related (e.g. one sensor is always low while another is high), so being able to correlate is very important. On the other hand, when change is detected in one sensor, the network should be robust enough that as it learns the new sensor trend, it continues to predict other unchanged sensors. The knowledge isolation in the MTBMLP provides this functionality.
C. The Change Detection Application The application developed to perform the change detection simulates sensors with mathematical functions (periodic, sawtooth, and square curves), provides their outputs to a MTBMLP, and then compares the outputs of the network with the next outputs of the sensors. The MTBMLP is implemented in such a way that the TimeNets can have any topology (they are all identical) and the MainNet may also have any topology. In addition, sensors are represented by an abstract class, so more simulation functions can be created or classes to read from real sensors can be implemented and used in the application. This modularity makes the system very practical for real-world application. Currently, the system runs through a simple command-line interface that recognizes twelve commands. These commands allow the user to create sensors, build a network, run the network, and change sensors to test change detection. IV. EXPERIMENTS This section provides detailed descriptions of experiment procedures used in testing. The results of these experiments are shown in Section VII and discussed in detail in Section VIII.
A. General Procedure The change detection system uses the following procedure for training, detection, and adaptation. A MTBMLP is created with a TimeNet with a specified number of time delays for each sensor in the sensor network and a MainNet with an output for each sensor. An error goal and a threshold are set. The error goal defines when training can be skipped – if for a certain iteration, all of the predictions are below the specified error, backpropagation is not applied for this iteration, thus speeding up the iteration. The threshold specifies the minimum
difference between the prediction and the actual next value that is considered novel. Once all of these values are specified and the network is created, each iteration is structured as follows. Each sensor is polled for its next value. These are set aside for later use. An array of the sensor outputs from time (t – 1) is presented to the neural network. The network compares these values to its outputs for the previous iteration (i.e. the predictions are compared to the actual values). If there are any errors greater than the error goal, backpropagation is applied. Then the next predictions are obtained from the network. Finally, these current predictions are compared to the current values that had been set aside at the beginning of the iteration. If any prediction is farther from its corresponding actual value than the set threshold allows, that sensor’s output is considered novel. The current sensor values are stored as the previous values for the next iteration and the process is repeated.
B. Constants and Variables In the experiments that will be described below, the following standard heuristics were applied. Each TimeNet has 3 input nodes (3 time delays), a hidden layer with 3 nodes, and a single output node. The MainNet has as its input layer the output nodes of the TimeNets; it has a single hidden layer with twice the number of sensor nodes and an output node for each sensor. Finally, the error goal is 0.01 (this value only really matters in real time application as it is supposed to act simply as a time saver). Two values were changed as part of testing: the change threshold and the number of iterations the system was given in which to detect the change.
C. Experiment 1 – Uncorrelated data The goal of this experiment is to study the relationship between detection and false alarms rates, and the detection threshold and number of iterations given for detection. In this experiment the false alarm rate is the ratio of detections occurring outside of the iterations given for detection and learning to the total number of iterations outside those given for detection and learning. The detection rate is the ratio of detections to the total number of sensor changes (in each test there is a single sensor change, and any number of detections within the detection time is considered to be a single successful detection, assuming it is for the proper sensor). This experiment consisted of 1800 tests. In each test there were 4 sine curve sensors that were learned for 4000 iterations. After this, one sensor was changed and the system was run for another 4000 iterations. Each test was conducted with “training times” 1000, 100, 10, and 1, and with thresholds 0.1, 0.15, 0.2, 0.25, and 0.3. The training times define both how many iterations are given for detection and how many iterations are not counted towards the false alarm rate. So, a training time of 1000 means that the initial 1000 iterations for all sensors and the 1000 iterations after the change for the
Set 1
Set 2
ID
Fi
Ff
ID
Fi
Ff
0
0.6
0.6
0
0.8
0.8
1
0.4
0.4
1
0.5
0.5
2
0.2
0.2
2
0.2
0.2
3
0.2
0.15
3
0.8
0
4
0.4
0.2
4
0.2
0
5
0.6
0.08
5
0
0
Figure 2. Initial and final frequencies for sensors in the correlated frequency tests. Rows in bold indicate sensors that were changed after 4000 iterations.
changed sensor are not considered when calculating the false alarm rate. Also, this 1000 means that if the changed sensor fires as novel at least once during the 1000 iterations following the change, it is considered a detection. The experiment was split up into three categories: changes in frequency, changes in amplitude, and sensors flat lining at zero. For the flat lining tests, the four initial sine sensors had high values +/- 0.1 from 0.8, low values +/- 0.1 from -0.8, frequencies +/- 0.1 from 0.8, and no phase shifts. After 4000 iterations, the first sensor was set to a horizontal line at y = 0. In the frequency tests the sensors were initialized with high equal to 0.8 and low equal to -0.8. The initial frequencies were between 0.25 and 2. The same range was used to choose the new frequency for the changed sensor. Finally, in the amplitude tests, the high values were initialized between 0.25 and 1, the low values were initialized between -1 and -0.25, and the frequencies were initialized +/- 0.1 from 0.8. The same high and low ranges were used when these values were changed after 4000 iterations for a single sensor. Within each category, the different thresholds and training times were used. Each threshold and training time combination was tested 30 times, making 600 tests per category. Each individual test involved the random initializations of sensors as described above, plus the random initialization of the weights in the network.
D. Experiment 2 – Correlated data The goal of this experiment is to determine the effect of data correlation on the results from the previous experiment. The MTBMLP is designed to separate knowledge and then correlate it, so it is logical to believe that correlated data would be easier for the system to learn and detect novelties in. The false alarm and detection rates are defined as in the previous experiment. This experiment consisted of 2400 tests. In each test there were 6 sine sensors in pairs of equivalent sensors (i.e. sensor 0 and sensor 5 were equivalent, sensor 1 and sensor 4 were equivalent, and sensor 2 and sensor 3 were equivalent). These 6 sensors were once again learned for 4000 iterations. Then, three of the sensors were changed in some way and the system was run for another 4000 iterations. The tests in this experiment were conducted for the same training times and thresholds.
ID 0 1 2 3 4 5
Li -0.6 -0.4 -0.2 -0.2 -0.4 -0.6
ID 0 1 2 3 4 5
Li -0.8 -0.5 -0.2 -0.8 -0.2 0
Set 1 Hi 0.6 0.4 0.2 0.2 0.4 0.6 Set 2 Hi 0.8 0.5 0.2 0.8 0.2 0
Lf -0.6 -0.4 -0.2 -0.15 -0.2 -0.08
Hf 0.6 0.4 0.2 0.15 0.2 0.08
Lf -0.8 -0.5 -0.2 0 0 0
Hf 0.8 0.5 0.2 0 0 0
Figure 3. Initial and final minimums (L) and maximums (H) for sensors in the correlated amplitude tests. Rows in bold indicate sensors that were changed after 4000 iterations.
This experiment was split into two 1200-test categories: frequency changes and amplitude changes. Within each category there were two sensor sets, as shown in Figures 2 and 3. The correlated tests were conducted in the same manner as the uncorrelated ones, using the different threshold values and training times and randomly initializing the network at the start of each test.
Figure 4. Protocol graphical-user interface design allows SnadsComponents to be “wired” together in a way, which captures the manner in which data flows through the system intuitively.
V. PROTOCOL IMPLEMENTATION AND SOFTWARE DESIGN The main part of the protocol will be the Sensor Network Anomaly Detection System (SNADS). The SNADS system is designed to be modular, extensible, robust, and scalable. By providing a generic sensor abstraction and sensor-definable configuration mechanisms, SNADS will allow for simple, secure management of arbitrary sensor networks. By supporting network nodes with different hardware and software configurations, SNADS will be a versatile cross platform tool. Modularity is achieved via a central signaling system, which allows components to work together as a number of black-box components. This setup allows for simple runtime reconfiguration of the SNADS system and minimizes the damage that a malfunctioning component can create. The messaging system is also highly extensible. Components can be added, removed, or replaced on the fly by the administrator. Updating to a new version of a component can be done without shutting down SNADS or the network. A SNADS network should be scalable to an arbitrary degree. From a simple four nodes test network to a thousand node perimeter monitoring system, the SNADS architecture can handle it. Because of the message passing abstraction, various
Figure 5. Tabular display of the measurement results for protocol testing
SNADS components could easily be located on separate units, and the workload of a single component could be spread across multiple computing units. SNADS system is designed to have the following components: 1. The signal routing subsystem lies at the core of SNADS. By registering with this system, a component will be able to send events to and receive events from other SNADS components.
F alse A lar m R at es
Detectio n Rates fo r A mplitude Changes fo r Different Training Times 3.50%
100.00%
3.00%
90.00%
2.50%
1000
80.00%
100 10
70.00%
1
Freq Rand
2.00%
Freq Corr
1.50%
Amp Rand Amp Corr
1.00%
60.00%
0.50%
50.00% 0.1
0.15
0.2
0.25
0.00% 0.1
0.3
0.15
Thresho ld
0.2
0.25
0.3
T hr esho ld
Figure 6. This graph demonstrates the relationship between threshold and detection rate. Also, one can see that the training time has very little effect on the detection rate.
Figure 8. This graph demonstrates the relationship between threshold and false alarm rates for correlated and uncorrelated data. One can see that both of the curves for correlated data are lower and have less drastic changes than those for uncorrelated data.
Flat Line False A larm Rates fo r Different Training Times
D et ect io n R at es 3.50%
100.00%
3.00% 2.50%
1000
2.00%
100
1.50%
10 1
1.00% 0.50%
90.00% 80.00%
Freq Rand Freq Corr
70.00%
Amp Rand Amp Corr
60.00%
0.00% 0.1
0.15
0.2
0.25
0.3
Thresho ld
Figure 7. This graph demonstrates the relationship between threshold and false alarm rate. Also, one can see that the training time has very little effect on the false alarm rate.
2. The sensor management subsystem maintains a collection of JSensor objects, which provide the basic abstraction for sensors in the network. The sensor manifold object directly handles all incoming sensor notifications and prepares them for use by other components such as the ADS or GUI. 3. Graphical User Interface (GUI) – see figure 4. The SNADS interface allows for simple central monitoring and management of a network. Sensors can be dropped from the network or individually configured. Sensor specific configuration dialogs integrate seamlessly into the rest of the GUI. Alerts from the ADS are reported to the GUI, which notifies administrators of potential problems. Additionally, the GUI supports database, session, and ADS configuration. While the GUI elements in figure 4 are standard to the distributed SNADS management, figure 5 shows a simple tabular data display implemented by an external component. In this particular example the Data Display tab was sent from a component on a remote machine, and its content is determined entirely remotely.
50.00% 40.00% 0.1
0.15
0.2
0.25
0.3
T hresho ld
Figure 9. This graph demonstrates the relationship between threshold and detection rates for correlated and uncorrelated data. One can see that both of the curves for correlated data are higher than those for uncorrelated data. Note that the curve for correlated amplitude changes drops dramatically for high thresholds because the changes in that set of tests were very small.
4. Anomaly Detection System (ADS) lies at the heart of SNADS. While a default component will be designed, arbitrary user-defined ADS implementations can easily be used instead of the default. Based on sample data, the ADS looks for potentially erroneous or malicious incoming sensor readings, marks such data as dirty, and sends notification to the messaging center. 5. The database subsystem provides a simple interface for data logging and searching. Each session is stored separately in the database and the format for sensor readings can be different across sessions. Data cataloged by the SNADS database system is easily accessible for later analysis if so desired. Depending on the specific SNADS application, however, the database may not be used at all.
Figure 10 Screenshot of the experiment with Crossbow mote kits and flickering lights level due to switching them on and off.
VI. RESULTS The results of the experiments described in section V demonstrate distinct trends in regard to different training times and threshold values. With respect to training times, the detection rates and false positive rates remained relatively unchanged with different training times at any given threshold. and threshold values. With respect to training times, the detection rates and false positive rates remained relatively unchanged with different training times at any given threshold. With respect to the threshold, as expected, detection rates went down with higher thresholds, as did false alarm rates. The results also demonstrated that the system performs better when presented with correlated data rather than randomly generated data, as shown in Figure 8 and Figure 9. Real life experiments with the change detection software have been developed and implemented in sensor networks designed from Crossbow Inc. Mote kits (MICA type), which included
three sensors (light, temperature, ultrasound) each. The data was stored in different excel files reflecting the data that was collected and what type of environment it was collected in. The data was organized by the different motes that were used and the different attributes that were applied. For this particular project the data was collected in three environments using the same exact attributes for each one of the environments. The environments were Indoors, Outdoors and the Database Lab at RIT. In results analysis both statistical and heuristic techniques were employed. A k-means clustering algorithm is used to organize the data into distinct groups. Those groups would allow the end user to analyze the end results with better precision families. The k-means clustering algorithm was selected for the heuristic analysis of data because the data was not collected in any particular order and by the way the numbers were dispersed it was obvious that some kind of clustering algorithm would do a better job than if the user tackled the problem with the raw data.
Figure 11. Screenshot of the experiment with Crossbow Inc. mote kits and changing the temperature due to opening-closing a door
The experiments given on figures 10 and 11 were conducted in a student dorm. The light and temperature were changed by opening and closing a building door and switching lights on and off. The ANN is designed to predict the value of the sensor signals. The change is detected when the difference between the actual and the predicted values go over the threshold value. In the screenshots given in the figures the top left region represents the plots of actual and predicted sensor signals, the bottom left region plots represent the error. In the experiment given in figure 10 the light is flickering. The change is detected when the light starts and stops flickering as well as when the flickering frequency is changing. So the detection system designed is able to detect the change of not only the signal levels but also sensor signal shapes and parameters that is a more complicated problem. The same ANN is capable to handle detection with both problems as well as adopting to the new pattern after the change occurred and was registered. In the experiment given in figure 11, the temperature is changing gradually due to the door opening. The system does not react on this small change. VII. ANALYSIS The results of the experiments conducted demonstrate that the proposed change detection system is both accurate and
reliable. The table (Figure 12) shows that for low thresholds (0.2 or smaller) detection is greater than or equal to 85% (with the exception of the correlated amplitude change tests, which will be addressed later). The false alarm rates for the same thresholds remained below 3.5%. Furthermore, a significant difference between the detection and false alarm rates for correlated and uncorrelated sensor sets and changes can be seen, especially in the tests related to frequency (note the exception of the correlated amplitude tests, which shall be discussed below). These results confirm the benefits anticipated from the modified neural network and demonstrate its usefulness in sensor prediction and change detection. The low false alarm rate shows that the knowledge separation within the network does indeed make for a more robust predictor, and the fact that detection rates and false alarm rates become better when data and changes presented to the system are correlated is positive evidence that the combination of the TimeNet outputs in the MainNet of the MTBMLP allows the system to learn the inherent relationships between various data sources. Finally, the results for the correlated amplitude tests must be explained. It is known, and will be discussed in more detail in the next section on limitations of the proposed system, that detections can be missed if the changes result in knew sensor
Sensor Fails – UC Frequency Changes – UC Amplitude Changes – UC Frequency Changes – C Amplitude Changes – C Sensor Fails – UC Frequency Changes – UC Amplitude Changes – UC Frequency Changes – C Amplitude Changes – C
0.1 0.15 0.2 False Alarm Rates, Averaged for all Training Times 2.640% 1.098% 0.569%
0.25
0.3
0.298%
0.170%
3.350%
0.832%
0.725%
0.451%
0.272%
3.075%
1.307%
0.758%
0.359%
0.167%
1.39%
0.48%
0.29%
0.19%
0.12%
0.27%
0.10%
0.06%
0.04%
0.02%
Detection Rates, Averaged for all Training Times 93.35% 92.52% 85.83%
86.65%
82.48%
95.00%
87.50%
85.82%
82.50%
75.82%
94.17%
90.83%
85.00%
75.00%
65.00%
100.00%
100.00%
100.00%
93.50%
91.16%
98.16%
85.67%
74.73%
52.05%
48.03%
Figure 12. This table shows false alarm rates and detection rates for various tests at various thresholds. Each rate is an average of the rates for the various training times tested – 1000, 100, 10, and 1 iteration(s). UC stands for uncorrelated, C stands for correlated, and “Sensor Fails” means that in that test one sensor becomes a flat line at 0.0.
functions similar enough to the original functions so as to not cause an error spike greater than the current change threshold. The detection rates for the correlated amplitude tests are low as compared to other results, and would seem to dispute the claim that correlated data are predicted more accurately by the system than uncorrelated data. However, as explained in the experiment procedure description, the changes in amplitude for these tests were very small, some being changes of only 0.05, which is less than the smallest tested threshold. And yet, with the threshold set to 0.1, these changes were detected with a rate of 98% where one would expect the system to miss detection every time. So, these results are in fact much better than would have been predicted, and do in fact show that the system thrives when learning and predicting correlated sensor sets. VIII. RECOMMENDATIONS ON THE PROPOSED SYSTEM APPLICATION Experimentation results and their analysis has allowed producing some recommendations on efficient applications of the proposed techniques. One may note that system was working fine with periodic and nonperiodic signals, however it is less efficient with long clock signals, being signals that cycle from value A to value B, staying at each value for longer than 3 iterations. When presented with such a signal, the system learns the flat line on each value and signals an error each time the value changes. Secondly, the system's need for a pre-determined error threshold requires further study. The threshold can be set too low or too high, with too low causing a notable increase in the false alarm rate (see Figures 6 and 7) as every slight change in
the environment will be considered novel, and too high causing a drop in the detection rate (see Figures 6 and 9) as small changes go unnoticed by the system. This limitation can be overcome with a small amount of a priori knowledge, as one can determine before running the system how small a change one wishes to detect. Finally, a third issue must be discussed, which is closely related to the previous one. The modified neural network used in the proposed system for function prediction is capable of adapting to new trends very quickly, making proper threshold determination even more important. Slow changes to a sensor over long periods of time can easily be overlooked by the system as it can adjust to minute changes in only a few iterations. Depending upon the application of the system, this may or may not actually be a problem. For instance, for a sensor detecting outdoor light levels, this quick adaptation would be beneficial in that the setting sun should not, and would not, be considered novel. However, a person with malicious intent could use this property of the system to get around it by altering the sensor very slowly, in order to disguise the change they are imposing on the system. IX. CONCLUSION The novel sensor network protocol improving system reliability and security based on change detection in sensor networks has been developed on the Time-based Multilayer Perceptron. The development included theoretical investigation, computer simulation and practical experiments based on real life application of sensor networks. The change detection problem in sensor networks was formalized as a
muticriteria optimization problem. The ANN design was conducted by training through simulation and practical experiments. The network parameter choice has been investigated. The influence of the threshold value on missing attack and false alarm rates has been studied. The software including Sensor Network Anomaly Detection System has been designed and implemented. The proposed system has been shown to detect change both reliably and accurately. Neural network function prediction methodologies have been successfully applied in real life experiments and have yielded very good results. Furthermore, the proposed modifications to the standard time-based multi-layer perceptron made this particular prediction methodology more useful as its modified form learns and runs much more quickly, is more robust, and can find inherent correlations between data. When applied to sensor networks, this allows the system to learn the relationships between various sensors as part of the process it goes through in learning to predict the outputs of said sensors. Because both the trends of the individual sensors and the trends between them are learned by the neural network, the change detection system is able to properly detect when one or more sensors’ trends change, while continuing to predict correctly the outputs of the unchanged sensors. The developed methodology of change detection was implemented and tested in real sensor network applications. It makes sensor network operation more efficient, reliable and secure.
[12] [13] [14] [15] [16] [17] [18]
[19]
[20]
[21] [22]
[23]
REFERENCES [1] [2] [3]
[4]
[5]
[6] [7]
[8] [9] [10] [11]
Wood A.D. and J.A. Stankovich “Denial of service in sensor networks”, IEEE Computer, October 2002, pp. 54-62. Zhao F., J. Shin, and J. Reich, “Information-driven dynamic sensor collaboration for tracking applications,” IEEE Signal Processing Magazine, vol. 19, pp. 61–72, Mar. 2002. Michiardi P.and R. Molva. “Simulation-based analysis of security exposures in mobile ad hoc networks” In European Wireless 2002: Next Generation Wireless Networks:Technologies, Protocols, Services and Applications, February 25-28, 2002, Florence, Italy, 2002 Boulis A., Ganeriwal S., Srivastava M.B. “Aggregation in sensor networks: an energy-accuracy trade-off “ In: Proceedings of the First IEEE. 2003 IEEE International Workshop on Sensor Network Protocols and Applications, 2003, 11 May 2003, pp.128 – 138 Reznik L. and Solopchenko G.N. “Use of a priori information on functional relations between measured quantities for improving accuracy of measurement”, Measurement, 1985, Vol. 3, No. 3, pp. 98 – 106 Reznik L. and K.P.Dabke “Measurement models: Application of intelligent methods”, Measurement, vol. 35, No.1, pp.47 – 58, 2004 Reznik L., V.Kreinovich and S.A.Starks “Use of fuzzy expert's information in measurement and what we can gain from its application in geophysics” Proceedings of the FUZZ '03, The 12th IEEE International Conference on Fuzzy Systems, St.Louis, May 25- 28, 2003, IEEE, vol. 2, pp. 1026- 1031 D.M. Hawkins, “Testing a sequence of observations for a shift in location,” J Amer Statist Assoc, 72, pp.180-186, 1977. K.J. Worsley, “On the likelihood ratio test for a shift in location of normal populations,” J Amer Statist Assoc, 74, pp.365-367, 1979. Y.C. Yao, “Estimating the number of change-points via Schwarz’s criterion,” Statist Probab Lett, 6, pp.181-189, 1988. B.E. Brodsky and B.S. Darkhovsky, “Nonparametric methods in Change-point Problems” Kluwer, Dordrecht, 1993.
[24]
[25]
[26]
[27] [28]
[29]
[30] [31]
E. Parzen, “Comparison change analysis approach to change-point estimation,” J Appl Statist Sci, 4, pp.379-402, 1994. A.N. Pettitt, “A non-parametric approach to the changepoint problem,” Applied Statistics, 28(2), pp.126-135, 1979. D. Andrews, “Heteroskedasticity and autocorrelation consistent covariance matrix estimation,” Econometrica, 59, pp.817-858, 1991 C.S.J. Chu and H. White, “A direct test for changing trend,” J Business Economic Statist, 10, pp.289-299, 1992. D. Siegmund, Sequential Analysis: Tests and Confidence Intervals, New York: Springer-Verlag, 1985 D.L. Hawkins, “An U-I approach to retrospective testing for shift parameters in a linear model,” Comm Statist -Theory Method, 18, pp.3117-3134, 1989. Kyong Joo Oh; Ingoo Han “An intelligent clustering forecasting system based on change-point detection and artificial neural networks: application to financial economics”, International Conference on System Sciences, 2001. Proceedings of the 34th Annual Hawaii, 3-6 Jan. 2001 Xiaolong Dai, Khorram, S. “Development of a new automated land cover change detection system from remotely sensed imagery based on artificial neural networks”, IEEE International Geoscience and Remote Sensing, 1997. IGARSS '97. 'Remote Sensing - A Scientific Vision for Sustainable Development'., 1997, 3-8 Aug. 1997 Singapore, 1029 1031 vol.2 Min Han; Jianhui Xi; Shiguo Xu; Fu-Liang Yin “Prediction of chaotic time series based on the recurrent predictor neural network” IEEE Transactions on Signal Processing, vol. 52 Issue: 12 , Dec. 2004 pp.:3409 – 3416 M. Csorgo and L. Horvath “Limit Theorems in Change-Point Analysis”, New York: John Wiley & Sons, 1997 Majeed, B. Nauck, D. Lee, B.-S. Martin, T. “Intelligent systems for wellbeing monitoring” Proceedings 2004 2nd International IEEE Conference on Intelligent Systems BT Exact Intelligent Syst. Lab. Res. & Venturing, British Telecom plc, Ipswich, UK, 22-24 June 2004 , pp. 164 - 168 Vol.1 Shan, Q. Liu, Y. Prosser, G. Brown, D. “Wireless intelligent sensor networks for refrigerated vehicle”, Proceedings of the IEEE 6th Circuits and Systems Symposium on Emerging Technologies: Frontiers of Mobile and Wireless Communication, 2004, 31 May-2 June 2004 , Vol.2, pp. 525 – 528 Reznik A.M. Shirshov Yu.M. Snopok B.A. Nowicki D.W. Dekhtyarenko A.K. “Associative memories for chemical sensing” Proceedings of the 9th International Conference on Neural Information Processing, 2002. ICONIP '02, vol.5 , 18-22 Nov. 2002, pp. 2630 – 2634 Reznik L., K.P. Dabke “Evaluation of uncertainty in measurement: A proposal for application of intelligent methods” in H. Imai/Ed. Measurement to Improve Quality of Life in the 21st Century, IMEKO – XV World Congress, June 13-18, 1999, Osaka, Japan, vol. II, p.93-99 Reznik L. and B.Pham “Fuzzy models in evaluation of information uncertainty in engineering and technology applications”, The 10th IEEE International Conference on Fuzzy Systems, December 2-5, 2001, Melbourne, Australia Reznik L. and K.P.Dabke “Measurement models: Application of intelligent methods”, Measurement, vol. 35, No.1, pp.47 – 58, 2004 Mari L and L.Reznik “Uncertainty in measurement: Some thoughts about its expressing and processing” In L.Reznik and V.Kreinovich/Eds. Soft Computing in Measurement and Information Acquisition, Springer, Berlin-Heidelberg-New York, 2003, ISBN 3540-00246- 4, pp. 1-9 Reznik L. “Which models should be applied to measure computer security and information assurance?” Proceedings of the FUZZ '03, The 12th IEEE International Conference on Fuzzy Systems, St.Louis, May 25- 28, 2003, IEEE, vol. 2, pp. 1243-1248 C. Jiann-Long, I. Shafiqul, and B. Pratim, “Nonlinear dynamics of hourly ozone concentrations: nonparametric short term prediction,” Atmospheric Environment, vol. 32, no. 11, pp. 1839–1848, 1998. R. Bakker, J. C. Schouten, and C. L. Giles et al., “Learning chaotic attractors by neural networks,” Neural Computing., vol. 12, pp. 2355– 2383, 2000
[32] [33] [34] [35] [36] [37] [38] [39]
[40] [41] [42]
[43] [44]
[45]
[46] [47]
[48]
H. Leung, T. Lo, and S. Wang, “Prediction of noisy chaotic time series using an optimal radial basis function neural network,” IEEE Trans. Neural Networks, vol. 12, pp. 1163–1172, Sept. 2001. J. T. Conner, D. Martin, and L. E. Atlas, “Recurrent neural networks and robust time series prediction,” IEEE Trans. Neural Networks, vol. 5, pp.240–253, Mar. 1994. P. A. Mastorocostas and J. B. Theocharis, “A recurrent fuzzy-neural model for dynamic system identification,” IEEE Trans. Neural Networks, vol. 32, pp. 176–190, Mar. 2002. R. Geqay and T. Liu, “Nonlinear modeling and prediction with feedforward and recurrent networks,” Physica D, vol. 108, pp. 119–134, 1997. J. Wang and S. Hu, “Global asymptotic stability and global exponential stability of continuous-time recurrent neural networks,” IEEE Trans. AutomaticControl, vol. 47, pp. 802–807, May 2002 Ma J., Perkins S., “Time-series novelty detection using one-class support vector machines”, Proceedings of the International Joint Conference on Neural Networks, 20-24 July 2003, pp. 1741 – 1745. Dasgupta D., Forrest S., “Novelty detection in time series data using ideas from immunology”, In Proceedings of the 5th International Conference on Intelligent systems, Reno, Nevada, June 19-21, 1996. Keogh E., Lonardi S., Chiu W., “Funding surprising patterns in a time series database in linear time and space”, Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data mining, Edmonton, Canada, July 23-26, 2002, pp. 550-556. Guralnik V., Srivastave J., “Event detection from time series data”, In Proceedings of the International Conference on Knowledge Discovery and Data Mining, San Diego, Ca, 1999. Singh S., Markou, M., “An approach to novelty detection applied to the classification of image regions”, IEEE Transactions on Knowledge and Data Engineering, Vol. 16, No. 4, April 2004, pp. 396 – 407. L.Reznik, M.Negnevitsky, C.Hoffman “Contents based security enhancement in sensor networks protocols”, CIHSPS2004 - IEEE International Conference on Computational Intelligence for Homeland Security and Personal Safety, Venice, Italy, 21-22 July 2004 Vesanto J. and E. Alhoniemi, “Clustering of the self-organising map”, IEEE Trans. Neural Networks, vol. 11, no. 3, 2000, pp. 586-600. H. Sohn H. , K. Worden, and C.R. Farrar, “Novelty detection using auto-associative neural network”, Proc. Symp. Identification of Mechanical Systems: Int’l Mechanical Eng. Congress and Exposition, 2001. Streifel R., R.J. Maks, and M.A. El-Sharkawi, “Detection of shortedturns in the field of turbine-generator rotors using novelty detectors— development and field tests”, IEEE Transactions on Energy Conversation, vol. 11, no. 2, 1996, pp. 312-317. Tarassenko L., A. Nairac, N. Townsend, and P. Cowley, “Novelty detection in jet engines”, Proc. IEEE Colloqium Condition Monitoring, Imagery, External Structures and Health, 1999, pp. 41-45. Y. Li, M.J. Pont, and N.B. Jones, “Improving the performance of the radial basis function classifiers in condition monitoring and fault diagnosis applications where ‘Unknown’ Faults May Occur”, Pattern Recognition Letters, vol. 23, 2002, pp. 569-577. J.A. Parikh, “A Comparative “Study of cloud classification strategies”, Remote Sensing of the Environment, vol. 6, 1977, pp. 67-81.