Learning Sensor Data Characteristics in Unknown ... - CiteSeerX

0 downloads 0 Views 942KB Size Report
Sanjay Jha [email protected] ... As we begin to ... In [2] we outlined that sensor data constitutes a major part of ... amount of training time depending on the characteristics of the ... winning neuron ¢1 , F"653( is a input vector of sensors readings ... 0A "65R QT SU(W VT 0A "653(R QY X` %' "6 4"653(W Ca 0A "$53(3(. (2).
Learning Sensor Data Characteristics in Unknown Environments. Tatiana Bokareva [email protected] The University of NSW

Nirupama Bulusu [email protected] Portland State University

Abstract— Ad hoc Wireless Sensor Networks derive much of their promise from their potential for autonomously monitoring remote or physically inaccessible locations [3]. As we begin to deploy sensor networks in real world applications [8], concerns are being raised about the fidelity and integrity of the sensor data. In this paper, we motivate and propose an online algorithm that leverages a Competitive Learning Neural Network (CLNN) for characterization of a dynamic, unknown environment. Based on the proposed characterization sensor networks can autonomously construct multimodal views of their environments and derive the conditions for verifying data integrity over time.

I. I NTRODUCTION For widespread adoption of sensor technology, robust and high-integrity operation of the sensor network is of paramount importance. This is challenging because sensor networks are often Ad-Hoc, deployed in unknown environments with limited knowledge of the phenomena being observed. In previous work, we have made the case for a self-healing sensor network architecture called SASHA[2], that is inspired by the Natural Immune System (NIS). The success of the NIS largely lies in its ability to learn to distinguish between hosts and foreign cell in our bodies, known as self and non-self sets respectively. In [2] we outlined that sensor data constitutes a major part of the self-set in a sensor network. In this paper, we propose a technique for the construction of such a self-set that leverages the Competitive Learning Neural Network (CLNN). CLNN uses a clustering procedure for detecting natural groupings in data, which we call the multimodal view of data. The ability of detecting such groupings is an important functionality of Wireless Sensor Networks (WSN). First of all, in unknown environments, distinguishing normal and anomalous behavior of sensor readings is a difficult task, mainly because of the lack of knowledge of normal data behavior. Second, individual sensor devices have very limited computational capacity which bounds the amount of computation that can be performed on these devices. The approach for checking sensor data integrity must scale to large networks, minimize the energy overhead required to communicate sensor data, and adapt to system and environmental dynamics, such as node failures and the natural changing environment. To address the first problem, we propose to leverage CLNN for the construction of a probabilisitic and multimodal view of the sensor environment. To address the second problem, we exploit in-network processing in a hierarchical sensor network,

Sanjay Jha [email protected] The University of NSW

where the classification of sensor readings is performed at cluster-heads or also known as monitoring nodes in SASHA rather than on a centralized base station. Figure 1 shows the generic architecture of a hierarchical sensor network where each node is communicating its sensor data to a closest cluster head. The cluster head classifies the input data and performs most of the signal processing before forwarding the data to a base station. The strength of using CLNN is that it does not require us to collect a priori data for inference of normal data behaviors, making it suitable for large scale, remotely deployed sensor networks. Furthermore, in the past similar procedures were successfully used for digital image analysis as a classifier of multimodal data[9][10]. Their ability of deducing underlying patterns in sensor data is a very important aspect for the classification of sensor readings as the number of such patterns is generally not known in advance. Another advantage is that clustering techniques can reduce the amount of information to be transmitted over the radio which has been identified as a major source of energy consumption. Further, leveraging the hierarchical structure of a sensor network can reduce the number of hops that this information needs to travel through, which can potentially lead to further energy savings. However, the reduction in the information space requires higher preprocessing of data. Another drawback is that CLNN suffers from the usual problems associated with equivalent learning procedures. It relies on parametric assumptions about underlying structure of data classes and it require a large amount of training time depending on the characteristics of the phenomena being observed. It also lacks significant theoretical properties and is very heuristic in nature. Nevertheless, we believe that CLNN is well suited for the classification of sensor readings especially if the undelying physical process is not known. Paper Contributions This paper motivates and proposes an unsupervised learning procedure for modeling inherent classes of sensor data based on which we can define a framework for data integrity verification. We motivate the use of data clustering as a building block for modeling of sensor data. This approach distinguishes high confidence from low-confidence clusters by assigning an additional probability to each cluster, while preserving the fine-grained features within the data. We have evaluated our approach using physical sensor read-

Base Station

neurons. Once a winner is selected, the corresponding vector of weights  )*#! is shifted toward the input vector .#! .

Stargate

 ) $#43657986 ) $#!43;:

Suggest Documents