Abstract- Wireless Sensor Networks (WSN) are often ... Detection of node
replication attack is therefore ... of distributed mechanisms for detection of
replication.
2006 IEEE International Conference on Systems, Man, and Cybernetics October 8-11, 2006, Taipei, Taiwan
Requirements and Open Issues in Distributed Detection of Node Identity Replicas in WSN Mauro Conti, Roberto Di Pietro, Luigi V. Mancini, and Alessandro Mei Abstract- Wireless Sensor Networks (WSN) are often deployed in hostile environments, where an attacker can also capture some nodes. Once a node is captured, the attacker can re-program it and start replicating the node. These replicas can then be deployed in all (or a part of) the network area. The replicas can thus perform the attack they are programmed for: DoS (Denial of Service), or influencing any voting mechanism are just examples. Detection of node replication attack is therefore a fundamental property of all the WSN applications in which an attacker presence is possible. The contribution of this paper is twofold: First, we analyze the desirable properties of a distnrbuted mechanism for the detection of replicated IDs; second, we show that the first proposal recently appeared in literature to realize a distributed solution for the detection of replicas does not completely fulfil the requirements. Hence, the design of efficient and distributed protocols to detect node identity replicas is still an open and demanding issue. I. INTRODUCTION
A Wireless Sensor Network (WSN) is a collection of sensors with limited resources that collaborate to achieve a common goal. A WSN can be deployed in harsh environments to fulfill both military and civil applications [1]. WSNs are often unattended, hence prone to different kinds of novel attacks. For instance, an attacker could eavesdrop the exchanged messages and capture nodes acquiring all the informations stored in the devices (sensors are assumed not tamper proof [1]). Further, the attacker could clone captured nodes and create multiple nodes with the same identity. The clones could then be deployed in the network area This work was partially supported by the WEB-MINDS project from the Italian MIUR under the FIRB program. Roberto Di Pietro is partially supported also by CNR-ISTI, Pisa, in the framework of the "SatNEx-Il" NoE project (contract N. 27393). Mauro Conti, Luigi V. Mancini and Alessandro Mei are with the Dipartimento di Informatica of Universita di Roma "La Sapienza", Italy (e-mail: {conti, mancini, mei}@di.uniromal.it). Mauro Conti is the corresponding author (phone: +39-06-49918421; fax +39-068541 842). Roberto Di Pietro is with the Dipartimento di Matematica of Universita di Roma Tre, Italy (e-mail:
[email protected]).
1-4244-0100-3/06/$20.00 C2006 IEEE
and, for instance, subvert the data aggregation or the decision making in the network if based on some voting mechanism [4], [10], [11], [14]. A similar attack, the sybil attack [10], [14], consists by claiming multiple existing identities stolen from corrupted nodes. Sybil and clone attacks will result in identity theft. While the former can be efficiently addressed with mechanism based on RSSI (Received Signal Strength Indicator) [7] or with authentication based on the knowledge of a fixed keys' set [4], [5], [6], [8], efficient detection of clone attacks are actually an open issue. To the best of our knowledge, only centralized or localized protocols were proposed: while the first ones have a single point of failure, the second ones might not detect replicated nodes distributed in different area of the network. In this paper we analyze the desirable properties of distributed mechanisms for detection of replication attacks. Further, we analyze the first protocol for distributed detection, recently proposed in [15], and show that the protocol does not match the identified requirements. Hence, the efficient and distributed detection of node replication attack remains an open issue. The remainder of this paper is organized as follows. Next section reviews related work; Section III illustrates the threat model assumed in the paper; Section IV discusses the requirements of the protocols for detection of node identity replicas in WSNs; in Section V we show some experimental results on the protocol proposed in [15] and we compare these results with our requirements. Finally, Section VI presents some concluding remarks. II. RELATED WORK
One of the first solutions for replicated node detection in WSNs relies on a centralized base station [11]. In this solution, each node can send a list of its neighbors and their claimed location to a Base Station. The same entry in two lists sent by nodes that are not "close" to each other will result in a replica detection. Then, the BS will revoke the replicated nodes. This solution has several drawbacks, for instance: Single
1468
point of failure (BS), and high communication cost due to the relevant number of exchanged messages. Other proposed solutions rely on local detection [4]; using localized voting mechanism, a set of neighbors can agree on the replication of a given node that has been replicated within the neighborhood. However, this kind of method fails to detect replicated nodes that are not within the same neighborhood. A naive distributed solution for node replication attack use the Node-To-Network Broadcasting [15]. Each node floods the network with a message containing its location information and compares the received location information with that of its neighbors. If a neighbor sW of node Sa receives a location claim that the same node Sa is in a position not coherent with the position of Sa detected by sw, this will result in a clone detection. However, this method is quite energy consuming since each node requires the transmission of 0(n) messages. To the best of our knowledge the first globally-aware distributed node-replication detection solution was recently proposed in [15]. In particular, two distributed detection protocols leveraging emergent properties [12] were proposed. The first one, the Randomized Multicast (RM), distributes node location information to randomly-selected nodes. The second one, the LineSelected Multicast (LSM), uses the routing topology of the network to detect replication. In the RM, when a node announces its location, each of its neighbors sends (with probability p) a digitally signed copy of the location claim to a set of randomly selected nodes. If every neighbor selects a given number of claim's destinations, that is O( /n), exploiting the birthday paradox [13], with high probability at least one node, the witness, will receive a pair of not coherent location claims (that is, a node is detected in two different locations in the same time-frame). The RM Protocol implies a high communication costs: Each neighbor has to send O(Vn) messages. To solve this problem the authors propose the LSM Protocol. In the LSM Protocol, when a node announces its location, every neighbor forward this location claim with probability p (with probability 1 - p no operations will be performed). If the neighbor forwards the claim, it randomly selects a fixed number g of destination nodes and sends the signed claim to all the destination nodes. Moreover, every node that routes this claim message will store the message and will check the coherence with the other location claims received within the same iteration of the detection protocol. If, during a check,
the same node Sa is present with at least two noncoherent locations, the witness will trigger a revocation protocol for node sa, III. THREAT MODEL
We devise a simple yet powerful attacker: before a round of the replica detection protocol is invoked, the attacker can compromise a certain fix amount of sensors. To cope with this threat, it could be possible to assume that sensors are tamper-proof. However, consistently with a large part of the literature, we will assume that sensors do not have tamper proof components and that they can be captured. The attacker goal is to prevent the sensors under its control that have been replicated from being detected. Hence, we assume that the attacker will try to subvert those sensors that will possibly act like witnesses. To formalize the attacker model, we introduce the following definition. Definition 3.1: Assume that the attacker goal is to subvert the distributed detection protocol by compromising a possibly small subset T of the sensors. The attacker has already compromised a set of sensors W2, while AJ is the initial set of sensors in the WSN. For every sensor s in the WSN, the sensor appeal S(s) returns the probability that s C AQ\W is a witness for the next round. We define two attackers, both of which tamper with sensors sequentially: 1) The oblivious attacker: at each step of the attack sequence, the next sensor to be tampered with is chosen randomly among the ones that have yet to be compromised; 2) the smart attacker: at each step of the attack sequence, the next sensor to be tamper with is sensor s, where s maximises S(s), s C A\F\W. Intuitively, the oblivious attacker does not take advantage of any information about the protocol used by the network. Conversely, the smart attacker greedily chooses which sensor to corrupt (the one that maximizes its appel) in order to maximize its chance for its replicas to go undetected. IV. REQUIREMENTS FOR DISTRIBUTED DETECTION A. Witnesses distribution A main problem in devising a protocol to detect replica attack is the selection of witnesses. Indeed, assume that the witness could be identified by the attacker before the detection takes place. In this case, it is possible to imagine that the attacker could subvert these nodes, and the attack would go undetected.
1469
Reasoning about the information according to which the attacker could pre-compute for a generic sensor s the probability S(s), we have identified the following information: * ID-based prevision; * area-based prevision. We will say that a protocol for replica detection assures ID obliviousness if that protocol does not provide any hint on which will be the witness provided the public parameters of the protocols and the identity of the sensors in the networks. To introduce the concept of area based prevision (that is, geographical localization), assume that the probability S(si) depends on the geographical position of sensor si within the network. In this case, the attacker can concentrate its effort on a subset of the sensors, based on their position in the network area. We can thus introduce the concept of area obliviousness.. A protocol is area oblivious if probability S(s)), for every s e .A\\W, does not depend on the geographical position of sensor si in the network.
B. Overhead Designing protocols for WSNs is a challenging task due to the resource constraints sensors are subject to: Any protocol is required to generate little overhead. However, this requirement alone is not enough. Indeed, assume that a detection protocol requires to route a total of O(na/iF) messages, that is O(V/W) messages per node on the average. Even if this overhead seems reasonable, we must also take care that no sensor in the network has an overhead that is much larger than the average. That is: overhead must be evenly distributed among the sensors. Indeed, if there is a bunch of sensors that have to forward c un messages, (with c > 1), these sensors will run out their batteries at least c times faster than the sensors that are only required to route mE messages. A more subtle consideration can be expressed for local memory. If the required memory to run the protocol can be considered acceptable as order of magnitude, it is important to assess whether some sensors could exceed the memory available. Further, it is fundamental for the correctness of the Protocol to assess what are the consequences to exceed the memory available on sensor. For instance, a sensor could experience a failure, or could drop just the messages that do not fit the available memory. In both cases, it would be required to assess the impact on the expected properties of the protocol (that is, it effectiveness in detecting replicas).
Hence, we can synthesize the above considerations expressing the general requirement that the overhead generated by the protocol should be small, that is sustainable by the WSN as a whole, and (almost) evenly
shared among sensors. Just to make a real example, in the LSM protocol every sensor that forwards a position claim should also store the forwarded messages. As analyzed in [15] every line-segment is of length O(Vn) and every node stores O(Vn) location claims. Note that this memory requirement could be impractical in real network with thousands of nodes. Table I shows in the first row the asymptotic overhead analysis of one round of the LSM Protocol, while the second row reports the averaged overhead generated by one round of the LSM Protocol, for a network of 1,000 sensors with sensing radius r = 0.1 (31 neighbors on average), p - 0.1 and g = 1. Finally, the third row highlights the maximum overhead experienced by a sensor. Detailed discussion of the overhead generated and compliance with the requirements above described are reported in next section.
L
Memory Occupancy Asymptotic Average Max
O(\W) 17.27 195
Sent
Messages I0(g9 .n) 22.08 226
I O(g.- uI-) I Received .
Messages 49.86 255
TABLE I LSM OVERHEAD
V. EVALUATION OF THE STATE OF THE ART To highlight the difficulties to cope with when addressing the issues described in Section IV-A, we analyze the state of the art protocol for distributed replicas detection, that is the proposal in [15] introduced in Section II. We have simulated the LSM in order to verify its compliance to the requirements introduced in Section IV. In the following simulation we assumed the unit square deployment area [2], [3], [9]. The LSM protocol is ID -oblivious due to the randomization technique adopted. In order to assess the area - obliviousness, we studied the witness distribution as follows: We selected increasing sub-areas of the network, and for each are we counted the number of witnesses present in that area after a run of the detection protocol. Each sub-area from the center of the unit-square towards the external border provided an increment of the 5% of the total area. Hence, 20 sub-areas were considered, as illustrated in Figure 1.
1470
Fig. 1. Example of sensors deployment and 5% incremental areas. n=1000
Fig. 2. Example of LSM Protocol iteration: n=1000, r=O.], g=J,
p=O.I
In Figure 2 the result of a LSM protocol iteration is shown: the filled large circles indicate the cloned sensors, the filled small circles indicate sensors that route claim of cloned sensors and finally the circles indicate the witnesses. From this figure it is possible to note there are. many routing sensors, and most of these are concentrated in the center of the network. Figure 3 reports the percentage of witnesses present in the incremental sub-areas obtained moving from the first one (the inner sub-area) towards the border. It is interesting to note that the central area corresponding to the 20% of all the area network (Al) collects the 0.4 0.6 % of network's areas (concentric square) 49.09% of all the witnesses, while in the most external area corresponding to the 20% of the area network (A2), contains only 1.75% of all the witnesses. The Fig. 3. Witness density: n=1000, r=O.I, g=1, p=O.] LSM is therefore not area-oblivious, since S(si) > S(sj) for an si selected from Al and sj selected from A2. of the nodes to store more than 60 messages, some In order to evaluate the distribution of the storage 6.5% of sensors to store a number of messages between requirements among nodes, Figure 4 reports the number 40 and 59, some 25.8% of sensors to store a number of messages sensors are required to store for the LSM of messages between 20 and 39. Since each message protocol. For a fixed x-value of messages in memory, carries some 512 bits (a digital signature and the list we show the percentage of the sensors that need to store of neighbors), some 1.5% of the sensors would require that number of messages. The values were obtained more than 512*60=30,720 bits. Note that the Mica2 averaging the result of 10,000 simulations. Note that motes can only provide 4KB of RAM [1], that is more some sensors could require to store as many as 200 than the 92% of the memory would be dedicated only messages. We decided not to report the values exceed- to store messages related to the detection protocol. ing the 100 messages to store. Despite this fact, Figure As for the computational overhead, note that any 4 shows that the LSM Protocol requires some 1.5% message stored requires a digital signature verification.
1471
U lu
I
LSM Protocol
+
80
60
/~~~~~~~14 X
40
7 A
20~ "I
0
0
50
number of messages in the sensor's memory
100
150
200
Iterations
Fig. 4. Used memory: n=JOOO, r=O.I, g=], p=O.I
Fig. 5.
Exhausted nodes in different iterations: n=1000, r=O.],
g=], p=O.l 100
The number of signatures required is proportional to the number of messages stored, showed in Figure 4. Further, also the number of messages sent is proportional to the number of messages received. Note that transmission is a quite a battery consuming operation [16]. These considerations indicate that the implementation of LSM does not match the requirement of balancing its overhead (almost) evenly among nodes; even if the overall overhead can be asymptotically considered acceptable (see Table I). In particular, Figure 5 shows the battery exhaustion related to the execution of LSM. After 100 iterations, for the LSM protocol there are some 20% of exhausted nodes. After 150 iterations, the LSM shows some 40% of exhausted nodes. Finally, after 200 detection protocol iterations LSM shows some 50% of exhausted nodes. It is also interesting to note the sensors exhaustion distribution in the network area. Figure 6 shows the exhausted nodes distribution after 200 protocol iterations. The x-axis indicates the area intervals the network area is divided into (as plotted in Figure 1), numbered sequentially from the inner one to the external one. The y-axis indicates the percentage of exhausted sensors in the considered area. It is interesting to note the little increasing in exhausted nodes percentage for the areas closer to the center (from the first to the fifth one). We explain this particular behaviour as follows: after a certain number of protocol iterations some nodes in the central area became isolated, even if it is not exhausted. The same phenomenon happens with increasing degree of mitigation, moving form the
LSM Protocol
880
6
4
2
i
5
10 Areas
15
20
Fig. 6. Exhausted nodes distribution after 200 iterations: n=1000, r=0.1, g=], p=O.l
central area (area 1) to the border. The sensors exhaustion starts after some 50 iterations; at the same iteration number, the detection probability starts decreasing. The experiments reported in this section indicate that the implementation of LSM does not match the requirement of balancing its overhead (almost) evenly among nodes; even if the overall overhead can be asymptotically considered acceptable (see Table I). VI. CONCLUDING REMARKS In this paper we presented a few basic requirements an ideal protocol for distributed detection of node replicas should have. In particular, we have introduced
1472
the preliminary notion of ID-obliviousness and areaobliviousness that convey a measure of the quality of the node identity replicas detection algorithm; that is its resilience to an active attacker. Moreover, we have indicated that the overhead of such a protocol should be not only small, but also evenly distributed among nodes, otherwise the protocol itself could sensibly impact: On the network life as for the energy required by the number of exchanged messages and the computations performed; on the effectiveness of the protocol itself if the memory requirements exceed the storage available to the sensor. Finally, we have analyzed the state of the art solution for node identity replicas detection, and we have shown that the proposed solution does not completely fulfil the issues above described. Open research directions are: complete characterization of the two notions of obliviousness provided together with a refinement of the appeal function; devising a protocol for node identity replicas detection compliant with the indicated requirements. REFERENCES [1] I. F. Akyildiz, W. Su, Y. Sankarasubramaniam, and E. Cayirci. Wireless sensor networks: a survey. International Journal of Computer and Telecommunications Networking - Elsevier, 38(4):393-422, March 2002. [2] C. Bettstetter. On the minimum node degree and connectivity of a wireless multihop network. In Proceedings of the 3rd ACM International Symposium on Mobile Ad Hoc Networking and Computing (MobiHoc '02), pages 80-91, 2002. [3] C. Bettstetter and C. Hartmann. Connectivity of wireless multihop networks in a shadow fading environment. In Proceedings of the 6th ACM International Workshop on Modeling, Analysis and Simulation of Wireless and Mobile Systems (MSWiM '03), pages 28-32, 2003. [4] H. Chan, A. Perrig, and D. Song. Random key predistribution schemes for sensor networks. In Proceedings of IEEE S&P '03, pages 197-213, 2003.
[5] M. Conti, R. Di Pietro, and L. V. Mancini. Ecce: Enhanced cooperative channel establishment for secure pair-wise communication in wireless sensor netwokrs. Journal of Ad Hoc Networks - Elsevier, to appear. [6] M. Conti, R. Di Pietro, and L. V. Mancini. Secure cooperative channel establishment in wireless sensor networks. In Proceedings of the Fourth Annual IEEE International Conference on Pervasive Computing and Communications Workshops (PERCOMW '06), pages 327-331, Washington, DC, USA, 2006. IEEE Computer Society. [7] M. Demirbas and Y. Song. An rssi-based scheme for sybil attack detection in wireless sensor networks. In Ist workshop on advanced EXPerimental activities ON WIRELESS networks and systems (EXPONWIRELESS 2006), pages 564-570, 2006. [8] R. Di Pietro, L. V. Mancini, and A. Mei. Energy efficient node-to-node authentication and communication confidentiality in wireless sensor networks. Wireless Networks, in Press, Corrected Proof, Available Online, May 2006. [9] R. Di Pietro, L. V. Mancini, A. Mei, A. Panconesi, and J. Radhakrishnan. Connectivity properties of secure wireless sensor networks. In Proceedings of ACM SASN '04, pages 53-58, 2004. [10] J. R. Douceur. The sybil attack. In Proceedings of the Ist International Workshop on Peer-to-Peer Systems (IPTPS '01), pages 251-260. Springer, 2002. [11] L. Eschenauer and V. D. Gligor. A key-management scheme for distributed sensor networks. In Proceedings of ACM CCS '02, pages 41-47, 2002. [12] V. D. Gligor. Emergent properties in ad-hoc networks: a security perspective. In Proceedings of the 4th ACM workshop on Wireless security (WiSe '05), page 55, New York, NY, USA, 2005. ACM Press. [13] A. J. Menezes, S. A. Vanstone, and P. C. V. Orschot. Handbook of Applied Cryptography. CRC Press, Inc., 1996. [14] J. Newsome, E. Shi, D. Song, and A. Perrig. The sybil attack in sensor networks: analysis & defenses. In Proceedings of ACM IPSN'04, pages 259-268, 2004. [15] B. Parno, A. Perrig, and V. D. Gligor. Distributed detection of node replication attacks in sensor networks. In Proceedings of IEEE S&P '05, pages 49-63, Washington, DC, USA, 2005. [16] A. Wander, N. Gura, H. Eberle, V. Gupta, and S. C. Shantz. Energy analysis of public-key cryptography for wireless sensor networks. In PerCom, pages 324-328, 2005.
1473