The 18th Annual IEEE International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC'07)
DERIVING EFFICIENT MOBILE AGENT ROUTES IN WIRELESS SENSOR NETWORKS WITH NOID ALGORITHM Aristides Mpitziopoulos Department of Cultural Technology and Communication, University of the Aegean Mytilene,Greece
[email protected]
Damianos Gavalas Department of Cultural Technology and Communication, University of the Aegean Mytilene,Greece
[email protected]
Charalampos Konstantopoulos Research Academic Computer Technology Institute and Dept. of Computer Engineering and Informatics, University of Patras, 26500 Rio, Greece
[email protected]
ABSTRACT In this article, we consider the problem of calculating an appropriate number of near-optimal (subject to a certain routing objective) routes for mobile agents (MAs) that incrementally fuse the data as they visit the nodes in a distributed sensor network. We propose an improved heuristic algorithm which computes an approximate solution to the problem by suggesting an appropriate number of MAs and constructing near-optimal itineraries for each of them. The performance gain of our algorithm over alternative approaches both in terms of cost and task completion latency is demonstrated by a quantitative evaluation and also in simulated environments through a Java-based tool. I. INTRODUCTION Data fusion methods for distributed Wireless Sensor Networks (WSNs) have been the target of active research during the last years. Data fusion applications cover areas like environment monitoring, automatic target tracking, battlefield surveillance, remote sensing, global awareness, etc [1]. In conventional data fusion approaches, the traditional client/server computing model is used i.e., all the sensor data is sent to a central location (processing element) where it is fused. However, the transmission of large volumes of noncritical sensor data leads to consuming of scarce network resources such as battery power and bandwidth. Thus, Qi et al [7] introduced the use of mobile agents (MAs) in WSNs for data fusion tasks as an alternative to the traditional client/server approach. The mobile agents selectively visit the sensors and incrementally fuse the appropriate data. In this paper, we consider the problem of calculating near-optimal routes for MAs that incrementally fuse the data as they visit the nodes in a distributed sensor network. We propose and experimentally evaluate the use of an improved heuristic that suggests an appropriate number of MAs and constructs nearoptimal itineraries for each one of them. The remainder of the paper is organized as follows: Section II reviews works related to our research. Section III discusses the design and functionality of our heuristic algorithm. Section IV presents the evaluation of the algorithm through simulation tests while Section V concludes the paper and presents future directions of our work. II. RELATED WORK MAs have been proposed in a variety of applications, including e-commerce, network management, information retrieval, etc [6]. WSN environments form a promising 1-4244-1144-0/07/$25.00 ©2007 IEEE.
Grammati Pantziou Department of Informatics, Technological Educational Institution of Athens Athens, Greece
[email protected]
application area for MAs; yet, they pose new challenges as the link bandwidth is typically much lower than that of a wired network and sensory data traffic may even exceed the network capacity. To solve the problem of the overwhelming data traffic, [7] and [8] proposed the use of MAs for scalable and energy-efficient data aggregation. By transmitting the software code (MA) to sensor nodes, a large amount of sensory data may be filtered at the source by eliminating the redundancy. MAs may visit a number of sensors and progressively fuse retrieved sensory data, prior to returning to the PE to deliver the data. This scheme proves more efficient than traditional client/server model, wherein row sensory data are transmitted to the PE where data fusion takes place. MAs have been proposed for enabling dynamically reconfigurable WSNs [10], in multi-resolution data integration and fusion [7], etc. These applications involve the usage of multi-hop MAs visiting large numbers of sensors. The order in which those sensors are visited (i.e., MAs itinerary) is a critical issue, seriously affecting the overall performance. Randomly selected routes may even result in performance worse than that of the conventional client/server model; yet, that issue is not addressed in these works. To the best of our knowledge, only [8] and [11] deal with the problem of designing optimal MA itineraries in the context of WSNs. In [8], Qi and Wang proposed two heuristic algorithms to optimize the itinerary of MAs performing data fusion tasks. In Local Closest First (LCF) algorithm, each MA starts its route from the PE and searches for the next destination with the shortest distance to its current location. In Global Closest First (GCF)algorithm,MAsalso start their itinerary from the PE node and select the node closest to the center of the surveillance region as the next-hop destination. The output of LCF-like algorithms highly depends on the MAs original location, while the nodes left to be visited last are associated with high migration cost [5]; the reason for this is that they search for the next destination among the nodes adjacent to the MA’s current location, instead of looking at the ‘global’ network distance matrix. On the other hand, GCF produces in most cases messier routes than LCF and repetitive MA oscillations around the region center, resulting in long route paths and undesirable performance [8][11]. Wu et al proposed a genetic algorithm-based solution to the problem [11]. Although providing superior performance (lower cost) than LCF and GCF algorithms, this approach implies a time-expensive optimal itinerary calculation (genetic algorithms typically start their execution with a random solution ‘vector’ which is improved as the execution progresses), which is unacceptable for time-critical applications, e.g. in target location and tracking. Also, in such
The 18th Annual IEEE International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC'07)
applications, the group of visited sensor nodes (i.e. those with maximum detected signal level) is frequently changed over time depending on target’s movement; hence, a method that guarantees fast adaptation of MAs itinerary is needed. Most importantly, both the approaches proposed in [8] and [11] involve the use of a single MA object launched from the PE station that sequentially visits all sensors, regardless of their physical location on the plane. Their performance is satisfactory for small WSNs; however, it deteriorates as the network size grows and the sensor distributions become more complicated. This is because the MA’s round-trip delay increases linearly with network size, while the overall migration cost increases exponentially as the traveling MA accumulates into its state data from visited sensors [3]. The growing MA’s state size not only results in increased consumption of the limited wireless bandwidth, but also consumes the limited energy supplies of sensor nodes. Our algorithm has been designed on the basis of three objectives: (a) MA itineraries should be derived as fast as possible and adapt quickly to changing networking conditions (b) visited sensors energy consumption should be minimized, (c) The number of MAs involved in the data fusion process should depend on the number and the physical location of the sensors to be visited; the order an MA visits its assigned nodes should be computed in such a way as to minimize the overall migration cost. The initial ideas behind our approach have been presented in [4]. However, an improved algorithm is presented herein as well as the computational complexity, the systematic experimental evaluation and fine-tuning of the parameters of this algorithm ([4] lacked any simulation results). III.
THE NEAR – OPTIMAL ITINERARY DESIGN (NOID) ALGORITHM
A Wireless Sensor Network (WSN) is represented by a complete graph G=(V,E), |V|=n, where each node i in V corresponds to a sensor Si, and each edge (i,j) in E corresponds to a communication link between the sensors Si and Sj . S0 corresponds to the PE. Each link (i,j) is associated with a cost ci,j, which is a function of the path loss of the link (i,j), (defined as the difference - in dB- between the effective power transmitted by Si and the power received by Sj and is a function of the physical distance between Si and Sj) as well as the transmitting power and the signal energy detected by the node Si. The Mobile Agent Routing (MAR) problem asks for a path (itinerary) in a WSN, that optimizes a certain routing objective. The overall routing objective is to maximize the sum of the signal energy received at the visited sensors while minimizing the energy consumption (power needed for communication) and the path losses [11]. The MAR problem is NP-complete [11] while approximate solutions to the problem are given by heuristic approaches ([8], [11]), as it is discussed in the previous section . Let us consider an extension of the problem where given a WSN, with S = {S0, S1, …, Sn-1} its set of sensors and C={ ci,j | (i,j) Є E} its cost matrix, instead of one itinerary, we ask for a set of near-optimal itineraries I = {I0, .., Ik}, all originated and terminated at the PE (node S0), such that the sum of the costs of the itineraries in I is minimized. The total cost per polling interval over all itineraries is defined as:
I
ctotal =
I i −1
∑ ∑ (d i =1 j = 0
j
+ s) ∗ c j
(1)
where dj is the amount of data collected by the ith MA on the first j visited sensors, s the MA initial size and cj the cost of utilizing the link (k,l) traversed by the MA on its jth hop, i.e., the wireless link connecting sensors Sk and Sl (cj=ck,l is given by the network cost matrix). Therefore, the extended MAR problem asks for a set of itineraries I minimizing the cost function of equation (1). Interestingly, our extended MAR problem exhibits some similarities with the Multi-point Line Topology or the Constrained Minimum Spanning Tree (CMST) problem. A CMST is a MST with the additional constraint on the size of the sub-trees rooted on the ‘center’. In the CMST problem, the objective is the optimal selection of the links connecting terminals to concentrators or directly to the network center, resulting in the minimum possible total cost. The CMST problem is NP-hard and as a result, several heuristics have been proposed to efficiently deal with it [5]. The output of an CMST algorithm typically comprises topologies partitioned on several multi-point lines (or tree branches), where groups of terminals share a sub-tree to a specific node (center). Substituting the terms ‘network center’, ‘link’ and ‘multipoint-line’ with the terms ‘processing element’, ‘migration’ and ‘itinerary’ respectively, and following the observation that the output of CMST algorithms resembles a group of itineraries all originated at the PE, the similarity of CMST and MA itinerary planning problems comes up although the cost functions and the routing objectives are different. Based on these observations, in [4] we proposed a new heuristic for the solution of the extended MAR problem, called NOID (Near – Optimal Itinerary Design) algorithm. NOID adapts some basic principles of Esau-Williams (E-W) heuristic for the CMST problem [2] in the specific requirements of our itinerary planning problem. In the sequel of this article we present the improved NOID algorithm as well as its experimental evaluation. The NOID algorithm takes into account the amount of data accumulated by MAs at each visited sensor (without loss of generality, we assume this is a constant d). Note that this is a factor ignored by the LCF and GCF heuristics for solving the MAR problem. Namely, NOID recognizes that travelling MAs become ‘heavier’ while visiting sensors without returning back to the PE to deliver their collected data. Therefore, NOID restricts the number of migrations performed by individual MAs, thereby promoting the parallel employment of multiple cooperating MAs, each visiting a subset of sensors. Specifically, the aim of NOID algorithm is, given a set of sensors S = {S0, S1, …, Sn-1}, the PE node S0 and the cost matrix C, to return a set of near-optimal itineraries I = {I0, .., Ik}, all originated and terminated at the PE. Initially, we assume S = n itineraries (as many as the WSN sensors) I0, .., In-1, each containing a single sensor (S0, S1, …, Sn-1, respectively). On each algorithm step, two sensors Si and Sj are ‘connected’ and, the itineraries including these hosts (I(i) and I(j), respectively) are merged into a single itinerary. LCF and GCF algorithms usually fail as they tend to leave hosts located far from the center stranded since they prioritize the inclusion of hosts closed to last selected node or the
The 18th Annual IEEE International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC'07)
center. As a result, relatively expensive links are left last to be included in the solution, significantly increasing the overall cost. A way of dealing with this problem is to pay more attention to sensors far from the center, giving preference to links incident upon them. The NOID algorithm accomplishes this by using the concept of ‘tradeoff function’ ti ,j associated with each link (i, j), defined by:
t i , j = ci , j + pi , j +
I (i ) + I ( j )
∑[ f ⋅ k ⋅ d ] − C k =1
i ,S0
(2)
On each algorithm’s iteration, the itineraries, that include the pair of sensors Si and Sj with the minimum tradeoff function value ti,j, are merged into one. The concept of the tradeoff function is introduced in E-W algorithm, defined as follows: t i , j = ci , j − C i , S 0 . Equation (2) extends and adapts this function in the specific requirements of agent itinerary planning problem. The main idea behind this equation is that the more nodes an itinerary already includes, the more difficult for a new host to become part of that itinerary, especially when d is large. In particular, the inclusion of a parameter representing the amount of data collected from the previous hosts ( k ⋅ d ) and also the number of sensors already included in the itineraries considered for merging, i.e. I (i) and I ( j ) , obstructs the construction of large itineraries, thereby promoting the formation of multiple itineraries, assigned to separate MAs. The coefficient f represents the filtering applied upon the data collected from each sensor from the data fusion algorithm. Finally, pi.j is a penalty coefficient defined as follows: (3)
The penalty coefficient dismotivates the creation of short itineraries, i.e., it encourages so-far created itineraries to merge rather than be connected directly to the PE. The choice of the value s for pi.j in the case that S j ≡ So denotes the cost burden for connecting an itinerary directly to the PE rather than to another itinerary. It is noted that equation (2) has been fine-tuned compared to that presented in [4]; the latter did not include the filtering and the penalty coefficients. In equation (2), C i ,S 0 is the cost of connecting I(i) to the PE S0. Initially, this is simply the cost of connecting node i directly to the PE. As i becomes part of an itinerary containing other sensors, however, this changes to:
C i , S0 = min c k , S0 k∈ I ( i )
NOID (n, c, d, s, S0) // n: Total number of sensors, c: cost //matrix, d: data collected per host, S0: processing element initialize I // I: I0, .., In-1, where I0 ={S0}, … In-1 ={Sn-1} current = S0 N_ connected = 0
,
0 ≤ f ≤ 1 , I ( j ) = 0 if j ≡ S 0
s, if S j ≡ S 0 pi , j = 0, elsewhere
nodes of each sub-tree; these itineraries correspond to a postorder traversal of the sub-trees. A pseudo-code implementation of NOID algorithm follows:
(4)
On each algorithm’s step, tradeoff function values ti,,j are evaluated for all pairs (i,j), except of those where nodes i and j are already part of the same itinerary; the ‘itineraries’ including the nodes that produce the minimum ti,,j value are merged. For instance, if the tradeoff function is minimized for the pair of nodes m and n, then I(m) and I(n) are merged into one itinerary. When NOID’s execution finishes, one or more ‘sub-trees’ (groups of nodes) rooted at the PE node have been constructed. It is then a trivial task to produce the itineraries (started and terminated at the PE node) for traversing the
// N_connected: the number of
// sensors already included into an itinerary while (N_ connected < n) // I(i) is the sequence of hosts (itinerary) where // sensor i has already been included compute t i , j = ci , j + pi , j + s
I (i ) + I ( j )
∑[ f ⋅ k ⋅ d ] − C k =1
s
i ,S0
,
where I (i) ∩ I ( j ) = ∅ and C i , S 0 = min c k , S0 k∈ I ( i )
s
// I (i ) denotes the set corresponding to itinerary //sequence I(i) merge (I(i), I(j)), for (i, j) minimizing the tradeoff function ( min ti , j ) i, j
N_ connected ++ return I With regard to the computational complexity of the NOID algorithm, the total cost is O(N2 logN) at most, by keeping the trade-off function values tij in a heap structure. A heap is a convenient data structure for finding the smallest of the ti,,j in O(1) time. The O(log N) factor in the total complexity of the NOID algorithm is due to the cost of updating the heap structure. Specifically, each time an itinerary merging takes place, some of the ti,,j values change and so the new values should replace the old ones in the heap. Each of these replacements has O(log N) cost at most. The ti,,j values affected by an itinerary merging are the values of edges that connect nodes of the just merged itineraries with nodes outside these itineraries. Most importantly, we need to perform at most N such ti,,j calculations/replacements, since for each node at most one ti,,j value adjustment is needed. So, for the N merging steps of the NOID algorithm, the total cost is O(N2 logN) . IV.
SIMULATION RESULTS
Our simulation work attempts to compare the performance of NOID against LCF and GCF algorithms in terms of the overall itinerary length, data fusion cost and data fusion response time. Unless otherwise specified, the parameters used throughout the simulation tests are those shown in Table 1. The simulation results presented herein have been averaged over ten simulation runs (i.e., for ten different network topologies). Simulations have been conducted using a Java-based tool, implemented for this purpose. The simulator allows to easily specify simulation parameters and graphically illustrates the output of NOID, LCF and GCF, while also recording their respective overall itinerary length, data fusion cost and response time.
The 18th Annual IEEE International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC'07)
Table 1. Simulation parameters Parameter 2
Simulated plane (m )
Value 1500 × 1000
Number of sensors
100
Sensors transmission power (dBm)
4
Sensors transmission range (m)
10
Network transfer rate (Kbps)
250
Initial sensors battery lifetime
20-100 units
MA execution time at each sensor (processing delay, in msec)
50
MA instantiation delay (msec)
10
MA code size (s, in bytes)
1000
Bytes accumulated by the MA at each sensor (d, in bytes)
100
Data fusion coefficient (f)
1
(a)
(b)
Figure 2. Output of NOID algorithm for: (a) d =5 bytes; (b) d = 200 bytes. The output of NOID (shown in Figure 1d) involves considerably shorter overall itinerary length than GCF, yet, larger than that of LCF. However, the four itineraries of NOID result in smaller overall cost (the cost of the data fusion task is based on Equation (1)). It is stressed that, unlike LCF and GCF, the output of NOID is not always the same for a given network topology. For instance, when changing the amount of data collected from each sensor to d=5 and d=200 bytes, NOID proposes one and seven near-optimal agent itineraries, respectively (see Figure 2). 180
(a)
(b)
D a ta f u sio n c o st (m illio n s )
160 140 120 100 80 60 40 20 0 20
40
60
80
100
120
140
160
180
200
160
180
200
Number of sensors (d=50 bytes) LCF
GCF
NOID
(a) 3.000
(c)
(d)
Figure 1. Java-based simulation of MA-based distributed data fusion algorithms: (a) LCF output; (b) GCF output; (c) the trees constructed by NOID (four trees, each assigned to an individual MA; (d) NOID output, where the four trees created on the previous step are traversed in post-order. Figure 1 illustrates representative screens of our Java-based simulator that draw the output of the three MA-based distributed data fusion algorithms. The ellipse in Figure b denotes the network center. Notably, GCF typically suggests wireless hops among distant sensor nodes which are not within mutual transmission range and, hence, should be routed through intermediate nodes, thereby frequently requiring complex routing decisions and increasing the overall latency and energy consumption. The same usually applies for the last hops of MA itineraries suggested by LCF. In contrast, NOID tends to construct itineraries with mediumdistance hops (that is usually the case for relatively dense networks) wherein travelling MAs hardly ever migrate from a sensor to another through intermediate nodes.
D at a fu sio n co s t ( m illio n s)
2.500 2.000 1.500 1.000 500 0 20
40
60
80
100
120
140
Number of sensors (d=1000 bytes) LCF
GCF
NOID
(b)
Figure 3. Comparison of LCF, GCF and NOID algorithms in terms of their overall data fusion cost for (a) s=1000 bytes, d=50 bytes, (b) s=d=1000 bytes. Figure 3 compares LCF, GCF and NOID algorithms in terms of their respective overall data fusion cost. For a relative low d/s ratio value (=0.05 in Figure 3a) the cost saving offered by NOID over LCF and GCF becomes 26.3% and 466,6% respectively for a network size of 100 sensors and increases to 34.9% and 571.1% for 200 sensors. As we increase d/s=1 (see Figure 3b), the performance gain of NOID over LCF and GCF becomes 480.9% and 2614.3%
The 18th Annual IEEE International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC'07)
V. CONCLUSIONS In this article we presented an improved version of NOID, an efficient heuristic algorithm that derives nearoptimal itineraries for MAs performing incremental data fusion in WSN environments. Although NOID only considers the spatial information for designing MA itineraries, it minimizes the energy consumption involved in agents’ transmission since the transmission power required to transmit data between pairs of sensors increases with their physical distance. NOID has been extensively evaluated through simulation tests and has been shown to outperform alternative existing MA-based approaches both in terms of data fusion cost and the associated overall response time. As a future work, we intend to investigate the applicability of NOID in object tracking applications, where MA itineraries will only include sensors with increased signal strength (high target detection accuracy) and sufficient energy availability. Another future research direction will involve the implementation of NOID in real WSN environments. In particular, a set of sensor nodes capable of hosting and
providing an execution environment for MAs programmed in Java [9] will form the basis of our experimental testbed. O verall resp onse time (sec)
16 14 12 10 8 6 4 2 0 20
40
60
80
100
120
140
160
180
200
160
180
200
Number of sensors (d=50 bytes) LCF
GCF
NOID
(a) 100 90 O verall respo nse tim e (sec)
respectively for 200 sensors. For d/s=10 (not shown), the gain of NOID increases to 1856.5% and 8730.2%, respectively. A last set of experiments evaluates the overall response time of LCF, GCF and NOID algorithms for completing data fusion tasks. Response time is calculated as the sum of MAs instantiation delay, processing delay, MAs transmission delay and propagation delay: t overall = t inst + t proc + t trans + t prop (5) The MAs instantiation delay is related to the number of MAs involved in the data fusion task (in our experiments it takes 10 msec to instantiate each MA object). Hence, it is constant for LCF and GCF algorithms that always instantiate a single MA that visits the whole set of sensor nodes, while for NOID it depends on the network scale and the d/s ratio which dictate the number of proposed itineraries. Processing delay (time needed for the MA to complete its data fusion task on each sensor) is constant (50 msec in our experiments). Transmission delay depends on the network transfer rate and the current size of the MA (i.e. the MA’s code size plus the amount of data accumulated within the MA’s state). Finally, propagation delay depends on the physical distance covered in successive MA migrations (i.e. on the overall itinerary length). Response time measurements are depicted in Figure 4. In both graphs, the response times of LCF and GCF almost coincide: LCF only involves slightly decreased propagation delay compared to GCF since it derives shorter itinerary lengths. It is demonstrated that as the d/s ratio increases (see Figure 4b) the response time gain of NOID over LCF and GCF increases drastically as the transmission time dominates (for LCF and GCF) over the other delay parameters. That is, although NOID dispatches a large number of MAs thereby increasing tinst, these MAs work in parallel, while each of them visits a small set of sensors (unlike LCF and GCF where a single MA performs a number of hops equal to the number of sensors). Hence, in NOID, by the end of their itinerary MAs have not collected large chunks of data, considerably decreasing the associated transmission delay.
80 70 60 50 40 30 20 10 0 20
40
60
80
100
120
140
Number of sensors (d=1000 bytes) LCF
GCF
NOID
(b)
Figure 4. Comparison of LCF, GCF and NOID algorithms in terms of the overall response time for (a) d=50 bytes, s=1000 bytes, (b) d=1000 bytes, s=1000 bytes. REFERENCES [1] F. Akyildiz, W. Su, Y. Sankarasubramaniam, E. Cayirci, “A survey on sensor networks”, IEEE Communications Magazine, pp. 102-114, August 2002. [2] L.R. Esau, K.C. Williams, “On teleprocessing system design. Part II- A method for approximating the optimal network, IBM Systems Journal, 5, 142–147, 1966. [3] A. Fuggeta, G.P. Picco, G. Vigna, “Understanding Code Mobility”, IEEE Transactions on Software Engineering 24(5), pp. 346–361, 1998. [4] D. Gavalas, G. Pantziou, C. Konstantopoulos, B. Mamalis, “A Method for Incremental Data Fusion in Distributed Sensor Networks”, Proceedings of the 3rd IFIP Conference on Artificial Intelligence Applications & Innovations (AIAI’2006), pp. 635-642. [5] A.Kershenbaum, “Telecommunications Network Design Algorithms”, McGraw-Hill, 1993. [6] Milojicic D., “Mobile agent applications”, IEEE Concurrency, 7(3), July-Sep. 1999. [7] H. Qi, S.S. Iyengar, K. Chakrabarty, “Multi-resolution data integration using mobile agents in distributed sensor networks”, IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Rev., 31(3), pp. 383-391, August 2001. [8] H. Qi, F. Wang, “Optimal itinerary analysis for mobile agents in ad hoc wireless sensor networks”, Proceedings of the International Conference on Wireless Communications, pp.147-153, 2001. [9] Sun Microsystems, Sun http://www.sunspotworld.com/.
Spot
project
home
page,
[10] T. Umezawa, I. Satoh, Y. Anzai, “A Mobile Agent-Based Framework for Configurable Sensor Networks”, Proceedings of MATA’02, pp. 128140, October 2002. [11] Q. Wu, N. Rao, J. Barhen, S. Iyengar, V. Vaishnavi, H. Qi, K. Chakrabarty, “On Computing Mobile Agent Routes for Data Fusion in Distributed Sensor Networks”, IEEE Transactions on Knowledge and Data Engineering, 16(6), pp. 740-753, June 2004.