Data Fusion Tree(MADFT) for energy constraint wireless sensor networks. Different from ..... Analytical and experimental
Ant System based Anycast Routing in Wireless Sensor Networks Luo Juan, Song Chen, Zhou Chao
School of Computer and Communication, Hunan University, Changsha, China Email:
[email protected],
[email protected],
[email protected] Abstract Anycast is a mechanism that it sends the data groups to the nearest interface during which they have the same anycast address. Ant colony system, a population-based algorithm, provides natural and intrinsic way of exploration of search space in optimization settings in determining optimal anycast tree. In this paper, we propose a sink selection heuristic algorithm called Minimum Ant-based Data Fusion Tree(MADFT) for energy constraint wireless sensor networks. Different from existing schemes, MADAT not only optimizes over the data transmission cost, but also incorporates the cost for data fusion which can be significant for emerging sensor networks with vectorial data and/or security requirements. Via simulation, it is shown that this algorithm has excellent performance behavior and provides a near-optimal solution.
2. System Model and The Anycast Problem Formulation 2.1 Network Model
A sensor network can be modeled as a graph G = (V,E) where V denotes the set of nodes and E the set of edges representing the communication links between pairs of sensors. It assumes that a set S ң V of k nodes are data sources of interests and the sensed data need to be gathered at a closest sink node t Щ V respectively, where it is further processed. g
sink1 f
Keywords: wireless sensor networks, data aggregation, anycast routing, ant colony system
b
1. Introduction
d
a
Wireless sensor networks have attracted a plethora of research efforts due to their vast potential applications. In particular, extensive research work has been devoted to providing energy efficient routing algorithms for data gathering [1]. While some of these approaches assume statistically independent information and have developed shortest path tree-based routing strategies [2], others have considered the more realistic case of correlated data gathering [3]. For the case when multiple sinks are present, there has been little research to-date on addressing optimal sink selection for anycast routing where anycast is defined in the sense that each source node must send all its locally generated data to only one sink. Indeed, the cost for data aggregation may not be negligible for certain applications. Energy consumption of beam forming algorithm for acoustic signal fusion has been shown to be on the same order of that for data transmission [4-5]. In this paper, we design a sink selection heuristic algorithm called Minimum Ant-based Data Fusion Tree(MADFT) for energy constraint wireless sensor networks.
sink2
n
e
c
Fig 1. Anycast Network Model In Fig. 1, gray circles represent sensor nodes generating source data, dashed lines are possible communication links among the nodes, and the solid lines compose a part of possible anycast routing tree for data gathering. We assume that data aggregation can potentially take place at any intermediate node along the route: an intermediate node can explore the redundancy among multiple child-nodes’ data and aggregate all into one compressed data stream. Since data fusion is performed by intermediate nodes to aggregate their own data with their children’s, in order to avoid confusion, we use w(x) to denote the temporary weight of a node before data fusion and use w’(x) to denote the weight of a node after data fusion.
2.2 Correlation and Data Aggregation To accommodate a variety of applications, we do not constrain ourselves to any particular model on data aggregation. The only assumption we make is that if the data of nodes u and v is fused at v, the resulting amount of data is
1-4244-1312-5/07/$25.00 © 2007 IEEE
2420
not less than either of the component data. In other words, we assume w(v) ҆ max{w(u), w(v)}. (1) And evidently we shall have w(v) ҅ w(u)+ w(v). Otherwise, aggregation shall not be performed at all and the problem becomes trivial. It assumes that the aggregation process for multiple inputs at a particular point is performed step by stepˈso the above formula is adequate in characterizing the fusion process. The justification of this assumption lies in the resource limitation of sensor nodes. Storing multiple inputs and fusing them at once may be difficult for sensors as it requires large memory and additional processing power. Second, data reported from different sensors cannot arrive at the same time, either due to the shared wireless medium or various intermediate nodes and processing. Therefore, fusing existing data with the newly received when it arrives is a natural solution. In other words, in step by step fusion manner, the fusion point aggregates its own data with one input first, and next fuses the aggregation result with another input. This process will be repeated until all the inputs are aggregated. For example, in Fig. 1, node d fuses data from node c with its original data and saves it as its temporary data, then node d will aggregate it again with the data from node b and sends the final result along its path to sink 2. The transmission cost over an edge e depends on two factors: the unit cost of the link for transmitting data from u to v, and the amount of data to be transmitted. The latter factor is simply w(e).in order to simplify our model, we abstract the unit cost as c(e) and thus the transmission cost t(e) is: t(e) = w(e)c(e). (2) Notice that c(e) is link-dependent and hence can accommodate various conditions per link, for example, different distances between nodes and local congestion situations. The fusion cost over an edge e depends on the amount of data to be fused as well as the algorithms utilized. In this paper, the fusion cost is expressed by a general function q(x), such that the cost for fusing the data of nodes u and v at node v is given as f(e) = q(w(u), w(v)) (3) Although both transmission and fusion costs are linkbased, we remark that they cannot be simply combined together and hence rely on existing techniques solely based on the transmission cost to solve this problem. The reason is that the fusion cost on an edge is determined by the inputs of the fusion function. The inputs include both the incoming data from other nodes and the data produced by the fusion point itself. On the contrary, the transmission cost on an edge is only determined by the weight of the start point of the edge. In other words, for a fusion point, the transmission cost is only determined by the output of the fusion function. More evidently, this can be seen from Equations (2) and (3).
Given the source nodes set S and sinks T, our objective is to design an anycast routing algorithm that minimizes the energy consumption when delivering data from all source nodes in S to the sinks T. Mathematically, the goal is to find a connected subgraph G* = (V* ,E*) ҧ G, which contains all sources (S ң V*) and the sinks (t Щ V*), such that the following sum is minimized: XeЩE(f(e) + t(e)) (4) Different from existing work, the objective function includes both transmission and fusion costs. Therefore, our objective next is to find a optimal anycast routing tree that is the solution to Equation (4) which minimizes the total energy consumption.
3. MADFT Algorithm Design In ACO(Ant colony optimization) a colony of artificial ants is used to construct solutions guided by the pheromone trails and heuristic information they are not strong or very intelligent; but they successfully make the colony a highly organized society. This functionality of real ant colonies is exploited in artificial ant colonies in order to solve Optimization problems[6,7]. A.Ant Colony Optimization for Optimal Anycast Tree In MADFT, given a set of source nodes constructs multi-aggregation trees rooted in sinks associated with above cost(transmission cost and fusion cost) which is the local best aggregation trees. The algorithm iterates to search the global best and the convergence of algorithm gives the optimal anycast routing trees from combinatorial space. Thus, the best anycast routing trees constructed by ant routing in iterations is remembered. Further, giving early aggregation more weight in cost function will converge in optimal aggregation points. The Algorithm as follows: first, assigns ants to source nodes, the route is constructed by one of the ants in which other ants search the nearest point of previous discovered route. The choose formula is Probability function composed of pheromones and costs in order to find the minimum total cost path. The points where multiple ants join are aggregation nodes. If a source node is in the path of previous discovered route the ant lies in the node stop the search, because the optimal route is founded by the previous ant. If a source node is not in the path of previous discovered route, the ant lies in the node try to find shortest route to closest sink and or finds closest aggregation point of the route searched by previous ants. Then the discovered path are given weight which indicates heuristics for reaching to destination sinks or nearest aggregation point and pheromone trails is the heuristics to communicate other ants of the route discovered. Ants tries to follow the route to get pheromone eventually converges to the optimal route. Non-optimal route pheromone gets evaporated with time. We then repeat this process on the new set until the optimal anycast tree rooted in sinks is achieved .data aggregation may arise in any nodes
1-4244-1312-5/07/$25.00 © 2007 IEEE
2421
on the optimal tree . The detailed algorithm is presented below. An artificial ant placed randomly in nodes and during each iteration chooses next node according to the rule: First each ant located at node i judges if destination sinks are among the neighbors of node i. if some destination sink t is node i’ neighbor ,ant located at node i stop searching, which avoid a node near one sink go through long length get to other sink. If not, then each ant located at node i hops to node j selected among the neighbors that have not yet been visited (except destination sinks)according to probability. Probability that ant k in node i will go to node j: [τ ij ( t ) ]α • [η ij ]β k Pij ( t ) = α β ¦ i∈ N k [τ ij ( t ) ] • [η ij ( t ) ] i
(5)
η (t ) = 1/ d
d
ij , ij is a priori known heuristic Where ij information: cost function (transmission cost and fusion
τ (t )
is the pheromone strength number of the edge cost), ij between node i and node j at time t. Parameters Į and ȕ determine how much the pheromone trail and heuristic information can influence the ants’ behaviors.
Ni
k
represents the feasible neighborhood of ant k; that is, the set of neighbors which ant k has not yet visited. Formula (5) expresses the transition probability from i toj increases along with pheromone’s increasing, and decreases with cost ’s increasing. Ant prefers to more pheromone and short cost. For each ant, it updates the pheromone intension of every edge. Which taking the formula as follows:
τ ij = τ ij + δ ij ⋅
δj =
{
1
mk mv
(6)
j∉tabuk (t )
0 otherwise Where mk is the random point’s data of ant k .mv is the last node’s data before sink in the path of ant k discovered. tabuk ( t ) is tabu list that ant k has visited point at time t , and it can avoid ant’s swaying between two point. ޕ In the algorithm, Once all ant complete one cycle㧘 the global updating rule is implemented as follows. Once all ants have built their tours, pheromone is updated on all edges according to
τ ij = (1 − ρ ) • τ ij + ρ total cos t
(7) where ρ are ACO parameters ,total cost is obtained from first pass.For other nodes is not visited in first evaporates more rapidly for lower values:
τ ij = (1 − ρ ) • τ ij
d
The ij is weighted function of cost (transmission and fusion cost )as follows:
d ij = t (e) + f (e)
(9) Ant choose the next node with cost is small in order to minimum the total energy consumption. The algorithm keeps running until the best solution found or until the defined termination condition reached.
4. Simulation In this section we perform a comparison between MADFT and other algorithms such as Greedy Incremental Tree(GIT), Center at Nearest Source(CNS).
4.1 Simulation Environment
The algorithm is simulated in c++ with a setting of sensor network of 100 nodes. The neighborhood is obtained from the random topology, which we assure definite connectivity. It assumes that each source node selected stochastically produces one 500-byte packet as original sensed data in each round and sends the data to the closted sink located in sensor area. We instantiate unit transmission cost on each edge, c(e), using the first order radio model presented in [8]. The transmission cost for sending J amount of information from one node to another node d distance away is given by J( β 1 dr + ε ) when d < rc.We set r = 2 and β 1 = 100pJ/bit/m2 to calculate the energy consumption on
ε
is 10 г the transmit amplifier. The typical value of 100nJ/bit and is set to 50nJ/bit in our simulation. If two nodes are more than rs distance apart, simply the correlation coefficient is 0. Otherwise, the correlation
ρ1 = 1 г d/rs , where d denotes the distance coefficient is between the nodes. By varying the correlation range rs, we can control the average correlation coefficient of the network. In order to distinguish the correlation between data originated from two nodes and that among aggregated data, we use a “forgetting” factor on the correlation coefficient among aggregated data. For the fusion cost, in the
η η simulation, we assume that q(x, y) = x (x + y), where denotes fusion cost of unit data. In other words, fusion cost is linear with the total amount of data to be fused. 4.2 Simulation result Figure 2 shows by increasing the number of sinks from 1 to 5, MADFT performs perfectly as shown in Figure 2 This occurs because traffic loads have been balanced among 5 sinks.
(8)
1-4244-1312-5/07/$25.00 © 2007 IEEE
2422
WRWDOFRVWP-
&16 *,7 0$')7
6. Conclusion
QXPEHUVRIVLQN
WRWDOFRVWP-
Fig 2. Comparison of total cost of numbers of sinks
&16 *,7 0$')7
Fig 3. rs = 1000m to simulate ρ 1 ė 1 In Figure 3 ,we study network connectivity’s impact on
η
the performance of routing algorithms. We set , the fusion cost for unit data, to be 15nJ/bit.Again the cases for rs = 1000m. As MADFT explicitly considers fusion cost, this phenomenon can be captured and exploited. On the contrary, CNS and GIT results in fixed routing structures according to network topology, and hence can not adapt to the change of
ρ
&16 *,7 0$')7
WRWDOFRVW P-
1 ψ 1, MADFT data correlation. Therefore, When performs better than all other algorithms. MADFT can balance between data aggregation and direct transmission and thus produce better performance. Longer transmission range and thus better network connectivity of the network is in favor of MADFT as it can employ more direct shortest paths to prevent unnecessary fusion cost at each node.
We propose Minimum Ant-based Data Fusion Tree(MADFT) , a routing algorithm for gathering correlated data in sensor networks. MADFT not only optimizes over both the transmission and fusion costs, but also adopts ant colony system to achieve the optimal solution. Analytical and experimental results show that MADFT adapts well to varying network conditions ,is a energy efficiency algorithm. As an ongoing effort ,we are designing an algorithm based on MADFT that can be executed in dynamic sensor networks.
Acknowledgements
FRPPXQFLDWLRQUDQJHP
FRUUHODWLRQUDQJHP
Fig 4. Comparison of total cost of cost ratio in Figure 4, we fix the transmission range of the sensor nodes and study the impact of correlation coefficient on the anycast routing performance. Here, we set the unit fusion
η
As described in former Section , the fusion cost per unit data may vary widely from network to network. Our experiments show that MADFT can adapt well to a wide range of fusion costs and hence applicable to a variety of applications.
is set to be 15nJ/bit. We increase rs from 1 to 1000m cost ρ which corresponds to varying 1 from 0 to 1.
This work is partially supported by the National Natural Science Foundation of China , Grant No.60673061, the National Research Foundation for the Doctoral Program of Higher Education of China, Grant No.2006053202 , the National Science Foundation of Hunan Province of China under Grant No. 06JJ50111, 06JJ50113. the Scientific and technological project in Changsha City under Grant NO.K069015-12.
References [1] I. Akyildiz, W. Su, Y. Sankarasubramaniam, and E. Cayirci, “A Survey on Sensor Networks,” IEEE Comm. Magazine, vol. 40, no. 8 ,Aug. 2002, pp.102-114. [2] W. Heinzelman, J. Kulik, and H. Balakrishnan, “Adaptive Protocol for Information Dissemination in Wireless Sensor Networks,”in Proc. ACM MobiCom Conf., Washington ,Aug. 1999,pp.174-185. [3] P.V. Rickenbach and R. Wattenhofer, “Gathering Correlated Data in Sensor Networks,” in Proc. ACM Joint Workshop Foundations of Mobile Computing (DIALM-POMC ’04), Philadelphia, Oct. 2004,pp.60-66. [4] B. Krishnamachari, D. Estrin, and S. Wicker, “Impact of Data Aggregation in Wireless Sensor Networks,” in Proc. 22nd Int’l Conf. Distributed Computing Systems, July 2002,pp.575 - 578. [5] A. Wang, W.B. Heinzelman, A. Sinha, and A.P. Chandrakasan, “Energy-Scalable Protocols for Battery-Operated Microsensor Networks,” J. VLSI Signal Processing, vol. 29, no. 3, Nov. 2001, pp. 223-237. [6] Colorni A , Dorigo M , Maniezzo V. “Distributed optimization by ant colonies”. Proc 1st European Conf on Artificial Life Paris, France:Elsevier Publishing, 1991, pp.134-142. [7] Colorni A, Dorigo M, Maniezzo V. “An investigation of some properties of an ant algorithm”. In Proc.PPSN ‘92Brussels, Belgium: Elsevier Publishing, 1992, pp.509-520. [8] S.Lindsey and C.S Raghavendra. “Energyefficientbroadcasting for situation awareness in ad hoc networks”. In Proc.ICPP01. Valencia, Spain, Sept. 2001, pp.149-155.
1-4244-1312-5/07/$25.00 © 2007 IEEE
2423