CGC: centralized genetic-based clustering protocol for wireless sensor ...

Telecommun Syst DOI 10.1007/s11235-015-0102-x

CGC: centralized genetic-based clustering protocol for wireless sensor networks using onion approach Majid Hatamian1 · Hamid Barati1 · Ali Movaghar2 · Alireza Naghizadeh3

© Springer Science+Business Media New York 2015

Abstract Wireless sensor networks consist of a large number of nodes which are distributed sporadically in a geographic area. The energy of all nodes on the network is limited. For this reason, providing a method of communication between nodes and network administrator to manage energy consumption is crucial. For this purpose, one of the proposed methods with high performance, is clustering methods. The big challenge in clustering methods is dividing network into several clusters that each cluster is managed by a cluster head (CH). In this paper, a centralized geneticbased clustering (CGC) protocol using onion approach is proposed. The CGC protocol selects the appropriate nodes as CHs according to three criteria that ultimately increases the network life time. This paper investigates the genetic algorithm (GA) as a dynamic technique to find optimum CHs. Furthermore, an innovative fitness function according to the specified parameters is presented. Each chromosome which minimizes fitness function, is selected by base station (BS) and its nodes are introduced to the whole network as proper CHs. After the selection of CHs and cluster formation, for

B

Majid Hatamian [email protected] Hamid Barati [email protected] Ali Movaghar [email protected] Alireza Naghizadeh [email protected]

1

Department of Computer Engineering, Dezful Branch, Islamic Azad University, Dezful, Iran

2

Department of Computer Engineering, Sharif University of Technology, Tehran, Iran

3

Department of Computer Science, Rutgers University, New Brunswick, NJ, USA

upper level routing between CHs, we define a novel concept which is called Onion Approach. We divide the network into several onion layers in order to reduce the communication overhead among CH nodes. Simulation results show that the implementation of the proposed method by GA and using onion approach, presents better efficiency compared with other previous methods. Conducted simulation results show that the CGC protocol has done significant improvement in terms of running time of the algorithm, the number of nodes alive, first node death, last node death, the number of packets received by the BS, and energy consumption of the network. Keywords Wireless sensor network · Clustering · Graph · Genetic algorithm · Chromosome · Centralized · Onion approach

1 Introduction In WSNs, nodes have limited battery power, limited transmission range, as well as their processing and storage capabilities are also limited. The routing strategy selection is an important issue for the efficient delivery of packets to their destination. Moreover, such a strategy, regardless of the application, must try to maximize network lifetime and minimize energy consumption of the overall network. The sensor nodes are typically battery-powered and should operate without attendance for a relatively longer period of time. In most cases, it is very difficult and even impossible to change or recharge the batteries [1–3]. In a two tier WSN, the sensor nodes are divided into several groups, which are called clusters. Each cluster has a manager, which is known as CH. All the sensor nodes sense local data and send it to their corresponding CH. Then, the CHs aggregate the local data and finally send it to the BS directly or via other CHs [4,5].

123

M. Hatamian et al.

Member node CH node BS

Fig. 1 Single-Hop data transmission from CHs to BS

Routing in WSNs is very actuating due to the inherent features that distinguish these networks from other wireless networks. According to the relatively large number of sensor nodes, it is very important to implement a suitable routing mechanism. Routing mechanisms should consider the inherent features of WSNs (e.g. resource and topological constraints, etc.) along with application and architecture requirements. One of the most effective routing mechanisms to reduce the energy consumption and increase the network lifetime, is hierarchical mechanism. In hierarchical protocols, nodes are grouped into clusters and each cluster has a CH. For example, Fig. 1 shows a WSN with hierarchical routing [6–8]. The CHs are used for higher level communication and reducing the traffic overhead. The use of hierarchical routing has a lot of advantages. Some of these advantages are: (a) size reducing of the routing tables stored at each individual sensor node, (b) saving the communication bandwidth, (c) and the use of optimized management strategies to prolong the battery life of the sensor nodes [9–11]. In this paper, a centralized genetic-based clustering (CGC) protocol using onion approach is proposed. The CGC protocol performs the clustering operation and CH selection according to three criteria: (1) The residual energy of the nominated node should be higher than the average energy of the whole network. (2) A graph between the nodes which are in the radio communication range of each other, is formed. A weight is assigned to each edge of this graph. Then, the sum of weights for edges (SW E) connected to each node must be calculated. SW E must be maximized for a node so that it could be considered as a proper candidate for being CH. SW E is used as one of the criteria for

123

selection optimal CH, because it makes an affordable transmission cost between members and the CHs. (3) The node should not be an outlier. We define a function as the function to indicate the node is outlier or not. This criterion is used because the nodes which have covered a large number of nodes, have more chance for being CH. Thus, the amount of surface area covered by CHs, increases, which makes the network more stable. The genetic algorithm (GA) is used to search in a complicated search space and select optimum CHs. We have proposed an innovative fitness function which is tried to be minimized. Each chromosome which minimizes the fitness function is selected by BS and its nodes are introduced to the whole network as proper CHs. In fact, the innovative fitness function leads to appropriate chromosome selection. Furthermore, a novel concept which is called onion approach for upper level routing (between CHs) is proposed. This approach divides the networks into several layers which are called “Onion Layers” and leads to reduction in transmission costs between CHs. Simulation results show that the CGC protocol has significant improvements in terms of running time of the algorithm, the number of nodes alive, first node death (FND), last node death (LND), the number of packets received by the BS, and energy consumption of the network. Our main contributions can be summarized as follows: • In the proposed fitness function, we used a parameter by considering the weight of edges connected between the sensor nodes and their CHs. This is in contrast to the fitness functions which are used in the traditional geneticbased clustering protocols. • Using a new concept called onion approach which is not used in any previous work as a method of communication between CH nodes. This method reduces communication overhead among CHs. Also, energy consumption becomes more balanced. • The above two solutions include an proposed fitness function and onion layering approach make the CGC protocol more affordable in terms of network lifetime, energy consumption, and packets received by BS than the other traditional genetic-based clustering protocols. The rest of this paper is organized as follows. Section 2 describes related work in hierarchical routing. Section 3 explains a brief description about genetic algorithm. Section 4 explains network and energy consumption model in proposed protocol. Section 5 presents the proposed protocol based on genetic algorithm (CGC). Section 6 introduces a new concept called onion approach for routing in upper level (among CHs). In Sect. 7 the proposed protocol has been analyzed in terms of time complexity. Also, the performance of it has been evaluated in terms of the following criteria: the number of dead nodes (or the number of alive nodes, because

CGC: centralized genetic-based clustering protocol for wireless...

they can be used interchangeably), first node death (FND), last node death (LND), the number of packets received by the BS, and energy consumption of the network. Finally, Sect. 8 concludes this paper.

2 Related work So far, various methods for clustering in WSNs and increasing the network lifetime have been proposed. Low-energy adaptive clustering hierarchy (LEACH) [12] protocol is one of the most popular hierarchical routing protocols in WSNs. The operation of LEACH protocol consists of two phases. First, in setup phase the clusters are organized and CHs are selected. In this phase, a sensor node selects a random number between 0 and 1. If this number is less than the threshold T (n), the node becomes a CH. T (n) is calculated as follows: p if n ∈ G 1− p×(r mod 1p ) (1) T (n) = o otherwise. In Eq. 1, r is the current round, p is the recommended percentage of CHs, and G is the collection of nodes which are not selected as a CH in the last 1/ p rounds. Second, in steady state phase the data is sent to the BS. The duration of steady state phase is longer than the duration of setup phase in order to minimize overhead. LEACH protocol increases the network lifetime compared with the previous protocols, and also supports high scalability. Despite these advantages, CHs are selected randomly, so the optimal number and distribution of CHs cannot be ensured. The nodes with low residual energy, have the same priority to be a CH as the node with high residual energy. For this reason, those nodes with less remaining energy, may be chosen as the CHs which will result that these nodes may die first. Although LEACH protocol presents randomly, adaptive and self-organize clustering, but it does not guarantee the number and placement of CHs. Therefore, LEACH-Centralized (LEACH-C) [13] protocol was proposed. LEACH-C utilizes the BS for cluster formation, unlike LEACH where nodes self-organize themselves into clusters. Initially in LEACH-C, the BS receives information about the location and energy level of each node in the network. With this information, the BS finds a predefined number of CHs and configures the network into clusters. Centralized balance clustering (CBC) routing protocol based on location was proposed in [14]. In CBC, in order to keep clustering balanced through the whole lifetime of the network and adapt to the non-uniform distribution of sensor nodes, a systemic algorithm for clustering is designed. First, the algorithm determines the cluster number according to condition of the network, and adjusts the hexagonal clustering results to balance the number of nodes of each cluster.

Second, it selects CHs in each cluster based on the energy and distribution of nodes, and optimizes the clustering results to minimize energy consumption. Finally, it allocates suitable time slots for transmission to avoid collision. In [15], LEACH protocol was improved with sleep mode. Authors proposed four new hierarchical clustering topology architectures: random CH and sub-CH (RCHSCH), random CH and max energy sub-CH (RCHMESCH), random CH and subCH with sleep mode (RCHSCHSM) and random CH and max energy sub-CH with sleep mode (RCHMESCHSM). The proposed architectures involve three-layers and are based on LEACH architecture. From the simulation results, RCHSCH, RCHMESCH, RCHSCHSM and RCHMESCHSM architectures perform better than the LEACH architecture. In RCHSCH, the sub-cluster formation of hierarchical clustering topology architectures is used in order to improve outcomes related to the problem where CHs die quickly in LEACH. Moreover, RCHSCH is improved by forming RCHMESCH wherein SCHs and RSCHs are elected based on the energy of the sensor nodes and wherein energy consumption can be balanced. Finally, RCHSCH and RCHMESCH architectures are improved such that a sleep mode is added to form the RCHSCHSM and RCHMESCHSM based on correlation of sensor data within sub-clusters. In [16], authors improved the clustering routing protocol using GA. In their study, GA is used to create energy efficient clusters for data transmission in WSNs. The BS uses GA to create energy efficient clusters for a given number of transmissions. The node is represented as a bit of a chromosome. A population consists of several chromosomes and the best chromosome is used to generate the next population. Based on the survival fitness, the population transforms into the future generation. Initially, each fitness parameter is assigned an conventional weight. After every generation, the fittest chromosome is evaluated and the weights for each fitness parameter are updated accordingly. The proposed technique uses a GA to determine the initial set of hierarchical clusters. Authors compared their proposed method with LEACH protocol in different ways. Their method saves more energy than LEACH, especially with an increase in the number of nodes in the network. However, since the proposed method in [16] does not consider the density of nodes in different parts of the network while selecting CHs, may select the nodes as CHs that they do not cover an adequate number of nodes. In [17], an optimal method of clustering homogeneous WSNs using a multi-objective two-nested GA is presented that its name is M2NGA. The top level algorithm is a multiobjective GA whose goal is to obtain clustering schemes in which the network lifetime is optimized for different delay values. The low level GA is used in each cluster in order to get the most efficient topology for data transmission from sensor nodes to the CH. The advantage of M2NGA compared with other heuristic clustering methods is its generality. Despite

123

M. Hatamian et al.

the advantage of the proposed method in [17], the relatively large computational overhead for its implementation, is considered as one of the weaknesses. In [18], authors proposed a new GA based clustering algorithm to solve the load balancing problem in WSNs. The algorithm forms clusters in such way that the maximum load of each gateway is minimized. In the phase of initial population generation, they restricted the generation of initial population by considering the connectivity between the sensor nodes and their CHs. Also, In the mutation phase, the mutation point is selected in such a way that it generates children chromosomes that ensures better load balancing. In [19], authors investigated the problem of grouping the sensor nodes into clusters to enhance the overall scalability of the network. A selected set of nodes, known as gateway nodes, will act as CHs for each cluster and the objective is to balance the load among these gateways. Load-balanced clustering increases system stability and improves the communication between the various nodes in the network. Their proposed algorithm adopts a centralized approach which assumes that each node is aware of the network topology. They first showed that a special case of load-balanced clustering problem (whereby the traffic load contributed by all sensor nodes are the same) is optimally solvable in polynomial time. They next proved that the general case of load-balanced clustering problem is NP-hard. In [20], authors investigated the advantages and disadvantages of LEACH protocol and then put forward a clustering routing protocol for balancing the energy consumption based on simulated annealing and GA. They formed the clusters by simulated annealing and GA and then calculated the cluster center of each cluster. If the energy of the node in the cluster is higher than the average energy of the cluster, it will become the candidate CH; at last the candidate CH becomes the CH according to the distance from the cluster center of the cluster. The main operational difference between the proposed protocol in [20] and LEACH is the selection process of CHs; CH selection is performed by simulated annealing and GA. Also, it is based on a centralized control algorithm that is implemented at the BS.

verges to optimum solution during a specific procedure. Heuristics methods are problem-dependent techniques. As such, they usually are adapted to the problem at hand and they try to take full advantage of the particularities of this problem. However, because they are often too greedy, they usually get trapped in a local optimum and thus fail, in general, to obtain the global optimum solution [21,22]. On the other hand, meta-heuristic algorithms (like GA) usually have shorter running time than heuristic algorithms and they are very simple. This probabilistic nature of the solution is also the reason they are not contained by local optima. The main reason that we used GA as an optimization method for clustering issue in WSNs is that meta-heuristic algorithms are problem-independent techniques. As such, they do not take advantage of any specificity of the problem. In general, they are not greedy. In fact, they may even accept a temporary deterioration of the solution, which allows them to explore more thoroughly the solution space and thus to get a hopefully better solution (that sometimes will coincide with the global optimum). In our proposed method, according to the simulation results, we realized that 1000 iterations is acceptable to find a very good solution because there was lack of change in optimum solution after a specific number of iterations (1000 iterations) [23,24]. In GA, at first, an initial population of solutions including n chromosomes must be generated. This population might be generated either randomly or using another solution which is close to optimized model. Afterwards, all chromosomes belonging to this population must be evaluated using a fitness function. Figure 2 briefly depicts the steps of GA. Some of chromosomes belonging to current population (current generation) are selected based on their desirability in order to generate new population (new generation). The offsprings are generated using evolutionary operators such as Selection, Crossover, and Mutation. When the new generation is generated, the algorithm should check the termination condition. If the termination criterion is satisfied, the algorithm terminates; otherwise, this cycle repeats itself till the termination condition is met.

3 An overview of genetic algorithm 4 Network and energy consumption model Genetic algorithm (GA) is a kind of meta-heuristic and evolutionary search mechanisms based on natural selection and genetics. Evolutionary means that initial population conFig. 2 The steps of GA

Consider one BS and the set of sensor nodes which are defined as follows:

Initialization and Calculate Fitness

Mutation Operator

Termination YES Algorithm

Start Individual Selection to Generate Offsprings

123

Crossover Operator

No

End


S = [S1 , S2 , S3 , . . . , Sn ].

(2)

In Eq. 2, n indicates the number of nodes which are distributed in a geographic area. Our goal is to select a collection of CHs which cover the entire area. Each sensor node is shown by Si , where 1 ≤ i ≤ n. Also, we define the set of CHs as follows: C = [C1 , C2 , C3 , . . . , Cm ].

E cpu = E cpu−state + E cpu−change .

(5)

(3)

In Eq. 3, m indicates the number of clusters, where n ≥ m, it means that the number of sensor nodes is always greater than the number of CHs. In the model of studied network, the following properties are assumed: – The sensor nodes are placed randomly and independently in a given environment and are homogeneous (in terms of computational and processing power, energy and memory). – Since the CGC protocol is a centralized method, sensor nodes are not engaged in CHs selection and cluster formation processes. Moreover, all the clustering procedures are performed by BS and its results are transmitted to the whole network. Therefore, BS is not limited in terms of energy, memory, and computational power. – The network structure consists of a BS and a number of sensor nodes that communicate with each other. – The network is divided into a number of clusters. Each cluster consists of several sensor nodes, each managed by its own CH. When CHs receive the messages from their members, they relay them to the BS. Each sensor node periodically senses a geographical area and sending the obtained information to the BS via its CH. – The initial energy of all nodes is the same and is defined by E primar y . It should be noted that E primar y = E max . Figure 1 illustrates a network which is divided into a number of clusters and each cluster consists of several sensor nodes. The WSN nodes consist of several modules including: Sensor Module, Processing Module, Wireless Communication Module and Power Supply Module. These modules work together in order to build sensing operation in a WSN environment. Thus, in order to evaluate the energy consumption of a WSN node, it is important to study the energy consumption of its modules. The energy consumption of sensor module is due to many factors like signal sampling, AD (Analogue to Digital) signal conversion and signal modulation. Also the energy consumption of this module is related to the sensing operation of the node (periodic, sleep/wake, etc). The energy consumption in periodic mode is obtained as follows [25–27]: E sensor = E on−o f f + E o f f −on + E sensor −r un .

In Eq. 4, E on−o f f is the energy consumption of closing sensor operation, E o f f −on is the energy consumption of opening sensor operation and E sensor −r un is the energy consumption of sensing operation. The energy consumption of processing module is obtained as follows:

(4)

Processing module supports three operation states: sleep, idle and run. E cpu−state is the state energy consumption and E cpu−change is the state transition energy consumption. The energy consumption of the nodes consists of the consumption energy for sending and receiving the messages. Like [12], energy is consumed on data sending/receiving for a sensor node can be calculated as follows: εmp × l × d 2 , d ≤ d0 (6) E t x (l, d) = E elec × l + εmp × l × d 4 , d > d0 . In Eq. 6, E t x (l, d) is the consumed energy when transmitting a l-bit message through the distance d. E elec is the electronic energy consumed per bit for coding, modulation, filtering and spreading. Also the distance threshold (d0 ) is calculated as follows: εfs . (7) d0 = εmp In Eq. 7, ε f s represents the amplifier parameter in a free space model when the transmission distance is shorter than d0 and εmp represents the amplifier parameter in a multi-path fading channel model when the transmission distance is longer than d0 . Also the energy consumption on receiving data is calculated as follows: Er x (l) = E elec × l.

(8)

Finally, the total energy consumption for transmitting a l-bit message from a source node S to a destination node D through the distance d is obtained as follows: E S,D (l, d) = E t x (l, d) + Er x (l).

(9)

In Eq. 9, E t x (l, d) is energy consumption for transmitting a l-bit message through distance d, and Er x (l) is energy consumption for receiving a l-bit message.

5 The CGC protocol based on genetic algorithm In this paper, GA is utilized to explore in a complicated search space and conduct desired optimization. The output is

123

M. Hatamian et al. Set-up Phase Steady-State Phase

Frame

BS Time

Round

Send request for ID, energy status, loction, and the number of neighbors to sensor node

Fig. 3 Working cycle of the CGC protocol

a chromosome consisting of the optimum CHs with respect to parameters which are defined in the following. The operation of the CGC protocol is divided into rounds. As shown in Fig. 3, a certain period of time is defined as a round, where each round begins with a setup phase, when the BS finds the optimum number of CHs and assigns members nodes of each CH, followed by a steady state phase, when the sensed data are transferred to CHs and collected in frames; then these frames are transferred to the BS. Also, each round consists of two phases [28]: The setup phase This phase consists of CHs selection and cluster formation, respectively. During each setup phase, the BS receives information on the current energy status, location, and number of the neighbors from all the nodes in the network. Based on these information, the BS selects proper CHs using an evolutionary approach (GA). After the appropriate chromosome is selected by BS (selecting the appropriate chromosome, will be explained), IDs of these CHs introduced to the whole network as proper CHs by BS as follows: ADV-Member = [IDs of proper CHs].

(10)

where, ADV-Member is advertisement message which is diffused to the whole network. Moreover, the IDs of members are sent to the CHs by and ADV-CH message as follows: ADV-CH = [Node’s ID, CH’s ID].

(11)

Then, the cluster is established. The operation of setup phase is shown by Fig. 4. After all, each CH creates the time division multiple access (TDMA) schedule by assigning slots to its member nodes and informs these nodes by this schedule. The TDMA schedule is used to avoid intra-cluster collisions and reduce energy consumption between data messages in the cluster. Also, to reduce inter-cluster interference, every CH selects a unique CDMA code and informs all member nodes within the cluster to transmit their data using this spreading code. The steady state phase In steady state phase, during each frame, every member node at the time of its respective time slot, sends sensed data to its CH (like [13]). Then, every CH forwards the aggregated data to the BS.

123

Waiting for information

information ID ENsi (x,y) NONNsi Requested information will be send by sensor node

The CHs are selected and their IDs are sent to sensor nodes

The IDs of the memebrs are sent to CHs

The cluster is formed

Fig. 4 The operation of setup phase in the CGC protocol

5.1 Fitness parameters for determining optimum CHs in CGC protocol As it is mentioned before, clustering is performed by the BS and the results are transmitted to all nodes. The number of needed CHs indicates the length of a chromosome. In the CGC protocol, a node is suitable for being CH if it acquires the undergoing conditions [28]. A. The residual energy of the nominated node must be higher than the average energy of the whole network. For this purpose, we define function E(avr ) as the function to calculate average energy of the whole network and it can be defined by Eq. 12 as follows: E S1 + E S2 + · · · + E Sn = E(avr ) = n

n i=1

n

E Si

.

(12)

Also, function E(r es) Si is defined to represent the residual energy of node Si . In other words, it means: E(opt) =

E(r es) Si Go to the next E(r es) Si

i f E(r es) Si ≥ E(avr) otherwise.

(13) In Eq. 13, where, E(opt) represents optimal energy, E(r es) Si and E(avr ) represent the residual energy of the sensor node Si and the average energy of the whole network, respectively.


0.3

)=

0.7

)=

4, S6

, S5

S w(

5

)

, S7

)=

S 3 )= 0

S 5 )= 0

.8

S7

.6

.4

S7

.5 =0

w( S 5 ,

.6

S4 ) =0 S2 ,

2 )= 0

5

6, S4

5 0.

.9

)=0.2

.6

0 )=

w(S 6 , S 7

w(

0 )=

w( S

)=

S4

w( S

(

0. 5 )= ,S 2 , S2 S5

w

0.4

0.8

, S7

j,h

)=

, S4

SW E(Si ) = w(Si , S j ) + · · · + w(Si , Sh ) =

)=

4

In Eq. 14, R E Si and D Si ,S j represent the residual energy of node Si and the distance between nodes Si and S j , respectively (we have considered estimation in [29] by using the log-normal shadowing radio propagation model (LNSM). On a basis of signal strength of received frames a distance between two sensor nodes is estimated). In this case, the sum of weights for edges (SW E) connected to each node must be calculated. We define function SW E(Si ) as the function to calculate the sum of weights for edges connected to each node. SW E(Si ) should be maximized for a node so that it could be considered as a proper candidate for being CH. SW E(Si ) is calculated by Eq. 15 as follows:

,S 4

7

(14)

,S 1

S w(

R E Si . w(Si , S j ) = D Si ,S j

S1

S w(

B. To address this issue, the network with n nodes is mapped into a graph. This graph is formed among the nodes which are in the radio communication range of each other. Then, a weight is assigned to each edge of this graph. Graph G is defined as G = {V, E(w)}. Where, V is the number of vertices (the number of nodes), E is the number of edges, and w is a function of E to the set of real and positive numbers. V is the set of sensor nodes and E is the set of transmission edges. The weight of such edge (Si , S j ) is represented by w(Si , S j ). The longitude of a path in graph G is defined as the total weight of the edges that make up the route. Now, the criterion for determining and allocating weights to the edges of a graph G, will be described. The weight of such edge (Si , S j ) (which is shown by Fig. 5) is calculated as follows:

S4

(S

w( w(

, S 2 )=

w

Fig. 5 The weight of edge (Si , S j )

0.4

w( S 1

S1

S3

w( S 2 , S 3 )=0.7

4 ,S

2 w( S

w( S 3 , S 2 )=0.4

S2

0.8 , S 1 )=

w( S 3 ,

NSj

w(

w( Si , Sj)

w( S

NSi

S5 w (S w

(S

0.8

8 , 5 , S5 ) =0 S8 .6 )

=0

.9

w(S 7 , S 8 )=0.3

S8

w(S 8 , S 7 )=0.4

)=0.4 , w( S7 S 6

S6

Fig. 6 An example of a graph and constructed edges between its nodes Table 1 Weight values of the edges Nodes

S1

S2

S3

S4

S5

S6

S7

S8

S1

–

0.4

–

0.4

–

–

–

–

S2

0.8

–

0.7

0.6

0.5

–

–

–

S3

–

0.4

–

–

0.4

–

–

–

S4

0.8

0.8

–

–

–

0.7

0.9

–

S5

–

0.5

0.6

–

–

–

0.8

0.9

S6

–

–

–

0.3

–

–

0.2

–

S7

–

–

–

0.6

0.5

0.4

–

0.3

S8

–

–

–

–

0.6

–

0.4

–

weights of all the edges are shown in Fig. 6. Also, these weights are presented in Table 1. It should be noted that all the values have been normalized between [0,1]. SW E for each node can be calculated as follows: SW E(S1 ) = w(S1 , S2 ) + w(S1 , S4 ) = 0.8 SW E(S2 ) = w(S2 , S1 ) + w(S2 , S3 )

w(Si , Sd ).

d=1

(15) Obviously, the higher value of the edge means that data transmission is more beneficial using that edge and needs less energy. Figure 6 shows a graph which is formed among the sensor nodes (according to the radio communication range and neighborhood relations). Also, the weights are allocated to each edge and they are displayed by w(Si , S j ). Where, Si is the source node and S j is the destination node. It is important to say that w(Si , S j ) = w(S j , Si ). It means w(Si , S j ) and w(S j , Si ) are different from each other. In this section, we aim to explain how it is possible to calculate SWE. The

+ w(S2 , S4 ) + w(S2 , S5 ) = 2.6 SW E(S3 ) = w(S3 , S2 ) + w(S3 , S5 ) = 0.8 SW E(S4 ) = w(S4 , S1 ) + w(S4 , S2 ) + w(S4 , S6 ) + w(S4 , S7 ) = 3.2 SW E(S5 ) = w(S5 , S2 ) + w(S5 , S3 )

(16)

+ w(S5 , S7 ) + w(S5 , S8 ) = 2.8 SW E(S6 ) = w(S6 , S4 ) + w(S6 , S7 ) = 0.5 SW E(S7 ) = w(S7 , S4 ) + w(S7 , S5 ) + w(S7 , S6 ) + w(S7 , S8 ) = 1.8 SW E(S8 ) = w(S8 , S5 ) + w(S8 , S7 ) = 1.0. For example, if we assume that the number of needed CHs is 2, according to the results of SW E values, it can be argued

123

M. Hatamian et al.

that nodes S4 and S5 are more appropriate candidates in terms of SW E, respectively. Therefore we have: Maximum SW E =

SW E(S4 ) = 3.2 SW E(S5 ) = 2.8.

for i = 1; i E(avr ) ⎪ ⎨ gx = I D. SW E(Si ) which is maximum between other SW Es ⎪ ⎩ O(Si ) which O(Si ) = 0.

(22)


For initialization and generating initial population, random selection might be utilized. 5.3 Fitness function in CGC protocol Now, the current population must be evaluated to determine survivors. For this purpose, a fitness function is exploited. Fitness function is an objective function which is tried to be minimized or maximized. Since routing is a two level procedure (first level between nodes and CHs and second level between CHs and BS), it must be considered while dealing with fitness function. The proposed fitness function (F) is a function of all three parameters described above and is defined as follows: F=

z

+

Min (EC z−1 − EC z ) + SW E z + B S Fz

Min (E Nz−1 − E Nz ) + W E z .

(23)

In Eq. 24, pi is the probability that chromosome i will be selected, Fitnessi represents the fitness function value s of chromosome i, q=1 Fitnessq represents sum of all fitnesses of the chromosome with the population, and s represents the size of population. The steps of the individual selection using roulette wheel method is shown by Algorithm 2. Algorithm 2 Pseudo-code for individual selection with roulette wheel model. ROULETTE WHEEL SELECTION() r=random number, where 0 ≤ r < 1 Sum=0 for each individual i Sum = Sum + p(i) if (r < Sum) BEGIN return i END END

z

In Eq. 23, EC z represents average energy of cluster in current round (round z) and EC z−1 represents average energy of cluster in previous round (round z − 1). E Nz and E Nz−1 are average of total network energy in current and previous rounds, respectively. Since the average energy of cluster and average energy of network in round z are always less than or equal to these values in round z − 1, it is desired that the difference between them is minimized. Also, SW E z represents the sum of weights for edges connected to the CH. B S Fz describes the CH outlier or non-outlier property and W E z represents the weight of the edge which is connected between the CH and BS. 5.4 The genetic operators in CGC protocol In this section, the genetic operators which are used in the proposed protocol, are explained. These operators are: Selection, Crossover, and Mutation. In the individual selection, some of current chromosomes are selected to generate new population with respect to their desirability. They may be selected randomly because a good offspring may result from combination (crossover) of a good and a bad parent or even two bad parents. It means that random selection will not be problematic in the future. Nevertheless, in CGC protocol the Roulette Wheel [21] model is utilized. The main idea is: 1- chromosomes with more competence, have greater chance of being selected. 2- chance of selection is proportional to the fitness of chromosomes. In roulette wheel selection, the probability that individual i will be selected, is obtained as follows: Fitnessi . q=1 Fitnessq

pi = s

(24)

Random number r is generated (where, 0 ≤ r < 1). Sum, calculates sum of all chromosomes fitnesses in population. Loop goes through the population and adds fitnesses from 0 to Sum. When the Sum is greater than r , stops and returns the chromosome where you are. When the parent chromosomes are selected, the crossover operator is applied to parent chromosomes with the probability of Pc . This operator combines parents and generates new chromosomes (offsprings). During crossover operation, new information is usually extracted from existing information in current chromosomes (chromosomes existing in parent population). In CGC protocol, one-point crossover [21] is exploited. One-point crossover operator breaks two chromosomes from a random point and replaces broken parts of these chromosomes. In other words, in one-point crossover one crossover location l[1, 2, . . . , m v−1 ] is chosen at random. m v is the number of variables of an individual (chromosome). Then, the variables are exchanged between the individuals from this point and two new offsprings are produced. Figure 8 depicts the generation of offspring chromosomes from parents chromosomes using one-point crossover operator in CGC protocol. Consider two parents [101011001011] and [000110101001]. If we assume that the chosen crossover location is l = 3, after one-point crossover, the new individuals are generated in forms of Offspring1 = [101|110101001] and Offspring2 = [000|011001011]. After two new chromosomes are generated, the initial chromosomes and generated ones are called parents and offsprings, respectively. Mutation [21] occurs while genetic information is transferred between parents and offsprings. Mutation operator is applied to each chromosome resulted from crossover operator (offspring chromosome). For each bit of chromosome,

123

M. Hatamian et al. Fig. 8 Generating offsprings by applying one-point crossover operator

Crossover Point

g1

g2

g3

g1

g2

g3

Crossover Point

g i................

gL

g1

g2

g3

g

g LL

g1

g2

g3

................

................................ ................ i ................

a random number is generated. If the value of random number is less than mutation probability Pm , mutation occurs in that bit; otherwise, mutation will not happen. In CGC protocol a proper node which satisfies the criteria for becoming a CH (according to the criteria defined earlier) is selected randomly. If this node is not among the nodes inside the chromosome, the mutation occurs and it will be located inside one of the cells of mentioned chromosome. 5.5 Algorithm termination in CGC protocol In this step, all members inside new population are evaluated. If the requirements are met, the algorithm terminates. Otherwise, the new population is utilized as initial population of next round and all mentioned steps are repeated. Termination criteria in GA might be different. For example, algorithm execution time, limited number of iterations or lack of change in optimum answer after specific number of iterations might be examples criteria for algorithm termination. In the CGC protocol the termination criterion is lack of change in optimum solution after a specific number of iterations. Pseudo-code of all steps in the CGC protocol is presented by Algorithm 3.

g i................

gL

g i................

gL

................

................

A WSN with a large number of nodes is called a large-scale WSN. Energy consumption management in such a scale is one of the most important issues. It should be noted that the strategies which are used to manage communications among nodes in small scale WSNs, are not efficient in large scale WSNs. In order to use the CGC protocol in large scale WSNs for upper level routing (between CHs), this protocol is expanded by a new concept which is called “Onion Approach”. After determining optimal CHs and the cluster formation through a process by the CGC protocol, onion approach will be realized. A layered network (like the layers of an onion) is formed and the CHs are placed in these layers. Figure 9 is assumed. In this figure, after the optimal CHs are selected and the clusters are formed, onion layering operation is performed. Each of the blue squares indicates a CH node. Onion approach reduces communication overhead between CHs by dividing the network into several onion layers.

BS

Source

Fig. 9 How the CHs are placed in different onion layers

123

Offsprings

6 Upper level routing using onion approach

Algorithm 3 Pseudo-code of all steps in the CGC protocol. Procedure CGC START for i = 1; i E(avr ) // SW E(Si ) is maximum // O(Si ) = 0 (the node is not an outlier) s=size of population Pc =rate of crossover Pm =rate of mutation itr =number of iterations generate initial population of size s save this solutions in P Evaluate P according to fitness function for itr not finished Select P(i) from P(i − 1) Produce child solution by crossover and mutation operators Evaluate child solution according to fitness function add child to P(i) END for END for END Procedure

Parents


6.1 Onion layering operations 6.1.1 Primary assumptions Assume a network with m CH nodes. Onion layers are a combination of the CHs and can be defined in form of (l1 , l2 , . . . , lk ), where, k denotes the number of layers and 1 ≤ k ≤ o. In the proposed onion layering method, after the selection of optimal CHs and cluster formation, each CH node which is placed in one of the onion layers, just knows the CHs which are placed in the next and previous layers and communicates with them. In large-scale networks, as the number of nodes increases dramatically, the communication overhead between them is also increased. Thus, providing a mechanism to reduce communication overhead between the CH nodes is crucial. Onion approach, by dividing the network into separate layers, leads to reduction in communication overhead between the CH nodes. Since, if a CH node wants to communicate with others, does not deal with a huge number of CH nodes and just deals with CHs which are placed in the next and previous layers. Also, it is sufficient for each CH node to identify the CH nodes of its neighboring layers and there is no need to identify the state of every CH node of every layer in the network. Assume layer lk . We assume that this layer includes m CH nodes. In this case, these nodes only identify the CH nodes in layers lk−1 and lk+1 , respectively. 6.1.2 Onion layering Assume that the procedures of selecting optimal CHs and forming clusters through an evolutionary process by the CGC protocol are already done. Now, we intend to do the layering operation on the clustered network using the onion approach. Assume the radio communication radius of each CH node is Rc . If the CH nodes in each layer of the onion want to make a single-hop communication with the CH nodes in the next and previous layers, the radius of different onion layers is determined by the following proposed equation:

Rlk =

⎧ ⎪ ⎨ ⎪ ⎩

Rc 3 Rc 3

if k = 0 + k×

Rc 3

+ (k − 1) × 0.5 R3c

otherwise. (25)

In Eq. 25, Rl0 is the radius of the 0th layer (the innermost layer) and Rlk is the radius of the kth layer. By using Eq. 25, the onion layers are created such that the CH nodes in each layer can communicate with the CH nodes of their neighboring layers in a single-hop way. Consider Fig. 10. In this figure, each CH node, in its worst state (at the edges of the circles), has a distance of Rc with the CH nodes in each of its neigh-

c4 Rl0=10m

c3

c2 c1

Rl1=20m

l0 l1 ln-1 ln

Fig. 10 Neighboring layers and the distance between them

boring layers (i.e. if two CHs want to stay connected, the maximum distance should be Rc ). For example, the distance between nodes c1 and c4 is Rc (at its worst case). Therefore, in Eq. 25, Rc must always be divided by three so that every CH node in each layer has the CH nodes in the next and previous layers (neighboring layers) within its radio communication radius. Assume that the radio communication radius of each node is 30 m (Rc = 30). Thus, the radius of the innermost layer of the onion (layer l0 ) is 10 m (Rl0 = 10). Furthermore, the radius of the next layers are calculated using Eq. 25. The radius of layer l1 is: Rc Rc Rc + 1× + (1 − 1) × 0.5 Rl1 = 3 3 3 = 10 + 10 + 0 = 20.

(26)

Also, the radius of layers l2 and l3 are calculated using the same approach by Eqs. 27 and 28, as follows: Rc Rc Rc + 2× + (2 − 1) × 0.5 3 3 3 = 10 + 20 + 5 = 35. Rc Rc Rc + 3× + (3 − 1) × 0.5 Rl3 = 3 3 3 = 10 + 30 + 10 = 50.

Rl2 =

(27)

(28)

Figure 11 shows the created layers of the onion. As it can be seen, the layers are created such that every CH node in each layer can cover the CH nodes in its neighboring layers. This procedure continues till all nodes on the network are covered by different layers. If a CH node wants to communicate with

123

M. Hatamian et al. Northern Part

BS c6 c5 c4

Rl0=10m Rl3=50m

c3

CH node Rl1=20m

c2 l0

c1

l1 l2 l3

Rl2=35m c7

Southern Part

Fig. 11 The layers creation of the onion using proposed equation

another CH node in its neighboring layers and if there are not any CHs in its neighboring layers, these layers will be considered as null layers and the first layers after them will be assumed as the new neighboring layers.

6.1.3 Routing between onion layers For performing routing procedure between CHs (onion layers), the network is divided into two “northern” and “southern” parts. It should be mentioned that BS knows the coordinate of the network (in our simulations it is 200 × 200 m2 ) and according to this coordinate, determines which node must be placed either in northern part or southern part of the network and then, informs each node from its situation. As a result, each node will be aware of its place either in northern or southern part. In addition, we used [3] as a method of communication between CHs in order to select next relay node in routing procedure. In order to make an intelligent decision for dividing the network into two separate parts (northern and southern), we intend to propose a practical solution which can be exploited in any topology. For example, in some applications of WSNs, BS has been situated among the sensor nodes. On the other hand, in other applications, it has been placed outside the network area (maybe in north, west, south or east). For this reason, it is an essential need to propose a novel solution which can be used irrespective of BS’s location. Since our method is a centralized protocol, BS knows the size of the network. Assume ynetwor k as the width of the network. By dividing ynetwor k to 2, BS is able to apportion the network into two sections. Furthermore, BS needs to notify each sensor node from its situation and this process can be done in set-up phase and using following equation:

123

Fig. 12 An example of routing procedure between the CHs in northern and southern parts

Position Si =

⎧ ⎨ Put Si in northern part ⎩

Put Si in southern part

k i f y Si

CGC: centralized genetic-based clustering protocol for wireless sensor ...

CGC: centralized genetic-based clustering protocol for wireless sensor ...

Suggest Documents

A New Clustering Protocol for Wireless Sensor

A Clustering Protocol for Wireless Sensor Networks Based on Energy

An Efficient Clustering Protocol for Wireless Sensor Networks Based ...

Genetic Centralized Dynamic Clustering in Wireless ...

CENTRALIZED CONTROL OF WIRELESS SENSOR ... - CiteSeerX

Deciding Clustering Algorithms for Wireless Sensor Networks

Deciding Clustering Algorithms for Wireless Sensor ...

clustering algorithms for heterogeneous wireless sensor networks

TDMA Protocol Requirements for Wireless Sensor Networks

A centralized energy-efficient routing protocol for wireless ... - CiteSeerX

A plan for Wireless Sensor Networks with Centralized ...

power efficient clustering protocol (pecp)- heteregenous wireless ...

Power Control and Clustering in Wireless Sensor

Clustering Routing Algorithm in Wireless Sensor

Classification of Wireless Sensor Networks Clustering

Applying Hierarchical Agglomerative Clustering to Wireless Sensor ...

Clustering in Wireless Sensor Networks: Performance

Wireless Sensor Network MAC Protocol: SMAC & TMAC

An Innovative Wireless Sensor Network Protocol ...

Defining a Wireless Sensor Network Management Protocol

Wireless Sensor Networks Energy-Efficient MAC Protocol

Wireless Sensor Network MAC Protocol: SMAC & TMAC

Routing Protocol in Wireless Sensor Network: A

centralized schemes of fault management in wireless sensor ... - GESJ