Cost-efficient dynamic quota-controlled routing in

0 downloads 0 Views 784KB Size Report
Delay-tolerant networks1 (DTNs) are wireless mobile net- works, in which mobile ... munity structure of human networks and social rela- tions among nodes to ...
Research Article

Cost-efficient dynamic quotacontrolled routing in multi-community delay-tolerant networks

International Journal of Distributed Sensor Networks 2018, Vol. 14(5) Ó The Author(s) 2018 DOI: 10.1177/1550147718776227 journals.sagepub.com/home/dsn

Jiagao Wu1,2 , Yue Ma1,2 and Linfeng Liu1

Abstract Delay-tolerant networks are novel wireless mobile networks, which are characterized with high latency and frequent disconnectivity. Besides, people carrying mobile devices form a lot of communities because of similar interests and social relationships. How to improve the routing efficiency in multi-community scenarios has become one of the research hot spots in delay-tolerant networks. In this article, we present a network model of the multi-community delay-tolerant networks and formulate a dynamic quota-controlled routing problem of minimizing the average number of copies of a message that satisfies the required delivery probability under the given time-to-live of the message as a nonlinear optimization problem. To solve this problem, we propose an improved genetic algorithm called genetic algorithm for delivery probability and time-to-live optimization for the dynamic quota-controlled routing scheme to reduce the routing cost further. In addition, a cost-efficient dynamic quota-controlled routing protocol based on genetic algorithm for delivery probability and time-to-live optimization is proposed, which can dynamically adjust message copies according to its assigned delivery probability and time-to-live in different communities on the shortest path. Both the numerical and simulation results show that our routing with the proposed algorithm is more cost efficient. Keywords Delay-tolerant networks, quota-controlled routing, multi-community, genetic algorithm, optimization

Date received: 29 September 2017; accepted: 10 April 2018 Handling Editor: Christos Anagnostopoulos

Introduction Delay-tolerant networks1 (DTNs) are wireless mobile networks, in which mobile carriers (e.g. human beings) communicate with each other via their short-distance and lowcost devices to share data objects.2 DTNs can be used in challenging environments such as wireless sensor networks, military networks, vehicular ad hoc networks, wildlife tracking, and space communications.3,4 Unlike conventional mobile networks such as mobile ad hoc networks, intermittent and uncertain connectivity makes data forwarding a challenging issue in DTNs, and thus the storecarry-forward strategy is often adopted in DTNs’ routing protocols in order to overcome the lack of connectivity.5 Lots of DTN routing protocols have been proposed. Direct delivery6 and epidemic7 routings are viewed as

the two extreme routing schemes, since the former scheme keeps only one copy and the latter does not limit the number of copies of a message in the network. To obtain a better tradeoff between performance and resource consumption in DTNs, routing protocols with quota-controlled scheme are proposed, in which each 1

School of Computer Science & Technology, Nanjing University of Posts and Telecommunications, Nanjing, China 2 Jiangsu Provincial Key Laboratory of Big Data Security and Intelligent Processing, Nanjing University of Posts and Telecommunications, Nanjing, China Corresponding author: Jiagao Wu, School of Computer Science & Technology, Nanjing University of Posts and Telecommunications, Nanjing 210023, Jiangsu, China. Email: [email protected]

Creative Commons CC BY: This article is distributed under the terms of the Creative Commons Attribution 4.0 License (http://www.creativecommons.org/licenses/by/4.0/) which permits any use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages (https://us.sagepub.com/en-us/nam/ open-access-at-sage).

2 newly generated message is associated with a quota to constrain the maximal number of copies of a message that can be duplicated during the process of message delivery. The quota-controlled routing scheme can be further divided into static (e.g. spray and wait8) and dynamic (e.g. multi-period S&W9 and delay-bounded S&W10) according to whether the quota is fixed or adjustable while routing a message. It is shown that the dynamic quota-controlled routing protocols can adapt the dynamic characteristic of DTNs and have more efficient utilization of the network resources by theoretical analysis and experimental results. Meanwhile, in DTNs, portable devices are in the majority carried or controlled by humans.11 Therefore, when we design a routing protocol, human mobility patterns are needed to be considered. Recently, a large number of routing protocols, which exploit the community structure of human networks and social relations among nodes to determine the way of message delivery, are presented.12–20 SimBet12 adopts the betweenness centrality and similarity metrics to determine the message forwarders. In Bubble Rap13, a node forwards the message to the encountered nodes with higher global or local degree centrality. Social- and mobile-aware routing strategy (SMART)14 applies the convolution of social centrality and social similarity to decide forwarding node efficiently while avoiding the blind-spot and dead-end problems. Gondaliya et al.15 proposed an approach to schedule relay nodes based on the two centralities of betweenness and degree in the absence of the community information about the destination. Nevertheless, all these routing protocols adopt the single-copy forwarding scheme, which limits the message delivery ratio. Integrating forwarding and replication (IFR)16 uses message replication and forwarding for intra- and inter-community communication, respectively. Friendship-based routing (FBR)17 exploits the periodically differentiated friendships to find the community and adopts the quota-controlled routing scheme inside a community. Although several heuristic methods were presented in IFR and FBR, the routing optimization problem in community-based DTNs was not considered yet. Homing spread (HS)18 is a zero-knowledge multi-copy routing algorithm, which utilizes community homes to spread messages. The optimization objective of HS is to minimize the delivery delay for a given number of message copies. Community-based adaptive spray (CAS)19 uses the community topology to pre-compute the multi-hop shortest path from the current community to the destination community and allocates the optimal quota at each hop adaptively. The goal of this protocol is to allocate the minimum number of message copies for a message while still achieving a high delivery ratio. Wu et al.20 developed a performance model of multicommunity epidemic routing with social selfishness and

International Journal of Distributed Sensor Networks proposed an energy-efficient copy-limit-optimized algorithm. However, to the best of our knowledge, none of them studied the cost efficiency of the dynamic quotacontrolled routing in the multi-community DTNs. Hence, the performance of community-based routing still needs to be improved considering the dynamic nature of DTNs. In order to obtain a balance between performance and cost of message delivery, in this article, we use the multi-community DTN model similar to that in CAS.19 However, instead of the static quota-controlled routing adopted in CAS, we apply the dynamic quotacontrolled routing (e.g. 2-period S&W) in each community on the shortest path to improve the cost efficiency further. In addition, we present a new cost optimization problem to minimize the dynamic quota and at the same time achieve a required delivery probability of the end-to-end multi-hop routing, which is more comprehensive than that of CAS. To solve the optimization problem, we propose an improved genetic algorithm (GA), which is more general and efficient than the enumeration algorithm used in CAS. The main contributions of this article are as follows: 1.

2.

3.

A network model of the multi-community DTNs is presented in this article. Based on this model, we formulate a dynamic quotacontrolled routing problem of minimizing the average number of copies of a message that satisfies the required delivery probability under the given time-to-live (TTL) of the message as a nonlinear optimization problem. An improved GA called genetic algorithm for delivery probability and time-to-live optimization (GAPTO) is proposed to solve this problem in the multi-community DTNs by means of several new evolutionary policies. A cost-efficient community-based dynamic quota-controlled routing protocol with our optimization algorithm is proposed, which adopts 2-period S&W and can dynamically adjust message copies according to its assigned delivery probability and TTL in different communities on the shortest path. Both the numerical and simulation results show that the routing protocol with our algorithm has more significant advantages in cost efficiency.

Related work There are various classifications of routing algorithms for DTNs, which usually are divided into two categories: forwarding based and replication based. Forwarding-based protocols, such as direct delivery,6 SimBet,12 and minimum estimated expected delay

Wu et al. (MEED),21 keep only one copy per message in the network and route message in the network hop by hop to the destination. Although they can save buffer space and network bandwidth, they often suffer from low delivery ratio in DTNs due to the single copy. On the contrary, replication-based protocols distribute multiple copies of a message into the network, which can be further divided into flooding-based and quotacontrolled protocols. Epidemic routing7 is a typical example of flooding-based routing algorithms. It distributes the copies of each message to as many nodes as possible, which makes it unpractical for resourceconstrained devices in DTNs. To address the problem of epidemic routing, quota-controlled protocols were proposed in many papers. In early times, the quota, that is, the maximal number of copies of a message, is static or fixed. However, due to the dynamic nature of DTNs, several routings with dynamic quota control schemes were proposed to achieve a better tradeoff between performance and resource consumption. Spray and wait (S&W)8 is a well-known static quota-controlled protocol. In S&W, a new message initially starts with a fixed value of quota and then it is delivered in two phases: spray and wait. In the spray phase, a binary quota allocation scheme is usually adopted, which allocates half of the quota from the current message to the replication of the message. If the destination node is not found in the spray phase, the routing process will enter the wait phase, where each node carrying a copy of the message performs direct transmission (i.e. forwards the message only to its destination). However, the fixed quota of message copies cannot suit the dynamic environment of DTNs very well. Therefore, protocols based on dynamic quota control scheme were proposed. Bulut et al.9 proposed a dynamic quota-controlled routing protocol named multi-period S&W, which partitions the delivery time deadline into several predefined, variable-length periods, each composed of a spraying phase followed by the wait for delivery. In each period, the source node sets a certain quota for the message. When the message is delivered to the destination node, the destination node will send an acknowledgment message to the source node with epidemic routing. During the current period, if the source node does not receive the acknowledgment message, an additional quota of the message is provided properly for the next period. It has been shown that the average number of copies of multi-period S&W is smaller than that of the original S&W protocol with the same required delivery ratio and deadline. Moreover, Abbas et al.10 proposed a delay-bounded S&W protocol, which dynamically adjusts the quota of messages based on the measured delivery probability and TTL of messages. In the research of the characteristics of human mobility, human movement is shown to be driven by

3 personal interests and social relationships.22 Hence, the nodes which frequently coexist in a common location are considered to consist of a community. Recently, the utilization of community structure has been proved to improve the routing performance in DTNs.23 Therefore, many community-based routing protocols were proposed. Daly and Haahr12 proposed SimBet, which adopts the betweenness centrality and similarity metrics (a measurement of the importance of a node in a network) to determine the forwarders and introduces a tunable parameter to adjust the relative importance of two metrics. Hui et al.13 evaluated the impact of community and centrality on forwarding, and proposed a hybrid algorithm, Bubble Rap, which selects highcentrality nodes and community members of destination as relays. Zhu et al.14 proposed SMART, which exploits a heuristic method for community detection and then applies a decayed routing metric convoluting social similarity and social centrality to decide forwarding node efficiently while avoiding the blind-spot and dead-end problems. Gondaliya et al.15 proposed a simple approach, which can schedule the most globally central nodes for message transmission based on the two centrality measures: betweenness and degree when community information about message’s destination is not available. Nevertheless, all these routing protocols adopted the single-copy forwarding schemes. As mentioned above, although saving network resources, the single-copy forwarding schemes often suffer from low message delivery ratio in DTNs. Hence, Li et al.16 proposed a community-based routing protocol, called IFR, which allows a node to replicate message in the same community and to forward message outside the community according to the delivery weight obtained from the centrality of the node. The difference between IFR and Bubble Rap is that IFR floods a message inside the community of the destination node rather than a single copy in Bubble Rap. Bulut et al.17 presented FBR, in which periodically differentiated friendship relations are used in detecting the community and forwarding of messages. For intra-community communication, FBR sprays several copies of messages to a number of nodes in the community. For intercommunity communication, the data are forwarded only when the destination is in the same periodical community as the relay. But the presented routing methods are only heuristic in IFR and FBR, which did not consider the routing optimization problem in community-based DTNs to improve their performance. Xiao et al.18 proposed a novel zero-knowledge multicopy routing algorithm named homing spread (HS) for mobile social networks, in which all mobile nodes share all community homes. HS mainly lets community homes spread messages with a higher priority. Theoretical optimization and analysis showed that HS can spread a given number of message copies with the

4 minimum expected delivery delay. Miao et al.19 proposed CAS routing protocol, which uses the community topology to pre-compute the multi-hop path that is the shortest path from the current community to the community of the destination node. In CAS, the optimal quota on each hop is computed under the message TTL and a predefined expected delivery ratio with an enumeration method. Wu et al.20 proposed a performance model of multi-community epidemic routing with social selfishness with ordinary differential equations (ODEs) and an energy-efficient copy-limitoptimized algorithm, which is designed to determine the optimal copy limit in multiple communities for epidemic routing. However, all of them utilized the static quota-controlled routing schemes (e.g. S&W and copylimited epidemic), which usually lead to worse cost efficiency than the dynamic ones. In this article, we study the cost-efficient dynamic quota-controlled routing in the multi-community DTNs and address the optimization problem of minimizing the average number of copies of a message that satisfies the required delivery probability under the given lifetime of the message. Although our model is similar to that in CAS,19 our work is different mainly in three ways. First, the routing protocol we adopt in the multi-community DTNs is dynamic quota controlled, while CAS adopts the static quota-controlled routing in a community though it re-allocates the quota at each hop. Second, the optimization problem we present is more comprehensive, which is able to express the minimum cost of the dynamic quota-controlled routing and the required delivery ratio of the end-to-end multi-hop routing. On the contrary, CAS is just suitable for static routing and the delivery ratio of one hop. Third, CAS uses the enumeration algorithm that is poorer in scalability, but we adopt a more general and efficient algorithm, that is, GA, to solve the optimization problem.

System model In our model of DTNs, we partition the nodes into multiple nonoverlapping communities and assume that there are n communities and any community i is denoted by Ci , i 2 ½1, n. Moreover, each community Ci has jCi j nodes in it. It needs to be pointed out that the multiple communities in practice can overlap, that is, a node can belong to more than one community. However, for simplicity and abstraction of our model, we assume in this article that these communities have no overlap, which is also assumed in some related work.13,19 We also assume that the nodes move according to the random waypoint (RWP) mobility model, which is widely applied in the research of mobile and wireless networks as a good abstraction of realistic movement scenarios,24 and the occurrence of the

International Journal of Distributed Sensor Networks

Figure 1. Model of multi-community DTNs.

contacts between any two nodes follows a Poisson process which has been validated.25 Because nodes in the same community meet each other with high frequency, while nodes belonging to different communities merely meet each other, in our model, we assume that the nodes in the same community have the same average inter-contact time with each other and the average contact rate in Ci is denoted as li . Such an assumption is a reasonable approximation in modeling and also adopted in previous work.18–20 In view of the whole network, we model that the communities Ci and Cj communicate only via a pair of nodes, which are the most closely related nodes between the two communities and called the gateway nodes of the communities Ci and Cj . Then we can denote lij as the average contact rate between the communities Ci and Cj , which is equal to the average contact rate between the gateway nodes of the communities Ci and Cj . Moreover, we note that different communities communicate via different pairs of gateway nodes. Our model of the multicommunity DTNs is shown in Figure 1. We now analyze the message forwarding process. As shown in Figure 1, when the source node S, which is in the community Cs , generates a message and wants to send the message to the destination node D in the community Cd . In our network model, it is possible that there exists more than one path from the community Cs to the community Cd . Then, the question is how to find the shortest path. From Figure 1, the multi-community DTNs can be viewed as a weighted undirected graph (WUG) in which a vertex represents a community and an edge represents a link between two communities. The weight of the link between Ci and Cj , denoted as Wij , is given by equation (1) Wij =

1 lij

ð1Þ

where Wij is the weight in the WUG, which is used to compute the shortest path, and 1=lij represents the average inter-contact time between the communities Ci

Wu et al.

5

and Cj . Considering that the average inter-contact time between different communities is far greater than that in the same community, here we use only 1=lij to represent the weight. Based on the WUG, the shortest path R from the source to the destination can be obtained by Dijkstra’s algorithm. For the sake of clarity and without loss of generality, let us assume that the communities on the path R are \C1 , C2 , . . . , Ck . sequentially, the source node S and the destination node D are in the communities C1 and Ck , respectively, and k is the total number of hops of the path R. According to the path R, at the beginning, the source node S should deliver the message to the gateway node in the community C1 , which connects the communities C1 and C2 . After that, when the gateway node in C1 encounters the gateway node in C2 , the message is forwarded to the community C2 . Then, the gateway node in C2 becomes a new source node of the message, which will be delivered to the next community on the path R. The process mentioned above will be repeated until the message is delivered to the destination node or its TTL expires. As shown in Figure 1, the whole message forwarding process consists of a series of subprocesses, each of which is a complete message delivery in a community on the path R. Therefore, we define P = (p1 , p2 , . . . , pk ) and T = (t1 , t2 , . . . , tk ) to be the delivery probabilitylimit vector and TTL-limit vector for all communities on the path R, respectively. Since different delivery probability- and TTL-limit vectors lead to different message delivery costs. If we want to deliver a message from the source node S to the destination node D with minimum cost and meet the desired delivery probability P0 before the restricted lifetime of message T0 at the same time, a reasonable allocation method of the limitation of the delivery probability and TTL for communities on the path R is necessary. Based on the considerations above, we can formulate the optimization problem for delivery probability-limit vector P and TTL-limit vector T as min Cost(P, T )  Ratio(P)  P0 s:t: Time(T )  T0

ð2Þ

where k Y

Ratio(P) =

pi

ð3Þ

i=1

Time(T ) =

k X

ti +

i=1

Cost(P, T ) =

k X i=1

k1 X

1 l i = 1 i, i + 1

Costi (pi , ti )

ð4Þ

ð5Þ

As described previously, the message will be delivered by a series of subprocesses, which are the independent events in probability theory. Hence, the total delivery probability with the delivery probability-limit vector P, that is, Ratio(P), can be given by equation (3), which must be no less than P0 as shown in equation (2). Moreover, equation (4) represents the total time, that is, Time(T ), which is made up of the sum of the TTL components of the TTL-limit vector T allocated to the message for different communities and the sum of the average inter-contact time between communities on the path R, which must be no greater than T0 as shown in equation (2). Now, let us focus on the delivery cost. We assume that the delivery cost of the message in the community Ci on the path R is Cost(pi , ti ), i 2 ½1, k, which is determined by the values of pi and ti . Hence, the total delivery cost, denoted Cost(P, T ), of the message is determined by the delivery probability-limit vector P and TTL-limit vector T , which is the sum of the delivery cost in each community on the path R as shown in equation (5). In the static quota-controlled routing protocol, such as S&W,8 if we assume that the source node is in the community Ci (i.e. the source node of the message or the gateway node which connects the communities Ci1 and Ci ) and Li is the minimum value of the quota, which can achieve the delivery ratio pi before TTL ti , Li can be obtained by the following equation9 Li =

ln (1  pi ) li ti

ð6Þ

Then we can have the minimum delivery cost in the community Ci with S&W algorithm as Costi (pi , ti ) = Li

ð7Þ

In order to optimize the cost efficiency of the routing further, in this article, we use the dynamic quotacontrolled routing protocol, such as multi-period S&W,9 since the dynamic routing protocol can achieve lower delivery cost than the static routing protocol as we discussed above. Specifically, we will use 2-period S&W algorithm in our model, because the simulation results show that 2-period S&W algorithm is very easy to be implemented and the performance of it is nearly as good as that of other multi-period algorithms in general.9 In 2-period spraying algorithm, the source node in the community Ci sets the quota of a message to be L1i at the beginning. When the gateway node in the community Ci which connects the communities Ci and Ci + 1 or the destination node receives any copy of the message, it will send an acknowledgment message with epidemic routing to all other nodes notifying them to stop

6

International Journal of Distributed Sensor Networks

the further spray. If the source node does not receive any acknowledgment of the message at time tdi , it will increase additional quota L2i  L1i of the message for the 2nd period of spray. The optimal values of L1i , L2i , and tdi can be calculated according to pi and ti under the assumption of L1i \L2i  jCi j, which is reasonable in DTNs.9 Hence, we can have the minimum delivery cost in the community Ci with 2-period S&W algorithm as Costi (pi , ti ) = L1i pdi + L2i (1  pdi )

ð8Þ

pdi = 1  eli L1i tdi

ð9Þ

where

Algorithm design The problem described in equations (2)–(5) can be regarded as a nonlinear constrained optimization problem. As discussed above, we should choose the appropriate delivery probability-limit vector P and TTL-limit vector T , with which the delivery cost is minimum and the inequality constraints are incorporated at the same time. However, most traditional optimization algorithms usually achieve a local optimal solution rather than the global optimal solution, which cannot effectively solve this kind of optimization problem. Thus, in this article, we consider an appropriate evolutionary method to solve it. During the past two decades, GAs have been successfully and broadly applied to solve constrained optimization problems.26 Hong et al.27 observed that in most GA variants only crossover and mutation operators are employed in each generation. As a result, the search ability of these algorithms could be limited. Hence, they improved the algorithm efficiency by means of adding the best preserving policy, an elitismbased immigrant policy and improving the traditional training process of the GA to overcome these drawbacks. In this article, we will use this improved GA and propose an algorithm called GAPTO to solve the optimization problem in equations (2)–(5). The details of the algorithm are described as follows.

Representation and initialization In this article, we denote the jth chromosome in the current population as Hj = (Pj , Tj ) ð10Þ = ((p1j , p2j , . . . , pkj ), (t1j , t2j , . . . , tkj )), j 2 ½1, m

where m is the maximum number of chromosomes in a generation, and pij and tij , i 2 ½1, k, represent the required delivery ratio pi and message TTL ti allocated to the community Ci in Hj , respectively.

Then the initialization operation generates m chromosomes randomly, of which the genes, pij and tij , are directly coded within their corresponding bounds 

pij = P0 + s(1  P0 ) tij = sT0

ð11Þ

where s is a uniform random number between 0 and 1. Afterward, we check whether the chromosome Hj satisfies the inequality constraints in equation (2). If not, a new chromosome Hj is regenerated by equation (11) and checked until the constraints are satisfied. After the initialization operation, we obtain m feasible chromosomes. This operation ensures that the optimization starts in a feasible region, which can predigest the calculation and decrease the search pressure.

Fitness function Although the initial chromosomes are all feasible after initialization, Hj sometimes may not satisfy the inequality constraints in equation (2) during the subsequent genetic operations, for example, crossover and mutation operations. Therefore, a penalty function is needed to convert the nonlinear constrained optimization into an unconstrained optimization problem based on equation (2). In addition, considering the difference range of Ratio(Pj ) and Time(Tj ), a normalization method is necessary to normalize them to the range between 0 and 1. In this way, all terms of the fitness function will be in the same order, which can speed up the convergence. Finally, the fitness function Fit(Hj ) with the objective function Cost(Hj ) and the penalty function G(Hj ) is defined as Fit(Hj ) = Cost(Hj ) + G(Hj )

ð12Þ

where    h P0 G(Hj ) = m max 1, 0 Ratio(Pj )   ð13Þ h T0 + max  1, 0 2T0  Time(Tj )

where m (m.0) and h (h.1) are the penalty factors. From equation (13), we can find that the ranges of P0 =Ratio(Pj ) and T0 =(2T0  Time(Tj )) are normalized between 0 and 1. The penalty factor m is used to unify the objective function and penalty function into the same order. According to the penalty factor h, when Ratio(Pj ) and Time(Tj ) do not satisfy the constraint conditions in equation (2), G(Hj ) will get a large value. Clearly, the fitness function Fit(Hj ) will reach the minimum optimization point only when G(Hj ) = 0. That is, all genes of the chromosome Hj will be in the feasible region.

Wu et al.

7

Crossover operator In order to increase the local diversity of crossover chromosomes, Ha and Hb are selected randomly with the crossover probability a. Then we introduce a linear combination of Ha and Hb , and each gene of Ha0 and Hb0 are reproduced as follows 8 0 p = spia + (1  s)pib > > < 0ia t ia = stia + (1  s)tib ð14Þ 0 p = spib + (1  s)pia > > : 0ib t ib = stib + (1  s)tia

Mutation operator Let the current best chromosome be He , which has the minimum fitness. Then, we choose randomly a chromosome Hf with the mutation probability b. If the difference of Fit(He ) and Fit(Hf ) is less than 102 , which means that the population converges, we introduce a linear combination for He and Hf to obtain the mutation chromosome Hf0 . Each gene of Hf0 is reproduced as follows (

p0 if = pie + s(pie  pif ) t0 if = tie + s(tie  tif )

ð15Þ

Otherwise, if the current chromosomes He and Hf are obviously different, to avoid premature convergence, we choose Hu and Hv in the current population randomly. Then each gene of Hf0 is reproduced as follows 

p0 if = pif + s(pie  pif ) + s(piu  piv ) t0 if = tif + s(tie  tif ) + s(tiu  tiv )

ð16Þ

In equation (16), the difference of He and Hf and the difference of Hu and Hv act as two search directions in the solution space, which can try to cover more search space when the population is in a local minimum.

Migration operator In order to greatly increase the exploration of the search space, we introduce the migration operation, in which chromosome is generated on the basis of the best chromosome He . Then we choose the wth gene and the zth gene of He with the migration probability x. Then the new genes of He0 are reproduced as follows ( p0we

=

0 = tze

we P0 pwe + s(P0  pwe ), g\ p1P 0 we P0 pwe + s(1  pwe ), g  p1P 0 ( g\ Ttze0 (1  s)tze ,

(1  s)tze + sT0 , g  Ttze0

ð17Þ

ð18Þ

where g is also a uniform random number between 0 and 1 as s. This migration operator can adjust the value of a gene properly within its bounds. If the value approaches the upper boundary, this operator can pull back it from the upper boundary. On the contrary, if it nears the lower boundary, the value will be increased.

Select operator Select m  1 members from the current population and add them to the next generation by roulette wheel election mechanism. Moreover, elitism strategy is embedded into the select operator, which means that the best chromosome of the current generation will be reserved to the next generation. The pseudo code of the proposed algorithm is shown in Algorithm 1. According to the algorithm, at first, m chromosomes are generated randomly to construct the first generation with initialization operator (lines 3–8). Then the crossover (lines 10–13), mutation (lines 14– 23), and migration (lines 24–28) operations are performed, respectively, and each of which is repeated m times. Afterward, a new generation with m chromosomes is generated using the selection operator (lines 29–32). After total generations of evolution, at last, we select the best chromosome and return the optimal delivery probability-limit vector P and TTL-limit vector T  (lines 34–35).

Implementation consideration In this section, we have a discussion about the implementation of a dynamic quota-controlled routing protocol with GAPTO. Assume that all nodes know the network parameters, including the number of communities n, average contact rate in communities fli g, average contact rate between communities flij g, desired delivery probability P0 , message TTL T0 , maximum number of chromosomes in a generation m, total number of iterations total, crossover probability a, mutation probability b, migration probability x, penalty factors m and h. When a source node wants to send a message to a destination node, at first, the source node will find the shortest path based on the model of multicommunity DTNs by Dijkstra’s algorithm, which is denoted as \C1 , C2 , . . . , Ck .. Then the source node runs Algorithm 1 to obtain the optimal delivery probability-limit vector P and TTL-limit vector T  . After this, the message is generated and the corresponding vectors P and T  are associated with it. During the routing, the source node or the gateway node in the communities Ci (1  i  k) computes the values of L1i , L2i , and tdi as described previously. Afterward, the message will be forwarded by 2-period S&W routing protocol. That means the message will be

8 Algorithm 1 Genetic algorithm for delivery probability and TTL optimization (GAPTO) Input: P0 , T0 , k, fli g, flij g, m, total, a, b, x, m, and h Output: P and T  1: gen 1 2: for j = 1 to m do 3: randomly generate a chromosome Hj 4: while Hj is not satisfied the constraints in equation (2) do 5: randomly generate a new chromosome Hj 6: end while 7: end for  Hj 8: POPgen 9: while gen  total do 10: for j = 1 to m do 11: select two chromosomes Ha and Hb with the crossover probability a 12: generate H0a and H0b with crossover operator by equation (14) 13: end for 14: for j = 1 to m do the best chromosome 15: He 16: randomly choose a chromosome Hf with the mutation probability b



17: if Fit(Hf )  Fit(He )  102 then 18: generate H0f with mutation operator by equation (15) 19: else 20: randomly choose two chromosomes Hu and Hv 21: generate H0f with mutation operator by equation (16) 22: end if 23: end for 24: for j = 1 to m do the best chromosome 25: He 26: randomly choose the wth gene and the zth gene of He with the migration probability x 27: generate H0e with migration operator by equations (17) and (18) 28: end for the best chromosome 29: He 30: add He to POPgen + 1 31: select m–1 members of POPgen to add to POPgen + 1 by using roulette wheel selection mechanism 32: gen gen + 1 33: end while the best chromosome 34: He Pe , T  Te 35: return P

forwarded in the community Ci until it is delivered to the destination node or the gateway node which connects the communities Ci and Ci + 1 . Therefore, with the optimal vectors P and T  , the routing protocol with GAPTO can maintain the desired delivery probability and minimize the delivery cost as excepted.

Performance evaluation In this section, we first analyze the effect of the parameters of GAPTO. Afterward, the proposed protocol is evaluated through both numerical computation and simulation. At last, the performance comparison of

International Journal of Distributed Sensor Networks GAPTO with the other two algorithms is made through theoretical results.

Effect of the parameters of GAPTO We now analyze the delivery cost of GAPTO affected by the migration probability x and the penalty factor h. We assume that the number of communities on the shortest path R, that is, k, ranges from 2 to 16. And we set the average contact rate li in the community Ci as follows li =

1 , i1 t 0 + (t 1  t 0 ) k1

1ik

ð19Þ

where t0 = 103 s and t1 = 104 s, which means that the average inter-contact time in the community Ci is different and between t 0 and t 1 . Moreover, the restricted lifetime of the message T0 should be long enough to let equation (4) satisfy the TTL constraint in equation (2). Thus, T0 is also set differently for different numbers of communities, that is, T0 is set at 1:5 3 104 s, 3:5 3 104 s, 6:5 3 104 s, and 1:25 3 105 s for k = 2, 4, 8, and 16, respectively. we also set the desired delivery probability P0 = 0:8 and the inter-community average contact rate lij = 1:996 3 104 =s (i 6¼ j). In this article, we choose crossover probability a = 0:8, mutation probability b = 0:6, and penalty factor m = 100 based on our experiment and experience. Figures 2 and 3 show the performance metrics in terms of the delivery cost versus migration probability x and penalty factor h with different k values, respectively. From Figure 2, we can find that the smaller the k is, the lower the delivery cost obtained. Because T0 is made up of the sum of the TTLs allocated to the message in all communities and the sum of the average inter-contact time between communities on the shortest path R according to equation (4). When k is small, the sum of the average inter-contact time between communities on R is relatively small. Then the source nodes or gateway nodes can set lower quotas with longer TTLs allocated to the message in the communities on R under the same required delivery ratio. On the contrary, when k increases, higher quotas are needed for shorter TTLs in the communities on R. Let us focus on the migration probability x. Although the population has a good stability due to no migration when x is 0, it is easy to fall into a local optimum rather than the global optimum. When x becomes larger, the stability is disrupted gradually, but the population diversity is ensured, which can avoid premature convergence. Thus, the delivery cost decreases when x changes from 0 to 0.6. However, the performance of the delivery cost becomes worse again when x is larger than 0.6 because of the slow convergence rate and the poor stability of population under larger x. At that point, the excellent chromosome is

Wu et al.

9

120

Delivery Cost

100

k =2 k =4 k =8 k =16

80

60

40

20

0

0

0.2

0.4

χ

0.6

0.8

1

Figure 2. Delivery cost versus migration probability x.

likely to become worse in the process of migration. Therefore, the delivery cost increases. Considering that an appropriate x can realize the parallel population diversity and selective pressure, we choose a value of x = 0:6 in our algorithm. It can be seen from Figure 3 that the delivery cost is the minimum when the penalty factor h is around 4.0 for different k values. When h is 0, there is no punishment on the fitness function in equation (12). As a result, those chromosomes not in the feasible domain may be reserved, and a large amount of search time will be spent in the infeasible domain. Thus, the solution may be a local optimum. When h increases, the strength of the punishment increases and more feasible domain is covered. This makes the solution jump from the local optimum and guides the search in promising directions. Therefore, the delivery cost approaches the global optimal value gradually. However, when h is more than 4.0, the punishment becomes too strong. Once a few genes of a chromosome are out of the feasible domain, the chromosome will be eliminated completely, which tends to decrease the search space and ignore the boundary of the feasible region. Thus, GAPTO cannot guarantee to find the optimal value, then the solution becomes worse again. To ensure the reasonable search space, h is chosen as 4.0 in this article.

Numerical and simulation evaluation In order to validate the proposed protocol in section ‘‘Implementation consideration,’’ we evaluate our protocol via both numerical computation and extensive simulation by MATLAB28 and the ONE simulator,29 respectively. In the simulation, the whole terrain is divided into different regions to form different communities. Besides, by changing the velocity of nodes in

Figure 3. Delivery cost versus penalty factor h.

Table 1. Simulation parameters. Parameter

Value

Mobility model Terrain size Velocity of nodes in communities on the path R Number of nodes in each community on the path R Transmission range Transmission speed Message size Acknowledgment message size Buffer size

Random waypoint 4000 m 3 4000 m 4~10 m/s 400 50 m 2 MB/s 8 KB 2 bytes 500 MB

different communities, we can obtain the different average contact rates li in different communities. The settings of the simulation parameters are shown in Table 1. Transmission speed and buffer size were sufficiently large to ignore the queuing effects, and the acknowledgment message size is so small that its cost can be ignored. Then, we set the inter-community average contact rate lij = 1:996 3 104 =s (i, j 2 ½1, k, i 6¼ j), P0 = 0:8, and k from 4 to 6. Figures 4–6 plot the performance metrics of the proposed protocol obtained by the numerical computation and simulation. It can be seen that all the results of the numerical computation are consistent with those of the simulation very well. Figure 4 shows the delivery cost versus the message TTL (T0 ) with different number of communities (k). It can be seen that the delivery cost increases with the increased k under a given T0 . The reason is the same as above, that is, the smaller the TTL allocated to the message in a community, the larger the quota that will be set in order to satisfy the requirement of delivery

10

Figure 4. Delivery cost versus T0 with different k.

International Journal of Distributed Sensor Networks

Figure 6. Delivery delay versus T0 with different k.

increases. This is because our protocol is optimized for the minimum cost. When T0 increases, our protocol can set smaller quota to satisfy the same P0 , which leads to the increase of the delivery delay. In addition, we observe that the delivery delay increases with decreasing k when T0 is the same, because the smaller the k value is, the longer the TTL and the lower the quota that are allocated to the message. Similarly, because of the aforementioned reason, the delivery delay increases.

Comparison with different algorithms

Figure 5. Delivery probability versus T0 with different k.

probability. Moreover, the simulated cost of the proposed protocol is slightly high compared to the numerical results. This is because the routing of the acknowledgment message adopts epidemic routing. In our simulation, the effect of the acknowledgment delay can be observed, which makes the process of the spray stop later and more copies of the message to be sprayed into the network. However, the gap between the numerical and simulated results will be narrowed when T0 becomes large. Figures 5 and 6 show the delivery probability and delivery delay versus T0 with different k values, respectively. It can be seen from Figure 5 that our protocol can achieve the desired delivery probability as we expect, which verifies the validity of our proposed protocol. In Figure 6, the delivery delay increases as T0

To evaluate the performance of GAPTO further, we compare GAPTO with the other two algorithms, proportional allocation algorithm (PAA) and random allocation algorithm (RAA), which are only different in the allocation algorithm for the delivery probability-limit vector P and TTL-limit vector T. Under the condition of satisfying the constraints in equation (2), in PAA, the assigned values of pi in P are directly proportional to li , while ti in T are inversely proportional to li , i 2 ½1, k. In RAA, the values of pi in P and ti in T are chosen randomly under the constraints in equation (2), i 2 ½1, k. Since the results of theory and simulation show good coherence in Figures 4–6, for convenience, the comparison is analyzed on the basis of numerical results by MATLAB. Let P0 , T0 , li , lij (i 6¼ j) be the same as those in section ‘‘Effect of the parameters of GAPTO.’’ We can calculate the delivery cost and delivery delay with different algorithms under different k and T0 values. Then the results are shown in Figures 7 and 8. From Figure 7, it can be seen that GAPTO always achieves the minimum delivery cost, because GAPTO

Wu et al.

11

Figure 7. Delivery cost versus k with different algorithms.

Figure 9. Delivery cost versus k with different protocols.

Figure 8. Delivery delay versus k with different algorithms.

Figure 10. Delivery delay versus k with different protocols.

not only retains the advantages of traditional genetic method, but also has better convergence and higher search efficiency, which is commonly used to generate high-quality solutions to optimization problems. On the contrary, RAA is always the worst in these three algorithms, because the results of RAA have the characteristics of randomness. PAA can also achieve a good result since there is a little difference of delivery cost between PAA and GAPTO. However, the difference becomes obvious when k increases. Besides better performance, GAPTO is more general and has a wider range of applications than PAA. Figure 8 shows the delivery delay versus k with different algorithms. It can be seen that the delivery delay of these three algorithms is similar to each other. Considering the delivery cost in Figure 7, we can find that GAPTO is the most efficient.

Moreover, to present the advantages in cost efficiency of the dynamic to the static quota-controlled routings, we compare GAPTO for 2-period S&W protocol against GAPTO for the original S&W protocol, which are called GAPTO and GAPTO_static, respectively. The only difference between GAPTO and GAPTO_static is the expression of the minimum delivery cost according to our system model. Figure 9 shows the delivery cost of the compared routing protocols. Clearly, the delivery cost of GAPTO is about 30% less than that of GAPTO_static. Thus, the cost efficiency of the dynamic quota-controlled routing is validated again. In addition, from Figure 10, we can see that the delivery cost of GAPTO is higher than that of GAPTO_static. This is because there usually exists a tradeoff between delivery cost and delay in DTNs. In this article, our focus is on minimizing the

12

International Journal of Distributed Sensor Networks

delivery cost and just limiting the delivery delay within the required TTL. Hence, the dynamic quota-controlled routing protocol is more cost efficient.

Conclusion In this article, we present a network model in the multicommunity DTNs and apply the dynamic quotacontrolled routing scheme into the model. Based on this, we formulate a routing problem of minimizing the delivery cost that satisfies the required delivery probability under the given TTL of message as a nonlinear optimization problem. Then we propose GAPTO and a cost-efficient dynamic quota-controlled routing protocol, which is based on GAPTO. The results of both numerical computation and simulation demonstrate the cost efficiency improvement of the proposed protocol. In the future, we are going to extend the network model and study our protocol under different mobility scenarios and heterogeneous communities. Declaration of conflicting interests The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by the National Natural Science Foundation of China under Grant Nos 41571389 and 61373139, Open Research Fund from Key Laboratory of Computer Network and Information Integration (SEU), Ministry of Education, China, under Grant No. K93-9-201405B, and Scientific Research Foundation of NUPT under Grant No. NY214063.

ORCID iD Jiagao Wu

https://orcid.org/0000-0003-3109-2553

References 1. Fall K. A delay-tolerant network architecture for challenged internets. In: Proceedings of the conference on applications, technologies, architectures, and protocols for computer communications of ACM, Karlsruhe, 25–29 August 2003, pp.27–34. New York: ACM. 2. Khabbaz M and Assi Fawaz CMW. Disruption-tolerant networking: a comprehensive survey on recent developments and persisting challenges. IEEE Commun Surv Tut 2012; 14(2): 607–640. 3. Spyropoulos T, Rais RNB, Turletti T, et al. Routing for disruption tolerant networks: taxonomy and design. Wirel Netw 2010; 16(8): 2349–2370.

4. Kang H, Ahmed SH, Kim D, et al. Routing protocols for vehicular delay tolerant networks: a survey. Int J Distrib Sens N 2015; 11(3): 325027. 5. Xia F, Liu L, Li J, et al. Socially aware networking: a survey. IEEE Syst J 2015; 9(3): 904–921. 6. Spyropoulos T, Psounis K and Raghavendra CS. Efficient routing in intermittently connected mobile networks: the single-copy case. IEEE ACM T Network 2008; 16(1): 63–76. 7. Vahdat A and Becker D. Epidemic routing for partially connected ad hoc networks. Technical Report CS-200006, 2000. Durham, NC: Duke University. 8. Spyropoulos T, Psounis K and Raghavendra CS. Spray and wait: an efficient routing scheme for intermittently connected mobile networks. In: Proceedings of ACM SIGCOMM workshop, Philadelphia, PA, 22–26 August 2005, pp.252–259. New York: ACM. 9. Bulut E, Wang Z and Szymanski BK. Cost-effective multiperiod spraying for routing in delay-tolerant networks. IEEE/ACM T Network 2010; 18(5): 1530–1543. 10. Abbas A, Lee C and Kim K. Delay bounded spray and wait in delay tolerant networks. In: Proceedings of the 9th international conference on ubiquitous information management and communication (IMCOM), Bali, Indonesia, 8– 10 January 2015, pp.1–4. New York: ACM. 11. Boldrini C and Passarella A. HCMM: modelling spatial and temporal properties of human mobility driven by users’ social relationships. Comput Commun 2010; 33(9): 1056–1074. 12. Daly EM and Haahr M. Social network analysis for routing in disconnected delay-tolerant MANETs. In: Proceedings of the 8th ACM international symposium on mobile ad hoc networking and computing (MobiHoc), Montreal, QC, Canada, 9–14 September 2007, pp.32–40. New York: ACM. 13. Hui P, Crowcroft J and Yoneki E. Bubble Rap: socialbased forwarding in delay tolerant networks. In: Proceedings of the 9th international symposium on mobile ad hoc networking and computing (MobiHoc), Hong Kong SAR, China, 26–30 May 2008, pp.241–250. New York: ACM. 14. Zhu K, Li W and Fu X. SMART: a social- and mobileaware routing strategy for disruption-tolerant networks. IEEE T Veh Technol 2014; 63(7): 3423–3434. 15. Gondaliya N, Shah M and Kathiriya D. A node scheduling approach in community based routing in social delay tolerant networks. In: Proceedings of international conference on advances in computing, communications and informatics (ICACCI), Kochi, India, 10–13 August 2015, pp.594–600. New York: IEEE. 16. Li Y, Cao Y, Li S, et al. Integrating forwarding and replication in DTN routing: a social network perspective. In: Proceedings of vehicular technology conference(VTC), Yokohama, Japan, 15–18 May 2011, pp.1–5. New York: IEEE. 17. Bulut E and Szymanski BK. Exploiting friendship relations for efficient routing in mobile social networks. IEEE T Parall Distr 2012; 23(12): 2254–2265. 18. Xiao M, Wu J and Huang L. Home-based zeroknowledge multi-copy routing in mobile social networks. IEEE T Parall Distr 2015; 26(5): 1238–1250.

Wu et al. 19. Miao J, Hasan O, Mokhtar SB, et al. A delay and cost balancing protocol for message routing in mobile delay tolerant networks. Ad Hoc Netw 2015; 25: 430–443. 20. Wu J, Zhu Y, Liu L, et al. Energy-efficient routing in multi-community DTN with social selfishness considerations. In: Proceedings of IEEE global communications conference (GLOBECOM), Washington, DC, 4–8 December 2016, pp.1–7. New York: IEEE. 21. Jones EPC, Li L, Schmidtke JK, et al. Practical routing in delay-tolerant networks. IEEE T Mobile Comput 2007; 6(8): 943–959. 22. Pirozmand P, Wu G and Jedari B. Human mobility in opportunistic networks: characteristics, models and prediction methods. J Netw Comput Appl 2014; 42(3): 45–58. 23. Pirozmand P, Wu G and Jedari B. Human movement models in mobile social networks. In: Proceedings of IEEE international conference on mobile ad-hoc and sensor networks, Dalian, China, 11–13 December 2013, pp.359– 364. New York: IEEE. 24. Petz A, Enderle J and Julien C. A framework for evaluating DTNs mobility models. In: Proceedings of 2nd international conference on simulation tools and techniques (Simutools), Rome, 2–6 March 2009, pp.1–8. New York: ACM.

13 25. Gao W, Li Q, Zhao B, et al. Multicasting in delay tolerant networks: a social network perspective. In: Proceedings of the tenth ACM international symposium on mobile ad hoc networking and computing ACM (MobiHoc), New Orleans, LA, 18–21 May 2009, pp.299–308. New York: IEEE. 26. Alavi SH, Jassbi J and Serra PJA. Comparison of genetic and gradient descent algorithms for determining fuzzy measures. In: Proceedings of international conference on intelligent engineering systems, London, 26–28 June 2006, pp.139–144. New York: IEEE. 27. Hong L and Yong J. Evolutionary algorithms for several complex optimization problems and their applications. PhD Thesis, Xidian University, Xi’an, China, 2009. 28. Gilbert JR, Moler C and Schreiber R. Sparse matrices in MATLAB: design and implementation. SIAM J Matrix Anal Appl 1992; 13(1): 333–356. 29. Kera¨nen A, Ott J and Ka¨rkka¨inen T. The ONE simulator for DTN protocol evaluation. In: Proceedings of 2nd international conference on simulation tools and techniques (Simutools), Rome, 2–6 March 2009, pp.1–10. New York: ACM.