Using an Evolutionary Algorithm to Optimize the ... - Semantic Scholar

3 downloads 850 Views 574KB Size Report
objective values are then conveyed to the optimization engine that ranks the ... Differential Evolution with Self-adaptation and Local Search for Constrained.
Using an Evolutionary Algorithm to Optimize the Broadcasting Methods in Mobile Ad hoc Networks Wahabou Abdou, Adrien Henriet, Christelle Bloch, Dominique Dhoutaut, Damien Charlet and Fran¸cois Spies Laboratoire d’Informatique de l’Universit´e de Franche-Comt´e 1 Cours Leprince-Ringuet, 25201 Montb´eliard (France) Tel: + 33 381 99 47 77 Fax: + 33 381 99 47 91 {firstname.lastname}@univ-fcomte.fr

Abstract A Mobile ad hoc network (MANET) is a collection of mobile nodes communicating through wireless connections without any prior network infrastructure. In such a network the broadcasting methods are widely used for sending safety messages and routing information. To transmit a broadcast message effectively in a wide and high mobility MANET (for instance in Vehicular Ad hoc Network) is a hard task to achieve. An efficient communication algorithm must take into account several aspects like the neighborhood density, the size and shape of the network, the use of the channel. Probabilistic strategies are often used because they do not involve additional latency. Some solutions have been proposed to make their parameters vary dynamically. For instance, the retransmission probability increases when the number of neighbors decreases. But, the authors do not optimize parameters for various environments. This article aims at determining the best communication strategies for each node according to its neighborhood density. It describes a tool combining a network simulator (ns-2) and an evolutionary algorithm (EA). Five types of context are considered. For each of them, we tackle the best behavior for each node to determine the right input parameters. The proposed EA is first compared to three EAs found in the literature: two well-known EAs (NSGA-II and SPEA2) and a more recent one (DECMOSASQP). Then, it is applied to the MANET broadcasting problem. Keywords: Multi-objective Evolutionary Algorithm, MANET, VANET, Broadcast, Preprint submitted to Journal of Network and Computer Applications November 12, 2010

Flooding, Simulation, Channel Saturation. 1. Introduction A mobile ad hoc network (MANET) is a set of mobile nodes communicating through wireless connections without any pre-existent network infrastructure. The mobile nodes can be laptops, mobile phones or any other equipment with a wireless network device. MANETs were initially used to make the establishment of communication systems easier during the relief following a disaster. Nowadays, these networks are used for various purposes: sharing data, playing games, sending road traffic information, etc. Due to the constraints related to the signal propagation (distance and/or obstacles), each node has a limited coverage area (area delimiting the neighbors it can communicate with). In such an environment, if a node wants to send a message beyond its coverage area, at least one of the neighbors it can communicate with should agree to relay its message. There are different types of communication between nodes: unicast (one source to one destination), multicast (one source to multiple destinations) or broadcast (from a source to all the nodes in its coverage area). This last type (broadcast) is widely used in MANETs, especially for sending safety messages and routing information. Since the radio resources (for instance bandwidth) are limited, the communication must be managed effectively, which means it must avoid unnecessary retransmissions. The bandwidth consumption is related to the number of nodes that are in the same coverage area. In a dense environment, if each node relays each message as soon as it is received, the number of collisions will quickly grow, preventing potential highly relevant and time-critical messages from getting access to the shared wireless channel. In a sparse environment (low density), if nodes rarely relay the communications, the broadcasting chain might be broken (while using a realistic propagation model). Regarding message retransmissions, the behavior of the nodes must depend on the context. The problem is how to communicate effectively without unnecessarily saturating the channel. Many researches focus on message dissemination strategies in MANETs. [1] and [2] present several mechanisms to reduce redundancy, contention and collisions in MANETs. They both make the retransmission probability vary according to the number of neighbors. In [1], the authors define two classes 2

of nodes depending on a given threshold τ . The nodes whose number of neighbors is greater than τ belong to class α. The other nodes belong to class β. All the messages sent from or received at nodes β are treated with higher priority. In [2], the retransmission probability P varies between a lower bound and an upper bound. P decreases if the number of neighbors is greater than a given threshold. Otherwise P increases. In [3] the authors propose an altruistic communication scheme that differentiates messages by their relevancy. This method is said to be altruistic because each node in the network does not only consider its own benefit. It also takes into account the potential recipients’ benefit to assign priority to the messages. In all these approaches, choosing the values of parameters (such as lower and upper bounds of probability, or threshold values) is the main difficulty. In the best cases, some authors provide guidelines to tune or to compute them, but they do not optimize them for various contexts. In this article, we suggest the use of four parameters to adapt the message dissemination to the context: • the probability for each node to relay a communication; • the number of times each node retransmits messages; • the delay between two successive retransmissions; • the TTL (Time To Live) denoting the maximal number of hops a message is allowed to do. Using these four parameters we intend to implement a self-regulating protocol that will take the environment context into account. In this article, we describe a methodology to identify the optimal parameters according to a given density. An evolutionary algorithm (EA) is used to explore the possible settings. Then, considering each of these settings (i.e. a set of parameters), the behavior of a set of nodes is simulated using a network simulator. Our goal is to reduce the channel utilization and the broadcasting propagation time of the message (when all nodes receive the message) in a given area. The remainder of this article is organized as follows. Section 2 presents different methods of broadcasting in wireless ad hoc networks. Section 3 describes our approach which consists in combining an EA and a network simulator to determine the optimal parameters of message dissemination in a given context. The configuration of the EA that is used and its validation by a 3

comparative study is outlined in Section 4. In Section 5, the proposed method is evaluated and the results are discussed. Section 6 presents concluding remarks and outlines future work. 2. Related work The goal of the optimization of the broadcasting techniques in wireless ad hoc networks is to reduce the number of redundant broadcast messages without decreasing reachability or increasing latency. The existing methods can be classified into five groups [2, 4]: simple flooding, probabilistic, counterbased, area-based, and neighbor knowledge-based methods. In the simple flooding method, each node relays every message exactly once. Incase of redundancy, the duplicated copies of the message are discarded. This technique gives interesting results in a moderately dense environment. However, when the density is high, many relayed messages could be redundant and waste the channel bandwidth. The probabilistic methods [1, 5, 6, 7] aim to improve the simple flooding technique. Upon the reception of a packet, the node forwards or drops it depending on a given probability. One challenge is to determine the appropriate value of the probability. Even if values between 0.6 and 0.8 are considered optimal by Li et al. [5], it is clear that they are not likely to be globally optimal. If this probability is equal to 1, this method is equivalent to the simple flooding. The counter-based methods [2, 8] rely on the idea that the more a message is received by a single node, the less an additional area will be reached if this node relays that message. When a node receives a message for the first time, it initializes a counter with a value of 1, then sets a Random Assessment Delay (RAD) that is a time chosen in a given interval. Before the expiration of the RAD, each time the same message is received, the counter is incremented by 1. When the RAD is over, if the counter is less than a given threshold, the message is relayed. Otherwise the message is discarded. It should be noted that this method implies an additional latency. When using an area-based technique, the decision of forwarding a message depends on the additional coverage area that will result from this retransmission. This technique does not consider whether nodes exist within that additional area or not [4]. To evaluate the additional coverage area, the node can use the distance between itself and each node that has previously relayed the message (distance-based scheme) or the geographical coordinates 4

(location-based scheme). Since the signal strength can be used to calculate the distance, the GPS is not indispensable in the distance-based scheme. In both distance-based and location-based schemes, a RAD is assigned before the message is relayed or dropped. In neighbor knowledge-based approaches, the nodes use “Hello” packets to build a 1-hop or 2-hop neighbors lists. These lists are suffixed to the broadcast packets so that the receiver (r) can compare the sender’s list to its own list. This comparison determines the additional nodes that will receive the message if r forwards it. For static or low mobility networks, it is a fair method. But when the node’s velocity is high, the information about the neighbors become quickly inaccurate. Alba et al. [9] have proposed an improvement of the knowledge-based methods. The authors used a cellular multi-objective genetic algorithm to find the values of the parameters that help to assess whether to forward a message. The Multi-point relay (MPR) [10] is another variation of the neighbor knowledge-based techniques suggested by the Hypercom team of INRIA1 lab. To reduce the number of redundant broadcasts of a packet in the network, each node chooses several nodes among its neighbors that will relay its communications. The selected nodes are called MPRs. When a node sends a packet on the radio channel, all its neighbors will receive it, but only the MPRs of the source node will relay the message. That means each node will have a list of all the nodes that have chosen it as their ”repeater” (MPRs selectors’ list). The MPRs are selected among the 1-hop neighbors so that they enable the node that has chosen them to reach all its 2-hop neighbors. The goal is to have the smallest list of MPRs in the network which optimizes communications. The MPRs require a bidirectional link. Thus, the project described in this article deals with probabilistic broadcast. This type of strategy was selected because it does not involve additional latency, unlike counter-based, area-based or neighbor knowledge-based approaches. Original parameters are used to adapt the behavior of nodes to the level of neighborhood density : the number of retransmissions of a message by a single node, the delay between two consecutive retransmissions and the TTL. Since this increases the combinatorial complexity of the problem (see section 3.2.2), an EA is used to solve it. Each solution built by the EA is evaluated by simulation, with the network simulator ns-2. Moreover 1

Institut National de Recherche en Informatique et en Automatique, France

5

this permits to use an improved radio propagation model and to assess the impact of mobility on the provided results (see section 5.5). The next section describes the proposed approach. 3. The hybrid optimization tool 3.1. Overview The proposed approach is based on three main modules: an optimization engine, a network simulator and a log analyzer. These three sub-systems cooperate to determine the best set of parameters of a probabilistic broadcasting strategy in different given environments. Figure 1 illustrates these modules and their interactions. Firstly, the EA generates a set of possible parameters that is transmitted to the network simulator. The latter integrates the received parameters to the simulation scripts. Thereafter, the simulations are run and some log files are built. Theses files describe the network behavior. With the aim of assessing the quality of each set of simulated parameters, the log files are passed on to the third sub-system (log analyzer) that extracts the values of the objective functions. The calculated objective values are then conveyed to the optimization engine that ranks the solutions according to these values. And then, the optimization tool runs some operations to regenerate another set of possible solutions that have to be simulated (for details see section 3.2). The loop starts again, until the stop condition is met. In Figure 1, the “P0 ?“ condition is used to check if the current population is the initial one. Section 3.2 presents the EA used to solve the broadcasting problem and describes the different elements of the algorithm: the genotypic representation of each considered solution, the evaluation of its fitness according to the simulation results, the selection process, the recombination operators (crossover and mutation) and the stop criterion. The evaluation process carried out by the network simulator is described in Section 3.3. 3.2. Evolutionary algorithm module 3.2.1. Background EAs draw an analogy between the solutions of an optimization problem and individuals in nature. They first build an initial population P0 , containing n randomly generated individuals, that are a set of possible solutions to the problem. An evaluation process assigns a fitness value to each of them. The more interesting a solution is, the greater its fitness. Thereafter, in the 6

selection step, the probability to select an individual is proportional to its fitness. Thereby, the best individuals are most likely to become parents. The recombination of two parents uses a crossover operator to generate new individuals, called children or offspring. By analogy with natural selection and reproduction, children inherit qualities from their parents. Repeating these steps to create several successive populations Pi (also called generations), the algorithm evaluates and compares many solutions, while continuously increasing their quality. Some children will replace parents in the population. An additional operator, called mutation, also generates new solutions to prevent the EA from being trapped by local optima. Crossover and mutation are applied with respective given rates. Finally, according to a given stop criterion, the algorithm returns the best solution(s) it has found. In recent years, several algorithms have been proposed [11], including the very famous Non-dominated Sorting Genetic Algorithm II (NSGA-II)

Figure 1: Interactions between the sub-systems

7

[12], the Strength Pareto Evolutionary Algorithm 2 (SPEA 2) [13] and the Differential Evolution with Self-adaptation and Local Search for Constrained Multiobjective Optimization algorithm(DECMOSA-SQP) [14]. NSGA-II is a genetic algorithm based on : • the non-dominated sorting principle : solutions are classified into Paretooptimal fronts; • an elitist mechanism based on the density of solutions in each Pareto front (crowding distance). SPEA 2 uses two populations: the current population and an archive containing the non-dominated solutions found in the previous generations. The fitness associated to each solution takes into account the number of solutions it dominates, the number of solutions by which it is dominated and the density around the solution to assess. DECMOSA-SQP is a hybridization of the differential evolution algorithm DEMOwSA [15] and the local search algorithm SQP [16]. While DEMOwSA explores the search space widely, SQP tries to improve solutions through a local exploration. For the version of DECMOSA-SQP presented in section 4, the authors used a replacement strategy based on SPEA 2 selection to choose the individuals to eliminate. This selection takes into account the uniformity of distribution to discriminate individuals for which dominance does not permit to conclude. 3.2.2. Parameters and performance criteria in a multi-hop wireless network A multi-objective genetic algorithm that applies binary mechanisms to real numbers is used in this study. Each gene is defined by its type (integer, real, binary,...), its lower bound, upper bound and its precision (p). The realvalued parameters with an accuracy of p decimal places are converted into integers by multiplying their real values by 10p (the remaining decimal part of the result is truncated). This method enables to encode the values of genes easily. Besides, it permits to adjust the precision. For example, a parameter that varies between 0 and 1 with a precision of 6 will have 0 as lower bound and 1 000 000 as upper bound. A possible value for this parameter might be 0.2454121 that will be represented by the integer 245412. This EA is called Elitist Simulated Binary Evolutionary Algorithm (ESBEA). The addition of elitism prevents the degradation of performance [17].

8

The proposed context-aware flooding protocol is based on a probabilistic broadcasting strategy. Each node is a smart repeater. When receiving a message, the node decides whether to forward or discard the message, using four parameters to adapt its behavior to the current environment. These parameters have been chosen so that they remain as few as possible in order to enable fast genetic computations. The considered parameters are: • P : the probability to accept repeating a packet when receiving it for the first time. Inherited from probabilistic flooding algorithms [1, 5, 6, 7], this parameter enables the tuning of the contention for the radio channel, especially in high density environments. • N r: the total number of repeats (applied only when a node has decided to repeat a packet). It enables the protocol to cope with low to extremely low node densities. In mobile environments, when repeating a packet, one cannot be sure that there will be a neighbor to receive and eventually repeat the message or process it further. Repeating this packet again over time maximizes the probability a passing node will receive it, at a price of channel over-use. This parameter is also needed to limit the diffusion overtime. • Dr : the delay between repeats (also only applied if the node initially chooses to repeat the packet). This parameter is meant to tune the channel use and, in conjunction with the total number of repeats, to tune how long a packet will be broadcast. • T T L : the Time To Live of a packet, expressed as a number of hops the packet is allowed to do. This parameter limits the diffusion overtime. The TTL could be complemented or replaced by time and / or geographic limitations. Even if these possibilities could be explored, it is obvious that they require more computation. In accordance with the variation ranges of decision variables set in Table 5, the search space contains 12.1015 possible combinations (9.1016 in the rural area context). Running each of these combinations requires from a dozen seconds up to a few minutes, according to the considered neighborhood density. Simulating all possible cases with ns-2 (Network Simulator 2) [18] would take a long time. An appropriate heuristics is needed to solve such a problem in an acceptable simulation duration : that is why an evolutionary algorithm is used to assess a subset of interesting solutions. 9

In the proposed approach, ESBEA and ns-2 are used to optimize these four parameters in any given environment. The EA explores the set of possible values in each environment. A given combination of values is called a setting. ns-2 is used to assess the quality of each considered setting by simulation. Several objectives are used to determine whether a solution is interesting or not. The given values of the parameters must enable to use the network efficiently without saturating the wireless channel. Four criteria are used to characterize this. Three of them are to be minimized: • N C : the number of collisions; • P T : the propagation time represents the time spent until all the nodes in the considered area receive the message; • R : the number of retransmissions during the simulation. As simulation is a stochastic process, five hundred simulations are run to evaluate the mean values of these objectives for each setting. In each of these simulations, if the channel is saturated, the number of collisions may prevent the message from being normally transmitted to all the nodes. In this case, the simulation is stopped, and this is stored as a failed simulation. This permits to compute a fourth criterion for each setting, called full reception ratio in a limited area (F R), which equals the ratio between the number of successful simulations and the total number of simulations. This criterion must be maximized. Besides, a constraint is defined. If F R is less than a given threshold, the considered setting (solution) is considered as unfeasible. Otherwise, the setting is feasible and the successful simulations are used to compute the mean values of the three other criteria. So optimizing the broadcasting technique is a multi-objective problem. Evolutionary algorithms are known to solve such problems effectively in various fields of application [11, 19]. The initial population is generated by choosing the values of each decision variable randomly in its given variation range. Each individual provides ns2 with four input values. The simulator and the log analyzer determine the values of the objectives N C, P T , R and F R. These values are used by ESBEA to compute the fitness of individuals using the principle of nondominated sorting, based on Pareto dominance [11]. Pareto solutions are those for which improvement in one objective implies the worsening of at least one other objective. ESBEA uses the evaluation results returned by the 10

log analyzer module to build three Pareto fronts. First, all the individuals for which F R is less than a given threshold are said to be unfeasible. The remaining feasible individuals are compared, using Pareto dominance and all the non-dominated individuals are added to the first Pareto front R1 . This process is repeated successively twice with the remaining subset of individuals to build the second and the third Pareto fronts (R2 and R3 ). Finally, the remaining subset and the unfeasible individuals are gathered in the set of dominated solutions R4 . Thereafter individuals are selected for recombination. The selection operator used is based on a roulette wheel in which the part of each front depends on its cardinality. The probability to select each list is computed as shown in equation 1. δ(Ri ) ∗ Card(Ri ) , ∀i s.t. 1 ≤ i ≤ 4 P (Ri ) = P4 i=1 [δ(Ri ) ∗ Card(Ri )]

(1)

where Card(Ri ) is the cardinality of Ri (size of the Ri list), and δ(Ri ) is a fixed probability ratio between the dominated solutions and those of Ri. Indeed, δ(Ri ) indicates a priority level associated with each list. Its values were determined empirically to ensure both the selection of the best solutions (e.g. solutions in the first Pareto front will have priority over the others) and the diversity in the new population (allowing the selection of some dominated solutions). Figure 2 gives an instance of such a roulette wheel built from the sets presented in Table 1. Using both δ(Ri ) and Card(Ri ) prevents the dominated solutions from having very low values of fitness and preserves the diversity of the successive populations.

Figure 2: Example of roulette wheel

The sensitivity of ESBEA with regard to δ(Ri ) has been evaluated using a multi-objective optimization problem (M OP ) proposed for the CEC 2009 11

i 1 2 3 4

δ(Ri ) 4 3 2 1

Card(Ri ) 19 8 5 17

P (Ri ) 0.5984252 0.1889764 0.0787402 0.1338583

Table 1: Example of sets and associated selection probabilities

δ(R1 ) 1 10 1 4

δ(R2 ) 1 1 1 3

δ(R3 ) 1 1 1 2

δ(R4 ) 1 1 10 1

IGD 0.003945 0.003866 0.004220 0.003475

Table 2: Sensitivity of ESBEA with regard to δ(Ri )

competition [20] (see Table 2). The problem M OP1 and the performance metrics Inverted Generational Distance (IGD) used for these evaluations are defined in section 4. The best solution is associated with the lowest value of IGD. The roulette wheel permits to select parents for the recombination. Each parent is chosen by two random selections. The Pareto front is first selected, then an individual is randomly selected among the individuals belonging to this front, using equal probabilities for all these individuals. Each pair of selected parents is then recombined using a simulated binary 10-point crossover. The four variables of each parent are first converted into binary strings and concatenated. The resulting binary string is then separated into several parts by choosing ten crossover points randomly. Then two children are built by alternatively copying the odd parts and the even parts of the selected parents. Then, the resulting binary string of each child is converted into four new integer values of parameters. This crossover operator has been used because k-point crossover is a classical method of recombination . It has been tested with 1-point crossover, but there was not enough diversity in the successive evolutionary populations. After several tests a 10-point crossover has been chosen because it seemed more efficient. However, additional experiments should be done by comparing various methods of crossing to determine which of them is the most suitable. An example of the crossover operator is given in Figure 3. This example 12

focus on a 3-point crossover, but the principle remains the same if one wants to extend the number of crossover points. Finally, during the mutation step, a gene is randomly chosen and the EA generates a new value for this variable with respect of its variation range. That is the uniform mutation.

Figure 3: Example of a 3-point crossover

These operators permit to generate a list of offspring, whose fitness is again computed using ns-2 simulations. Each offspring replaces the first parent it dominates in the population list. If the offspring does not dominate any parent, it is not added to the list of individuals of the next generation. All these steps (evaluation, selection, crossover, mutation, replacement) are repeated until a given stop criterion is met. For the moment, since simulation requires quite a large computing time, this criterion is a given number of generations, in order to keep reasonable solving durations. The EA finally returns the set R1 . 4. Validation of the Elitist Simulated Binary Evolutionary Algorithm (ESBEA) ESBEA was first validated by comparison with algorithms found in the literature before applying it to a mobile ad hoc problem. The compared algorithms were two well-known algorithms (NSGA-II and SPEA2) and a more recent one (DECMOSA-SQP). The four algorithms are compared using two constrained multi-objective problems (M OP ) proposed in the CEC 2009 competition. In both M OP1 and M OP2 , f1 (equations 2 and 4) and f2 (equations 3 and 5) are to be minimized. 13

M OP1 f 1 = x1 +

3(j−2) 2 X 0.5(1.0+ n−2 ) 2 ) (xj − x1 |J1 | jJ

(2)

1

3(j−2) 2 X 0.5(1.0+ n−2 ) 2 f 2 = 1 − x1 + (xj − x1 ) |J2 | jJ

(3)

2

Where J1 = {j|j is odd and 2 ≤ j ≤ n} and J2 = {j|j is even and 2 ≤ j ≤ n} The constraint is C: f1 + f2 − |sin[N π(f1 + f2 + 1)]| − 1 ≥ 0 The search space is [0, 1]n . N = n = 10 M OP2 f 1 = x1 +

jπ 2 X (xj − sin(6πx1 + ))2 |J1 | jJ n

(4)

1

f2 = 1 −



x1 +

2 X jπ (xj − cos(6πx1 + ))2 |J2 | jJ n

(5)

2

Where J1 = {j|j is odd and 2 ≤ j ≤ n} and J2√= {j|j is even√and 2 ≤ j ≤ n} The constraint is C: 1+et4|t| ≥ 0 where t = f2 + f1 −sin[N π( f1 −f2 +1)]−1 The search space is [0, 1] x [−1, 1]n−1 . N = 2 and n = 10 The Inverted Generational Distance (IGD) is used as performance metrics to evaluate the results produced by ESBEA, NSGA-II, SPEA 2 and DECMOSA-SQP. The IGD is calculated as shown by equation 6. P d(v, P F ) (6) IGD(P F, P F0 ) = v∈P F |P F0 | where P F is the Pareto front built by the EA (ESBEA, NSGA-II, SPEA 2 or DECMOSA-SQP) and P F0 a set of 100 uniformly distributed points selected among the solution points given for the CEC 2009 competition. d(v, P F ) is the minimum euclidean distance between a given point v in the reference Pareto front P F0 and the points in the P F (the computed front). |P F0 | represents the size of P F0 . 14

The goal was to minimize the distances between the points on the P F provided by the EAs to compare and those on P F0 . The results presented in Table 3 show that ESBEA, SPEA 2 and DECMOSA-SQP have lower values of IGD (with respect to NSGA-II) for both M OP1 and M OP2 . Thus, the comparison was performed using the rules of the CEC 2009 competition. The results presented in Table 3 show that ESBEA outperforms the three other algorithms for M OP1 . That means ESBEA’s results are closer to the reference front (let us say the optimal solutions). The gain (Equation 7) of ESBEA in comparison with the others EAs are shown in Table 4. For M OP2 , the results of ESBEA are just fair, but considering both problems of the comparison, this algorithm has a positive gain. ESBEA M OP1 0.003475 M OP2 0.188578

NSGA-II 0.555952 0.492870

SPEA 2 0.054088 0.154983

DECMOSA-SQP 0.10773 0.0946

Table 3: Calculated IGD for M OP1 and M OP2

NSGA-II M OP1 99.37% M OP2 61.74%

SPEA 2 93.57% -17.81%

DECMOSA-SQP 96.77% -49.84%

Table 4: Gain of ESBEA in comparison to the three others EAs

Gain =

IGDEA − IGDESBEA M AX(IGDEA , IGDESBEA )

(7)

EA ∈ {N SGA − II, SP EA 2, DECM OSA − SQP } 5. Experimentation The design of protocols for wireless communications is a complex task. Therefore, the use of simulations allows to study the behavior of a protocol close to the real situations. Moreover it permits to evaluate the behavior of this protocol in a large network (with hundreds or thousands of nodes). There are lots of network simulators (ns-2, ns-3, GloMoSim, OMNeT++, etc.) but ns-2 is the most reliable in our case, the most active open-source network simulator and the most frequently used for wireless simulations. It 15

uses discrete event simulation, which is based on the change of events and states, because it is a faster and more realistic method. It is also packetbased, which means that the transport of data packets on network topology is simulated. For our experiments, we evaluate the packet propagation in several wireless scenarios. However there are some limits to ns-2 about the radio propagation model for wireless communications. That is why we use an improved model called the Shadowing Pattern [21] to have a more realistic error model. 5.1. Description of the experimentations ESBEA will be used to optimize the broadcasting methods in mobile ad hoc network. This section will focus on a particular type of MANETs known as Vehicular Ad hoc Network (VANET). Inter-vehicular communications assist drivers by giving them road traffic (e.g. traffic jam) and security (e.g. accidents) information. Due to the high mobility of vehicles, unicast messages are not appropriate to such a network [3]. In the following experiments, our goal will be to send an emergency message in a VANET effectively through a dedicated channel (for example a security channel). Parameter Lower bound Upper bound

P 0 1

Nr 1 5 (30 for rural area experiment)

Dr (in seconds) 0 2

TTL 10 40

Table 5: Variation ranges of decision variables

Parameter Population size

Value 52

Crossover rate

0.8

Parameter Number of generations Mutation rate

Value 100 0.001

Table 6: Configuration parameters of ESBEA

In order to evaluate the behavior of message dissemination protocols correctly, complex simulations of the various network layers (from radio propagation and channel access control up to application level) are necessary. The simulations are done under the all-in-one distribution of ns-2.30. Because those computations are demanding, the simulations were carried out on a 16

multiprocessor network: sixteen 2.66 GHz Intel CPUs cores with 12 GB of Random Access Memory. For each individual, 500 independent ns-2 iterations are run to obtain statistically reliable results. This number of iterations was determined empirically. Each iteration has a duration of 1000 simulated seconds. The variation ranges of decision variables are given in Table 5. In preliminary tests, the variation range of N r was set to 30 but for moderately dense networks, the EA chose solutions with low values (between 1 and 5). However in the case of a very low neighborhood density, the EA selected high values of N r. Thus, in order to limit the search space (and therefore the simulation time), we used two intervals for this parameter depending on the density. The presented tests used the values of (Ri ) given in Table 1. The configuration parameters of ESBEA are presented in Table 6 and the feasibility threshold is 0.75 (the constraint is F R ≥ 75%). The population size and the number of generations are low to limit computation time. The other parameters were defined empirically. When discussing the results, the TTL of each individual is not taken into account because the size of the simulated networks is not high enough. Four levels of density will be studied in the following subsections : low density (highway), medium density (urban context), high density (dense urban context) and very low density (rural context). An additional experiment will be carry out in a urban context with mobile nodes. 5.2. Tests in a highway context To illustrate vehicle communications on a highway, the chain topology presented in Figure 4 has been set. 50 nodes were simulated, with a distance of 200 meters between two consecutive nodes. This illustrates cars lined up on 10 km with a low density. Each vehicle is able to communicate regularly with a dozen peers, but more occasionally packets are received by vehicles up to a few thousand meters away (as observed in real experiments). The results are presented in figures 5 to 8. The EA provides a few dozen solutions belonging to R1 among the 12.1015 possible combinations. All these solutions are feasible (F R > 75%). The aim of this study is mainly to check that this approach could identify adapted settings in each given context. These optimal results show the most efficient values which can be accessible by a distributed strategy. Figure 5 presents the full reception ratio. The lower right curve shows that retransmitting a packet only once, whatever the retransmission probability 17

200m

...

Figure 4: The simple chain topology

Figure 5: Full reception ratio in a low-density scenario

(P ), leads to incomplete coverage. Out of 500 simulations in a given setting, the N r = 1 (N r is the total number of repeats for each node) curve never goes over 96% of F R. And this ratio quickly decreases with the decrease of P . Since the current methods of dissemination found in the literature [1, 7, 8] do not take into account the number of retransmissions, they could be compared to the solutions we obtained with Nr = 1. This allows us to say that even in the best case (P = 1), those methods will not transmit the sent packets to all nodes in the study area. In this scenario and for this criterion, individuals with N r = 2; P > 0.9 and those with N r ∈ 3, 4, 5; P > 0.75 prevail and are kept as candidates for the next criterion. The shaded portion of the figure contains the points that are considered no good. To make the figures readable, when going from one figure to another in the same network 18

Figure 6: Propagation time in a low-density scenario

context, the points that were not selected in the previous step are removed from the curves of the next figures. Figure 6 shows the propagation time (also known as the time needed for a message to be successfully received by every car). Only the individuals that were not discarded in the previous step are shown. Individuals should be further eliminated regarding the propagation time, but all of them are kept. This is done in order to have a choice left in the next steps. It is allowed because the least efficient of those individuals is already below 0.6 second; 0.6 second to advertize safety information over a 10-kilometer highway is indeed quite good already. This behavior suggests that the second broadcast front (two retransmissions per node) is enough to make the message reach all nodes in the considered area. So, at this stage, individuals for which N r = 2; P > 0.9 and N r ∈ 3, 4, 5; P > 0.75 are still selected. We have a peak point for N r = 4 at around a probability value equal to 0.82 because for this solution the corresponding Dr is 1.62 seconds. Thus the retransmissions are delayed and the message does not reach all the nodes quickly. The next two figures will give information about the channel saturation. Figures 7 and 8 show that in a one-vehicle-every-200-meter scenario, if 19

each message is repeated more than twice, it causes many collisions. Moreover, continuing to retransmit the message will unnecessarily use the channel. Please note that in our study, only one node is causing the safety message diffusion. If one assumes that more than one message can be sent simultaneously (which would probably be the case in real networks), solutions with N r > 2 will quickly produce inefficient results. So, judging from Figure 7, some individuals with higher N r are excluded from the selected set as they cause too many collisions. Please note that, the lowest point observed for N r = 4 corresponds to the solution that leads to a peack point in Figure 6. This is coherent since a high value of Dr means that the retransmissions are spaced. This logically reduces collisions. Judging from Figure 8, further individuals are excluded, and only one with N r = 2 is kept as, for all other comparable criteria, it performs substantially better in terms of channel efficiency.

Figure 7: Collisions in a low-density scenario

In this context, the best solution is the one with N r = 2 and P > 0.9. The observed delay time between repeats (Dr) for these individuals is 0.72 second. In this context, it is clear that the simple flooding (P = 1 and N r = 1) will not be efficient since it allows only one retransmission for each packet and some nodes will not receive the message. 20

Figure 8: Number of retransmissions in a low-density scenario

5.3. Tests in an urban context Here the focus is on the increased density of vehicles. The used topology is similar to the previous chain, but with 134 nodes for the same 10-km length (with one vehicle every 75 meters). Each message may be received by tenths of nodes. The results are presented in figures 9 to 12. For a probability close to 0.8 it appears that almost whatever the number of times the message is transmitted, it reaches all nodes (Figure 9). Naturally, the more a message is repeated, the quicker it is likely to spread on the channel (as long as the channel is not saturated by this message or by other communications). Using the propagation time criterion (Figure 10), a few more individuals can be excluded from the set of good solutions. All remaining selected individuals behave in the same way and are excellent against this criterion. Figures 11 and 12 enable to reduce the selected set further and to discern an optimal setting. Individuals still selected in Figure 11 are comparable in terms of collisions. Finally, judging from Figure 12, the best individuals correspond to N r = 1 (as they have the lowest channel occupation) and P = 0.75, which is the minimum below which information would not always

21

reach all the nodes. There is no sense here in the Dr time because when N r = 1 there is no second repeat on nodes. The simple flooding will unnecessarily consume the bandwidth if it is used in this context.

Figure 9: Full reception ratio in a medium density scenario

5.4. Tests in a mobile nodes context For this scenario, we used a double-chain topology as depicted in Figure 13. Vehicles of the first line move in the opposite direction of those in the second one with average speed of 70 km.h−1 . Each chain has 67 vehicles. This topology remains similar to the previous one. The results are not shown, as they are very close to those presented in figures 9 to 12. Thus, in such topology, neighborhood density prevails over mobility model from the packet diffusion protocol point of view. 5.5. Tests in a dense urban context In this scenario, we even more increased the network density, as we now have 400 vehicles. The distance between two consecutive vehicules is 25 meters. Many cars are able to receive each packet, and solutions that restrict 22

Figure 10: Propagation time in a medium density scenario

Figure 11: Collisions in a medium density scenario

23

Figure 12: Number of retransmissions in a medium density scenario

150m

...

Figure 13: The double-chain topology

the number of collisions will logically prevail. Since each vehicle has many neighbors in its coverage area, very high Full Reception Ratio (F R) are possible even with low retransmission probability (P ). As depicted in Figure 14, only very low retransmission probability (below 0.2) and low number of retransmissions fail to reach F R approximately equal to 1. For the same reason, the time needed for the information to reach all the vehicles is usually very short, all nodes receiving the informations on the first transmission wave. As shown in Figure 15, only some solutions with very low retransmission probability will have to wait for the second or third retransmission. As expected, the collision rate is quite high and solutions with lower 24

Figure 14: Full reception ratio in the high density scenario

Figure 15: Propagation time in the high density scenario

25

values of N r perform better. A few interesting phenomena can still be seen in Figure 16. If, as expected, all individuals for which N r = 1 are on the lowest part of the figure, those with N r = 4 perform better than most of those with Nr=2 or Nr=3. This is due to the fact that those having N r = 4 also have a higher retransmission delay, limiting the number of collisions between two successive waves of broadcast. The same phenomenon leads the two individuals with r = 2 and a probability of 0.45 to have different collision rates. Lastly, from the solutions selected in the previous step as not producing too much collisions, the one still needing less transmissions are those with N r = 1, as shown on Figure 17

Figure 16: Collisions in the high density scenario

5.6. Tests in a rural area Generally, in rural areas the traffic flows steadily. A vehicle might have no neighbor in its coverage area. To simulate such an environment, we took the topology illustrated in Figure 4. But in order to simulate a sparse vehicle distribution, the radio propagation model mimics the very intermittent presence of neighbors: a given vehicle can communicate only periodically. 26

Figure 17: Number of retransmissions in the high density scenario

The total of communicating periods amounts to about 20% of the simulation time. For the remaining time of the simulation, the vehicle is considered to be without any neighbor. Figures 18 and 19 show that ESBEA selected only individuals with high N r values. This is understandable as only high redundancy enables reliable enough hop-to-hop communications in such a sparse network. Figure 18 shows that for an F R value close to 1, N r > 20 and P > 0.75 are necessary. This appears in Figure 19 as a high overall number of retransmissions. Hopefully this network overhead is of course distributed over a longer period and a larger area (tenths of seconds and a 10-km line in this scenario). So the collision ratio remains very low (around an average of 20 collisions for the selected individuals). In such an environment, as information can only occasionally have the opportunity to jump from one vehicle to the next one, the delay until all vehicles have been reached is quite high, in the order of tenths of seconds, because the Dr time is in [0.3;2]. Due to the fact that the message has to be retransmitted many times, the simple flooding is not applicable in this context.

27

Figure 18: Full reception ratio in a very low density scenario

Figure 19: Number of retransmissions in a very low density scenario

28

6. Conclusion The optimization of broadcasting methods in mobile ad hoc networks is a real challenge. Several studies have been conducted on this topic. In this article, we showed that the decision variables that were often used to regulate the broadcast of a message were not sufficient. New parameters have been proposed, including the number of retransmissions of the same message and the time between successive transmissions by a single node. It has been demonstrated that in a very low node density environment, these two parameters are absolutely essential if one wants to ensure wide dissemination of the message and a complete reachability within a defined area. On the whole, the proposed method provides four input parameters that are adjustable according to the network density. The size of the search space and the conflicting nature of objectives motivated the use of an evolutionary algorithm to solve the problem. Indeed, in such cases, analytical methods are difficult to apply. Based on the results developed in the previous section, we highlighted the need for an adaptive strategy to design broadcast methods. The results have enabled us to measure the differences between an appropriate strategy and an inefficient one in a particular context. Even if some results were expected, they validate the proposed approach and confirm its practicality. Once we have a tool that can determine the best communication strategy in each particular context, the other steps of the project can be addressed. Firstly, the nodes should be able to determine the density of their neighborhood. The goal is to do so by sending the least possible number of control messages. Then, depending on the density variations, the nodes must be able to switch from a communication strategy to another, each time choosing the most appropriate one. So it will be possible to simulate more complex and realistic network topologies (such as grid topologies). The second main prospect concerns the proposed EA (ESBEA). Its improvement will mainly focus on the development of an adaptive distributed version, to facilitate the configuration phase and to reduce computation time. This step will rest on the adaptation of concepts we already used in other application fields [22]. References [1] N. Karthikeyan, V. Palanisamy, K. Duraiswamy, Optimum density based model for probabilistic flooding protocol in mobile ad hoc network, European Journal of Scientific Research 39 (4) (2010) 577–588. 29

[2] Q. Zhang, D. P. Agrawal, Dynamic probabilistic broadcasting in manets, J. Parallel Distrib. Comput. 65 (2) (2005) 220–233. doi:http://dx.doi.org/10.1016/j.jpdc.2004.09.006. [3] S. Eichler, C. Schroth, T. Kosch, M. Strassberger, Strategies for contextadaptive message dissemination in vehicular ah hoc networks, in: Mobiquitous, 3rd Annual International Conference on Mobile and Ubiquitous Systems - Workshops, 2006. [4] B. Williams, T. Camp, Comparison of broadcasting techniques for mobile ad hoc networks, in: Proceedings of the ACM International Symposium on Mobile Ad Hoc Networking and Computing (MOBIHOC), 2002, pp. 194–205. [5] L. Li, J. Halpern, Z. Haas, Gossip-based ad hoc routing, in: Proceedings of the IEEE INFOCOM, IEEE Computer Society, 2002. [6] A. M. Hanashi, A. Siddique, I. Awan, M. Woodward, Dynamic probabilistic flooding performance evaluation of on-demand routing protocols in manets, in: CISIS ’08: Proceedings of the 2008 International Conference on Complex, Intelligent and Software Intensive Systems, IEEE Computer Society, Washington, DC, USA, 2008, pp. 200–204. doi:http://dx.doi.org/10.1109/CISIS.2008.66. [7] A. M. Hanashi, I. Awan, M. Woodward, Performance evaluation based on simulation of improving dynamic probabilistic flooding in manets, in: WAINA ’09: Proceedings of the 2009 International Conference on Advanced Information Networking and Applications Workshops, IEEE Computer Society, Washington, DC, USA, 2009, pp. 458–463. doi:http://dx.doi.org/10.1109/WAINA.2009.78. [8] M. B. Yassein, A. Al-Dubai, M. O. Khaoua, O. M. Al-jarrah, New adaptive counter based broadcast using neighborhood information in manets, Parallel and Distributed Processing Symposium, International 0 (2009) 1–7. doi:http://doi.ieeecomputersociety.org/10.1109/IPDPS.2009.5161179. [9] E. Alba, B. Dorronsoro, F. Luna, A. J. Nebro, P. Bouvry, L. Hogie, A cellular multi-objective genetic algorithm for optimal broadcasting

30

strategy in metropolitan manets, Comput. Commun. 30 (4) (2007) 685– 697. doi:http://dx.doi.org/10.1016/j.comcom.2006.08.033. [10] D. Nguyen, P. Minet, Analysis of mpr selection in the olsr protocol, in: AINAW ’07: Proceedings of the 21st International Conference on Advanced Information Networking and Applications Workshops, IEEE Computer Society, Washington, DC, USA, 2007, pp. 887–892. doi:http://dx.doi.org/10.1109/AINAW.2007.94. [11] C. A. Coello Coello, G. B. Lamont, D. A. Veldhuizen, Evolutionary Algorithms for Solving Multi-Objective Problems, Springer, 2007. [12] K. Deb, S. Agrawal, A. Pratap, T. Meyarivan, A fast and elitist multiobjective genetic algorithm: Nsga-ii., IEEE Trans. Evolutionary Computation 6 (2) (2002) 182–197. [13] E. Zitzler, M. Laumanns, L. Thiele, Spea2: Improving the strength pareto evolutionary algorithm, Tech. rep., Computer Engineering and Networks Laboratory (TIK),Swiss Federal Institute of Technology (ETH) Zurich (2001). ˇ [14] A. Zamuda, J. Brest, B. Boˇskovic, V. Zumer, Differential evolution with self-adaptation and local search for constrained multiobjective optimization, in: CEC’09: Proceedings of the Eleventh conference on Congress on Evolutionary Computation, IEEE Press, Piscataway, NJ, USA, 2009, pp. 195–202. ˇ [15] A. Zamuda, J. Brest, B. Boˇskovic, V. Zumer, Differential evolution for multiobjective optimization with self adaptation, in: CEC’07: Proceedings of the Eleventh conference on Congress on Evolutionary Computation, 2007, pp. 3617–3624. [16] P. T. Boggs, J. W. Tolle, Sequential quadratic programming for largescale nonlinear optimization, J. Comput. Appl. Math. 124 (1-2) (2000) 123–137. doi:http://dx.doi.org/10.1016/S0377-0427(00)00429-5. [17] V. Guliashki, H. Toshev, C. Korsemov, Survey of evolutionary algorithms used in the multi-objective optimization, in: Problems of Engineering Cybernetics and Robotics, Vol. 60, Sofia, 2008.

31

[18] The Network Simulator Project - ns-2. URL http://www.isi.edu/nsnam/ns [19] A. Liefooghe, M. Basseur, L. Jourdan, E.-G. Talbi, ParadisEO-MOEO: A Framework for Evolutionary Multi-objective Optimization, in: Evolutionary Multi-Criterion Optimization (EMO 2007), Vol. 4403 of Lecture Notes in Computer Science (LNCS), Matsushima Japon, 2007, pp. 386– 400. URL http://hal.inria.fr/inria-00269972/PDF/075.pdf [20] Q. Zhang, A. Zhou, S. Zhao, P. N. Suganthan, W. Liu, S. Tiwari, Multiobjective optimization test instances for the cec 2009 special session and competition, Tech. rep., The School of Computer Science and Electronic Engineering (2009). [21] D. Dhoutaut, A. Regis, F. Spies, Impact gation models in vehicular ad hoc networks VANET’06: Procs of the 3rd int. workshop hoc networks, ACM Press, Los Angeles, USA, doi:http://doi.acm.org/10.1145/1161064.1161072.

of radio propasimulations, in: on Vehicular ad 2006, pp. 40–49.

[22] J.-L. Hippolyte, C. Bloch, P. Chatonnay, C. Espanet, D. Chamagne, A self-adaptive multiagent evolutionary algorithm for electrical machine design, in: D. T. et al. (Ed.), GECCO’07: Proceedings of the 9th annual conference on Genetic and evolutionary computation, Vol. II, ACM Press, New York, NY, USA, 2007, pp. 1250–1255.

32