An Efficient Large-Scale Sensor Deployment Using a Parallel Genetic ...

3 downloads 49798 Views 13MB Size Report
Feb 10, 2016 - Jae-Hyun Seo,1 Yourim Yoon,2 and Yong-Hyuk Kim1. 1Department of Computer ... time monitoring and automation in industrial fields, traf- .... a software environment that allows developers to use C ... return the best solution;.
Hindawi Publishing Corporation International Journal of Distributed Sensor Networks Volume 2016, Article ID 8612128, 17 pages http://dx.doi.org/10.1155/2016/8612128

Research Article An Efficient Large-Scale Sensor Deployment Using a Parallel Genetic Algorithm Based on CUDA Jae-Hyun Seo,1 Yourim Yoon,2 and Yong-Hyuk Kim1 1

Department of Computer Science and Engineering, Kwangwoon University, 20 Kwangwoon-ro, Nowon-gu, Seoul 139-701, Republic of Korea 2 Department of Computer Engineering, Gachon University, 1342 Sengnamdaero, Sujeong-gu, Seongnam-si, Gyeonggi-do 461-701, Republic of Korea Correspondence should be addressed to Yong-Hyuk Kim; [email protected] Received 5 September 2015; Revised 11 December 2015; Accepted 10 February 2016 Academic Editor: Daniel G. Reina Copyright Β© 2016 Jae-Hyun Seo et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. We have employed evolutionary computation to solve the optimization problem of sensor deployment in battlefield environments. A genetic algorithm has the advantage of delivering results of a higher quality than simple computational algorithms, but it has the drawback of requiring too much computing time. This study aimed not only to shorten the computing time to as close to real-time as possible by using the Compute Unified Device Architecture (CUDA) but also to maintain a solution quality that is as good as or better than the case when the proposed algorithm is not used. In the proposed genetic algorithm, parallelization was applied to speed up the fitness evaluation requiring heavy computation time. The proposed CUDA-based design approach for complex and various sensor deployments is validated by means of simulation. We parallelized two parts in Monte Carlo simulation for the fitness evaluation: moving lots of tested vehicles and calculating the probability of detection (POD) for each vehicle. The experiment was divided into CPU and GPU experiments depending on arithmetic unit types. In the GPU experiment, the results showed similar levels for the detection probability by GPU and CPU, and the computing time decreased by approximately 55-56 times.

1. Introduction For several decades, a number of studies have been conducted on wireless sensor networks (WSNs), which have now become an essential component in the Internet of Things (IoT) in recent years [1]. WSNs consist of a large number of miniaturized low power sensors. Data is collected through these sensors wirelessly and stored in sinks, which then transfer the collected data to other networks. Currently, WSNs are used in many areas such as realtime monitoring and automation in industrial fields, traffic surveillance and control, continuous health care, military target tracking, and environmental monitoring. Unlike commonly used wired and wireless networks, WSNs have many limitations to consider, such as battery life, computation capability, and communications. WSNs are very much application oriented; therefore, they require a customizable design according to application environments, and

they require cross-layer optimization in the communication protocol stack. For this reason, WSNs require a wide range of research in multiple fields, including MAC, data routing, and transport protocols. WSN deployment is also considered one of the major WSN design factors and it has a significant effect on the performance measurement index, such as connectivity between sensors, efficient network coverage, and the network life cycle. Therefore, WSN deployment requires considerable research. In recent years, several well-organized survey studies on issues relating to WSN deployment have been published [1, 2]. In general, WSN is deployed in planned and random deployment methods. In random deployment, sensors are thrown into the region of interest (RoI) randomly, which could be in a disaster area or in war zones using airplanes. On the other hand, in the case of planned deployment, the location where sensors are deployed is determined beforehand to aim for maximum coverage, minimum power consumption, and strengthening

2 of network connectivity. This deployment method is mainly used in border surveillance, facility intrusion detection, or procedural health care. It is also used in inaccessible ROIs into which sensors can be moved for deployment. In reality, various design objects, such as heterogeneous sensors and a large number of sensor deployments, characterize issues regarding planned deployment, which is an NP-hard problem [3]. Thus, it shows a tendency of rapidly increasing computing time to determine optimal solutions, depending on the size of the problem. To reduce the computing time, we applied parallelization techniques using GPU. There are four approaches for the planned deployment of WSNs: computational geometry, artificial potential field, genetic algorithm (GA), and particle swarm optimization. We have chosen the GA as our approach, which is a popular heuristic method. There are several fundamental design factors in WSNs, such as the sensing model, sensor mobility, and network coverage and connectivity. Sensing models are divided into binary and probabilistic sensing models. In a binary sensing model [4, 5], a sensor has a simple fixed sensing radius and the object is detected if an object is present in the sensing radius; otherwise, it is not detected. In the probabilistic sensing model [6], various factors, such as noise and barriers, capable of affecting the accuracy of the sensor reading are taken into consideration. Depending on sensor mobility, WSNs are divided into static and mobile WSNs. A mobile WSN consists of sensors that have sensing, processing, communication, and mobile capabilities. A mobile WSN offers the advantage of redeployment and the ability to control sensor deployment after the initial random deployment and reconfiguration to restore networks that were disconnected because of energy depletion and environmental changes. However, a static WSN is assumed for this study. The WSN coverage can be classified into three methods: area coverage, point coverage, and barrier coverage [6, 7]. We have used the barrier coverage method [8], which deals with the general detection of all movement crossing over sensor barriers. Korea presents special circumstances, because the country is divided into South and North. More than 70% of the terrain in Korea is mountainous, and guerrilla-like battles would be crucial to determine the fate of war. Mountainous terrain is characterized by the fact that it is extremely difficult to detect enemies with the naked eye at night and particularly it is almost impossible to precisely detect the size of enemies in a large area. To solve this problem, a variety of existing sensor deployment techniques can be applied, but it is still a highly complex problem to deploy sensors efficiently and rapidly while taking topographical characteristics into consideration [9]. We studied sensor deployment to detect movements of enemy troops and to enable us to cope with the unique circumstances presented by the mountainous terrain in Korea. As a result, we proposed the paralleled sensor deployment optimization method using GPU to obtain near real-time results by applying evolutionary computation. The contributions of this paper include the following: (i) based on real environments, we used two types of sensors and three scenarios with different terrains and varied the number of sensors from 15 to 200 for comparison between CPU and GPU experiments; (ii) we not only shortened the

International Journal of Distributed Sensor Networks computing time to as close to real-time as possible by using the CUDA but also maintained solution qualities that are as good as the results shown in the CPU test; (iii) we took an elaborated parallelized approach based on CUDA for complex and various sensor deployments. The remainder of this paper is organized as follows: Section 2 explains CUDA and generational GA. Section 3 introduces related work for our subject handled in this paper. In Section 4, we present the problem definition and in Section 5 we explain the proposed parallel GA. Section 6 describes the environments of our experiment and Section 7 analyzes the results. The paper ends with conclusions in Section 8.

2. Preliminaries 2.1. CUDA (Compute Unified Device Architecture). In November 2006, NVIDIA introduced CUDA [11], a general purpose parallel computing platform and programming model that leverages the parallel compute engine in NVIDIA GPUs to solve many complex computational problems in a more efficient way than on a CPU. CUDA comes with a software environment that allows developers to use C as a high-level programming language. Other languages, application programming interfaces, or directives-based approaches are supported, such as FORTRAN, DirectCompute, and OpenACC. The advent of multicore CPUs and many-core GPUs means that mainstream processor chips are now parallel systems. Furthermore, their parallelism continues to scale with Moore’s law. The challenge is to develop application software that transparently scales its parallelism to leverage the increasing number of processor cores, much as 3D graphics applications transparently scale their parallelism to many-core GPUs with widely varying numbers of cores. The CUDA parallel programming model is designed to overcome this challenge while maintaining a low learning curve for programmers familiar with standard programming languages such as C. At its core are three key abstractionsβ€” a hierarchy of thread groups, shared memories, and barrier synchronizationβ€”that are simply exposed to the programmer as a minimal set of language extensions. These abstractions provide fine-grained data parallelism and thread parallelism, nested within coarse-grained data parallelism and task parallelism. They guide the programmer to partition the problem into coarse subproblems that can be solved independently in parallel by blocks of threads, and each subproblem into finer pieces that can be solved cooperatively in parallel by all threads within the block. This decomposition preserves language expressivity by allowing threads to cooperate when solving each subproblem and at the same time enables automatic scalability. Indeed, each block of threads can be scheduled on any of the available multiprocessors within a GPU, in any order, concurrently or sequentially, such that a compiled CUDA program can be executed on any number of multiprocessors, and only the runtime system needs to know the physical multiprocessor count.

International Journal of Distributed Sensor Networks

3

Create an initial population of size 𝑛; repeat { for 𝑖 = 1 to π‘˜ { choose parent1 and parent2 from the population; offspring𝑖 = crossover (parent1 , parent2 ); offspring𝑖 = mutation (offspring𝑖 ); } replace (population, [offspring1 , offspring2 , . . . , offspringπ‘˜ ]); } until (stopping condition); return the best solution; Algorithm 1: Pseudocode of a typical genetic algorithm. Sensor deployment Generational GA process Stop condition

Population creation

β˜…

Random mutation

Replacement β˜…

Tournament selection Uniform crossover

Evaluation (GPU/CPU)

Detection rate β˜… This step requires an evaluation process

Figure 1: Flow chart of the proposed generational GA.

This scalable programming model allows the GPU architecture to span a wide market range by simply scaling the number of multiprocessors and memory partitions: from the high-performance enthusiast GeForce GPUs and professional Quadro and Tesla computing products to a variety of inexpensive, mainstream GeForce GPUs. 2.2. Genetic Algorithm. Algorithm 1 shows the pseudocode of a typical GA [12]. In this algorithm, if we define that 𝑛 is the number of solutions in the population, we randomly create 𝑛 new solutions. The evolution starts from the population of completely random individuals and the fitness of the whole population is determined. Each generation consists of several operations such as selection, crossover, mutation, and replacement. All individuals in the current population are replaced with new individuals to form a new population. Finally, this generational process is repeated until a termination condition has been reached. In a typical GA, the whole number of individuals in a population and the number of reproduced individuals are fixed at 𝑛 and π‘˜, respectively. The percentage of individuals who will be copied to the new generation is defined as the ratio of the number of new individuals to the size of the parent population, π‘˜/𝑛, which we termed the β€œgeneration gap.”

In this study, we use a generational GA of which generation gap is 1. Figure 1 shows the process of our GA of which each part is briefly described in Table 1.

3. Related Work We have studied sensor deployment using a GA from a number of studies on WSNs. Table 2 compares studies that aimed to solve WSN deployment problems by using the GA. A study by Yoon and Kim [13] proposed an efficient GA for the solution of the maximum coverage sensor deployment problem (MCSDP) with a maximum detection area. The authors created a novel normalization method, which was especially developed for the MCSDP with efficient evaluation functions and they proposed a novel sensor deployment methodology that was suited to the characteristics of the GA. They used a binary sensing model and diversified a sensing type and the radius of the detection range. In the MCSDP, the same solution was obtained from various types of representations, because the sizes of the genotype and phenotype spaces differed from each other, thereby causing degradation of execution performance, which was then alleviated by using the proposed normalization method. The evaluation functions derived results by combining the

4

International Journal of Distributed Sensor Networks Table 1: GA used in the experiment.

GA parameters

Descriptions

Encoding

A vector of integer pairs (2D coordinates).

Population initialization

Creation of π‘š random solutions.

Selection

Once 2𝑑 solutions are selected randomly from the solution group, the best quality solution is chosen (𝑑 = 3). Creating an offspring by the genetic recombination of two parents. Each integer pair (a grid point) is randomly copied from the first or the second parent. Changing each gene of the offspring at the rate of 5 percent. Each integer pair (a grid point) in the offspring is randomly selected, and each integer value of the pair is added or subtracted by a small amount. If the solution quality of an offspring is better than that of the worst individual in the population, we replace it with the worst one.

Crossover Mutation Replacement

Table 2: Comparison between genetic-based algorithms for WSN deployment. Algorithm

Type of GA

Objective(s)

Type of sensors

Encoding

Sensing model

Yoon and Kim [13]

Normalized

Heterogeneous

Integer

Deterministic

Kima et al. [14]

Generational

(i) Maximize coverage (i) Minimize travel time, find optimal location of sensors

Homogeneous

Binary

Probabilistic

(i) Maximize coverage

Homogeneous

Real-number

Deterministic

Homogeneous

Integer

Probabilistic

Homogeneous

Real-number

Deterministic

Homogeneous

Real-number

Deterministic

Binary

Deterministic

Binary

Probabilistic

Heterogeneous; (nonisotropic)

Integer

Deterministic

Shi and Zhou [15] Unaldi et al. [16] Jourdan and de Weck [17] Jourdan and de Weck [18]

Niche (real-coded) Generational, steady-state Standard, multiobjective Standard, multiobjective

Carter and Ragade [19]

Microbial

Carter and Ragade [20]

Microbial

Deif and Gadallah [21]

Variable-length

Ta et al. [7] Indhumathi and Venkatesan [22]

(i) Maximize the quality of coverage of a WSN (i) Maximize coverage (ii) Maximize lifetime (i) Maximize coverage (ii) Maximize survivability (iii) Minimize number of sensors (i) Cover a set of target points (ii) Minimize number of sensors (i) Cover a set of target points (ii) Minimize number of sensors (i) Cover a set of target points (ii) Maximize coverage (iii) Minimize cost

Heterogeneous; (acoustic and image) Homogeneous; (infrared)

Generational

(i) Minimize number of sensors

Homogeneous

Integer

Deterministic

Generational

(i) Maximize coverage

Heterogeneous

Binary

Deterministic

detection ranges of all sensors. In their experiment, in which the GA was used, the number of samples was set to 100,000. Further, once the final solution was obtained, the number of samples was set to 1 million to increase the accuracy of the result by recalculating it, thereby increasing its efficiency. Furthermore, as the generation of the GA evolved, the number of samples was increased gradually in the experiment, thereby reducing the computing time approximately by half, and achieving considerable performance improvement in terms of quality. Kim et al. [14] studied methods for increasing the accuracy of time measurements for running vehicles on the highway. Previously, a travel time was estimated based on speed data measured via fixed sensors, but it was inaccurate, because it differed from the actual travel time. The authors took the characteristics of the highway (interchange connections, exits, and accidents) into consideration in the

experiment as well as various setup options depending on the traffic volume, time of day, and incident duration. In the experiment, a microscopic traffic simulation model and GA were employed. The authors reduced the experimental time dramatically by omitting the simulation process, which was required every time when measuring fitness, instead of using an evaluation method of travel time via the speed data measured according to sensor positions produced from simulation results using the maximum number of sensors. The maximum number of sensors was 594 sensors and the gap between sensors was 76.2 m. The total distance in the experimental section was approximately 47.044 km. Binary encoding of 594 bits, which corresponded to the maximum number of sensors, was used. An activated sensor was indicated by bit 1, while an inactivated sensor was indicated by bit 0. In the experimental results, the error of travel time estimation was reduced to less than 10% and showed a better

International Journal of Distributed Sensor Networks accuracy over 90% than existing measurement methods for most traffic situations. The best performance was revealed when using 60 sensors. In a study by Shi and Zhou [15], a two-step strategy was used: the quality of the initial solution was increased up to a certain level using virtual force (VF) and then sensor deployment was optimized using a GA. They used 30 fixed sensors and 20 mobile sensors in a 100 Γ— 100 area and 20 mobile sensors were used for optimal sensor deployment by means of VF and GA. The experiment was divided into VF, real-coded GA (RCGA), virtual force GA (VFGA), virtual force crossover GA (VFCGA), and virtual force mutation GA (VFMGA). The mean probability of detection (POD) of the experimental result showed the performance of VF, RCGA, VFGA, VFCGA, and VFMGA to be 83.79%, 96.53%, 97.53%, 95.83%, and 95.17%, respectively, indicating that VFGA performed the best. Unaldi et al. [16] studied an optimal sensor deployment method in which a wavelet-transform- (WT-) based mutation operator was applied to a steady-state GA (S-GA) in three-dimensional (3D) terrain. Their proposed algorithm maximized the detection area by using a probabilistic sensing model and line of sight (LOS) algorithm proposed by Bresenham. The LOS algorithm can perform a relatively fast computation because it does not require interpolation computation in 3D environments. The sensing target area is divided into subregions depending on the number of sensors, and a subregion is composed of multiple pixels and a sensor is positioned at a specific pixel. The representation of each solution consists of a subregion and pixel numbers. When the specific detected pixel was within sensing range and there was a LOS between two pixels, then it was regarded as detected. This means that there was a height difference between a pixel located where a specific sensor was located and the detected pixel due to 3D terrain characteristics, and a virtual line was drawn between two pixels. If there was no pixel that went beyond that virtual line, it was regarded as detected. A singlepoint cross-operation was used, and a random mode and a WT-based attractiveness mode were compared for mutation operation. Furthermore, the WT-based mutation operation was divided into a method allowing a sensor to move only within a subregion and a method allowing a sensor to move between subregions, and the two methods were compared. The generational GA and S-GA were also compared. The SGA experiment, in which the location of the sensor was set up to enable it to move between subregions and used a WTbased mutation operator, had the highest average POD of 76.3%. Jourdan and de Weck [17] employed multiobjective GA (MOGA) to achieve the optimal deployment of 𝑛 fixed sensors in a 2D plain. Their objectives were to maximize the coverage and lifetime of the WSNs. They used a binary sensing model and configured all sensors to have the same communication radius and sensing radius. Representation of solutions was expressed in a coordinate set of all sensors and each solution was ranked according to its area coverage and lifetime. Real-number encoding and random single-point crossover were used. In [18], Jourdan and de Weck applied the results of their study to three specific

5 surveillance scenarios. The first scenario had three objectives: coverage maximization, minimization of deployed sensors, and maximization of distance between deployed sensor and hostile building under surveillance to ensure the maximum survival of networks. The objectives of the second and third scenarios were to minimize the number of deployed sensors while maximizing WSN coverage. The difference between the second and third scenarios was the different coverage type. The barrier coverage was maximized in the second scenario, while the area coverage was maximized in the third scenario. The experimental results showed samples of a pareto optimal set in nondominated deployments for each scenario, and the results offered guidance as to how trade-off information between competing objectives should be presented to a network designer. Their study had the advantage of being flexible enough to apply the result to design objectives of various sets, but their study had two drawbacks. First, it could derive incorrect results with a low accuracy due to the use of a binary sensing model. Second, it lacked practicality as their study assumed RoIs that only consisted of plains without obstacles. Carter and Ragade [20] aimed to solve the problem by covering the target locations (target points) of a finite set by using predetermined sensors. They attempted to solve two deployment objectives (i.e., to minimize the number of deployed sensors and guarantee coverage of all target points) by using microbial GA [23]. They used two sensors: acoustic and image sensors. For acoustic sensors, a binary sensing model was applied, whereas the field-of-view metric [24] was applied for image sensors. Their study extended a previous study [19] by adding probabilistic coverage determination methods. Feng et al. [25] studied the optimal placement of pyroelectric infrared (PIR) sensors in developing the infrared motion sensing system for human motion localization. They used a GA in optimizing both the deployment and the modulated field of view of the PIR sensors for improving the localization performance. The average and maximum localization errors were used to evaluate the localization performance. The numerical analysis was also presented to offer guidance on the searching spaces of the design parameters in implementing GA optimization. Watfa and Commuri [26] studied the optimal 3-dimensional sensor deployment strategy. They rigorously analyzed the deployment problem in a 3D space in WSNs. They tried to solve the problem of determining the minimum number of sensor nodes that guarantee complete coverage. A regular hexagonal lattice arrangement was applied to solve the problem. In the present study, a real terrain was divided into a 2D grid and aimed to use optimal sensor deployment capable of maximizing the detection of moving vehicles. We took the topographical characteristics into consideration and acoustic and FLIR sensors were used. In the present study, a Monte Carlo simulation method was employed and a large performance improvement was achieved by using parallel processing with GPUs. The parts to which parallel processing was applied were POD computation in the fitness function and POD computation due to movement of the vehicle. We aimed to improve the practical applicability by

6

International Journal of Distributed Sensor Networks Table 3: Capabilities of sensors.

50 (0, 0)

(49, 0)

50

Capabilities of sensors Seismic sensor

π‘₯ < 250 m, 0.95 π‘₯ < 500 m, βˆ’1.37 βˆ— ln(π‘₯) + 8.55

FLIR sensor

π‘₯ < 150 m, 0.90 π‘₯ < 300 m, 100 βˆ— βˆ’1.20 βˆ— ln(π‘₯) + 6.90

was set at 90%, if the distance was less than 300 m and larger than 150 m, the range was set at (βˆ’1.20 Γ— ln(distance) + 6.90), and if the distance was more than 300 m, the range was set at 0%.

(0, 49)

(49, 49)

Figure 2: Grid model of the battlefield for sensor deployment.

reflecting the real topographical information and overcoming the disadvantage of a GA, namely, slow computing speed.

4. Problem Definition 4.1. Terrain and Obstacles. Real terrain of 5 km by 5 km was divided according to a 50 Γ— 50 grid (Figure 2) and the size of each grid point was set to 100 m Γ— 100 m. For a grid point in hilly terrain, the detection capacity of the sensor was reduced by a certain value [9, 10]. We used three different scenarios: one of them is plain and the other two are with hilly terrain. Figure 3 shows the terrain of Scenarios 1 and 2 and the yellow parts represent hilly terrain. A POD at each grid point was calculated after the sensors were deployed and PODs were calculated by using a setting that enabled a vehicle to be detected or not detected based on the PODs whenever a vehicle passed through each grid point. Figure 4 shows the approximate location of Figures 3(a) and 3(b) on a map of the Korean Peninsula. In Section 6.1, we give a detailed explanation of our modeling and simulator. In the modeling of Scenarios 1 and 2, we represented the grid model of Figure 2 as a twodimensional integer array having 50 Γ— 50 sizes. If a grid point is higher than 100 meters above the sea level, the element of the array has value 1 as hill, otherwise 0 as plain. We developed a sensor deployment simulator using C and CUDA-C languages on NVIDIA CUDA platform 7.0.

4.3. POD Calculation at Each Grid Point. After a certain number of grid sensors were deployed, the distance between a single grid point and sensors is calculated. Equations (1) and (2) mean the probabilities that a single grid point could be detected by a seismic sensor and an FLIR sensor around the grid point, respectively. Each probability is accumulated for the process of (3), and this accumulated value is POD of the grid point. An example of POD calculation is shown in Section 5.1, in Algorithm 3 and Figure 5. Equation (3) represents the POD equation for a single grid point, where 𝑝𝑗 is the POD at each grid point, 𝑛 is the number of deployed sensors, and π‘˜ is a sequence number of the grid point (0– 2499). In the case of 𝑗 ∈ {1, 2, . . . , π‘š}, 𝑝𝑗 is the probability for detecting a target at grid point 𝑗 and 𝑑𝑖𝑗 is the distance between sensor 𝑖 and grid point 𝑗. Constant 𝐷 is 500 in the case of seismic sensors, and 300 in the case of FLIR sensors. In the POD computation, the condition of the hilly terrain was taken into consideration, applying the reduction rate of 65% when the grid point was on a hillside [9, 10]: 𝑃𝑖𝑗 (seismic) 𝐷 { 0.95, if 𝑑𝑖𝑗 < { { 2 { { = {βˆ’1.37 Γ— ln (𝑑 ) + 8.55, if 𝑑 < 𝐷, 𝑑 β‰₯ 𝐷 𝑖𝑗 𝑖𝑗 𝑖𝑗 { { 2 { { if 𝑑𝑖𝑗 β‰₯ 𝐷, {0,

(1)

𝑃𝑖𝑗 (FLIR) 𝐷 { 0.90, if 𝑑𝑖𝑗 < { { 2 { { = {βˆ’1.20 Γ— ln (𝑑 ) + 6.90, if 𝑑 < 𝐷, 𝑑 β‰₯ 𝐷 𝑖𝑗 𝑖𝑗 𝑖𝑗 { { 2 { { if 𝑑𝑖𝑗 β‰₯ 𝐷, {0,

(2)

𝑛 π‘–βˆ’2

4.2. Type of Sensors and Detectable Range by Sensor. Seismic and FLIR sensors as shown in Table 3 were used for the experiment [10]. For seismic sensors, if the distance between the grid point and sensor was less than 250 m, a detectable range was set at 95%, if a distance was less than 500 m and larger than 250 m, a range was set at (βˆ’1.37 Γ— ln(distance) + 8.55), and if the distance was more than 500 m, the range was set at 0%. For FLIR sensors, if the distance between the grid point and the sensor was less than 150 m, a detectable range

𝑃 (𝑝𝑗 ) = 𝑃1𝑗 + βˆ‘ ∏ (1 βˆ’ 𝑃𝑖𝑗 ) .

(3)

𝑖=2 π‘˜=1

4.4. POD Calculation. For each moving vehicle π‘₯ position was selected randomly from π‘₯ = {0, 1, . . . , 49} and it was moved in an increasing direction of 𝑦, as shown in Figure 6. Whenever a vehicle passed through each grid point, the POD at each grid point was compared with the random value and if the POD was larger, then it was regarded as

International Journal of Distributed Sensor Networks

7

(a) Around Myonghak-josuji in North Korea (Scenario 1)

(b) Around Myonghak-josuji in North Korea (Scenario 2)

Figure 3: Map where hilly terrain information is applied.

Table 4: Grid and block dimensions of POD calculation. Blocks per grid Threads per block

Figure 4: Location of DML and experimental target area on Korean Peninsula.

detected. The detected vehicles value was increased when it was detected while 𝑦 was within a range of 0–47; otherwise, the undetected vehicles value was increased when it was not detected, calculating the POD (fitness) of each solution as indicated in Detection rate (%) = 100 Γ—

detected vehicles . detected vehicles + undetected vehicles

(4)

5. CUDA-GA Design 5.1. Parallel POD Calculation. In the CPU experiments, the process requiring the longest computation time was the fitness evaluation which consisted of two parts: POD calculation of each grid point and calculation of detection rate according to vehicle movement. Table 4 and Figure 5(b) show the structure of grid and block. A grid has 50 Γ— 100 Γ— 1 blocks and each block has 50 Γ— 1 Γ— 1 threads. A thread was in charge of a POD calculation. Algorithm 2 shows the pseudocode of POD kernel function on NVIDIA CUDA platform. It concurrently calculates every POD per generation. The return values of the kernel function are 250,000 PODs per generation. Coordinates (π‘₯, 𝑦) of all sensors are stored in two one-dimensional arrays,

Dimensions (π‘₯, 𝑦, 𝑧) = (50, 100, 1) (π‘₯, 𝑦, 𝑧) = (50, 1, 1)

respectively. sensors π‘₯ is the pointer of π‘₯-coordinate array and sensors 𝑦 is that of 𝑦-coordinate array. In the kernel function, the arrays are copied to shared memory which is shared by all threads in a block. The shared memory is very fast and reduces memory latency. (𝑐, π‘Ÿ) is a coordinate of a grid point which has a POD. From the grid point (𝑐, π‘Ÿ), the distance with all sensors was calculated and sorted by ascending order. A POD is calculated by accumulating the probability calculated by (1) or (2) according to the order of sensors. It takes 𝑂(𝑛) time to copy system memory into GPU shared memory. The process of POD calculation at each grid point also takes 𝑂(𝑛) time. The total time complexity becomes a linear time with respect to the number of sensors. The CPU experiment used a method that sequentially calculated the POD per generation for each solution. The calculated POD value was input into each grid point, and when a vehicle passed through the grid point, the probability of being detected was calculated by comparing the POD at each grid point with the newly generated random value, to determine whether the vehicle was detected or not. Figure 5(a) shows an example in which the POD is calculated at grid point (π‘₯, 𝑦) = (4, 2) when two seismic sensors and two FLIR sensors were deployed. Algorithm 3 shows the procedure for calculating the POD when the grid point is (4, 2). The POD is the sum of probability values from (a) to (d). Figure 5(b) shows the setup of blocks and threads used to calculate the PODs in parallel at all grid points within the solution group. It consists of 50 blocks, each of which contains 50 threads, thereby enabling 2,500 PODs to be calculated simultaneously. 5.2. Calculation of Vehicle Movement in Parallel. Each vehicle was moved according to the moving direction probability in Figure 7, as shown in Figure 6. Each vehicle sequentially

8

International Journal of Distributed Sensor Networks

Block (0, 0)

...

Block (49, 0)

.. .

.. .

Block (0, 99)

Thread (a) POD calculation of a single cell

...

Block (49, 99)

Calculate POD of one cell

(b) POD calculation for each thread in the GPU

Figure 5: Method for parallelizing POD calculation of each cell in the GPU.

...

0.02

0.07

0.05

0.07

0.22

0.35

0.22

Direction of movement

Figure 6: Probability of possible moving direction of vehicles and of moving direction [9, 10]. Figure 7: Example of vehicle movement path passing through target area.

passes through the region of interest and vehicles have no interference with each other. In the GPU experiment, vehicles of all solutions (100 solutions Γ— 30 vehicles) were moved in parallel to calculate the POD of each solution. Whenever a vehicle moves into other grid points, 𝑦-coordinate is changed. In the case that 𝑦-coordinate is larger than or equal to 47, the vehicle is regarded to have arrived at the destination as in [10].

In the fitness evaluation of a solution, each of thirty vehicles independently passes through the region of interest in its own workspace, in parallel. Table 5 and Figure 8 show the structure of a grid and blocks of vehicle movements.

International Journal of Distributed Sensor Networks

9

// 𝑛: # of sensors; // π‘₯[1, . . . , 𝑛], 𝑦[1, . . . , 𝑛]: coordinates of sensors in system memory; // 𝑠π‘₯[1, . . . , 𝑛], 𝑠𝑦[1, . . . , 𝑛]: coordinates of sensors in GPU shared memory; // 𝑔𝑑id : unique index of each thread in GPU; // 𝑐: column number of a current grid point; π‘Ÿ: row number of a current grid point if threadIdx.x is zero then for 𝑖 ← 0 to 𝑛 βˆ’ 1 do 𝑠π‘₯[𝑖] ← π‘₯[𝑖]; 𝑠𝑦[𝑖] ← 𝑦[𝑖]; end for end if syncthreads(); // wait all threads in this block; // sorting sensors in ascending distance order around the grid point (𝑐, π‘Ÿ); POD ← 0.0; // POD: probability of detection at the grid point (𝑐, π‘Ÿ); for 𝑖 ← 0 to 𝑛 βˆ’ 1 do 𝑑 ← distance between the grid point (𝑐, π‘Ÿ) and sensors [𝑖]; st ← kind of sensors [𝑖]; // SEISMIC or FLIR if st is SEISMIC then if 𝑑 ≀ 250 then POD += (1 βˆ’ POD) βˆ— 0.95; else if 𝑑 < 500 then POD += (1 βˆ’ POD) βˆ— (βˆ’1.3738 βˆ— ln(𝑑) + 8.5464); end if else if st is FLIR then if 𝑑 ≀ 150 then POD += (1 βˆ’ POD) βˆ— 0.90; else if 𝑑 < 300 then POD += (1 βˆ’ POD) βˆ— (βˆ’1.2 βˆ— ln(𝑑) + 6.9); end if end if end for Algorithm 2: Pseudocode of POD calculation.

(a) Probability of cell (4, 2) being detected by sensor (5, 1), which is located at the closest distance. (b) Probability of not being detected at (a) Γ— Probability of cell (4, 2) being detected by sensor (2, 2) which is the next closest distance. (c) Probability of not being detected at (b) Γ— Probability of cell (4, 2) being detected by sensor (6, 4) which is the next closest distance. (d) Probability of not being detected at (c) Γ— Probability of cell (4, 2) being detected by sensor (4, 6) which is the next closest distance. Algorithm 3: Example of POD calculation of a grid point.

Table 5: Grid and block dimensions of vehicle movement. Blocks per grid Threads per block

Dimensions (π‘₯, 𝑦, 𝑧) = (100, 1, 1) (π‘₯, 𝑦, 𝑧) = (30, 1, 1)

Algorithm 4 gives the pseudocode of calculating detection ratio according to the vehicle movements. PODgrid is a variable of having 250,000 PODs in all solutions and it is given as a parameter. Whenever a vehicle moves into other grid points, we check whether or not the vehicle is detected. In the case that the vehicle is detected, the value of detected variable increases by 1, otherwise that of undetected variable increases by 1. Time complexity of Algorithm 4 is 𝑂(π‘Ÿ), where π‘Ÿ is the number of rows in given grid model.

6. Experimental Setup The performance was compared and evaluated by using a CPU and GTX970 GPU. If the number of degrees of freedom in the 𝑑 distribution was 30 or larger, the approximation can be statistically meaningful in our experiments [27]. We performed 30 independent tests for this reason. All types of experiments were repeated 30 times and the mean and standard deviation of the optimal solution for each experiment were calculated to compare the quality of the solution. The time required to complete the experiments was measured by dividing the performance time into the GA execution time, evaluation function execution time, POD calculation time, and vehicle movement and POD calculation time, and the average of each execution time was calculated for the experimental results. For each solution, 30 vehicles were run to calculate the POD. The type of experiment was set as

10

International Journal of Distributed Sensor Networks

// detected: detected count, undetected: undetected count; // π‘₯: column number of a current grid point; // 𝑦: row number of a current grid point; // π‘Ÿ ← random value [0, . . . , 100]; π‘Ÿπ‘“ ← float random value [0.0, . . . , 1.0]; // PODgrid : PODs of a grid; // 𝑑id : unique id of each thread in a block; repeat if π‘Ÿ ≀ 2 then 𝑦 βˆ’ βˆ’; // moving to back; else if π‘Ÿ ≀ 7 then; // hold current position; else if π‘Ÿ ≀ 14 then π‘₯ βˆ’ βˆ’; // moving to right; else if π‘Ÿ ≀ 21 then x++; // moving to left; else if π‘Ÿ ≀ 43 then π‘₯ βˆ’ βˆ’, 𝑦 + +; // moving to right and forward; else if π‘Ÿ ≀ 65 then x++, 𝑦 + +; // moving to left and forward; else if π‘Ÿ ≀ 100 then y++; // moving to forward; end if if PODgrid [𝑑id ] > rf then detected++; else undetected++; end if until (𝑦 < 47); Algorithm 4: Pseudocode of vehicle movements.

...

Block (99, 0)

Block (1, 0)

Block (0, 0)

Table 6: Comparison of GPU specifications used in the experiment.

Thread

Figure 8: Vehicle movements for each thread in the GPU.

follows: First, it was divided into CPU and GPU experiments. Second it was divided into seismic and FLIR sensors, in which a different sensing range was specified for each sensor. Third, the number of deployed sensors was diversified by using 15, 25, 50, 100, 150, and 200 sensors. Finally, three topographical scenarios were used. We used a general desktop computer to optimize sensor deployment on specific area of 5 km by 5 km. The price of the computer was around $500. We could improve computing speed using CUDA after an NVIDIA GeForce GTX 970 and a 500 W power supply unit were installed. The graphic card and the power supply unit cost about $700. 6.1. Test Environments. We developed a sensor deployment simulator using C and CUDA-C languages on NVIDIA CUDA platform 7.0. It has various setup options such as the types of arithmetic units, the number and types of sensors, and GA parameters. The simulator consists of three parts: one

GPU component GPU clock (MHz) CUDA cores Memory clock rate (MHz) L2 cache size (KB) Global memory (MB)

GTX 970 1,317 1,664 3,505 1,792 4,096

GTX 770 1,189 1,536 3,505 512 4,096

GTX 560 1,620 336 2,010 512 2,048

is a generational GA, another is the fitness function of CPU, and the third is that of GPU. In the CPU experiment, a PC with Intel Core2 duo quad core 3.0 GHz processors and 4 GB memory was used. In the GPU experiment, NVIDIA GeForce GTX graphic cards were mounted in the same PC that was used in the CPU experiment. Table 6 shows the information for the three different GPUs used in our tests. The important factors are CUDA cores and L2 cache size. L2 cache is used in shared memory and registers. The L2 cache improves performance for the CUDA applications in the aspects of memory access patterns. NVIDIA GeForce GTX 970 has three times more L2 cache than the other two GPUs, and it can speed up in memory usage if we use L2 cache in an efficient way. CUDA cores are useful to process a lot of small tasks. 6.2. GA Parameters Configuration. Table 7 shows the configuration of the GA parameters for the experiment. The 2D grid was encoded by using a list of (π‘₯, 𝑦) coordinates. The first part of the listed coordinates refers to seismic sensors, whereas the second part refers to the listed coordinate of the FLIR sensor. 6.3. Maximum Coverage. The theoretically possible maximum coverage was calculated as follows. The POD was calculated for a single sensor, and the calculated PODs were

International Journal of Distributed Sensor Networks

11

Table 7: GA parameters used in the experiment. GA parameters Individual fitness function

Descriptions Detection rate

Encoding

A vector of integer pairs (2D coordinates) (15, 25, 50, 100, 150, and 200 dimensions)

#Population

100

#Generation

100

Selection

Tournament selection (π‘˜ = 3)

Crossover

Uniform crossover (crossover rate = 0.5)

Mutation

Gene-wise mutation (mutation rate = 0.05) With regard to individual fitness (detection rate), if an offspring is superior to the worst individual in the population, we replace it with the worst one (after removing the worst individual from the population, we add the offspring to the population)

Replacement

(a) Coverage of a seisemic sensor

(b) Coverage of an FLIR sensor

Figure 9: Coverage of a sensor.

summed, followed by division by the total number of grid points, 2,500. A single seismic sensor and a single FLIR sensor can cover 69 and 25 grid points maximally, respectively, as shown in Figure 9. The sum of the PODs at 69 grid points, each with a value larger than 0 based on a single seismic sensor, was 40.412 and it showed 1.616% of coverage. The sum of the PODs at 25 grid points, each with a value larger than 0 based on a single FLIR sensor, was 14.038 and it showed 0.561% of coverage. In (5), which calculates the maximum coverage, 𝐢seismic was 1.616% and 𝐢FLIR was 0.561%. The parameter #cells represents the size of the grid, which was 2,500. #seismic was the number of seismic sensors and #FLIR was the number of FLIR sensors. Table 8 shows the theoretically possible maximum coverage calculated as per the number of sensors deployed using 𝑀 = 𝐢seismic Γ—

#seismic #FLIR + 𝐢FLIR Γ— . #cells #cells

(5)

7. Experimental Results All values in Tables 9–13 are the average results of 30 experiments. Table 9 shows the results of comparative tests with coverage algorithms such as Multi-Start [13], VFA [5], and the proposed GA, which were obtained using the same computing time for fair comparison. The results of Multi-Start and VFA were superior to those of the initial random deployment, which are given in the last column of Table 10. Multi-Start showed better qualities than VFA when the number of sensors was 50 or under. However, in the case that the number of sensors was above 50, the qualities of VFA were better than or equal to those of Multi-Start. The proposed GA was the best among all tests. Tables 10–12 show experimental results according to the terrain type and the CPU and GPU results according to the number of deployed sensors. Tables 10–12 show that as the number of deployed sensors increased, the POD also increased constantly, while the standard deviation decreased gradually, indicating a conversion of the solution

12

International Journal of Distributed Sensor Networks Table 8: Maximum coverage rate according to the number and type of sensors.

Instances Instance 1 Instance 2 Instance 3 Instance 4 Instance 5 Instance 6

Seismic sensor 7 12 25 50 75 100

FLIR sensor 8 13 25 50 75 100

Coverage of seismic sensor (%) 11.3 19.4 40.4 80.8 100 100

Coverage of FLIR sensor (%) 4.5 7.3 14.0 28.1 42.1 56.2

Maximum coverage (%) 15.8 26.7 54.4 100 100 100

Table 9: Comparative tests with coverage algorithms for plain. Instances

CPU

Instance 1 Instance 2 Instance 3 Instance 4 Instance 5 Instance 6

Multi-Start [13] Ave (%) SD (%) 23.3 1.0 33.9 1.1 53.3 1.0 76.4 0.6 88.4 0.7 94.3 0.3

VFA [5] Ave (%) 23.6 33.2 52.9 76.5 88.5 94.3

GA SD (%) 0.9 0.9 0.9 1.0 0.7 0.4

Ave (%) 26.6 38.1 61.4 85.4 95.1 98.3

SD (%) 1.4 1.4 1.5 0.5 0.3 0.2

Table 10: Experimental results for plain.

GTX 970

CPU

Instances Instance 1 Instance 2 Instance 3 Instance 4 Instance 5 Instance 6 Instance 1 Instance 2 Instance 3 Instance 4 Instance 5 Instance 6

Ave (%) 24.4 36.3 60.2 85.5 95.0 98.3 26.6 38.1 61.4 85.4 95.1 98.3

SD (%) 0.9 0.8 1.4 0.7 0.4 0.2 1.4 1.4 1.5 0.5 0.3 0.2

GA (s) 0.433 0.600 1.321 3.897 7.786 13.227 11.080 25.950 79.101 237.074 456.122 743.208

Eval (s) 0.422 0.582 1.293 3.844 7.706 13.123 11.070 25.936 79.073 237.022 456.041 743.108

POD (s) 0.088 0.243 0.915 3.457 7.318 12.767 N/A N/A N/A N/A N/A N/A

Vehicles (s) 0.008 0.007 0.007 0.007 0.007 0.007 N/A N/A N/A N/A N/A N/A

Random (%) 18.6 28.0 47.2 71.0 84.3 91.6 19.5 29.1 48.7 72.7 85.3 92.7

βˆ—

Ave: average of detection rates, SD: standard deviation of detection rates, GA: computing time of genetic algorithm, Eval: fitness function computing time, POD: POD computing time, Vehicles: computing time according to vehicle movement, and Random: detection rates when random deployment is applied.

Table 11: Experimental results of Scenario 1.

GTX 970

CPU

Instances Instance 1 Instance 2 Instance 3 Instance 4 Instance 5 Instance 6 Instance 1 Instance 2 Instance 3 Instance 4 Instance 5 Instance 6

Ave (%) 22.3 32.9 53.7 76.2 84.7 87.9 25.0 35.5 55.7 77.8 86.3 89.9

SD (%) 0.8 1.0 1.0 0.7 0.4 0.2 1.3 1.3 1.2 1.0 0.7 0.3

GA (s) 0.433 0.624 1.349 3.908 7.793 13.245 10.527 25.360 77.559 232.520 449.369 728.997

Eval (s) 0.423 0.604 1.316 3.854 7.716 13.139 10.517 25.345 77.533 232.469 449.292 728.892

POD (s) 0.089 0.244 0.915 3.460 7.320 12.779 N/A N/A N/A N/A N/A N/A

Vehicles (s) 0.008 0.007 0.007 0.007 0.007 0.007 N/A N/A N/A N/A N/A N/A

Random (%) 16.8 25.2 42.2 63.0 74.9 81.5 17.7 26.2 43.3 64.7 76.4 82.9

International Journal of Distributed Sensor Networks

13

Table 12: Experimental results of Scenario 2.

GTX 970

CPU

Instances Instance 1 Instance 2 Instance 3 Instance 4 Instance 5 Instance 6 Instance 1 Instance 2 Instance 3 Instance 4 Instance 5 Instance 6

Ave (%) 22.6 33.3 53.0 74.6 83.0 86.3 25.1 35.5 54.7 75.1 84.0 87.3

SD (%) 0.8 1.0 1.1 0.7 0.5 0.4 1.3 1.7 1.1 0.4 0.2 0.3

GA (s) 0.430 0.595 1.317 3.903 7.807 13.246 10.482 25.271 77.486 234.460 449.728 730.183

Eval (s) 0.420 0.578 1.290 3.851 7.729 13.139 10.472 25.256 77.460 234.409 449.652 730.076

POD (s) 0.089 0.243 0.913 3.459 7.324 12.780 N/A N/A N/A N/A N/A N/A

Vehicles (s) 0.008 0.007 0.007 0.007 0.007 0.007 N/A N/A N/A N/A N/A N/A

Random (%) 16.4 24.7 40.8 62.0 73.4 80.2 17.3 25.4 42.6 63.5 75.1 81.7

Vehicles (s) 0.008 0.007 0.007 0.007 0.007 0.007 0.012 0.012 0.011 0.011 0.011 0.011 0.022 0.022 0.022 0.022 0.022 0.022

Random (%) 18.6 28.0 47.2 71.0 84.3 91.6 18.7 28.1 47.3 71.2 84.4 91.4 18.6 28.2 47.7 71.0 84.1 91.9

Table 13: Comparison of detection rates between GPUs for plains.

GTX 970

GTX 770

GTX 560

Instances Instance 1 Instance 2 Instance 3 Instance 4 Instance 5 Instance 6 Instance 1 Instance 2 Instance 3 Instance 4 Instance 5 Instance 6 Instance 1 Instance 2 Instance 3 Instance 4 Instance 5 Instance 6

Ave (%) 24.4 36.3 60.2 85.5 95.0 98.3 25.7 39.7 63.8 86.6 95.6 98.7 27.1 39.0 63.1 86.6 95.9 98.7

SD (%) 0.9 0.8 1.4 0.7 0.4 0.2 0.8 1.2 1.3 0.7 0.4 0.2 1.0 1.0 1.1 0.6 0.4 0.2

GA (s) 0.433 0.600 1.321 3.897 7.786 13.227 0.451 0.622 1.392 4.682 9.893 17.115 0.645 0.938 2.345 8.391 20.243 36.870

groups. Table 13 shows the comparative results of detection rates between GPUs for plains. Figure 16 shows computing time according to the number of vehicles in GPU tests. The computing time slightly increases even though the number of vehicles shows drastic change. The reason is that moving of vehicles is concurrently processed in CUDA. Figures 17–19 show the initial deployments and the optimal deployments, which were obtained by the proposed GA, of the tested scenarios. Initial deployment is a very important factor which influences the detection rates. We initially deployed sensors following a uniform distribution to make tests fair. In the plain experiment of Table 10, the β€œAve” values of the GPU and CPU were 98.3% and 98.3%, respectively, which showed a similar quality, whereas the value of β€œEval” revealed that the GPU experiment had a computing speed that was 56 times faster than that of the CPU. Figure 10 shows confidential interval using 3𝜎 values. In the CPU test,

Eval (s) 0.422 0.582 1.293 3.844 7.706 13.123 0.437 0.604 1.364 4.623 9.815 17.012 0.633 0.913 2.317 8.334 20.163 36.761

POD (s) 0.088 0.243 0.915 3.457 7.318 12.767 0.118 0.281 1.007 4.233 9.440 16.663 0.203 0.477 1.852 7.848 19.670 36.295

huge computing time for fitness evaluation is caused by POD calculation, and the computing time of POD is significantly improved by parallel processing. In Scenario 1 experiment of Table 11, the β€œAve” values of the GPU and CPU were 89.9% and 87.9%, which showed a comparable quality in Figure 11, while β€œEval” revealed that the computing speed in the GPU experiment was about 55 times faster than for the CPU. In Scenario 2 experiment of Table 12, the β€œAve” values of the GPU and CPU were 87.3% and 86.3%, which showed the CPU experiment had a slightly better result. However, the results of CPU and GPU tests were included within the span of the error bar in Figure 12. In the β€œEval,” the computing time of the GPU experiment was about 56 times faster than for the CPU. Table 13 presents a comparison of the results of GPU performance on a plain. The experimental results show that the computing speed of the GTX970 was faster overall than

International Journal of Distributed Sensor Networks

50 40

Instance 1

Instance 6

Instance 5

Instance 4

Instance 3

Instance 2

7.706

13.123

Scenario 1 728.892

449.292

232.469 77.533

Instance 5

Instance 4

7.716

13.139

Instance 6

3.854

1.316

Instance 3

Instance 1

800 700 600 500 400 300 200 100 0

Instance 2

(s)

Scenario 1

CPU GTX 970

Instance 6

(b) CPU versus GPU (Scenario 1)

80 70

449.652

234.409 77.460

1.290

3.851

7.729

13.139

Instance 6

Instance 1

90

730.076

Instance 5

Scenario 2

Scenario 2

Instance 4

Figure 11: Confidential interval graph of Scenario 1.

(s)

CPU GTX 970

800 700 600 500 400 300 200 100 0

Instance 3

Instance 5

Instance 4

Instance 3

Instance 2

Instance 1

Instance

Instance

Instance

60 CPU GTX 970

50 40

(c) CPU versus GPU (Scenario 2)

Instance 6

Instance 5

Instance 4

Instance 3

Instance 2

Instance 1

30 20

3.844

(a) CPU versus GPU (plain)

Figure 10: Confidential interval graph of plains.

Detection rate

1.293

Instance

CPU GTX 970

Detection rate

79.073

CPU GTX 970

Instance

100 90 80 70 60 50 40 30 20 10

237.022

Instance 2

20

Instance 1

30

456.041

Instance 6

60

743.108

Instance 5

70

Plain

Instance 4

80 (s)

Detection rate

90

800 700 600 500 400 300 200 100 0

Instance 3

Plain

100

Instance 2

14

Instance CPU GTX 970

Figure 12: Confidential interval graph of Scenario 2.

Figure 13: Comparison of CPU and GPU computing time according to scenarios and the number of sensors.

those of the other GPUs, followed by the GTX770. In the case of Instance 6, the GTX970 was approximately 1.3 times and 2.8 times faster than the GTX770 and GTX560, respectively. The GTX770 was approximately 2.16times faster than the

15

Computation time

40 35 30 25 20 15 10 5 0

20.163

9.815 7.706

8.334

17.012 13.123

Instance 6

Instance 5

Instance 2

Instance 3

Instance 4

3.844

Time (ms)

36.761

Instance 1

(s)

International Journal of Distributed Sensor Networks 50 45 40 35 30 25 20 15 10 5 0 10

20

30

40

50 60 Vehicles

70

80

90

100

Figure 16: Computing time according to the number of vehicles.

Instance GTX 560 GTX 770 GTX 970

Figure 14: Comparison of computing time between NVIDIA GeForce GPUs.

0.76 Detection rates

0.74 0.72 0.7 0.68 0.66 0.64

0

10

20

30

40 50 60 Generations

70

80

90

100

(a) Initial deployment

Figure 15: An example of convergence results of the sensor deployment.

GTX560. The above results seem to indicate that CUDA cores between the GPUs affected the computing speed, and it was the most influential factor with the most influence. The PODs of all solutions were calculated in parallel, and the number of the POD tasks is 250,000. The calculation is repeated every generation. The POD calculation does not require high performance and a lot of small POD tasks are very suitable to be processed in GPU, concurrently. Thus, the more CUDA cores GPU has, the more arithmetic operations could be achieved in parallel. The computing time of GPU also could also be improved. Figure 13 shows a graph comparing the computing time in GPU and CPU experiments according to the scenarios and the deployment of sensors and Figure 14 shows a graph comparing the computing time between GPUs for plains. Figure 15 shows an example of the convergence results among the experiments.

8. Conclusion This paper describes the use of a generational GA and probabilistic sensing models in WSN environments to discuss issues relating to barrier coverage. The parts representing the

(b) Optimal deployment

Figure 17: Optimization of sensor deployment on a plain.

POD calculation and vehicle movements were processed in parallel to improve the efficiency in terms of the computing time when using a generational GA. A comparison of the CPU and GPU experiments showed that the quality of the results of the GPU experiment approximated that of the CPU experiment, while the computing speed became approximately 55-56 times faster. Through experiments in which several GPUs were compared on plains, the GPU specification was analyzed to understand which factors influenced the computing time. In the future, we will use

16

International Journal of Distributed Sensor Networks

(a) Initial deployment

(b) Optimal deployment

Figure 18: Optimization of sensor deployment on Scenario 1.

(a) Initial deployment

(b) Optimal deployment

Figure 19: Optimization of sensor deployment on Scenario 2.

various GA operators, which are convenient for 2D problem solution, for example, [12], and aim to improve the qualities of the solutions. We intend to perform sensor deployment more realistically by taking into consideration factors such as mobile sensors [6], weather impact [28], lifetime, and various deployment methods [29–32] and topographical circumstances, by increasing the parallelization level to enable the computing time to approximate real-time.

Conflict of Interests The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (no. 2015R1D1A1A01060105). This research was also supported by the Gachon University Research Fund of 2015 (GCU-20150030).

References [1] D. S. Deif and Y. Gadallah, β€œClassification of wireless sensor networks deployment techniques,” IEEE Communications Surveys & Tutorials, vol. 16, no. 2, pp. 834–855, 2014. [2] B. Wang, β€œCoverage problems in sensor networks: a survey,” ACM Computing Surveys, vol. 43, no. 4, article 32, 2011. [3] K. Chakrabarty, S. S. Iyengar, H. Qi, and E. Cho, β€œGrid coverage for surveillance and target location in distributed sensor networks,” IEEE Transactions on Computers, vol. 51, no. 12, pp. 1448–1453, 2002. [4] B. Carter and R. Ragade, β€œAn extensible model for the deployment of non-isotropic sensors,” in Proceedings of the IEEE Sensors Applications Symposium (SAS ’08), pp. 22–25, IEEE, Atlanta, Ga, USA, February 2008. [5] Y. Zou and K. Chakrabarty, β€œSensor deployment and target localization based on virtual forces,” in Proceedings of the Twenty-Second Annual Joint Conference of the IEEE Computer and Communications (INFOCOM ’03), vol. 2, pp. 1293–1303, IEEE, San Francisco, Calif, USA, March-April 2003. [6] M. Cardei and J. Wu, β€œCoverage in wireless sensor networks,” in Handbook of Sensor Networks, CRC Press, 2005. [7] V. D. Ta, S. C. Huang, and H. T. T. Binh, β€œCovering the target objects with mobile sensors by using genetic algorithm in

International Journal of Distributed Sensor Networks

[8]

[9]

[10]

[11] [12] [13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

wireless sensor networks,” Journal of Computers, vol. 10, no. 5, pp. 300–308, 2015. L. Zhao, G. Bai, Y. Jiang, H. Shen, and Z. Tang, β€œOptimal deployment and scheduling with directional sensors for energyefficient barrier coverage,” International Journal of Distributed Sensor Networks, vol. 2014, Article ID 596983, 9 pages, 2014. J.-H. Seo, Y.-H. Kim, H.-B. Ryou, S.-H. Cha, and M.-H. Jo, β€œOptimal sensor deployment for wireless surveillance sensor networks by a hybrid steady-state genetic algorithm,” IEICE Transactions on Communications, vol. 91, no. 11, pp. 3534–3543, 2008. L. M. J. Lamm, Develop measures of effectiveness and deployment optimization rules for networked ground micro-sensors [M.S. thesis], Division of Systems Engineering, School of Engineering and Applied Science, University of Virginia, Charlottesville, Va, USA, 2001. CUDAβ€”Nvidia, Compute Unified Device Architecture C Programming Guide, 2014. D. E. Goldberg, Genetic Algorithms in Search, Optimization, and Machine Learning, Addison-Wesley, 1989. Y. Yoon and Y.-H. Kim, β€œAn efficient genetic algorithm for maximum coverage deployment in wireless sensor networks,” IEEE Transactions on Cybernetics, vol. 43, no. 5, pp. 1473–1483, 2013. J. H. Kim, B. K. Park, J. Y. Lee, and J. S. Won, β€œDetermining optimal sensor locations in freeway using genetic algorithm-based optimization,” Engineering Applications of Artificial Intelligence, vol. 24, no. 2, pp. 318–324, 2011. J. Shi and C. Zhou, β€œTwo-stage dynamic sensor deployment strategy based on virtual force and genetic algorithm in wireless sensor networks,” International Journal of Education and Management Engineering, vol. 2, no. 1, pp. 1–8, 2012. N. Unaldi, S. Temel, and V. K. Asari, β€œMethod for optimal sensor deployment on 3D terrains utilizing a steady state genetic algorithm with a guided walk mutation operator based on the wavelet transform,” Sensors, vol. 12, no. 4, pp. 5116–5133, 2012. D. B. Jourdan and O. L. de Weck, β€œLayout optimization for a wireless sensor network using a multi-objective genetic algorithm,” in Proceedings of the IEEE 59th Vehicular Technology Conference (VTC ’04), pp. 2466–2470, IEEE, May 2004. D. B. Jourdan and O. L. de Weck, β€œMulti-objective genetic algorithm for the automated planning of a wireless sensor network to monitor a critical facility,” in Sensors, and Command, Control, Communications, and Intelligence (C3I) Technologies for Homeland Security and Homeland Defense III, Proceedings of SPIE, pp. 565–575, Orlando, Fla, USA, April 2004. B. Carter and R. Ragade, β€œAn extensible model for the deployment of non-isotropic sensors,” in Proceedings of the 3rd IEEE Sensors Applications Symposium (SAS ’08), pp. 22–25, Atlanta, Ga, USA, February 2008. B. Carter and R. Ragade, β€œA probabilistic model for the deployment of sensors,” in Proceedings of the IEEE Sensors Applications Symposium (SAS ’09), pp. 7–12, IEEE, New Orleans, La, USA, February 2009. D. S. Deif and Y. Gadallah, β€œWireless Sensor Network deployment using a variable-length genetic algorithm,” in Proceedings of the IEEE Wireless Communications and Networking Conference (WCNC ’14), pp. 2450–2455, IEEE, Istanbul, Turkey, April 2014. S. Indhumathi and D. Venkatesan, β€œImproving coverage deployment for dynamic nodes using genetic algorithm in wireless

17

[23]

[24]

[25]

[26]

[27] [28]

[29]

[30]

[31]

[32]

sensor networks,” Indian Journal of Science and Technology, vol. 8, no. 16, 2015. I. Harvey, β€œThe microbial genetic algorithm,” in Advances in Artificial Life. Darwin Meets von Neumann, pp. 126–133, Springer, Berlin, Germany, 2011. Y.-J. Choi, G.-W. Jeong, Y.-H. Seo, and H. S. Yang, β€œGametheoretic camera selection using inference tree method for a wireless visual sensor network,” International Journal of Distributed Sensor Networks, vol. 2014, Article ID 839710, 10 pages, 2014. G. Feng, M. Liu, and G. Wang, β€œGenetic algorithm based optimal placement of PIR sensors for human motion localization,” Optimization and Engineering, vol. 15, no. 3, pp. 643–656, 2014. M. K. Watfa and S. Commuri, β€œOptimal 3-dimensional sensor deployment strategy,” in Proceedings of the 3rd IEEE Consumer Communications and Networking Conference (CCNC ’06), vol. 2, pp. 892–896, IEEE, January 2006. R. V. Hogg and E. A. Tanis, Probability and Statistical Inference, Macmillan, New York, NY, USA, 2014. J.-H. Seo, Y. H. Lee, and Y.-H. Kim, β€œFeature selection for very short-term heavy rainfall prediction using evolutionary computation,” Advances in Meteorology, vol. 2014, Article ID 203545, 15 pages, 2014. N. A. B. Ab Aziz, A. W. Mohemmed, and M. Y. Alias, β€œA wireless sensor network coverage optimization algorithm based on particle swarm optimization and voronoi diagram,” in Proceedings of the IEEE International Conference on Networking, Sensing and Control (ICNSC ’09), pp. 602–607, Okayama, Japan, March 2009. F. H. Khan, R. Shams, M. Umair, and M. Waseem, β€œDeployment of sensors to optimize the network coverage using genetic algorithm,” Sir Syed University Research Journal of Engineering and Technology, vol. 2, no. 1, pp. 8–11, 2012. X. Wang, S. Wang, and J. Ma, β€œDynamic deployment optimization in wireless sensor networks,” in Proceedings of Intelligent Control and Automation: International Conference on Intelligent Computing, ICIC 2006 Kunming, China, August 16–19, 2006, vol. 344 of Lecture Notes in Control and Information Sciences, pp. 182–187, Springer, Berlin, Germany, 2006. C.-P. Chen, S. C. Mukhopadhyay, C.-L. Chuang et al., β€œA hybrid memetic framework for coverage optimization in wireless sensor networks,” IEEE Transactions on Cybernetics, vol. 45, no. 10, pp. 2309–2322, 2015.

International Journal of

Rotating Machinery

Engineering Journal of

Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014

The Scientific World Journal Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014

International Journal of

Distributed Sensor Networks

Journal of

Sensors Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014

Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014

Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014

Journal of

Control Science and Engineering

Advances in

Civil Engineering Hindawi Publishing Corporation http://www.hindawi.com

Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014

Volume 2014

Submit your manuscripts at http://www.hindawi.com Journal of

Journal of

Electrical and Computer Engineering

Robotics Hindawi Publishing Corporation http://www.hindawi.com

Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014

Volume 2014

VLSI Design Advances in OptoElectronics

International Journal of

Navigation and Observation Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014

Hindawi Publishing Corporation http://www.hindawi.com

Hindawi Publishing Corporation http://www.hindawi.com

Chemical Engineering Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014

Volume 2014

Active and Passive Electronic Components

Antennas and Propagation Hindawi Publishing Corporation http://www.hindawi.com

Aerospace Engineering

Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014

Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014

Volume 2014

International Journal of

International Journal of

International Journal of

Modelling & Simulation in Engineering

Volume 2014

Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014

Shock and Vibration Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014

Advances in

Acoustics and Vibration Hindawi Publishing Corporation http://www.hindawi.com

Volume 2014