Efficient WiFi Deployment Algorithms based on Realistic ... - CiteSeerX

1 downloads 0 Views 840KB Size Report
S. Wang, D. Fox, and H. Kautz, “Opportunity knocks: a system to pro- vide cognitive assistance with transportation services,” in Proceedings of International ...
Efficient WiFi Deployment Algorithms based on Realistic Mobility Characteristics Tian Wang

Guoliang Xing

Minming Li and Weijia Jia

City University of Hong Kong Michigan State University City University of Hong Kong Computer Science Computer Science Computer Science 83 Tat Chee Avenue, Kowloon, Hong Kong East Lansing, Michigan, USA 83 Tat Chee Avenue, Kowloon, Hong Kong Email: [email protected] Email: [email protected] Email: {minmli, itjia}@cityu.edu.hk

Abstract—Recent years have witnessed the emergence of numerous new Internet services for mobile users. Supporting mobile applications via public WiFi networks has received significant research attention due to the drastic increase of penetration rate of 802.11-based networks. Nevertheless, recent empirical studies showed that unplanned WiFi networks cannot provide satisfactory Quality of Service for interactive mobile applications due to intermittent network connectivity. In this paper, we exploit realistic mobility characteristics of users to deploy WiFi Access Points (APs) for continuous service for mobile users. We study two AP deployment problems that aim to maximize the continuous user coverage and to minimize the AP deployment cost, respectively. Both problems are formulated based on mobility graphs that capture the statistical mobility patterns of users. We prove that both problems are NP-hard. We develop several optimal and approximation algorithms with provable performance bounds for different topologies of mobility graphs. The effectiveness of our approaches is validated by extensive simulations using real user mobility traces.

I. I NTRODUCTION Recent years have witnessed a drastic increase of penetration rate of 802.11-based WiFi networks in public environments [1]. Numerous WiFi hotspots have been deployed in metropolitan areas for providing users broadband Internet access. In addition, WiFi has also become a built-in feature for a wide range of mobile devices, such as PDAs, smart phones, handheld game-consoles, and laptops. These devices enable the users to access the Internet through WiFi access points (APs). Along with the increasing availability of WiFi connectivity, supporting mobile Internet applications via public WiFi networks has received significant research attention [2], [3], [4], [5]. However, many mobile applications impose stringent performance requirements including sufficient connectivity time, short delay, and high bandwidth. Recent empirical studies showed that the current WiFi network density can only provide intermittent network connectivity. In particular, Vladimir et al. [3] studied the performance of WiFi networks based on the measurements from 290 “drive hours” in and around the Boston metropolitan area. Their studies showed that the median duration of link-layer connectivity from each AP is only 13 seconds while the mean time interval between the discovery of two adjacent APs is 75 seconds. As a result, interactive and time-sensitive mobile applications such as

978-1-4244-7489-9/10/$26.00 ©2010 IEEE

422

video-on-demand and VoIP cannot receive satisfactory quality of service. The poor mobility support of WiFi networks is largely due to their unplanned deployment. Originally designed for private wireless LAN applications, 802.11 APs are often deployed by different owners in an uncoordinated manner, which inevitably results in insufficient coverage over large areas. However, large-scale planned WiFi deployments have become feasible due to the drastic drop of the prices of commercial-off-theshelf 802.11 APs in the last few years. Many college campuses can provide sporadic WiFi coverage over geographic areas of a few square kilometers. Moreover, a few projects have been launched to provide WiFi access for major metropolitan areas such as Sydney [6] and Hong Kong [7]. For example, more than 2,000 APs were available for free WiFi access in Hong Kong in mid-2009. In this paper, we study the problem of deploying WiFi APs for continuous service for mobile users. Our study is motivated by the critical need to satisfy stringent performance requirements of interactive and time-sensitive mobile applications. At the same time, the deployment cost such as the number of APs must be minimized for large-scale WiFi networks. Our key contribution is to improve WiFi coverage and minimize the AP deployment cost by exploiting realistic mobility models that capture the statistical movement patterns of users. Although a vast number of mobility models have been proposed in the literature, most of them assume some form of random walk on a flat plane, which does not reflect the real-world human movements [8], [9]. Recently, several realistic mobility models have been developed based on the movement traces collected through various sources (e.g., WiFi usage [10], [11]). These models account for the speed and geographic distribution of users. In this paper, we study efficient AP deployment based on a new graph structure called mobility graphs that can incorporate existing statistical mobility models. We make the following contributions in this paper: • We formulate two AP deployment problems–the Maximum Continuous Coverage (M CC) and the Minimum Deployment Cost (M DC) problems, which aim to provide the maximum continuous WiFi coverage to mobile users at the minimum deployment cost. Both problems are formulated based on mobility graphs that capture the

statistical movement patterns of users on a map. We prove that both MCC and MDC problems are NPhard. We also develop the optimal algorithms for tree topologies and several approximation algorithms with provable performance bounds for general graph topologies. • The effectiveness of our approaches is validated by extensive simulations using real user mobility traces. Our results showed that WiFi deployment based on realistic user mobility characteristics can significantly improve the network coverage. The rest of the paper is organized as follows. Section II reviews related work. Two formulations of the AP deployment problem are presented in Section III. The optimal/approximation solutions to the two problems are given in Section IV and V, respectively. Section VI presents the evaluation results and section VII concludes the paper. •

II. R ELATED WORK Recently, using WiFi networks to provide Internet access for mobile users has received considerable attention. Vladimir [3] et al. studied the performance of supporting mobile users with unplanned APs. They reported the results of a measurement study carried out over 290 “drive hours” and found that the median duration of link-layer connectivity for a vehicle is 13 seconds and the median connection upload bandwidth is 30 KBytes/s. Cabernet [2] is a system for providing moving vehicles Internet access using WiFi APs. However, Cabernet only provides intermittent network connectivity with the current WiFi deployment density. As a result, Cabernet is only viable for non- or low-interactive applications, such as emails and web browsing. A recent measurement study on 802.11 wireless channels [12] shows that the mobility of users results in highly dynamic wireless links, which can significantly affect the performance of mobile applications. A channel-aware rate adaptation algorithm is developed in [12] to help select the transmission rate of a WiFi connection. The above studies have shown that the unplanned WiFi networks are only viable for the applications that can tolerate intermittent connectivity. There is a vast literature on mobility models. Many existing studies in mobile and ad-hoc networks (MANETs) are conducted based on simplistic mobility model [8], [9], [13], [14], [15], [16] which usually assume some form of random walk on a flat plane. These models are easy to implement in simulations and also can facilitate statistical analysis of largescale protocols and systems. However, they do not capture the way people move in reality, which renders the conclusions based on these models questionable, as has been recently shown in the literature [17], [18], [19]. To overcome the lack of reality, there have been a few recent studies on extracting mobility models from traces of moving users. Patterson et al. used GPS data to identify common destinations in a user’s daily life [20]. Hui et al. used small Bluetooth devices carried by 54 participants of IEEE Infocom 2005 to measure the implication of human mobility patterns [21]. Several studies have been conducted base on

the traces collected on the Dartmouth campus [22], [23]. In these traces, whenever clients authenticate, associate, roam, disassociate or de-authenticate with an AP, the timestamp in seconds, the client MAC address, the AP name and the event type are recorded. Based on real traces, Kim et al. [24], [10] propose a method to estimate the physical location of users. They showed that the speed and pause time follow a log-normal distribution. Moreover, the direction of movements closely reflects the direction of roads and walkways, which conforms to the intuition. Yoon et al. [11], also propose a system that is able to derive a realistic mobility model by using the traces of WiFi clients. They also develop a probabilistic mobility model to produce user movement patterns that are representative of real movement. Zheng et al. [25] propose a metric called α−coverage that guarantees the interconnection gap of vehicular Internet access. However, the mobility characteristics of users are not considered in their work. Moreover, being different from [25], we focus on providing continuous network coverage for mobile users, which is required by many Internet applications. III. P ROBLEM D ESCRIPTION Our goal is to (re)-deploy WiFi APs based on mobility models that characterize the statistical movement patterns of users. In the following, we first describe how to build realistic mobility models and then introduce the definitions of two AP deployment problems. A. Building Mobility Graphs Although a vast number of mobility models have been proposed in the literature, most of them do not reflect the real-world human movements [11]. For instance, the widely adopted Random Way Point models choose random destinations for users in a bounded or open area, which is not consistent with the way real users move about in urban settings. Recently, several realistic mobility models have been developed based on the movement traces collected through various sources (e.g., WiFi usage [22], [23] and people tracking systems [26]). In this work, we adopt a statistical mobility model similar to several models proposed recently [27], [10], [11], [28], [29]. For a given urban area, we build a mobility graphs G < N, E > to represent the characteristics of user mobility in the area. Each edge corresponds to a road section on which users travel. Each node of the graph represents a building that is located on a road or an intersection of two roads. Among the nodes in the graph, there exist some special nodes referred to as “popular sites” which are places frequently visited by users. Studies based on real user traces [27], [10] showed that most user traffic starts from popular sites. For example, in a campus setting, popular sites include academic buildings, libraries, and parking garages etc. Each popular site is associated with a set of user paths which start from it and end on other nodes in the graph. Popular sites can be identified by existing methods such as the one described in [10].

423

Fig. 1.

on the user population distribution on the original starting popular sites. Then, the probability of each path originated from the original starting popular sites should be multiplied by the weight of edge from its original starting popular sites to the virtual popular site. With this conversion, the multiple popular sites case can be dealt with in the same way as the single popular site case. The creation of above mobility graphs requires either the statistics of user volume or the mobility models that are extracted from such statistics. For instance, several studies have built statistical mobility models from WiFi association logs [10], [11] and people tracking systems like Placelab [26]. A potential issue of using the user traces gathered by existing APs is that the low density of APs may affect the accuracy of resulted mobility graphs. However, recent studies [11] showed that, by utilizing statistical techniques, a small amount of user traffic information is enough to generate accurate mobility models. We now use an example to illustrate how to build mobility graphs from user traces collected on the Dartmouth campus [22], [23]. Fig. 1 is the map of a corner of the Dartmouth College. The blocks in the map represent buildings or road intersections and the edges represent the paths connecting them [11]. The data traces collected are composed of the timestamp, AP name, MAC address and association information. Each trace is a list of successive APs accessed by each user. Based on the timestamp and location of the associated APs, we can obtain the paths the users take. Based on the statistics of the path frequency, we can finally obtain the probability the user taking each path. Fig. 2 is a mobility graph converted from the map shown in Fig. 1. The number on each edge is the cost associated with that edge, which is the number of APs needed to cover it. The probability at which the user takes each path from node A is also shown.

The map of a part of Dartmouth College

6% J

27%

7% K

4

I

2 3 H 17%

1

16% 4

G

F

1 E 2

C

4

12%

A

B 3

15%

D

Fig. 2. The mobility graph converted from the map shown in Fig. 1. Node A is a popular site. Each path is labeled by the probability that the user takes the path. Each edge is labeled by its cost – the number of APs that can cover it.

A path is defined as a sequence of consecutive edges starting from a popular site. Each path is associated with a probability pi that represents the probability that the user takes the path. Probability pi is typically calculated based on the user distribution on the path from user statistics. Therefore, it can also be interpreted as the fraction of users that take the path. Each edge is associated with a cost li which is the number of APs needed to cover the edge. When some APs are already deployed in a region, the mobility graph excludes them from the computation of edge costs. We formally define the AP coverage model in Section III-B. To simplify the graph representation, we focus on the case with only one popular site in the remainder of the paper. When there are multiple popular sites, we can introduce a virtual popular site that connects to the original starting popular sites. The cost of edges between the virtual and the original popular sites are set to be zero and the weights can be calculated based

B. AP Coverage Model The objective of our problem is to deploy APs to provide assured coverage for mobile users. That is, the probability that a mobile user can associate with at least one AP wherever it moves within a given region should be maximized. Such an objective is consistent with the requirements of a number of emerging mobile applications. For instance, vehicular Internet access may require sufficient continuous coverage from the APs deployed along roadside in order to provide satisfactory Internet service to passengers. We adopt the following coverage model for APs. An AP covers part of a road if it satisfies certain QoS requirements of users. The specific definition of QoS varies with applications. For instance, the WiFi network deployed in a populated area may require each AP to serve a given number of users with certain bandwidth. For given QoS requirements, the number of APs needed to cover a road can be obtained by field tests. Based on this coverage model, we make the following assumptions for our problem. First, we assume that the cost of an edge in mobility graph is equal to the number of APs required to cover it. We note that this assumption is made mainly for the purpose of easy

424

composition as our AP deployment algorithms can work with any positive edge costs. For instance, the cost of an edge may be adapted to account for AP installation and hardware costs in practice. We also assume that the nodes in a mobility graph have already been covered by APs. This is reasonable as the nodes are popular sites (e.g., buildings) where people spend most time and hence are often required to be covered. Our objective then becomes achieving the coverage of edges connecting nodes on a mobility graph. In the rest of this paper, we consider two different formulations of AP deployment for continuous coverage. In the first formulation, there is a budget on the maximum number of APs that are available. The objective is to maximize the percentage of time users can receive continuous WiFi coverage without interruption. In the second formulation, the percentage of time that users can receive continuous WiFi service must be no lower than a given threshold while the number of APs that need to be deployed must be minimized. IV. AP D EPLOYMENT FOR M AXIMUM C OVERAGE A typical requirement of AP deployment is to maximize the network coverage under a given budget on the maximum number of available APs. We refer to this problem as the Maximum Continuous Coverage (MCC) problem. Due to the probabilistic nature of user movement, our problem formulation must account for the probability that each path is taken by users. Specifically, our objective is to deploy the given APs to continuously cover a selected subset of paths such that the total probability that users take these paths are maximized. We now formally define the MCC problem. Definition 1: Maximum Continuous Coverage (MCC) problem: Given a mobility graph G < N, E > and n possible user paths ({rk }, 1 ≤ k ≤ n). The probability (pk ) and cost (lk ) of path rk are equal to the probability that users take that path and the cost of all edges on the path, respectively. The objective is to select a subset of paths that maximize the sum of the paths’ probabilities under the constraint that the sum of the paths’ costs is no larger than a given constant (c). We have the following theorem regarding the complexity of the M CC problem. Theorem 1: The M CC problem is NP-hard. Proof: To prove that the problem is NP-hard, we reduce from the 0-1 knapsack problem which is a well-known NPhard problem. A special case of the M CC problem is that the mobility graph has a star topology, i.e., the paths will not share any edges with each other. Then, this special case of the M CC problem can be formulated as: maximize

n 

pk xk

(1)

lk xk ≤ c

(2)

k=1

subject to

n 

The key difference between the general case of our problem and the knapsack problem is that the paths may share some parts with each other. That is, there may be some overlap among different paths. As a result, the existing algorithms of 01 Knapsack problem cannot be applied to our problem directly. A. Optimal algorithms for tree topologies Although the general version of the M CC problem is NPHard, optimal solutions exist for the special case where the user paths form a tree, i.e., there is no loop on the mobility graph. We now describe an optimal dynamic programming algorithm (referred to as M CC opt) for this special case. An important property of tree topology is that a user path must end on a leaf node. Therefore, we add a variable pv to the leaf node to denote the probability that the path from the root node to the leaf node is taken by users. Suppose the total number of APs to be deployed is N . We define ramification-tree as the subtree rooted at v plus the edge from v to its parent node. The number of APs needed to cover the edge is denoted by nv . When v is the root node, we define nv = 0. We first transform the original tree to a binary tree where the left child of a node is its first child and the right child of a node is its next sibling. Based on the binary tree, we define opt(v, x) as the maximum probability that can be obtained when x APs are to be deployed on the ramification-tree rooted at v. Denote node A as the left child of the root, then opt(A, N ) is the solution of our problem. For any non-root node v, suppose vL and vR are the left child and the right child of v and there are xL and xR APs to be deployed on the ramification-tree rooted at vL and vR respectively. We then have the following iterative equation: opt(v, x) = max{opt(vL , xL − nv ) + opt(vR , xR )}

(3)

In the equation above, xL + xR = x. For a node v with no right child, opt(vR , xR ) = 0. And for a node v with no left child, opt(vL , xL − nv ) can be computed using the following equation:  opt(vL , xL − nv ) =

0 pv

if xL < nv else

For every node, the x in opt(v, x) has n possible values. If the variable x is fixed, as xL + xR = x, there are at most x combinations of {opt(vL , xL −nv )+opt(vR , xR )}. Therefore, for every node, it should compute N × x times. As there are total m nodes and x ≤ n, the total computation time is at most m × N × N = mN 2 .

k=1

where variable xk represents whether path ri is chosen and has value 0 or 1. The above formulation is exactly the 0-1 knapsack problem.

425

B. Approximation algorithms for general topologies The optimal algorithms described in Section IV-A are designed for tree topologies. In this section, we focus on the design of approximation algorithms for general graph topologies.

We now describe two approximation algorithms for the M CC problem. We start with Algorithm M CC Combine. Before giving the algorithm, two definitions are introduced. 1)Independent path: the path which does not share any edge with other paths. 2)Dependent path: the path which shares at least one edge with other paths. Let m represents the number independent paths. Then there are k = n − m dependent paths. Let P = {p1 , p2 , ..., pm } and L = {l1 , l2 , ..., lm } represent the set of probabilities and costs of independent paths. We define the corresponding profit density as plii . The key idea is to try out all the combinations which are composed of only dependent paths. For the independent paths we just select the path with the maximum profit one by one. We first sort the independent paths in the order of nonincreasing profit densities. We then introduce a procedure M CC Select(I, M ) which computes the total probability of the paths selection in order of nonincreasing profit densities, after the paths in I have been selected. That is, M CC Select(I, M ) selects the path ri one by one until the APs left cannot cover the path any more.

cost of each edge is shown on the edge and the number of APs available is 4. According to our definitions, paths a − f and a− g are independent paths while paths a− b − c, a− b − d and a − b − e are dependent paths as they share the edge a − b. All the possible combinations of dependent paths are {a − b − c}, {a−b−d}, {a−b−e}, {a−b−c, a−b−d}, {a−b−c, a−b−e}, {a−b−d, a−b−e} and {a−b−c, a−b−d, a−b−e}. Firstly, we choose the combination {a − b − c} and get the probability 20%. It costs 2 APs to cover the path and there are 2 APs left. Algorithm 1 is then used to select independent paths. As a result, path a − f is selected and the Pout = 40%. We apply the same process for each combination one by one and update the Pout if the current value is greater than the previous result. Finally, when the combination {a − b − c, a − b − d, a − b − e} is selected, we obtain the maximum Pout = 60%. That is, these three paths will be covered by four APs and 60% users receive continuous coverage.

20%

d 20%

c 20%

1

e

1 1

Algorithm 1 M CC Select(I, M ) Input: set I – a set composed of the paths already selected; all the independent paths {r1 , r2 , ..., rm } in the order of nonincreasing profit densities; M – the number of APs available; Output: Pout – The probability achieved which is equal to the maximum percentage of time users can receive continuous WiFi coverage 1: Pout = 0; i = 1; 2: if li ≤ M then Pout = Pout + pi ; M = M − li ; I = I ∪ ri ; /* the path ri is selected and added into set I */ 3: i = i+1; /* prepare to handle the next independent path*/ 4: if i ≤ m then goto step 2 5: return (Pout ); end Clearly, procedure M CC Select(I, M ) runs in time O(m). We now describe an algorithm that uses procedure M CC Select(I, M ) to produce approximation solutions. We first introduce the following definitions. For a set of user paths I, the size of I (denoted by SI ) is the number of paths in it. The cost of I (denoted by LI ) is the sum of the costs of the paths in it. M CC Combine first sets Pout to 0 and then for any combination I composed of only the dependent paths compute the sum of the probability (PI ) of these paths. If the total cost LI is less than c, there are still APs available. Procedure M CC Select(I, M ) is then called to select the independent paths. Every time we find a Pout greater than the current one, we update the current value of Pout . We now illustrate the basic idea of M CC Combine algorithm using the example shown in Fig. 3. There are 5 candidate paths a−b−c, a−b−d, a−b−e, a−f and a−g in Fig. 3 and each has an equal probability 20% being taken by users. The

b 1

a 2

2

g

f 20%

20%

Fig. 3. An example of the M CC Combine. Node a is a popular site. There are total 5 paths. According to M CC Combine algorithm, the finally chosen paths are a − b − c, a − b − d and a − b − e.

Algorithm 2 M CC Combine Input: set of dependent paths; Output: Pout 1: Pout = 0 2: for all combinations I (composed of only dependent paths) whose cost LI is less than c PI = i∈I pi ; Pout = max{Pout , PI + M CC Select(I, c − LI )}; 3: return (Pout ), end Theorem 2: Denote Popt as the optimal solution of this problem and pm as the maximum probability among the independent paths. Then Pout ≥ Popt − pm and the time complexity of the algorithm is O((n − k)2k ), where k is the number of dependent paths. Proof: If the optimal solution is composed of only the dependent paths, then our algorithm yields exactly the optimal solution since all combinations of user paths are examined in

426

Algorithm 3 M CC Group Input: the paths set Output: Pout 1: Partition the user paths into groups which do not share any edge with each other 2: for each group: Enumerate all the possible combinations and compute their profit densities. Denote the maximum probability of all combinations as Pg∗ . Sort the combinations in nonincreasing order by their profit density and remove the combination with the cost larger than c. 3: Select the combination with the maximum density one by one until the total cost of the paths is greater than c. When a combination is selected, update the density of other candidate combinations containing the paths in this combination. 4: Denote the sum of probabilities obtained so far as Pg . Pout = max{Pg , Pg∗ } 5: return(Pout ), end

step 2 of M CC Combine algorithm and hence the optimal solution can be found. Without loss of generality, we assume that the optimal solution Popt is obtained by selecting the following j paths: r1, , r2, , ...rx, , r1∗ , r2∗ , ..., ry∗ and suppose that there are x independent paths and y dependent paths (x + y = j). Suppose combination I corresponds to r1∗ , r2∗ , ..., ry∗ at some point in the execution of the algorithm. Let rz be the first path that cannot be selected during the execution of M CC Select(I, M ), and we have selected z − 1 independent paths. Then the paths selected up to now have a profit no lower than that using the paths from r1, , r2, , ...rx, , r1∗ , r2∗ , ..., ry∗ . Therefore, y 

p∗i +

z 

i=1

pi ≥

i=1

y 

p∗i +

i=1

x 

p,i = Popt

(4)

i=1

As pm ≥ pz , pout =

y  i=1

p∗i +

z−1 

pi ≥ Popt − pm

(5)

i=1

For any given combination I, Algorithm 2 runs in polynomial time. There are Cki combinations of size i. Hence step 2 of  Algorithm 2 is executed ki=1 Cki < 2k times. Each execution takes at most m = n − k time. Hence the total running time is bounded by O((n − k)2k ). The running time of this algorithm is exponential with respect to k which may be large when many paths share edges with each other. To reduce the computation time, we now describe another approximation algorithm. The key idea is to divide these k dependent paths into several groups and then the paths in one group will not share any part with the paths in another group. As a result, the size of each group may be smaller than k. For each group, the number of possible combinations may be much smaller than the number of combinations in algorithm M CC Combine. We now describe the details of the algorithm. First, we partition these paths into groups as follows. Initially, each path is a group individually and two groups are merged together if there are some sharing parts between them. The merging process is repeated until each group has no part shared with other groups. We define the size of a group as the number of paths in it. The path combinations in each group are then processed as follows. For each combination, we define its probability as the sum of the probabilities of all paths in it and its cost as the total cost of the paths. We then define the profit density of a combination as the ratio of its probability to its cost. For example, in Fig.4, path a − b − c and a − b − d form a combination. The cost of each edge is shown on the edge. That is, the cost of a − b, b − c and b − d are 1, 3 and 1 respectively. = 0.12. Then the density of the combination is 0.20+0.40 1+3+1 The pseudo code of the approximation algorithm is shown in Algorithm M CC Group. In the example of Fig.4, initially, there are 4 candidate paths: a − b − c, a − b − d, a − e and a − f and suppose the number of

APs available is 4. After grouping, path a − b − c and a − b − d can merge into a group (denote as Rab ). Path a−e and a−f is a group itself (denoted by Cae and Caf respectively). Group Rab has 3 possible combinations: Cabc , Cabd and Cabcd . Thus there are 5 combinations in total. We sort them in nonincreasing order by their density: (Cabd , 0.2), (Caf , 0.15), (Cae , 0.15), (Cabcd , 0.12), (Cabc , 0.05). According to our algorithm, we first choose the combination Cabd as it is the one with the highest density. Then, we update the density of Cabcd to be 0.6−0.4 = 0.066 because after edge a − b has been chosen 5−2 the profit and cost of Cabcd is changed. After this process, the combinations in nonincreasing order by their density are (Caf , 0.15), (Cae , 0.15), (Cabcd , 0.066), (Cabc , 0.05). At this point, we choose the combination Caf and the algorithm thus terminates. 20% c 3 40% d

1 b 1

2

a

2

e

30% 10%

f

Fig. 4. An example of the M CC Group. Node a is a popular site. There are 4 paths in all and according to M CC Group algorithm the sequence of choosing paths are a − b − d, a − f , a − b − c, a − e.

427

Theorem 3: Denote  Popt as the optimal solution of our   Pout  problem, then  Popt  ≥ 12 and the computation time of M CC Group is O(n2 2h ), where h is the maximum size of the groups. Proof: Denote r1∗ , r2∗ , ..., rx∗ as the paths chosen by the optimal solution. Denote Li and Pi as the cost and probability of combination Ci respectively. After the algorithm terminates, suppose that the combination Cs is the first combination that cannot be selected and the combinations are already selected are C1 , C2 , ..., Cs−1 . At this stage, we observe that 1) less than Ls APs are available 2) the paths selected up to now was at a profit no lower than that using the paths from r1∗ , r2∗ , ..., rx∗ . s−1 

Pi + Pg∗ ≥

i=1

s−1 

Pi + Ps ≥ Popt

and cost of all edges on the path, respectively. The objective is to select a subset of user paths that minimize the sum of the paths’ cost under the constraint that the sum of the paths’ probabilities is greater than a constant α ∈ (0, 1]. We have the following theorem regarding the complexity of the M DC problem. Theorem 4: The M DC problem is NP-hard. Proof: Similar to the proof of Theorem 1, we consider the star topology. Then, this special case of the M DC problem can be formulated as: n  lk xk (9) minimize k=1

(6)

subject to

i=1

Pout = max{

Pi , Pg∗ } ≥

i=1

pk xk ≥ α%

(10)

k=1

According to step 4 of our algorithm, s−1 

n 

s−1 

1 ( Pi + Pg∗ ) 2 i=1

where variable xk represents whether path ri is chosen and has value 0 or 1. The above formulation is another representation of 0-1 knapsack problem.

(7)

So

1 Popt (8) 2 For the groupwith size t, the number of all possible t combinations is i=1 Cti ≤ 2t ≤ 2h . The sorting cost of a combination is less than 2h × log(2h ) = log(2)h2h . As there are at most n groups, the total sorting cost is log(2)nh2h . It will take at most n time to select the combination with the maximum density, as there are at most n groups. When a combination has been selected, the density update process will cost at most 2h time and our algorithm will at most execute n times. Therefore, the total computation time of M CC Group is log(2)nh2h + n2 2h , which is O(n2 2h ). Pout ≥

We note that, the performance of algorithm M CC Combine is always no worse than that of algorithm M CC Group. Suppose that M CC Group finally has selected paths (r1 , r2 , ..., rs ) which are composed of x independent paths (r1∗ , r2∗ , ..., rx∗ ) and y dependent paths (r1, , r2, , ..., ry, ). Obviously, at some point during the execution of algorithm M CC Combine, there must be a combination I that corresponds to r1, , r2, , ..., ry, . From then on, the total profit of the paths selected by M CC Combine is the same as that of paths r1, , r2, , ..., ry, . Therefore, if the number of dependent paths is not large, M CC Combine is always preferred over M CC Group. V. AP D EPLOYMENT WITH M INIMUM C OST In many applications, the AP deployment must satisfy certain QoS requirement on the coverage while the number of APs to be deployed should be minimized. We now formally formulate this problem as follows. Definition 2: Minimum Deployment Cost (MDC) problem: Given a mobility graph G < N, E > and n possible user paths ({rk }, 1 ≤ k ≤ n). The probability (pk ) and cost (lk ) of path rk are equal to the probability that users take that path

A. Optimal algorithm for tree topologies We can use the optimal algorithm for the M CC problem in section IV-A to obtain an optimal algorithm for the M DC problem. The method is similar to the binary search as the number of APs needed is monotonically increasing with the coverage probability. Denote the algorithm in IV-A as M CC opt(x) which outputs the maximum coverage probability given x APs. Using this procedure as a building block, our optimal algorithm is shown as Algorithm 4. Algorithm 4 M DC opt(α) Input: α – the probability to be guaranteed; the paths set r1 , r2 , ..., rn Output: the number of APs needed to achieve the required coverage 1: l = 0; h = the number of APs needed to cover all the paths in the graph; 2: while (l ≤ h) mid = (l + h)/2; if α is equal to M CC opt(mid) return mid; else if α < M CC opt(mid) h = mid − 1; else l = mid + 1; 3: return (mid); 4: end The M DC opt algorithm performs a binary search of the number of APs needed to achieve the required coverage. As the running time of M CC opt is O(mN 2 ), the time complexity of M DC opt(α) is O(mN 2 log N ). B. Approximation algorithm for general topologies In this section, we describe an approximation algorithm for the M DC problem with general graphs. Suppose there are m independent paths and k = n − m dependent paths. As in Section IV-B, we let P = {p1 , p2 , ..., pm }

428

and L = {l1 , l2 , ..., lm } represent the set of probabilities and cost of independent user paths. We first sort the independent paths in the order of nonincreasing densities (i.e., pi /li ≥ pi+1 /li+1 , n − m + 1 ≤ i ≤ n − 1). We then introduce a procedure M DC Select(I, Q) which computes the total cost needed to let Q more users receive continuous WiFi coverage after the paths in I have been selected. M DC Select(I, Q) selects the path ri one by one until the number of APs left cannot cover the path any more. Algorithm 5 M DC Select(I, Q) Input: I – a set composed of the paths already selected; all the independent paths {r1 , r2 , ...rm } in the order of nonincreasing profit densities; Q – extra portion of users that need to receive continuous WiFi coverage Output: The cost Ls 1: Ls = LI ; i = 1; 2: Ls = Ls + li ; Q = Q − pi ; add ri into set I; i = i + 1; 3: if Q > 0 goto step 2 4: return (Ls ), end

2. Without loss of generality, we assume that the optimal solution Popt is obtained by selecting the following j paths: r1, , r2, , ...rx, , r1∗ , r2∗ , ..., ry∗ and suppose that there are x independent paths and y dependent paths (x + y = j). During the execution of step 2, there must be a combination composed of r1∗ , r2∗ , ..., ry∗ . Let r1 , r2 , ..., rz be the paths selected by procedure M DC Select(I, Q). Suppose that currently we have selected z − 1 independent paths, then the paths selected up to now have a cost less than that of paths r1, , r2, , ...rx, , r1∗ , r2∗ , ..., ry∗ , so we have

Clearly, procedure M DC Select(I, Q) runs in time O(m). The next algorithm will use procedure M DC Select(I, Q) to produce approximation solutions for the M DC algorithm. We use the same notation defined in Section IV-B. Firstly, we compute Lout (Iout is the combination) which is the minimum cost of all combinations composed of dependent paths whose probability is greater than α. If this combination does not exist, we set Lout to be ∞. Then, for all the combinations whose probability PI ≤ α, we compute their costs LI . We call M DC Select(I, Q) to select the independent paths. After that, we will obtain a cost Lout for the combination I. Every time we find a Lout which is less than the current one, we substitute the current value of Lout . Algorithm 6 M DC Input: dependent paths set; α–the probability to be guaranteed Output: Lout /*the number of APs needed*/ Lout = the minimum cost of all combinations of dependent paths whose probability PI ≥ α; if all these probability PI ≤ α then Lout = ∞; 2: for all combinations I (composed of only dependent paths) whose probability PI ≤ α compute their LI ; Lout = min{Lout, LI + M DC Select(I, α − PI )} 3: return (Lout ), end 1:

Theorem 5: Denote Lopt as the optimal solution of the M DC problem and lm as the maximum cost of all independent paths, then Lout ≤ Lopt + lm . The running time of this algorithm is O((n − k)2k ). Proof: If the optimal solution is composed of only the dependent paths, our algorithm yields exactly the optimal solution since all combinations are examined at step

429

y 

li∗ +

i=1

z−1 

li ≤

i=1

y 

li∗ +

i=1

x 

li, = lopt

(11)

i=1

As lm ≥ lz , lout =

y  i=1

li∗ +

z 

li ≤ lopt + lm

(12)

i=1

For any given combination I the algorithm runs in polynomial time. There are Cki combinations of size i. Hence step k 1 and 2 of Algorithm 6 are executed i=1 Cki < 2k times. Each execution takes at most m = n − k time. Hence the total running time is bounded by O((n − k)2k ). VI. P ERFORMANCE EVALUATION This section presents the evaluation of our algorithms. We extract mobility characteristics from the real data trace collected over a campus-wide wireless network at Dartmouth College [22]. As discussed in Section III, we use the method similar to [11] to obtain the path probabilities from the traces. The number of nodes required to cover a road is computed based on the average AP communication range obtained from user traces. However, as discussed in Section III-B, our algorithms only require the knowledge of the number of APs that can cover each edge of the mobility graph and hence can work with any AP coverage model. We conduct simulations based on both tree topologies and general graphs. We run our algorithms on a number of topologies that have 50 nodes or so in a moderate-size metro area. Fig. 5 shows part of the topology. We compare our algorithms with several baseline algorithms: Maximum Probability First (MPF) and Minimum Cost First (MCF). These algorithms are similar greedy heuristics that differ only in the criterion of choosing the next path. Specifically, at each step, M P F and M CF always deploy APs on the path with the maximum probability and minimum cost respectively. We also use the brute-force method to obtain the optimal solution for small graph topologies for comparison. A. Performance of optimal tree-based algorithms We first evaluate our optimal algorithms on tree topologies. Fig. 6 is the probability achieved when the number of APs varies from 10 to 30. This figure shows that all algorithms yield higher probability when the number of APs is increased. This is because more APs can serve more users. M CC OP T yields the best performance among all

Fig. 5. A trace topology generated from a map of Dartmouth College campus

Fig. 6. Probability achieved VS. number of APs (on tree)

Fig. 8. Probability achieved VS. number of APs (on general graph)

Fig. 9. Comparison of computation time

algorithms. In particular, M CC OP T outperforms M P F and M CF by 13% − 87%, which validates the effectiveness of the optimal deployment algorithm. M CF yields the worst performance because it does not consider the probability of each path. With the increase of the number of APs, the difference among these algorithms becomes smaller. This is because, when more APs are available, much more paths can be covered and different algorithms more likely cover the same paths. An extreme case is when the number of APs is enough to cover all the paths and all algorithms perform identically as they all can achieve 100% probability. The results of this simulation show that our algorithms can make the best use of the APs. That is, with the same number of APs our deployment approach can serve more users.

We then evaluate the algorithms of the M DC problem. Fig. 7 shows the number of APs needed when the required probability varies from 30% to 70%. The figure shows that in order to guarantee the higher service probability it will need more APs. Our optimal algorithm can save 9% − 80% number of APs while achieving the same service probability compared with other algorithms. As explained before, M CF yields the worst performance and M P F performs slightly better than M CF .

Fig. 7. Number of APs VS. Probability achieved (on tree)

Fig. 10. Number of APs VS. Probability achieved (on general graph)

B. Performance of approximation algorithms for general topologies In this section, we evaluate our algorithms for general graphs. Fig. 8 shows the probability achieved when the number of APs provided varies from 10 to 30. Similar to tree-based algorithms, all algorithms yield higher coverage when the number of APs is increased. Obviously, as a brute-force solution, OPT yields the best performance among all algorithms. Our algorithms, M CC Combine and M CC Group perform very similarly as OPT and outperform M P F and M CF by 6% − 60% or so. These results validate the effectiveness of our approximate algorithms. Fig. 9 shows the computation time of M CC Combine and M CC Group algorithms. There are two different topologies where the total numbers of paths are almost the same (21 and 20 respectively) but the numbers of independent paths are notablely different (16 and 11 respectively). For easy illustration, the computation time is normalized as the ratio of the computation time to that of the optimal algorithm. It shows that both the algorithms run much faster than the brute-force optimal algorithm. Furthermore, in the case with large number of dependent paths, M CC Group outperforms M CC Combine in terms of computation much more. This result validates the analysis in section IV-B.

430

Fig. 10 shows the performance of the M DC algorithm on general graph topologies, when the service probability varies from 30% to 70%. This result is very similar to the result shown in Fig.7. Although algorithm M DC performs slightly worse than the optimal brute-force algorithm, it outperforms M P F and M CF by 12% − 75%. VII. C ONCLUSION In this paper, we studied the problem of deploying WiFi APs that can provide continuous service for mobile users. We formulated two AP deployment problems that aim to maximize the continuous user coverage and to minimize the AP deployment cost, respectively. Both problems are formulated based on mobility graphs that capture the statistical mobility patterns of users. Several optimal and approximation AP deployment algorithms with provable performance are developed. Our simulations using real user mobility traces showed that our algorithms can significantly improve the continuous coverage for mobile users while reducing the number of APs needed in WiFi deployments. ACKNOWLEDGMENT This work was supported by grants from the Research Grants Council of the Hong Kong SAR, China Nos. (CityU 114908), (CityU 114609) and CityU Applied R & D Funding (ARD-(Ctr)) No. 9681001 and ShenZhen Basic Research Grant No. JC200903170456A. R EFERENCES [1] P. S. Henry and L. Hui, “Wifi: what’s next?” Communications Magazine, vol. 40, no. 12, 2002. [2] J. Eriksson, H. Balakrishnan, and S. Madden, “Cabernet: vehicular content delivery using wifi,” in Proceedings of the 14th ACM international conference on Mobile computing and networking, 2008. [3] V. Bychkovsky, B. Hull, A. Miu, H. Balakrishnan, and S. Madden, “A measurement study of vehicular internet access using in situ wi-fi networks,” in Proceedings of the 12th annual international conference on Mobile computing and networking, 2006. [4] R. Gass, J. Scott, and C. Diot, “Measurements of in-motion 802.11 networking,” in Proceedings of WMCSA, 2006. [5] J. Ott and D. Kutscher, “A disconnection-tolerant transport for drive-thru internet environments,” in Proceedings of IEEE INFOCOM, 2005. [6] “Free wi-fi in sydney cbd,” 2007, http://searchnetworking.techtarget. com.au/news/23410-Free-Wi-Fi-in-Sydney-CBD. [7] “Hong kong rolls out free wifi service,” 2008, http://www.muniwireless. com/2008/07/21/hong-kong-rolls-out-free-wi-fi-service. [8] D. B. Johnson and D. A. Maltz, “Dynamic source routing in adhoc wireless networks,” Mobile Computing, vol. 353, 1996. [9] J. Broch, D. A. Maltz, D. B. Johnson, Y. C. Hu, and J. Jetcheva, “A performance comparison of multi-hop wireless ad hoc network routing protocols,” in Proceedings of Mobile Computing and Networking (MobiCom), 1998. [10] M. Kim, D. Kotz, and S. Kim, “Extracting a mobility model from real user traces,” in Proceedings of IEEE INFOCOM, 2006. [11] J. Yoon, B. Noble, M. Liu, and M. Kim, “Building realistic mobility models from coarse-grained traces,” in MobiSys, 2006. [12] G. Judd, X. Wang, and P. Steenkiste, “Efficient channel-aware rate adaptation in dynamic environments,” in Proceeding of the 6th international conference on Mobile systems, applications, and services (Mobisys08), 2008. [13] T. Camp, J. Boleng, and V. Davies, “A survey of mobility models for ad hoc network research,” in Proceedings of Wireless Communication and Mobile Computing (WCMC): Special issue on Mobile Ad Hoc Networking: Research, Trends and Applications, 2002.

[14] X. Hong, M. Gerla, G. Pei, and C. Chiang, “A group mobility model for ad hoc wireless networks,” in Proceedings of ACM/IEEE MSWiM’99, 1999. [15] C. Bettstetter, G. Resta, and P. Santi, “The node distribution of the random waypoint mobility model for wireless ad hoc networks,” IEEE Transactions on Mobile Computing, vol. 2, no. 3, 2003. [16] G. Lin, G. Noubir, and R. Rajaraman, “Mobility models for ad hoc network simulation,” in Proceedings of IEEE, 2004. [17] J. Yoon, M. Liu, and B. Noble, “Sound mobility models,” in Proceedings of the 9th annual international conference on Mobile computing and networking (MobiCom), 2003. [18] W. Navidi and T. Camp, “Stationary distributions for random waypoint models,” IEEE Transactions on Mobile Computing, vol. 3, no. 1, 2004. [19] J. Yoon, M. Liu, and B. Noble, “Random waypoint considered harmful,” in Proceedings of IEEE INFOCOM, 2003. [20] D. J. Patterson, L. Liao, K. Gajos, M. Collier, N. Livic, K. Olson, S. Wang, D. Fox, and H. Kautz, “Opportunity knocks: a system to provide cognitive assistance with transportation services,” in Proceedings of International Conference on Ubiquitous Computing, 2004. [21] P. Hui, A. Chaintreau, J. Scott, R. Gass, J. Crowcroft, and C. Diot, “Pocket switched networks and human mobility in conference environments,” in SIGCOMM’05 Workshops, 2005. [22] “The dartmouth wireless traces,” 2005, http://crawdad.cs.dartmouth.edu/ data/dartmouth/. [23] D. Kotz and K. Essien, “Analysis of a campus-wide wireless network,” in In Proceedings of ACM Mobicom. ACM Press, 2002, pp. 107–118. [24] M. Kim and D. Kotz, “Classifying the mobility of users and the popularity of access points,” in Proceedings of the International Workshop on Location- and Context-Awareness (LoCA), 2005. [25] Z. Zheng, P. Sinha, and S. Kumar, “Alpha coverage: Bounding the interconnection gap for vehicular internet access,” in Proceedings of IEEE INFOCOM Mini-Conference, 2009. [26] A. LaMarca, Y. Chawathe, S. Consolvo, J. Hightower, I. Smith1, J. Scott, T. Sohn, J. Howard, J. Hughes, F. Potter, J. Tabert, P. Powledge, G. Borriello, and B. Schilit, “Place lab: Device positioning using radio beacons in the wild,” In Proceedings of the Third International Conference on Pervasive Computing, 2005. [27] C. Tuduce and T. Gross, “A mobility model based on wlan traces and its validation,” in Proceedings of IEEE INFOCOM, 2005. [28] J.-K. Lee and J. C. Hou, “Modeling steady-state and transient behaviors of user mobility: formulation, analysis, and application,” in Proceedings of the 7th ACM international symposium on Mobile ad hoc networking and computing, 2006. [29] W. Hsu, T. Spyropoulos, K. Psounis, and A. Helmy, “Modeling timevariant user mobility in wireless mobile networks,” in Proceedings of IEEE INFOCOM, 2007.

431

Suggest Documents