service selection distribution of the underlying evolutionary game describes the state of ... 1For the rest of this paper, âaccess networkâ and âservice providerâ are.
This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE Globecom 2010 proceedings.
Optimal Bandwidth Allocation with Dynamic Service Selection in Heterogeneous Wireless Networks Kun Zhu, Dusit Niyato, and Ping Wang School of Computer Engineering, Nanyang Technological University (NTU), Singapore Email:{zhuk0001,dniyato,wangping}@ntu.edu.sg Abstract—Bandwidth allocation for different service classes in heterogeneous wireless networks is an important issue for service provider in terms of balancing service quality and profit. It is especially challenging when considering the dynamic competition both among service providers and among users. To address this problem, a two-level game framework is developed in this paper. The underlying dynamic service selection is modeled as an evolutionary game based on replicator dynamics. An upper bandwidth allocation differential game is formulated to model the competition among different service providers. The service selection distribution of the underlying evolutionary game describes the state of the upper differential game. An openloop Nash equilibrium is considered to be the solution of this linear state differential game. The proposed framework can be implemented with minimum communication cost since no information broadcasting is required. Also, we observe that the selfish behavior of service providers can also maximize the social welfare.
Keywords – Bandwidth allocation, Replicator dynamics, Differential game, Optimal control, Open-loop Nash equilibrium. I. I NTRODUCTION In recent years, the evolving different wireless access technologies and system architectures constitute a heterogeneous wireless environment where different networks complement each other in terms of coverage area, mobility support, offered data rate, and price. Naturally, two issues arise for the users and service providers in this heterogeneous wireless environment. First, the rational users select the access network and the service class from different service providers according to the performance observation of available service classes. The decision (i.e., strategy) of network and service selection will be made dynamically so that the individual utility is maximized. Second, the service providers have to allocate the available network capacity (i.e., bandwidth) to the offered service classes. Due to the dynamic behavior of users, this bandwidth allocation has to be performed dynamically to obtain the maximum profits. To address the issues of service selection and bandwidth allocation, a hierarchical (i.e., two-level) game framework is developed to jointly obtain the strategies of users and service providers . The dynamic decision of service selection by the users is modeled by an evolutionary game [1]. This evolutionary game takes into account the bandwidth allocation control of service providers. In turn, the service providers observe the service selection of the users and allocate the bandwidth dynamically. This bandwidth allocation of service providers is modeled as a differential game.
The novelty of this two-level game framework is the consideration of the dynamic decision making. The system parameters (e.g., number of users selecting any access service class) are naturally dynamic, and hence a steady state of the network may never be reached. Therefore, the dynamic optimal control (i.e., differential game for noncooperative environment) is the suitable approach for analyzing the dynamic decision making process of the rational service providers in heterogeneous wireless networks. A few works studied the network selection and rate control problems in heterogeneous wireless networks. In [1], evolutionary game based algorithms were proposed for dynamic network selection. A Markov Decision Process (MDP) based control scheme was proposed for flow assignment among different networks in [2]. In [3], a robust rate control framework for multiple-network simultaneous access based on H∞ optimal control was developed. Differential game was also applied to solve data transmission issue in wireless network. In [4], the routing in ad hoc networks was formulated as the differential game with coupling constraints. However, none of the works considered the problem of dynamic optimal bandwidth allocation in heterogeneous wireless network in which the users can change service selection dynamically. This constitutes the main contribution of this paper. The rest of this paper is organized as follows. Section II presents the system model. The underlying service selection in heterogeneous wireless networks is formulated as an evolutionary game in Section III. The optimal bandwidth allocation control considering the dynamic network and service selection is formulated as a differential game in Section IV. Section V presents the numerical studies. The summary of this paper is given in Section VI. II. S YSTEM M ODEL AND A SSUMPTIONS We consider a particular service area a in the coverage of a heterogeneous wireless environment consisting of M access networks and N (t) active users at time t as shown in Fig. 1. Without loss of generality, each access network is owned by each service provider1 . Service provider i ∈ {1, 2, . . . , M } can provide Ki service classes to users for satisfying different M quality of service (QoS) requirements. Denote K = i=1 Ki as the total number of service classes. 1 For the rest of this paper, “access network” and “service provider” are used interchangeably.
978-1-4244-5637-6/10/$26.00 ©2010 IEEE
This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE Globecom 2010 proceedings.
Due to the characteristics of wireless channels (e.g., fading, interference) and the mobility of wireless users, the system capacity (i.e., bandwidth, denoted by Bi for service provider i) and the number of users in area a (denoted by N ) are generally be time varying. In this paper, we assume that they are the smooth functions with respect to t, i.e., Bi (t) and N (t). Similar to [3], we consider the full bandwidth utilization criterion. In particular, all users subscribed to the same service class will share the available bandwidth equally (e.g., a WiMAX base station allocates equal size of time slot to the users). The bandwidth of user k received from service class j of service provider i at time t is denoted as τkij (t) = Bij (t)/Nij (t), where Bij (t) represents the allocated bandwidth of service class j from service provider i, Nij (t) represents the total number of users choosing class j of service provider M service Ki i at time t and i=1 j=1 Nij (t) = N (t). Users with multi-mode terminals can choose different service classes from different service providers freely and independently according to the perceived instantaneous utility [5]. Service Provider 2 with K2 Service Classes
Service Provider 1 with K1 Service Classes
Service Provider M with KM Service Classes
representing the QoS satisfaction level. Let xij (t) ∈ [0, 1] denote the proportion of users in area a2 choosing service class j from service provider i at time t. Therefore, the bandwidth allocated to each of this proportion of users at time t is τkij (t) = Bij (t)/(N (t)xij (t)) and the payoff of user k is u(τkij (t)) = ατkij (t) = α
Bij (t) , N (t)xij (t)
(1)
where α is a constant indicating the increasing rate of utility. Then, the average payoff (utility) of the population can be derived as follows: u(t) =
Ki M i=1 j=1
xij (t)u(τkij (t)).
(2)
The replicator dynamics used to model the evolution process of service selection strategy for all i ∈ {1, 2, . . . , M }, j ∈ {1, 2, . . . , Ki } can be described as the following differential equations ∂xij (t) = x˙ ij (t) = δxij (t) u(τkij (t)) − u(t) , ∂t Ki M xij (t) = 1. (3) i=1 j=1
with initial condition SC1 SC2
SCK1
SC1 SC2
SCK2
SC1 SC2
SCKM
x(0) = x0 ∈ X ,
(4) T
where x(t) = x11 (t) · · · xij (t) · · · xM KM (t) is a vector describing the population state, δ is the learning rate of the population, and X ⊆ K is the set of all possible states. IV. DYNAMIC BANDWIDTH A LLOCATION
a
AN1
ANM AN2
SC : Service Class AN : Access Network
Fig. 1.
System model of multi-class heterogeneous wireless networks.
III. E VOLUTION OF S ERVICE S ELECTION Users in area a compete to select the available access networks from candidate service providers. The objective of this selection is to maximize the satisfaction (i.e., utility) from QoS performance. At any time instance, each user can adapt their service selection strategies according to the time-varying observed network performance which depends on the current congestion condition. Similar to [1], an underlying evolutionary game is formulated to model the dynamic competition of service selection among users. This is the lower-level game in the proposed twolevel game framework . In this lower-level evolutionary game model, the players are the N (t) active users in area a at time t. In the context of evolutionary game, a group of users constitute the population. The strategies of players are the choices of particular service class from certain service providers (i.e., available access networks). Payoff of a player is the utility
With the dynamic service selection behavior of users, the service providers can optimally allocate the bandwidth to achieve the maximum profits. Increasing the allocated bandwidth of certain service is a natural way to improve the performance and also to attract more users for this service class. However, with the limited capacity of the access network, increasing the bandwidth allocated to one service class will decrease the bandwidth allocated to other service classes which may result in a reduced total profit of service provider. In this section, we formulate the differential game model for bandwidth allocation of service provider. This is the upperlevel game in the proposed two-level game framework . This differential game model takes the dynamic service selection of users into account. A. Noncooperative Bandwidth Allocation as a Differential Game Each of the M noncooperative service providers competes to maximize the present value of its objective function derived over an infinite time horizon by controlling the bandwidth allocation strategy. To achieve this, a simultaneous play differential game is formulated as follows. The set of players 2 Without loss of generality, notation for area a is omitted in the rest of the paper for simplicity of the presentation.
978-1-4244-5637-6/10/$26.00 ©2010 IEEE
This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE Globecom 2010 proceedings.
is composed of all service providers of the available access networks. For a service provider as a player, the strategy is the dynamic control of the proportion of bandwidth allocated to different service classes. Specifically, we denote the proportion of bandwidth of service provider i allocated to service class j at time t as γij (t). The control strategy of service provider i is denoted by vector γ i (t) = T Ki γi1 (t) · · · γij (t) · · · γiKi (t) ∈ R+ . Naturally, Ki γij (t) ∈ [0, 1], γ (t) = 1 and B (t) = Bi (t)γij (t) ij j=1 ij for all t ∈ [0, +∞). Similar to the notation used in game theory, Φ = {γ i (t), γ −i (t)} denotes the strategy profile of this differential game, and γ −i (t) is a vector of strategies of all players except player i. Depending on different informational structure assumptions, the control strategies of service providers can be represented in different ways (i.e., open-loop control strategy and closed-loop control strategy). An open-loop strategy does not need any feedback information from the system which means that the output of control process also does not need to be observed. While a closed-loop strategy can use feedback information to adjust the control process if the system biases the predetermined target. Therefore, the use of closed-loop strategy requires more complicated system structure. In this paper, we consider the open-loop control strategy of service provider due to its simplicity of implementation (i.e., the centralized controller is not required) which is suitable for the loosely coupled heterogeneous wireless network. In the bandwidth allocation differential game, all service providers (i.e., players) choose their bandwidth allocation control strategies simultaneously, therefore influencing the evolution of the state of the differential game as well as their own and their opponents’ objective functions. The state of the differential game is represented by the population state x(t) of the underlying service selection game. The replicator dynamics differential equations (3) describe how the current state x(t) and the service providers’ control γ i (t) at time t influence the rate of change of the state at time t. For a service provider, the problem becomes an optimal control subject to the constraints (e.g., state evolution differential equations) given the control strategies of other service providers. The instantaneous payoff of service provider i choosing control strategy γ i (t) is expressed as i Jins (γ i (t), γ −i (t)) =
Ki
(Pij N (t)xij (t) − θj (γij (t)Bi (t))2 ),
j=1
J (γ i (t), γ −i (t)) (5) ∞ K i = e−ρt (Pij N (t)xij (t) − θj (γij (t)Bi (t))2 )dt, 0
j=1
Bi (t)γij (t) − u(t) , = δxij (t) u N (t)xij (t) = x0 ,
x˙ ij (t) x(0)
(6)
for i ∈ {1, . . . , M } and j ∈ {1, . . . , Ki }, where Ki M
xij (t)
=
1,
xij (t) ∈ [0, 1],
γij (t)
=
1,
γij (t) ∈ [0, 1], t ∈ [0, +∞), (7)
i=1 j=1 Ki j=1
where ρ is the discounting rate of payoff of service provider. B. Nash Equilibrium Nash equilibrium is considered to be the solution of above bandwidth allocation differential game. First, the definition of an optimal bandwidth allocation strategy is given as follows: Definition 1: A bandwidth allocation control path γ ∗i (t) is optimal for service provider i if the inequality condition J i (γ ∗i (t), γ −i (t)) ≥ J i (γ i (t), γ −i (t)) holds for all feasible control paths γ i (t) in the noncooperative bandwidth allocation differential game. According to the Definition 1, the definition of open-loop Nash equilibrium for the bandwidth allocation differential game is given as follows: Definition 2: Denote γ i (t) the open-loop bandwidth allocation strategy of service provider i. The strategy profile Φ = {γ ∗i (t), γ ∗−i (t)} is an open-loop Nash equilibrium if for each service provider i ∈ {1, 2, . . . , M }, γ ∗i (t) is an optimal control path given other service providers’ control strategies γ ∗−i (t). To obtain the open-loop Nash equilibrium, each service provider needs to solve an optimal control problem. In this case, Pontryagin’s maximum principle can be used [7]. First, the definitions of the Hamiltonian function H, the maximized ˙ Hamiltonian function H ∗ , and the adjoint equation λ(t) for bandwidth allocation differential game are given. The Hamiltonian function of service provider i is denoted by Hi as Hi (x(t), γ i (t), γ −i (t), λij (t), t) =
(8)
Ki Pij N (t)xij (t) − θj (γij (t)Bi (t))2 j=1
where θj is a cost factor, and Pij denotes the price charged by service provider i for service class j per user per unit of time. In noncooperative bandwidth allocation, for each rational service provider i ∈ {1, 2, . . . , M }, the optimal control can be expressed as follows: maximize: i
subject to:
Bi (t)γij (t) −u , λij (t)δxij (t) u + N (t)xij (t) i=1 j=1 Ki M
where λij (t) is the co-state variable associated with x(t). Then, the corresponding maximized Hamiltonian function H ∗ is defined as Hi∗ (x(t), λij (t), t) = max{Hi (x(t), γ i (t), γ −i (t), λij (t), t)|γ i (t) ∈ [0, 1]Ki }. (9) The adjoint equation is defined as ∂Hi∗ (x(t), λij (t), t) λ˙ ij (t) = ρλij (t) − . ∂xij (t)
978-1-4244-5637-6/10/$26.00 ©2010 IEEE
(10)
This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE Globecom 2010 proceedings.
Based on the above Hamiltonian functions and the linear utility function, we can obtain the following derivation: ∂Hi (x(t), γ i (t), γ −i (t), λij (t), t) ∂xij (t) αδB(t)λij (t) , = Pij N (t) − N (t) M where B(t) = i=1 Bi (t). Therefore, ∂ 2 Hi (x(t), γ i (t), γ −i (t), λij (t), t) = 0, 2 ∂xij (t)
(11)
(12)
and similarly we can obtain γ −i (t), λij (t), t) ∂ 2 Hi (x(t), γ i (t), ∂xij (t)∂λij (t) ∂ 2 Hi (x(t), γ i (t), γ −i (t), λij (t), t) = = 0. ∂λij (t)∂xij (t)
(13)
providers adjust their bandwidth control paths in a cooperative manner to maximize the social welfare in terms of aggregated profits. Similar to the noncooperative case, the optimal control problem for the cooperative bandwidth allocation can be expressed as follows: maximize: J(γ i (t), γ −i (t)) = (17) ∞ K M i e−ρt (Pij N (t)xij (t) − θj (γij (t)Bi (t))2 )dt 0
i=1 j=1
with the same constraints as defined in (6) and (7). To obtain the optimal solution of cooperative bandwidth allocation, Pontryagin’s maximum principle is used. In this case, the Hamiltonian function, the maximized Hamiltonian function, and the adjoint equation of service provider i for the cooperative bandwidth allocation are defined as Hic , Hi , and λ˙ cij (t), respectively, and can be expressed as follows:
According to (12) and (13), we have the following property. c c (18) P ROPERTY 1: The bandwidth allocation differential game Hi (x(t), γ i (t), γ −i (t), λij (t), t) K M defined in (5)-(7) is a linear state differential game which i possesses the property that the open-loop Nash equilibria are = (Pij N (t)xij (t) − θj (γij (t)Bi (t))2 ) Markovian perfect [6]. i=1 j=1
Ki M To solve for the optimal control strategy, the first order Bi (t)γij (t) c −u , λij (t)δxij (t) u + condition is defined as follows: N (t)xij (t) i=1 j=1 ∂Hi Bi (t) = −2θj Bi2 (t)γij (t) + λij (t)δα = 0. (14) ∂γij (t) N (t) Hi (x(t), λcij (t), t) (19) Then, we can obtain c c Ki i (t), γ −i (t), λij (t), t)| = max{Hi (x(t), γ γ i (t) ∈ [0, 1] }, λij (t)δα ∗ γij (t) = . (15) and 2θj Bi (t)N (t) ∂Hi (x(t), λcij (t), t) We can observe that the optimal control path is independent λ˙ cij (t) = ρλcij (t) − . (20) ∂xij (t) of system state x(t) and only relates to the costate variable
λij (t). This costate variable can be obtained by solving the adjoint equations as follows: λ˙ ij (t)
=
ρλij (t) −
∂Hi∗ (x(t), λij (t), t) , ∂xij (t)
(16)
where the maximized Hamiltonian function Hi∗ (x(t), λij (t), t) can be obtained by substituting (15) into the Hamiltonian function defined in (8). Denote the solution of (16) as λij (t). Substituting this λij (t) into (15), we can obtain the optimal ∗ bandwidth allocation control path γij (t) to service class j of service provider i. Similarly, the optimal control path for all service classes of all service providers can be derived. ∗ Then, we obtain the strategy profile Φ∗ = {γij (t)|i ∈ {1, . . . , M }, j ∈ {1, . . . , Ki }}. Since the state space X is a convex set, the solution to the state evolution differential equation (6) exists and is unique [7]. Also, for all t ∈ [0, ∞), the maximized Hamiltonian function H ∗ is concave and continuously differentiable with respect to x. Therefore, we can state that the obtained strategy profile Φ∗ is a Nash equilibrium for the noncooperative bandwidth allocation differential game. C. Cooperative Bandwidth Allocation as Optimal Control Next, we consider the cooperation of service providers to allocate bandwidth to service classes. In particular, the service
According to (18), we can verify that the cooperative bandwidth allocation is a linear state optimal control. With the similar methods used in the noncooperative case, we can obtain the cooperative optimal control γij (t) and accordingly the cooperative strategy profile Φ . Observation 1: In the noncooperative bandwidth allocation differential game defined in (5)-(7), the selfish behavior of service providers can also maximize the social welfare. Proof: According to the first order condition, let ∂Hic /∂γij (t) = 0. We can obtain γij (t) =
λij (t)δα . 2θj Bi (t)N (t)
(21)
The adjoint equation of the cooperative case is derived as λ˙ cij (t)
= =
∂Hi (x(t), λcij (t), t) ∂xij (t) B(t)δαλcij (t) − Pij N (t), (22) ρλcij (t) + N (t)
ρλcij (t) −
which is equal to the adjoint equation of the noncooperative case. Accordingly, the co-state variable λcij (t) = λij (t), for all i ∈ {1, 2, . . . , M }, j ∈ {1, 2, . . . , Ki }. Therefore, we obtain ∗ γij (t) = γij (t), which shows that the selfish behavior of service providers can also maximize the social welfare.
978-1-4244-5637-6/10/$26.00 ©2010 IEEE
This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE Globecom 2010 proceedings.
A. Parameter Setting We consider a heterogeneous wireless network where an IEEE 802.11b access point and an IEEE 802.16 base station provide two service classes to the 20 users in the area a as shown in Fig. 1. The maximum saturation throughput of the IEEE 802.11b-based WLAN is assumed to be 7 Mbps. The available bandwidth of IEEE 802.16 network for area a is assumed to be 5 Mbps when considering the bandwidth sharing of other users in the same cell. For convenience, the WLAN network service provider and the WiMAX service provider are denoted by service provider 1 and service provider 2, respectively. Fixed connection fees for two service classes of two service providers are set to be P11 = 0.2, P12 = 0.1, P21 = 0.3, and P22 = 0.25, respectively. For the replicator dynamics, we set the learning rate to be δ = 0.6. For the utility of users, we set α = 0.2. The discounting rate and cost factors for the objective function of service providers are set to be ρ = 0.1, and θ1 = θ2 = 0.01, respectively. The initial proportion of users choosing two service classes of two service providers are assumed to be x11 (0) = 0.2, x12 (0) = 0.3, x21 (0) = 0.1, and x22 (0) = 0.4, respectively. B. Numerical Results
Proportion of users choosing services
The dynamic behavior of service selection of users under the bandwidth allocation control is investigated and the strategy adaption trajectory from the initial selection distribution is shown in Fig. 2. The trajectory shows that the dynamics converges to a certain selection distribution where every user in area a receives the same utility as the average utility of population. According to the optimal control strategies, we observe that both service providers 1 and 2 allocate larger bandwidth to service class 1 due to the higher price. As a result, more users select service class 1 as shown in Fig. 2. 0.8 Service class 1 of SP1 Service class 2 of SP1 Service class 1 of SP2 Service class 2 of SP2
0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0
100
200
300
400
500
Time
Fig. 2.
Dynamics of service selection.
Due to the mobility of users, the total number of users in area a is time varying. The impacts of the variations of the number of users to the optimal bandwidth allocation control is shown in Fig. 3. With the increasing number of users in area a, both access networks become congested. The service providers can control the congestion by dynamically adjusting the proportion of bandwidth allocated to the service class with higher price. Also, due to the higher price difference of service
classes of service provider 1, we observe that the proportion of bandwidth allocated to service class 1 by service provider 1 is larger than that of service provider 2. Proportion of bandwidth allocated to service class 1
V. N UMERICAL S TUDIES
1
0.9
0.8
0.7
0.6
Service Provider 1 Service Provider 2 50
100
150
200
The number of users
Fig. 3.
Control strategies under different number of users.
VI. S UMMARY We have presented a two-level game framework based on differential game and evolutionary game for the optimal bandwidth allocation in heterogeneous wireless networks. The dynamic service selection behavior of users have been modeled as an evolutionary game and the strategy evolution process has been analyzed using the replicator dynamics. The bandwidth allocation among different service classes considering users’ dynamic service selection has been formulated as a linear state differential game. An open-loop Nash equilibrium is considered to be the solution of this differential game. In addition, we have considered the cooperative bandwidth allocation of service providers to maximize aggregated profit. It has been shown that the open-loop Nash equilibrium can also maximize the social welfare. R EFERENCES [1] D. Niyato and E. Hossain, “Dynamics of networks selection in heterogeneous wireless networks: An evolutionary game approach,” IEEE Transactions on Vehicular Technology, vol. 58, no. 4, pp. 2008-2017, May 2009. [2] J. P. Singh, T. Alpcan, P. Agrawal, and V. Sharma, “An optimal flow assignment framework for heterogeneous network access,” in Proc. WoWMoM, June 2007, pp. 1-12. [3] T. Alpcan, J. P. Singh, and T. Bas¸ar, “Robust rate control for heterogeneous network access in multihomed environments,” IEEE Transactions on Mobile Computing, vol. 8, no. 1, pp. 41-51, January 2009. [4] L. Lin, X. W. Zhou, L. P. Du, and X. N. Miao, “Differential game model with coupling constraint for routing in ad hoc networks,” in Proc. WiCom, September 2009, pp. 3042-3045. [5] C. U. Saraydar, N. B. Mandayam, and D. J. Goodman, “Pricing and power control in a multicell wirless data network,” IEEE Journal on Selected Areas in Communications, vol. 19, no. 10 , pp. 1883-1892, October 2001. [6] S. Jørgensen, G. Mart´ın-Herr´an, and G. Zaccour, “Agreeability and time consistency in linear-state differential games,” Journal of Optimization Theory and Applications, vol. 119, no. 1, pp. 49-63, October 2003. [7] E. J. Dockner, S. Jørgensen, N. V. Long, and G. Sorger, Differential Games in Economics and Management Science. Cambridge Univ. Press, November 2000.
ACKNOWLEDGMENT This work was done in the Centre for Multimedia and Network Technology (CeMNet) of the School of Computer Engineering, Nanyang Technological University.
978-1-4244-5637-6/10/$26.00 ©2010 IEEE