Optimal network selection in heterogeneous wireless ... - Springer Link

18 downloads 169 Views 551KB Size Report
Aug 5, 2009 - range of radio access technologies. Most of previous work on integrating heterogeneous wireless networks concen- trates on network layer ...
Wireless Netw (2010) 16:1277–1288 DOI 10.1007/s11276-009-0202-1

Optimal network selection in heterogeneous wireless multimedia networks Pengbo Si Æ Hong Ji Æ F. Richard Yu

Published online: 5 August 2009  Springer Science+Business Media, LLC 2009

Abstract The complementary characteristics of different wireless networks make it attractive to integrate a wide range of radio access technologies. Most of previous work on integrating heterogeneous wireless networks concentrates on network layer quality of service (QoS), such as blocking probability and utilization, as design criteria. However, from a user’s point of view, application layer QoS, such as multimedia distortion, is an important issue. In this paper, we propose an optimal distributed network selection scheme in heterogeneous wireless networks considering multimedia application layer QoS. Specifically, we formulate the integrated network as a restless bandit system. With this stochastic optimization formulation, the optimal network selection policy is indexable, meaning that the network with the lowest index should be selected. The proposed scheme can be applicable to both tight coupling and loose coupling scenarios in the integration of heterogeneous wireless networks. Simulation results are presented to illustrate the performance of the proposed scheme.

This work was supported in part by the National Science Foundation of China under Grant 60672124 and 60832009, the Hi-Tech Research and Development Program (National 863 Program) under Grant 2007AA01Z221, and the National Key Basic Research and Development Plan of China (973 Program) under Grant 2009CB320400. F. R. Yu Department of Systems and Computer Engineering, Carleton University, Ottawa, ON, Canada P. Si (&)  H. Ji Key Laboratory of Universal Wireless Communication, Ministry of Education, Beijing University of Posts and Telecommunications, Beijing, People’s Republic of China e-mail: [email protected]

1 Introduction In recent years, with the rapid growth of wireless communication systems, the complementary characteristics of different wireless networks make it attractive to integrate a wide range of radio access technology (RAT) standards, including wireless wide area networks (WWANs) like cellular networks, wireless metropolitan area networks (WMANs) like WiMAX networks, and wireless local area networks (WLANs) like IEEE 802.11 based networks. Several internetworking architectures between cellular and other RAT systems have been proposed. Loose coupling and tight coupling are the two generic approaches to the integration specified by European Telecommunications Standards Institution (ETSI) [1]. In the loose coupling approach, data flows from different types of networks go to the external IP network directly, and only signaling is required between cellular networks and other complementary networks. In the tight coupling approach, complementary networks communicate with the external network through the cellular networks. In addition, an internetworking architecture is developed by the Third Generation Partnership Project (3GPP) to enable the radio resource reuse between the networks as well as the authentication, authorization and accounting (AAA) [2]. To improve the performance of heterogeneous networks and keep users always best connected (ABC) [3], a number of schemes are proposed to deal with the network integration problems. Analytical hierarchy process (AHP) and grey relational analysis (GRA) are used in [4] to combine multiple network selection criteria and decide the weights of the criteria according to the user preferences and service applications. In [5–7], several resource management and admission control schemes are proposed in cellular/WLAN integrated networks. Authors of [8] propose an architectural

123

1278

framework for network selection and a comprehensive decision making process to rank candidate networks for the users. An optimal joint session admission control scheme is proposed in [9] for integrated cellular/WLAN systems with vertical handoff. Game theory is introduced to heterogeneous networks in [10] for radio resource management including bandwidth allocation and admission control. Authors of [11] propose a Markovian framework for the allocation of multiple services in multiple RATs and a model to embed the evaluation of several RAT selection policies considering different allocation criteria. Although some work has been done to integrate heterogeneous wireless networks, most of the previous work considers network layer quality of service (QoS), such as blocking probability and utilization, as the design criteria. Consequently, application layer QoS, such as distortion for multimedia applications, is largely ignored in the integration of heterogeneous wireless networks. However, multimedia applications, such as video telephony and surveillance, are very promising services that require more radio resource compared to other types of services in heterogeneous wireless networks. From a user’s point of view, QoS at the application layer is more important than that at other layers. Popular video compression standards, such as MPEG-4 and H.264, have the capability to adapt its quality (i.e., distortion) to different network conditions in heterogeneous wireless networks. Therefore, application layer QoS should be taken into account in the design of network selection schemes in heterogeneous wireless networks. To the best of our knowledge, the design of optimal network selection in heterogeneous wireless networks considering multimedia application layer QoS has not been addressed in previous work. In this paper, we present a novel distributed scheme based the stochastic optimization formulation of this problem. Some distinct features of the proposed scheme are as follows. •





To improve the user experience as well as to reduce the cost, we consider multimedia application layer distortion and network access price in the network selection optimization. An application layer parameter, intrarefreshing rate, is adapted in heterogeneous wireless networks. We formulate the heterogeneous wireless networks as a restless bandit system [12–15], which has been successfully applied in operations research and stochastic control problems. With the restless bandit approach, the optimal network selection policy is indexable, meaning that it simply selects the network with the lowest index. It is a fully distributed and scalable scheme, which can be applicable to both tight coupling and loose coupling scenarios in the integration of heterogeneous wireless networks.

123

Wireless Netw (2010) 16:1277–1288

The rest of the paper is organized as follows. In Sect. 2, the heterogeneous networks are introduced. Application layer distortion is presented in Sect. 3. We formulate the network selection problem as a restless bandit problem in Sect. 4 and solve this problem in Sect. 5. In Sect. 6 the process of the selection scheme is described. Simulation results are presented in Sect. 7. Finally, we conclude this study in Sect. 8.

2 Network selection in heterogeneous wireless networks In heterogeneous networks, multiple types of totally N networks cooperate to provide seamless coverage for universal wireless access [16, 17]. In this paper, we consider an area with the coverage of three types of networks: WLANs, WiMAX networks and cellular networks. Each new session arrives in the area is to be associated to one network. In our proposed optimal network selection scheme, using the restless bandit approach, different types of networks working as the bandits cooperate to implement joint admission control to provide maximized reward that combines both distortion and costs. According to the indexable rule of the restless bandit approach, base stations of the networks share their state information with the others, and calculate the indices of themselves to make the optimal network selection decision. 2.1 Heterogeneous wireless networks Generally speaking, there are two different ways of integrating heterogeneous wireless networks, defined as tight coupling and loose coupling interworking [9, 18, 19]. In a tightly coupled system, a network is connected to another network in the same manner as other radio access networks. The main advantage of this approach is that the existing mechanisms for authentication, mobility and QoS can be reused directly over another network. However, this approach requires the modifications of the design to accommodate the increased traffic from the other network. In a loosely coupled system, the heterogeneous wireless networks are not connected directly. Instead, they are connected to the Internet. In this approach, different mechanisms and protocols are used to handle authentication, mobility and billing, and the traffic from one network would not go through the other network. Nevertheless, as peer IP domains, they can share the same subscribe database for functions such as security, billing, and customer management. The proposed scheme is applicable to both tight coupling and loose coupling scenarios. Since radio resource is one of the most precious resources in wireless communications, admissible sets in different networks are introduced in the

Wireless Netw (2010) 16:1277–1288

1279

following subsections. These admissible sets will be used in our formulation later. 2.2 Admissible set in WLANs

Thus the admissible set of WiMAX networks n can be derived as [25] ( ) L X U l ðn; tÞW l ðnÞ  Cn ðtÞ ; ð4Þ Sn ¼ gðnÞ 2 ZþJ : l¼1

In IEEE 802.11e based wireless local area networks [20], throughput and delay are important QoS metrics. Authors of [21] derive the throughput of IEEE 802.11e, and an optimal operating point is determined in [22]. Assume that n is the network number of a WLAN, and g(n) is a vector representing the numbers of different service types in network n. According to these results, we adopt the following admissible set for wireless LAN n.   Sn ¼ gðnÞ 2 ZþJ : Bl ðnÞ  TBl ðnÞ; El ðnÞ  TEl ðnÞ ; ð1Þ where Bl(n) and El(n) are the throughput and delay of service type l in network n, respectively. TBl(n) is the throughput constraint and TEl(n) is the delay constraint for sessions of service type l in network n. 2.3 Admissible set in WiMAX networks In this paper we assume that the WiMAX networks use TDD scheme based on OFDM/TDMA, although WiMAX supports both TDD and FDD operations [23]. Thus, all subcarriers are allocated to one session at one time. To enhance the capacity and quality, adaptive modulation and coding are assumed. We describe the receiving SNR j with the general Nakagami-i model. The receiving SNR is a random variable distributed according to the probability density function ji pj ðjÞ ¼ e j ii ji1 =½ji =CðiÞ; where i is the Nakagami fading parameter, j is the average value of SNR j; and R1 CðiÞ :¼ 0 ti1 et dt is the Gamma function. The receiving SNR can be divided into H ? 1 non-overlapping intervals with boundary points fjh gh¼0; 1; ...; H ; where H is the available number of transmission modes of WiMAX (H = 7 in IEEE 802.16), and 0\j0 \j1 \    \jHþ1 : Thus when the current SNR j 2 ½jh ; jhþ1 Þ; mode h will be chosen. h here is the mode index. If the SNR is too low to avoid possible transmission errors, i.e., j\j0 ; no data is to be transmitted. Thus, the probability of active mode h is     Zjhþ1 C h; hjjh  C h; hðjjhþ1 Þ ; ð2Þ pj ðjÞdj ¼ PðhÞ ¼ CðhÞ jh

R1 where Cðh; xÞ :¼ x ti1 et dt is the complementary incomplete Gamma function. We use the average transmission rate as the capacity for the WiMAX network n [24] Cn ¼

H þ1 X

PðhÞCðhÞ;

h¼0

where C(h) is the capacity with mode h.

ð3Þ

where Cn(t) is the capacity of WiMAX network n at time t, L is the total number of service types, Ul(n, t) is the number of sessions of service type l in network n at time t, and Wl(n) is the bandwidth required by type l service in WiMAX network n. 2.4 Admissible set in cellular networks We take CDMA cellular networks with matched filter receivers as one type of the networks in the heterogeneous networks. One important physical layer QoS requirement for sessions of service type l in CDMA cellular network n is the target signal-to-interference ratio (SIR), which should be kept above the target value xn,l [26]. Denote by W the total cell bandwidth, Hn,l the average data rate, q the orthogonality factor, r the ratio between intercell interference and total intracell power, Ku the path loss of session u, Pp the power of common control channels, PN the power of background noise. To guarantee the SIR requirement, the minimum base station transmission power is P P Ku Pp þ PN Ll¼1 U u¼1 W=ðxn;l Hn;l Þþq PT ¼ ; P P qþr 1  Ll¼1 U u¼1 W=ðxn;l Hn;l Þþq Then the admissible set of CDMA networks n with matched filter receiver is [26]   Sn ¼ gðnÞ 2 ZþJ : PT  PMAX ; T where PMAX is the maximum available base station power. T

3 Distortion optimization for multimedia transmission in heterogeneous networks Recent advanced coding algorithms, such as H.264 and MPEG-4, use rate control mechanism to control the video encoder output bit rate and error resilience mechanism for error protection [27]. Intra-refreshing, also called intraupdate, of macroblocks (MBs) is an important approach for rate control and error protection. An intra coded MB does not need information from previous frames that may have already been corrupted by channel errors. This makes intra coded MBs an effective way to mitigate error propagation. Alternatively, with inter-coded MBs, channel errors from previous frame may propagate to the current frame.

123

1280

Wireless Netw (2010) 16:1277–1288

In heterogeneous wireless networks, different networks provide different data rates and different link quality to the mobile users. Given a data rate in a network, authors in [27] provide a closed form distortion model taking into account varying characteristics of the input video, coding algorithm, and the intra-refreshing rate. We will use this rate-distortion model in our study. The total distortion comprises of the quantization distortion introduced by the lossy video encoder to meet a target bit rate and the distortion resulting from channel errors, which will be presented in the following subsections. 3.1 Source distortion The source distortion is given by Ds(Hs,n) = Ds(Hs, 0) ? n(1 - g ? gn)[Ds(Hs,1) - Ds(Hs,0)], where Hs is the source coding rate, n is the intra-refreshing rate, g is a constant based on the multimedia sequence. Ds(Hs, 0) and Ds(Hs, 1) are the time average all inter- and intra-mode selection for all frames. Ds ðHs ; 0Þ ¼

Yk T 1 1X 1X Ds ðHs ; 0; yÞ; T k¼0 Yk y¼1

Yk T 1 1X 1X Ds ðHs ; 1Þ ¼ Ds ðHs ; 1; yÞ; T k¼0 Yk y¼1

ð5Þ

where Yk is the number of inter/intra frames at epoch tk. 3.2 Channel distortion According to the rate-distortion model [27], the average channel distortion is given by   X1 w Dc ðw; nÞ ¼ E½Fd ðy; y  1Þ; 1  X2 þ X2 n 1  w ð6Þ where w is the packet loss rate, X1 is the energy loss ratio of the encoder filter, X2 is a constant based on the motion randomness of the multimedia data, and E[Fd(y, y-1)] is the average value of the frame difference Fd(y, y-1) over the epochs. 3.3 Optimal intra-refreshing rate The total distortion is D(Hs, w, n) = Ds(Hs, n) ? Dc(w, n). Thus the optimal n* to minimize the total distortion is given by n ¼ arg min DðHs ; w; nÞ: n

4 Restless bandit formulation The restless bandit problem is an extension to the classical multiarmed bandit problem [28]. Each lever provides a reward when it is pulled. It is a model of an project that is trying to achieve the balance of acquiring new knowledge and optimizing its decisions based on existing knowledge. The multiarmed bandit approach is to maximize the overall discounted reward based on the balancing. As a special type of stochastic control problem, the multiarmed bandit problem allows N parallel projects with finite state spaces to decide which one project will be active at each discrete time instant in a distributed way. We denote by N the number of these parallel projects. An active project earns a reward, with the change of its state. A passive one does not change state with a passive reward. The aim is to maximize the total discounted reward earned over the time horizon, by determining the optimal policy that identifies which project should be active at each time point. According to the indexable rule of the multiarmed problem, the optimal policy can be found by simply choosing the project with the largest index. Although it is a relatively simple solution to the multiarmed problem, in our network selection problem, it is not realistic to allow only the active network to change state. The restless bandit problem is proposed to deal with this problem [12–15]. At each epoch t1 ; t2 ; . . .; one or more projects out of N can be active, and the states of all N networks may change. We denote by M the number of active projects at each epoch. Reward is earned at each time slot by each project. There is also an indexable rule of the restless bandit problem. Projects are selected to be active according to their indices that are calculated by linear programming (LP) relaxation [13] based on the states, transition probabilities and rewards. This restless bandit approach has been successfully used to solve clinical trial [12], project selection [13] and aircraft surveillance [14] problems, among others. In this paper, we use this approach to solve the network selection problem in heterogeneous networks. 4.1 Decision epochs and actions

ð7Þ

To deal with the time-varying wireless connection states of the networks, we use adaptive intra-refreshing rate n to achieve the minimum distortion. Decreasing n reduces the

123

source distortion Ds for a target bit rate. However, intercoding relies on information in previous frames. Packet loss results in error-propagation until the next intra-coded macro block is received. Thus there is a tradeoff, and our aim is to find an optimized n to minimize the distortion by (7).

In this paper, we set the decision epochs to be the set of session arrival and departure time points, because the states change when a session arrives and departs. Denote tk by the decision epochs, k ¼ 0; 1; 2; . . .; tk0 the arrival epochs, and

Wireless Netw (2010) 16:1277–1288

1281

t*k the departure epochs. The time intervals between two adjacent arrival epochs and two adjacent departure epochs 0  are ðtk0 ; tkþ1  and ðtk ; tkþ1  respectively, and the durations of which are both exponential distributed random variables with the expected number of epochs each time unit, P PL inP N l l or traffic rate, m ¼ Ll¼1 ml and l¼1 n¼1 U ðn; tk Þl , l respectively, where U (n, tk) is the number of type l sessions in network n at epoch tk, ml and ll are the type l session arrival and departure rate respectively. Consequently, the time intervals between epochs (tk, tk?1] are exponentially distributed with the expected number of P P epochs in each time unit m þ Ll¼1 Nn¼1 U l ðtk Þll . This is called the total traffic rate. The action is the network selection decision at the current epoch. At each epoch tk, one of the networks is selected to be active, meaning that it is ready to admit a new arrival session at the next epoch tk?1 if a new session arrives at tk?1. At each arrival epoch tk0 , only the state of the network selected at the former epoch changes; At each departure epoch tk , only the state of the network from which the session departs changes. For each network n at epoch tk,

1; if network n is active at epoch tk ; an ðtk Þ ¼ ð8Þ 0; if network n is passive at epoch tk : P The actions satisfy Nn¼1 an ðtk Þ ¼ 1. 4.2 State space and transition probabilities The state of network n at epoch tk is defined as sðn; tk Þ ¼ ½U l ðn; tk Þl2f1; 2; ...; Lg ; where L is the number of service types. Thus the state space of network n is the admissible set Sn : The state of network n under action a evolves according to a Markov chain with the transition probability pai,j(n) from state si ðnÞ ¼ ½uli ðnÞl2f1; 2; ...; Lg to sj ðnÞ ¼ ½ulj ðnÞl2f1; 2; ...; Lg : Define the expected interval duration between two epochs for the state si to be si = E(tk?1-tk | k)), which is the inverse of the total traffic rate si ¼ si(n, tP 1 m þ Ll¼1 Uil ðnÞll Define the transition probability matrix of network n with action a to be Pa(n) = [pai,j(n)]S(n) 9 S(n), where S(n) is the number of available states s(n) of network n. Denote by v(l), 1 B l B L, the L-element row vector of which the lth element is one and the other elements are zero, thus the transition probabilities can be represented as

8 ml fðsj ðnÞÞasi ; > > < l Ui ðnÞll si ; a pi;j ðnÞ ¼ > 1  ml fðsj ðnÞÞasi  Uil ðnÞll si ; > : 0;

where f(x) is defined as

1; if x 2 Sn ; fðxÞ ¼ 0; otherwise.

ð10Þ

4.3 System reward The optimization goal is to maximize the total discounted reward which is defined as Z¼

T 1 Uðt Xk Þ X

bTk1 Ru ðtk Þ;

ð11Þ

k¼0 u¼1

where T is the number of epochs considered, b is a discount factor, and Ru(tk) is the reward of session u at epoch tk Ru ðDðuÞ; BðuÞÞ ¼ ½c1 lgðDðuÞÞ  c2 BðuÞ þ c3 si ;

ð12Þ

where D(u) is session u’s distortion, B(u) is the price paid by session u, which is related to the current serving network. c1 C 0, c2 C 0 and c3 are constant coefficients. By adjusting the coefficients, the balance of distortion and price can be achieved. Since sessions of the same service type in the same network have the consistent properties, the distortion minimization for them will choose the same intra-refreshing rate, and achieve the same minimized distortion. Besides, the costs of these sessions are also the same. Consequently, (11) can be also written as Z¼

T1 X N X L X

bTk1 U l ðn; tk ÞRl ðnÞ;

ð13Þ

k¼0 n¼1 l¼1

where Rl(n) is the reward by session of type l in network n. The objective of our problem is to maximize the total reward to achieve Z  ¼ max ZðAÞ: A2A

ð14Þ

4.4 Indices and policies The restless bandit approach has an indexable rule that reduces the computational complexity dramatically. For network n in state in, we denote by the index dn(in). According to the restless bandit approach, the optimal policy A* is a set of optimal actions. Let the element of A*

if sj ðnÞ ¼ si ðnÞ þ vðlÞ; if sj ðnÞ ¼ si ðnÞ  vðlÞ; if sj ðnÞ ¼ si ðnÞ; otherwise ;

ð9Þ

123

1282

Wireless Netw (2010) 16:1277–1288

in row n and column k be a*n(tk), which represents the optimal action for network n at epoch tk, thus

1; if dn is the smallest among fd1 ; d2 ; ; dN g;  an ðtk Þ ¼ 0; else. ð15Þ Define the set of all available policies to be A ¼ fAg: Thus A ¼ arg maxA2A ZðAÞ: In our network selection problem, at each epoch, the network with the smallest index dn is set to be active, while other networks are passive. At the next epoch, if a session arrives, the active network will admit the new session; if a session departs, only the corresponding network needs to do the deassociation action.

5 Solving the restless bandit problem The standard restless bandit problem allows M out of N objects to be active at epoch tk. The reward Ra(n) is earned by each object, with its state changing according to the transition probability matrix Pa(n). The total reward is time-discounted by the discount factor b. The aim is to find the optimal policy A 2 A to maximize the expected reward R(A). 5.1 Solving the restless bandit problem by LP relaxation To solve the restless bandit problem, a hierarchy of increasingly stronger LP relaxations is developed based on the result of LP formulations of Markov decision chains (MDCs) [13], the last one of which is exact. To formulate the problem, we first introduce Iaj (tk): if action a is taken at epoch tk in state j, Iaj (tk) = 1; otherwise Iaj (tk) = 0. With Iaj (t), let " # T 1 X Tk1 a a xj ðAÞ ¼ EA Ij ðtk Þb ð16Þ k¼0

represent the total discounted time that action a is taken in state j under policy A, where EA denotes the function of expectation over policy A. Denote by D ¼ fði; aÞ : i 2 S; a 2 Ag the state-action space. Consequently, (14) can be translated into X Z  ¼ max Rain xai ðAÞ; ð17Þ A2A

ði;aÞ2D

where Rain is the reward by the network n in state i with action a. Let’s introduce the performance vector xðAÞ ¼ ðxaj ðAÞÞj2S;A2A under all A 2 A: We can rewrite (17) as P Z  ¼ maxx2X ði;aÞ2D Rain xai ; where X ¼ fxðuÞ; u 2 Ug:

123

We decompose (16) for two admissible actions " # T 1 X Ii1n ðtk ÞbTk1 ; x1in ðAÞ ¼EA k¼0

x0in ðAÞ

¼EA

" T 1 X

# Ii0n ðtk ÞbTk1

ð18Þ ;

k¼0

Thus the restless bandit problem can be formulated as the linear program X X X a a Z  ¼ max Rinn xinn ; ð19Þ x2X

n2f1;2;...;Ng in 2Sn an 2f0;1g

where X ¼ fx ¼ ðxainn ðAÞÞin 2Sn ;an 2f0;1g;n2f1;2;;Ng j A 2 Ag: The approach to solve this problem is to construct relaxations of polytope X that yield polynomial-size relaxations of linear program. Denote by Xb  X the relaxations, not on the space of the original variables xai , but in a higher-dimensional space that includes new auxiliary variables [13]. Now the first-order relaxation can be formulated as the linear program X X X Z 1 ¼ max Rainn xainn n2f1;2;...;Ng in 2Sn an 2f0;1g

subject to ð20Þ

xn 2 Q1n ; n 2 f1; 2; . . .; Ng; X X 1 : x1in ¼ 1b n2f1; 2; ; Ng i 2S n

n

There are OðN jSmax jÞ variables and constraints of this linear program, where jSmax j ¼ maxn2f1;2;;Ng jSn j; with the size polynomial in the problem dimensions. 5.2 Primal-dual priority-index heuristic In this subsection, a heuristic for the restless bandit problem that uses the information contained in optimal primal and dual solutions to the first-order relaxation is presented. The primal-dual heuristic is interpreted as a priority-index heuristic as well. The dual of (20) is X X M D1 ¼ max k ajn kjn þ 1 b n2f1;2;...;Ng j 2S n

n

subject to X ki n  b p0in jn kjn  R0in ; in 2 Sn ;

n ¼ 1; . . . N;

ð21Þ

jn 2Sn

ki n  b

X

p1in jn kjn  R1in ; in 2 Sn ;

n ¼ 1; . . . N;

jn 2Sn

k  0: We denote by fxainn g and fkin ; kg the optimal primal and dual solution pair to the first-order relaxation (20) and its

Wireless Netw (2010) 16:1277–1288

1283

dual (21). Let fcainn g represent the corresponding optimal reduced cost coefficients: X c0in ¼kin  b p0in jn kjn  R0in ; jn 2Sn

c1in ¼kin  b

X

p1in jn kjn  R1in ;

index corresponding to the current state and action. The off-line computation is as follow. 1.

ð22Þ

jn 2Sn

which must be nonnegative. Furthermore, c0in and c1in can be interpreted as the rates of decrease in the objective-value of linear program (20) per unit increase in the value of the variable x0in and x1in ; respectively. We define a directed graph from the transition probabilities for each network n 2 N : Gn ¼ ðSn ; An Þ; where An ¼ fðin ; jn Þjp0in jn [ 0; and p0in jn [ 0; in jn 2 Sn g: Thus under the mixing assumption that Gn is connected for every n, every extreme point x of polytope P1 has the following properties [13]: 1. 2.

2.

After the off-line initialization, at epoch tk, the on-line computation is as follows: 1.

There are at most one network m and one state im 2 Sm for which x1im [ 0 and x0im [ 0: For all other networks n and all other states either x1in [ 0 or x0in [ 0:

Based on the cost coefficients computed in (22), the index of the network n in state in is defined as din ¼ c1in  c0in :

ð23Þ

The priority-index rule is to select the network that has the smallest index to be active. In case of ties, set active network with x1in [ 0:

6 The process of the optimal scheme In the proposed optimal scheme, at each epoch, a request from the session is sent to all the networks. If this is an arrival epoch, the new session is to be associated to the current active network, and an optimal intra-refreshing rate is selected for the transmission; if this is a departure epoch, a session leaves from its network. Then every network calculates its own index based on the current state, and shares with others in a distributed way. By comparing the indices, the network with the lowest index is selected to be the active one for the new session association decision at the next epoch. 6.1 Optimal network selection The network selection is in a distributed and cooperative way, which can be divided into the off-line stage and the on-line stage. In the off-line stage, indices are calculated for all states and actions, and are stored in a table. In the on-line stage, a network looks up its table to find out the

According to the admissible sets of the networks and the session arrival/departure rate, the state space and transition probability matrices under different actions are determined. For each network n and each possible state in 2 Sn ; input the state transition probability pain jn ; the reward Rain ; the discount factor b and the initial state probability vector a , then off-line compute the finite set of the indices fdin g according to (21–23). Store these indices and the corresponding pain jn ; Rain and a in a table.

2. 3. 4. 5.

Denote by na the current active network. If this is an arrival epoch and the active network na is capable to admit the new arrival session according to Sna ; na admit the session and update the its state sna : If this is an arrival epoch but the active network na is not capable to admit the new session, the new session is to be rejected. If this is a departure epoch, the session leaves from the associated network, and the state of the network is updated. Each network n shares its state sn as the initial state probability vector a with the others. With a; each network looks up the index table to find out the corresponding index din : The networks share their indices din in a distributed way. Each network arranges the list of the indices from the lowest to the highest. A network is set to be active if its index is in the first place.

6.2 Optimal intra-refreshing rate Given the source-coding bit rate Hs and the packet loss rate w for session of type l, the intra-refreshing rate n is off-line optimized for different situations according to (7) to minimize the total distortion. Thus the minimized distortion D* = D(Hs, w,n*) can be calculated as a part of the reward Rn,l. This reward is used for the policy optimization.

7 Simulation results and discussions In this section, extensive simulation results are presented to show the performance of the proposed optimal network selection scheme. The video and VoIP session arrival time is Poisson distributed with the expected rates m1 and m2, respectively. The session departure time is also Poisson distributed, with the expected rate l. The area considered is covered by three networks: a WLAN, a WiMAX

123

1284

Wireless Netw (2010) 16:1277–1288

Table 1 Parameters of the networks Parameter

Value

Parameter

Value

Target SIR value

10 dB

Total cell bandwidth

3.84 Mc/s

Orthogonality factor

0.4

Inter/intra cell interference ratio

0.55

Common control channel power

33 dB

Background noise power

-106 dB

Maximum power of BS

43 dB

Cellular

WiMAX System Bandwidth

7 MHz

Number of carriers

256

Number of data carriers

192

Sampling factor

8/7

Guard period ratio

1/4

Average SNR

15 dB

11 Mbps 1 ls

Slot time Time to transmit a PHY header

10 ls 48 ls

Time to transmit a MAC header

25 ls

Time to transmit a RTS

15 ls

Time to transmit a CTS

10 ls

Time to transmit an ACK

10 ls

AIFSN

1

Maximum contention window

32

WLAN

network and a cellular network. We adopt the parameters shown in Table 1 [7, 9, 29]. The data rate of the VoIP service is 64 Kbps, while video service data rate varies in different networks. In WiMAX and WLAN networks, it’s 1.17 Mbps. In the cellular network, it’s 240 Kbps because of relatively low bandwidth in the cellular network. For the cost of network services, we use the prices in New York city currently: the price of 3G mobile Internet access provided by AT&T is $60 per month with a limitation of 5 GB; WiMAX broadband access by Sprint/Nextel is $59.99 per month also with the 5 GB limitation; WiFi access by AT&T is $19.95 per month. Assume c1 = c2 = 0.5 and c3 = 34 in (12). Moreover, since the price differs in different regions, we also change WiMAX and cellular network access prices and present the performance with different price ratios. In the distortion model, we assume that X1 = 0.001, X2 = 1.0 and E[Fd(y,y-1)] = 100 in (6). The existing network selection scheme ignores the optimization of application layer QoS and the optimal network selection from the view point of the users, i.e., a fixed intra-refreshing rate is used, and the network selection is solely based on the signal strength, or SNR, without the consideration of application layer QoS requirement. In this simulation, we compare the performance of our proposed optimal scheme with the existing scheme with a fixed n = 0.7. 7.1 Optimal intra-refreshing rate These three different types of wireless networks provide different data rates and different link quality to the mobile user. From a user’s point of view, application layer

123

10 WLAN WiMAX Network Cellular Network .

9

Distortion (MSE)

Average channel bit rate Propagation delay

8

7

6

5

4

0

0.2

0.4

0.6

0.8

1

Intra−Refreshing Rate

Fig. 1 The video distortion with different intra-refreshing rate n

distortion is more important than the QoS at other layers. In Fig. 1, we plot the application layer distortion, which is the mean square error between the original and decoded video frames, with different intra-refreshing rate n. To minimize the distortion, the optimal intra-refreshing rate can be optimized for the WLAN, WiMAX network and cellular network, which is shown in Fig. 2. 7.2 Reward along the time line At each epoch, a network selection decision is made and the number of associated sessions is updated. To illustrate the dynamics of the system, we plot the session number, which is an average value of 2000 trials, in Fig. 3. In the initial state, there are two VoIP and one video sessions in

Wireless Netw (2010) 16:1277–1288

1285

1

300

250

0.8 0.7

200

0.6

Reward

Optimal Intra−Refreshing Rate

0.9

0.5 0.4

150

100 0.3 0.2

Optimal Scheme Existing Scheme

50

0.1

0

0 WLAN

WiMAX Network

Cellular Network

Fig. 2 The optimal intra-refreshing rate for different networks

40

60

80

100

Time (min)

shown that because of the optimized network selection scheme and intra-refreshing rate n, the optimal scheme improves the reward significantly compared to the existing scheme. Total WLAN WiMAX Network Cellular Network .

15

10

5

0

20

40

60

80

100

Time (min)

Fig. 3 The average number of sessions along the time

the networks. Assume l = 0.2, m1 = 1.6 and m2 = 3.2. From Fig. 3, we can see that the number of sessions goes up first and becomes converged after about 60 min, when the balance between the expected numbers of sessions departs and arrives is achieved, and the total session number does not change dramatically any more. We can also observe that the WLAN, which provides the highest reward is more likely to be selected when it is not saturated. After WLAN’s saturation, the WiMAX network whose reward is higher than the cellular network but lower than the WLAN becomes the first choice. Each session in one of the networks earns its reward at each epoch. In Fig. 4, we compare the total reward of the heterogeneous networks with another scheme, in which no application layer distortion is considered and each individual network is optimized separately. The rewards increase first, and converge after about 60 min in the same manner as the curve of session numbers in Fig. 3. It is

7.3 Reward with different traffic rates In this subsection, we present the affect of traffic rate on the expected reward. We adopt the average value of the reward after 60 minutes (convergence time point) in the time line as the expected reward. As shown in Fig. 5, with the increase of session departure rate l, the session number decreases in the networks, and consequently, the reward decreases. We set m1 = 1.6 and m2 = 3.2 in this simulation. We can see that the reward of the optimal scheme is always better than the existing scheme.

350 Optimal Scheme Existing Scheme

300

Expected Reward

20

Average Session Number

20

Fig. 4 The reward comparison along the time

25

0

0

250 200 150 100 50 0 0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

User Departure Rate

Fig. 5 The expected reward comparison with different session departure rate l

123

1286

Wireless Netw (2010) 16:1277–1288

200

which consume less resource. Thus the reward increases for both schemes. We can observe from the figures that the optimal scheme improves the reward significantly with different VoIP session arrival rate m2.

150

7.4 Reward with different price ratios

Expected Reward

250

100

50 Optimal Scheme Existing Scheme

0

0

0.5

1

1.5

2

The prices of network access services are decided by the operators, and thus differ dramatically in regions and times. We will show the performance of the proposed scheme with different service prices. In Fig. 8, the reward comparison is presented with fixed WLAN and cellular network prices and variable WiMAX

Video User Arrival Rate 400

Fig. 6 The expected reward comparison with different video session arrival rate m1

350

Expected Reward

In Fig. 6, the reward comparison for different video session arrival rate m1 is presented. Assume l = 0.2 and m2 = 3.2. Since that the total session number increases as m1 increases, the total reward also goes up for the existing scheme. However, a larger m1 also means relatively more video sessions, which consume much more resources. Therefore, in the optimal scheme, resource in the network that provides the highest reward could be used up by video sessions very quickly without obtaining high reward. That is why the reward in the optimal scheme does not change dramatically with different video session arrival rate m1. Nevertheless, with different m1, the optimal scheme performs much better than the existing scheme. The situation is different in Fig. 7, in which the reward with different VoIP session arrival rate is shown. Assume l = 0.2 and m1 = 1.6. Increasing m2 with a fixed m1 is equivalent to increasing the proportion of VoIP sessions,

300 250 200 150 100 50 0 0.3

Optimal Scheme Existing Scheme

0.35

0.4

0.45

0.5

0.55

0.6

Price Ratio: WLAN/WiMAX

Fig. 8 The expected reward with different WLAN/WiMAX price ratio, with fixed WLAN and cellular network price and variable WiMAX price 400

300

350

Optimal Scheme Existing Scheme

300

Expected Reward

Expected Reward

250

200

150

100

200 150 100 50

50

0

250

0 0.5

2

2.5

3

3.5

4

Optimal Scheme Existing Scheme

1

1.5

Price Ratio: WiMAX/Cellular

VoIP User Arrival Rate

Fig. 7 The expected reward comparison with different VoIP session arrival rate m2

123

Fig. 9 The expected reward with different WiMAX/cellular network price ratio, with fixed WLAN and WiMAX price and variable cellular network price

Wireless Netw (2010) 16:1277–1288

price. The reward is plotted against the WLAN/WiMAX price ratio. With the decrease of WiMAX price (the increase of WLAN/WiMAX price ratio), we can see that the total reward goes up, and the optimal scheme always chooses the best network for the new arrival session, thus obtains a higher reward. We then assume different WiMAX/cellular network price ratio with fixed WLAN and WiMAX prices and variable cellular network price. As shown in Fig. 9, the expected reward also grows as the price ratio increases, and the price of cellular network decreases. Our optimal scheme performs the optimal action at each epoch to guarantee the maximized reward under different price conditions.

8 Conclusions and future work In this paper, we have proposed an optimal network selection scheme in heterogeneous wireless networks considering multimedia application layer QoS. An application layer parameter, intra-refreshing rate, is adapted in heterogeneous wireless networks. The integrated network is modeled as a restless bandit system. We have presented an indexable optimal network selection policy. Simulation results were presented to show that application layer QoS has impact on the system performance, and the proposed scheme can improve the performance significantly. Network selection is a complex procedure, in which a number of factors should be considered in practice. In this paper, we consider application layer distortion and access price of different networks. It is interesting to consider other factors in our framework to design an indexable network selection schemes. Acknowledgment We thank the reviewers for their detailed reviews and constructive comments, which have helped to improve the quality of this paper.

References 1. ETSI. (2001). Requirements and architectures for interworking between HIPERLAN/3 and 3rd generation cellular systems. Tech. Rep. ETSI TR, 101, 957, Aug. 2. 3GPP TS 23.234, v.6.2.0,. (2004). Group services and system aspects; 3GPP systems to wireless local area network (WLAN) interworking; system description (release 6), Sept. 3. Gustafsson, E., & Jonsson, A. (2003). Always best connected. IEEE Wireless Communications, 10(1), 49–55. 4. Song, Q., & Jamalipour, A. (2005). Network selection in an integrated wireless LAN and UMTS environment using mathematical modeling and computing techniques. IEEE Wireless Communications, 12(3), 42–48. 5. Song, W., Jiang, H., Zhuang, W., & Shen, X. (2005). Resource management for QoS support in cellular/WLAN interworking. IEEE Network, 19(5), 12–18.

1287 6. Song, W., Jiang, H., & Zhuang, W. (2007). Performance analysis of the WLAN-first scheme in cellular/WLAN interworking. IEEE Transactions on Wireless Communications, 6(5), 1932–1952. 7. Song, W., Cheng, Y., & Zhuang, W. (2007). Improving voice and data services in cellular/WLAN integrated networks by admission control. IEEE Transactions on Wireless Communications, 6(11), 4025–4037. 8. Bari, F., & Leung, V. (2007). Automated network selection in a heterogeneous wireless network environment. IEEE Network, 21(1), 34–40. 9. Yu F., & Krishnamurthy, V. (2007). Optimal joint session admission control in integrated WLAN and CDMA cellular networks with vertical handoff. IEEE Transactions on Mobile Computing, 6(1), 126–139. 10. Niyato, D., & Hossain, E. (2008). A noncooperative game-theoretic framework for radio resource management in 4G heterogeneous wireless access networks. IEEE Transactions on Mobile Computing, 7(3), 332–345. 11. Gelabert, X., Pere´z-Romero, J., Sallent, O., & Agustı´, R. (2008). A Markovian approach to radio access technology selection in heterogeneous multiaccess/Multiservice wireless networks. IEEE Transactions on Mobile Computing, 7(10), 1257–1270. 12. Whittle, P. (1988). Restless bandits: activity allocation in a changing world. In J. Gani, (Ed.), A celebration of applied probability (vol. 25 of J. Appl. Probab, pp. 287–298). Sheffield: Applied Probability Trust. 13. Berstimas, D. & Nin˜o-Mora, J. (2000). Restless bandits, linear programming relaxations, and a primal dual index heuristic. Operations Research, 48(1), 80–90. 14. Ny, J. L., Dahleh, M., & Feron, E. (2006). Multi-agent task assignment in the bandit framework. In Proceedings of the 45th IEEE Conference on Decision and Control (pp. 5281–5286). San Diego, California. 15. Ny, J. L. & Feron, E. (2006). Restless bandits with switching costs: Linear programming relaxations, performance bounds and limited lookahead policies. In Proceedings of the 2006 American Control Conference (pp. 1587–1592). Minneapolis, Minnesota. 16. Guo, G., Guo, Z., Zhang, Q., & Zhu, W. (2004). A seamless and proactive end-to-end mobility solution for roaming across heterogeneous wireless networks. IEEE Journal on Selected Areas in Communications, 12(5), 834–848. 17. Moon, K., Lee, Y., Son, Y., & Kim, C. (2003). Universal home network middleware guaranteeing seamless interoperability among the heterogeneous home network middleware. IEEE Transactions on Consumer Electronics, 49(3), 546–553. 18. Salkintzis, A. K. (2004). Interworking techniques and architectures for WLAN/3G integration toward 4G mobile data networks. IEEE Wireless Communications, 11, 50–61. 19. Buddhikot, M., Chandranmenon, G., Han, S., Lee, Y.W., Miller, S., & Salgarelli, L. (2003). Integration of 802.11 and third-generation wireless data networks. In Proceedings of IEEE INFOCOM’03, San Francisco, CA, Apr. 20. ANSI/IEEE Std. 802.11e, Draft 5.0 (2003). Wireless medium access control (MAC) and physical layer (PHY) specification: Medium access control (MAC) enhancement for quality of service (QoS), July. 21. Kuo, Y., Lu, C., Wu, E., & Chen, G. (2003). An admission control strategy for differentiated services in IEEE 802.11. In Proceedings of IEEE Globecom’03 (pp. 707–712). San Francisco, CA, Dec. 22. Zhu, H., & Chlamtac, I. (2006). A call admission and rate control scheme for multimedia support over IEEE 802.11 wireless LANs. Wireless Networks, 12, 451–463. 23. IEEE Std. 802.16-2004 (2004). IEEE standard for local and metropolitan area networks, part 16: Air interface for fixed broadband wireless access systems, Oct.

123

1288 24. Liu, Q., Zhou, S., & Giannakis, G.B. (2005). Queuing with adaptive modulation and coding over wireless links: Cross-layer analysis and design. IEEE Transactions on Wireless Communications, 4(3), 1142–1153. 25. Elwalid, A.I., & Mitra, D. (1993). Effective bandwidth of general Markovian traffic sources and admission control of high speed networks. IEEE/ACM Transactions on Networking , 1(3), 329– 343. 26. Holma, H., & Toskala, A. (2004). WCDMA for UMTS: Radio access for third generation mobile communications. NY: Wiley. 27. He, Z., Cai, J., & Chen, C. (2002). Joint source channel ratedistortion analysis for adaptive mode selection and rate control in wireless video coding. IEEE Transactions on Circuits and Systems for Video Technology, 12(6), 511–523. 28. Robbins, H. (1952). Some aspects of the sequential design of experiments. Bulletin of the American Mathematical Society, 55, 527–535. 29. Zhang, S., Yu, F., & Leung, V. (2008). Joint connection admission control and routing in IEEE 802.16-based mesh networks. In Proceedings of IEEE International Conference on Communications (ICC’08) (pp. 4938–4942), Beijing, P.R. China, May.

Author Biographies Pengbo Si received the B.S. degree in Communications Engineering and Ph.D. degree in Communication and Information System from Beijing University of Posts and Telecommunications, Beijing, P.R. China, in 2004 and 2009, respectively. From November 2007 to November 2008, he visited Carleton University, Ottawa, ON, Canada and the University of British Columbia, Vancouver, BC, Canada as a visiting scholar. He joined Beijing University of Technology in June 2009, where he is currently a lecturer. His research interests include heterogeneous networks, cognitive radio, radio resource management and optimal distributed scheduling algorithms. Dr. Si served on the Technical Program Committee (TPC) of Cognitive Wireless Communications and Networking (CWCN) 2009 Workshop

123

Wireless Netw (2010) 16:1277–1288 at the International Conference on Ultra Modern Telecommunications (ICUMT) 2009. Hong Ji received the B.S. degree in Communications Engineering, M.S. degree and Ph.D. degree in Information and Communications Engineering from Beijing University of Posts and Telecommunications, Beijing, China, in 1989, 1992 and 2002, respectively. From June to December 2006, she visited the University of British Columbia, Vancouver, Canada as a visiting scholar. She is currently a professor at Beijing University of Posts and Telecommunications. She is also with national science research projects including Hi-TECH Research and Development Program of China (863 program) and The National Natural Science Foundation of China (NSFC) etc. Her research interests include heterogeneous networks, P2P protocols and cognitive radio. F. Richard Yu received the Ph.D. degree in Electrical Engineering from the University of British Columbia (UBC) in 2003. From 2002 to 2004, he was with Ericsson (in Lund, Sweden), where he worked on the research and development of 3G cellular networks. From 2005 to 2006, he was with a start-up in California, USA, where he worked on the research and development in the areas of advanced wireless communication technologies and new standards. He joined Carleton School of Information Technology and the Department of Systems and Computer Engineering at Carleton University, in 2006, where he is currently an Assistant Professor. His research interests include crosslayer design, QoS provisioning and security in wireless networks. He has served on the Technical Program Committee (TPC) of numerous conferences and as the TPC Co-Chair of IEEE IWCMC’2009, VTC’2008F Track 4, WiN-ITS’2007. He is a senior member of the IEEE.

Suggest Documents