A Correlated-Equilibrium-based Subcarrier Allocation Scheme for Interference Minimization in Multi-cell OFDMA Systems Jianchao Zheng1,2, Yueming Cai1,2, Dan Wu1,2 1. Institute of Communications Engineering, PLA University of Science and Technology, Nanjing 210007, China. 2. National Mobile Communications Research Laboratory, Southeast University, Nanjing 210096, China. E-mail:
[email protected],
[email protected],
[email protected] Abstract—Interference management is a key problem to system performance in the next-generation wireless networks. In this paper, a distributed cooperation policy selection scheme for interference minimization is proposed to perform subcarrier assignment for uplink multi-cell OFDMA systems by adopting a new solution concept, the correlated equilibrium, which achieves better performance by allowing each user to consider the joint distribution of the users’ actions. The proposed algorithm exhibits low complexity and converges to a set of correlated equilibria with probability one. Simulation results demonstrate that the proposed algorithm offers a good performance. Keywords-OFDMA; multi-cell; interferrence minimization; subcarrier allocation; correlated equilibrium
I.
INTRODUCTION
Orthogonal frequency division multiple access (OFDMA) has emerged as one of the most promising multiple access techniques for high data rate transmission over wireless channels due to its ability to mitigate multipath fading and its efficient implementation using IFFT and FFT blocks. For example, the most recently proposed next generation wireless wide area network (WWAN) standards, 3GPP2 ultra mobile broadband (UMB), IEEE802.20 mobile broadband wireless access (MBWA), 3GPP LTE and worldwide interoperability for microwave access (WiMAX) are all OFDMA based [1]. To take advantage of the multi-user diversity resulted from the variation in channel conditions among the users, it has become a challenging problem to efficiently allocate the resources such as subcarrier, bit and power [2]. In multi-cell environment, one of the major issues to research is how to maximize the performance by controlling the co-channel interference among the neighboring cells [3]. Future wireless network evolutions are envisioned to employ a full frequency reuse [1]. Hence, co-channel interference originating from neighboring cells is a major impairment that limits the system throughput [4]. Since the co-channel This work is supported by the Major National Science & Technology Specific Projects (No. 2010ZX03006-002-04), the Project of Natural Science Foundation of Jiangsu (No.BK2010101), the National Natural Science Foundation of China (No. 61001107 and 60972051) and the open research fund of National Mobile Communications Research Laboratory, Southeast University (No. 2010D09).
interferences are largely affected by the subcarrier assignment, any change of subcarrier allocation in a specific cell will affect the performances of the nearby cells [5]. Consequently, joint subcarrier allocation for interference management over a cluster of neighboring cells via base-station coordination is a promising solution. However, most recent works mainly focus on downlink systems and the design of distributed uplink subcarrier allocation algorithms which require limited information exchange among base-stations is still in its infancy [4]. Moreover, most of the existing game theoretic works are based on the concept of Nash equilibrium [6]. However, the Nash equilibrium might not be system efficient and the performance of the game outcome can still be improved. Unlike Nash equilibrium in which each user only considers its own strategy, correlated equilibrium, which was first proposed by Robert J. Aumann in [7] and further studied in [8–11], achieves better performance by allowing each user to consider the joint distribution of the users’ actions. In other words, each user needs to consider the others’ behaviors to see if there are mutual benefits to explore [12]. Correlated equilibrium proves to be a better solution compared to the non-cooperative Nash Equilibrium in many cases[12]. This paper makes a study on the subcarrier allocation games through correlated equilibrium in the uplink multi-cell OFDMA systems. We show that by adopting interference minimization as the users’ utility function, a new subcarrier allocation algorithm through correlated equilibrium can solve the interference problem well. Here we emphasize that the network is of the distributed type rather than the centralized cellular type. The rest of this paper is organized as follows: In Section II, we present the system model and a novel utility function considering the interference. In Section III, we study the correlated equilibrium. Then, we construct a no-regret learning algorithm and show that the algorithm converges to a set of correlated equilibria. Simulation results are shown in Section IV and finally conclusions are drawn in Section V.
II.
SYSTEM MODEL AND PROBLEM FORMULATION
A. System Model
978-1-4577-1010-0/11/$26.00 ©2011 IEEE
Ω3 Ω2
Ω
Ω2
Ω
Ω1 Ω3
Ω
(1)
Ω
Where Si and S− i denote the action of user i and his/her
Ω
opponents, respectively. gijk gives the channel gain between transmitter of user i and receiver of user j when transmission
Ω
Ω3 Ω2
K ⎛ N N ⎞ U i ( Si , S−i ) = −∑ ⎜ ∑ δ jik p jk g kji + ∑ δ ijk pik gijk ⎟ k =1 ⎝ j =1 j =1 ⎠
Ω
Fig. 1 Illustration of frequency reuse schemes a. Illustration of conventional fixed frequency reuse (1/3) b. Illustration of full frequency reuse Grey scale is used to represent the frequency ( Ωi ) that a sector uses Ω = Ω1 ∪ Ω2 ∪ Ω3 and Ωi ∩ Ωj = ∅ for i ≠ j Since high data rate demands of the next generation wireless communications require a highly efficient exploitation of the available spectrum, OFDMA systems employing the conventional frequency reuse (fig. 1-a) no longer meet the high spectral usage efficiency requirement [1]. Therefore, we consider the latter case (fig. 1-b) in which all subcarriers are made available to each cell, i.e., the full frequency reuse. Obviously, interference exists between any two cells, especially severe among the neighboring cells. In a multi-cell OFDMA system, the subcarrier allocation problem is very complicated. However, co-channel interference only exists among the users who share the same frequency band because different subcarriers are mutually orthogonal. Consequently, the problem could be decomposed into a simple one that each cell only serves one user who is randomly located similar to [13]. We consider an OFDMA system with N cells, and each user comprises a pair of transmit and receive nodes and will be allocated a given number of subcarriers. When performing subcarrier allocation, static channel conditions are assumed. We define the subcarrier assignment matrix A ∈ ( aik )
N ×K
,
whose element aik ∈ {0,1} . aik = 1 if subcarrier k is assigned
to user i ( user i and the user in the i th cell have the same meaning because all users are distributed in different cells) and aik = 0 otherwise. The i th row vector of A, denoted by aTi , shows the chosen strategy of user i . Considering that the all-zero vector 0 is a invalid choice, thus aTi ∈ Ωi , where Ωi = {0,1} \ {0} . K
B. Problem Formulation We first define the utility function and construct a cooperation policy selection game model based on it. The utility function for each user is defined as the negative of total interferences partly generated by this user to the environment and partly received from the environment which is expressed as [13]:
is made through subcarrier k . gijk ≠ g kji in general. p ik is the transmitted power of user i over subcarrier k , which must satisfy the non-negative requirement. And the total power transmitted by user i should be less than Pmax . δ ijk is a random variable indicating whether two users i and j transmit through a common subcarrier k . ⎧⎪1 i, j (i ≠ j ) both transmit via k
δ ijk = ⎨
(2) ⎪⎩0 otherwise It is easy to verify that δ ijk = δ jik = aik a jk (i ≠ j ) , and δ iik = 0 because a user does not cause any interference with itself.
All users will compete for the most suitable subcarrier assignment selfishly in order to minimize their interference value or maximize their utility function. This is a distributed optimization problem given by [13]: max U i ( Si , S− i ) , ∀i ∈ N T ai
s.t.
⎧aTi ∈ {0,1}K \ {0} ⎪ ⎨ p ≥ 0, ∑k pik ≤ Pmax ⎪⎩ ik
(3)
Which can be easily solved by modeling a game
G = [ N , {Ωi }i∈N , {U i }i∈N ]
(4)
Where the components of the game are given in the list: 1) N = {1, 2,… , N } is the index set of the players (we use player, user interchangeably). 2) Ωi = {0,1} \ {0} is the strategy space of player i . The space for the joint strategy profiles therefore is defined by S = Ω1 × Ω2 × … × ΩN . K
3) U i : S → R is the individual utility mapping the joint strategy space to the set of the real number. III. CORRELATED EQUILIBRIUM FOR COOPERATION POLICY SELECTION In order to analyze the outcome of the proposed game G , we focus on an important generalization of the Nash equilibrium, known as the correlated equilibrium that a strategy profile is chosen randomly according to a certain distribution given to the players by some “device” or “referee”. Each player is given-privately-instructions for his own play only and the joint distribution is known to all of them. It is to the players'
best interests to conform with this recommended strategy, and the distribution is called the correlated equilibrium [14]. A. Correlated Equilibrium Definition 1 [10]: For the proposed game G , a probability distribution p over the pure strategies S = Ω1 × Ω2 × … × ΩN is a correlated equilibrium, if and only if, for all i ∈ N , Si ∈ Ωi , and S− i ∈ Ω− i , ∀Si′ ∈ Ωi
∑
S− i ∈Ω− i
p ( Si , S− i )[U i ( Si′, S− i ) − U i ( Si , S − i )] ≤ 0
(5)
The inequality means that when the recommendation to user i is to choose action Si , then choosing any other action instead cannot obtain a higher expected utility.
Theorem 1 [15]: A correlated equilibrium always exists in the cooperation policy selection game G . Proof: The result from [16] shows that every finite game has a correlated equilibrium. Hence, Theorem 1 is justified, and enables the application of the proposed game.
Remark: The set of correlated equilibria is nonempty, closed and convex in G . In fact, every Nash equilibrium is a correlated equilibrium and Nash equilibrium correspond to the special case where the play of each different players is independent, i.e., p( Si , S − i ) = p( S1 ) × … p ( Si ) × … p( S N ) . Moreover, the set of correlated equilibrium distributions of G is a convex polytope that includes all the NE distributions. B. Optimal Corrrelated Equilibrium However, the correlated equilibrium defines a set of solutions which is better than Nash equilibrium, but which one is the most suitable should be carefully considered in practical design. [17] [18] discussed the criterion of correlated optimal. [10] proposed two refinements. The first one is the maximum sum correlated equilibrium that maximizes the sum of utilities of players. The second one is the maxi-min fair correlated equilibrium that aims to improve the worst player situation. The optimal correlated equilibrium can be formulated as: max ∑ E p (U i ) or max min E p (U i ) p
i∈N
p
i∈N
⎧ ∑ p( Si , S − i )[U i ( Si′, S− i ) − U i ( Si , S− i )] ≤ 0 ⎪ s.t. ⎨ S− i ∈Ω− i (6) ⎪⎩∀i ∈ N , ∀Si , Si′ ∈ Ωi where E p ( i ) is the expectation over p . The constraints guarantee that the solution is within the correlated equilibrium set.
IV.
DECENTRALIZED NO-REGRET ALGORITHM FOR COOPERATIVE POLICY SELECTION
A. Algorithm Description In this section, we present a decentralized learning algorithm which always leads to the set of correlated equilibria. From the result, each player can independently determine its own cooperative policy. Concretely, the proposed algorithm is based on the no-regret procedure of
[14]. In this procedure, players may depart from their current play with probabilities that are proportional to measures of regret for not having used other strategies in the past. Then, our algorithm for cooperative policy selection is executed independently by each player and summarized as follows. Suppose that the proposed game G is played repeatedly through time: n = 1, 2, … . At time n + 1 , given a history of
( )
play H n = S
τ
n
τ =1
∈ ∏ τ =1 S , each player i ∈ N chooses n
Sin +1 ∈ Ωi according to the average regret value. 1) Initialization: At the initial time n = 1 , each player initializes his/her strategy arbitrarily. 2) Iterative Update Process: ∙Utility Update: At the time n , each player i calculates the utility of the current strategy Si ∈ Ωi and the utility for choosing the different strategy Si′ ∈ Ωi . ∙ Average Regret Update: If player i replaces strategy Si , every time that it was played in the past, by the different strategy Si′ , the resulting difference in i ’s average utility up to time n is
Din ( Si , Si′) =
1 ∑ [U i ( Si′ , S−(τi ) ) − U i ( Si(τ ) , S−(τi ) )] (7) n τ ≤ n:Si(τ ) = Si
Rin ( Si , Si′) = max { Din ( Si , Si′), 0}
(8)
Where Rin ( Si , Si′) represents the average regret at time n for not having played, every time that Si was played in the past, the different strategy Si′ . ∙Strategy Decision: Assuming Si ∈ Ωi is the strategy last chosen by player i , i.e., Sin = Si . Then at time n + 1 , i updates his/her decision strategy according to the probability distribution:
1 n ⎧ n +1 ′ ⎪⎪ pi ( Si ) = μ Ri ( Si , Si′), ∀Si′ ≠ Si ⎨ n +1 ⎪ pi ( Si ) = 1 − ∑ pin +1 ( Si′ ) Si′≠ Si ⎪⎩
(9)
Where μ is a normalization factor. In the proposed algorithm, each player does not need to be concerned about the individual strategies and utilities of other players, global system structure, etc. Each one just needs to know the effect of other players on its individual utility function. In addition, each player views its current actual strategy as a reference point, and makes a decision for next period according to propensities to depart from it. However, the change should bring the improvement in individual utility, relative to the current choice.
B. Convergence Analysis
For every time n + 1 , we define z n ∈ ΔS by the empirical distribution of the N-tuple strategies choosed up to time n . Its element, denoted by z n ( S ) , ∀S ∈ S , represents the relative frequency that S has been played at time n ,i.e.,
zn ( S ) =
1 τ ≤ n : Sτ = S n
is separated by 100 3 m among each other. The base station is located at the center of each cell and one user is generated as a uniform distribution within the corresponding cell.
(10)
Moreover, the empirical distribution z n can be obtained by the recursion:
z n +1 = z n +
1 e n+1 − z n n +1 S
(
)
(11)
g 12 g 32
where eS n+1 = [0, 0,… ,1, 0,… , 0] denote the S dimensional unit vector with the one in the position of S n +1 .
The proof that z n converges to the set of correlated equilibria is presented in [14] and [19] respectively. Here, we only summarize and compare the two proofs: 1) In [14], the proof relies on a recursive formula for the distance of the vector of regrets to the negative orthant. Particularly, in order to satisfy the conditions of Blackwell’s approachability theorem, a multi-period recursion, where a large block of periods is combined together, substitutes for a one-period recursion. 2) In [19], the proof is based on a stochastic approximation convergence proof. A continuous time random process z n ( t ) is constructed by interpolating z n . The tail behavior of the sequence { z n } is captured by the behavior of z n ( t ) for large t .
Moreover, the trajectory of z n ( t ) converges almost surely to a trajectory whose dynamics are given by a different inclusion. Then, the asymptotically stable properties of the different inclusion tell us what we wish to know about the tail behavior of { z n } .
C. Computational Complexity Analysis At each iteration, each player i keeps record of the time n with the utilities of the current strategy and the utility for changing the different strategy. Other than this, the proposed algorithm requires one table lookup, not more than n + N additions and N + 1 multiplication to update the regret value, and one comparison to choose the next strategy.
V. SIMULATION RESULTS AND ANALYSIS In this section, we conduct simulations to study the performance of the proposed interference minimization subcarrier allocation algorithm. We consider a 3-cell OFDMA system sharing the total K = 8 subcarriers, where each cell has a radius of 100 m and
Fig. 2 the 3-cell OFDMA system structure For simplification and fairness, we fixed the number of subcarriers for each user to be Ki = 4 . In order to focus only on the subcarrier allocation, we assume that the maximum power per user is the same and the same power budget will be distributed among Ki subcarriers, thus we can set maximum power per subcarrier at Pmax Ki . The maximal power constraint of each user is set to Pmax = 2 W . The channel gain between
two
users
is
expressed
as hij = 0.097 / dijυ ,
where υ = 4 , dij is the distance between transmitter of user i and receiver of user j ( dij = d ji generally). Then for user i , j and subcarrier k , the channel gain is g ijk = hij β k , where β k is a random variable. We initialize the game with a random subcarrier assignment for each player. The players will take action to search for improvement in utility value by looking for the best response strategy after observing the opponent’s action. The total number of strategies in this case is K ! ( K i !( K − K i ) !) . -0.6
x 10
-9
user1 user2 user3
-0.8
-1 the value of utility
Theorem 2: If every player follows the proposed algorithm, the empirical distributions of play z n converge almost surely as n → ∞ to the set of correlated equilibria of our game.
-1.2
-1.4
-1.6
-1.8
0
10
20
30
40
50 iterations
60
70
80
90
Fig. 3 The utility vs the number of iterations
100
Fig. 3 shows utility of each player against the number of iterations. The individual utility is updated at each iteration. It is easy to observe that the convergence should take no longer than 20 iterations. Essentially, we have verified three important facts: (i) improvement of all users’ utilities, (ii) existence of correlated equilibria, and (iii) the convergence of the proposed algorithm. And similar simulation results can be achieved when more users participate. It should be noted that the speed of convergence can change with μ and the initial strategy of users.
find a better correlated equilibrium point through a number of trials. Algorithm 3 can get the optimal solution (the minimum interference) via exhaustive search. From this figure we can draw conclusions as follows: (i) the interference becomes serious as the number of the subcarriers per user increases due to the improvement of spectrum utilization efficiency, (ii) algorithm 3 can get the optimal solution at expense of the complexity, while the proposed decentralized learning algorithm can find the sub-optimal solution with low complexity, and we can modify this algorithm to search for the better correlated equilibrium point.
1 probability of strategy 1 probability of strategy 2 probability of strategy 3 probability of strategy 4 probability of strategy 5 probability of strategy 6 probability of strategy 7
0.8 0.7
-9
1.8
x 10
1.6 1.4
0.6
1.2
0.5
interference(W)
probability dis tribution for the finite s trategy s et
0.9
strategy 1,2,3,4,5,6,7
0.4 0.3
1 0.8 0.6
0.2 0.4
0.1 0.2
0
0
10
20
30
40
50 iterations
60
70
80
90
100
0
1
Fig. 4 p vs the number of iterations (player 2)
9
x 10
-10
Exhaustive Search Algorithm Correlated Equilibrium Algorithm
8 7 6
-8
interferenc e(W )
x 10
Algorithm1: Correlated Equilibrium Algorithm Algorithm2: Modified Correlated Equilibrium Algorithm Algorithm3: Exhausive Search Algorithm
1.2
Interferenc e(W )
3
Fig. 6 The initial interference of each user
Fig. 4 plots the evolution of probability distribution for the finite strategy set of one of the players. We can find that the probability of one of the strategy converges to 1, while the others converge to 0, which indicates that the empirical distributions of play z n converge almost surely as n → ∞ to the set of correlated equilibria of our game. 1.4
2 user
5 4
1
3
0.8
2 1
0.6
0 0.4
2 user
0.2
Fig. 7 Final interference of each user
0
2
3
4 5 Number of subcarriers per user
6
7
Fig. 5 The interference vs Ki In Fig. 5 we plot the interference against the number of subcarriers per user and compare the difference of three algorithms. Algorithm1 is the decentralized learning algorithm achieving the correlated equilibrium proposed in section IV. Algorithm 2 is the modification of algorithm 1, which aims to
1
3
We plot the initial interference of each user in Fig. 6 and the final interference of each user in Fig. 7. And the results in Fig. 7 are obtained by two algorithms, i.e., the exhaustive search algorithm and the decentralized learning algorithm. Comparing the interference value in the two figures, we can conclude that the both algorithms decrease the total interference largely, about 67.6% and 46.9% respectively. Moreover, the decentralized learning algorithm achieves obviously better fairness, as a consequence of competing.
VI. CONCLUSION In this work, we present a new distributed subcarrier allocation algorithm for multi-cell OFDMA systems relying on the correlated equilibrium. The goal is to minimize the cochannel interference. Concretely, we model a cooperation policy selection game and focus on implementation of the set of correlated equilibria to analyze the outcome of the proposed game. Then, we develop an algorithm based on the no-regret procedure to learn the correlated equilibrium. And the simulation results show that the proposed scheme achieves a good performance, such as quick convergence, largely interference mitigation and good fairness. However, our work only studies the subcarrier assignment, further study could be focused on how to consider both the power and subcarrier allocation to maximize the overall utility of the system. REFERENCES [1]
[2]
[3]
[4]
[5]
[6] [7] [8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
M.M.Wang and T.Ji, “Dynamic resource allocation for interference management in orthogonal frequency division multiple access cellular communications,” IET Communications, vol.4, iss.6, pp.675-682, 2010 Zhenyu Liang, Yong Huat Chew and Chi Chung Ko, “On the modeling of a non-cooperative multicell OFDMA resource allocation game with integer bit-loading,” in Proc. of IEEE GLOBECOM, 2009. Hojoong Kwon and Byeong Gi Lee, “Distributed resource allocation through noncooperative game approach in multi-cell OFDMA systems,” in Proc. of IEEE ICC, 2006. Kai Yang, Narayan Prasad, and Xiaodong Wang, “An auction approach to resource allocation in uplink OFDMA systems,” IEEE Transactions on Signal Processing, vol. 57, no. 11, pp. 4482-4496, 2009. Zhu Han, Zhu Ji, and K. J. Ray Liu, “Non-Cooperative Resource Competition Game by Virtual Referee in Multi-Cell OFDMA Networks,” IEEE Journal on Selected Areas in Communications, vol. 25, no. 6, pp. 1079-1090, 2007. G. Owen, Game Theory, 3rd ed. New York: Academic, 2001. R. J. Aumann, “Subjectivity and Correlation in Randomized Strategies,” Journal of Mathematical Economics, vol. 1, no. 1, pp. 67-96, 1974. V. Krishnamurthy, M. Maskery, and G. Yin, “Decentralized adaptive filtering algorithms for sensor activation in an unattended ground sensor network,” IEEE Transactions on Signal Processing, vol. 56, no. 12, pp. 6086-6101, 2008. M. Maskery, V. Krishnamurthy, and Q. Zhao, “Decentralized dynamic spectrum access for cognitive radios: cooperative design of a noncooperative game,” IEEE Transactions on Communications, vol. 57, no. 2, pp. 459-469, 2009. Z. Han, C. Pandana, and K. Liu, “Distributive opportunistic spectrum access for cognitive radio using correlated equilibrium and no-regret learning,” in Proc. of IEEE WCNC, 2007. Z. Lin and M. v. d. Schaar, “On the correlated equilibrium selection for two-user channel access games,” IEEE Signal Processing Letters, vol. 16, no. 3, pp. 156-159, 2009. Mohamad Charafeddine, Zhu Han, Arogyaswami Paulraj and John Cioffi, “Crystallized rates region of the interference channel via correlated equilibrium with interference as noise,” in Proc. of IEEE ICC, 2009. Quang Duy La, Yong Huat Chew, and Boon-Hee Soong, “An interference minimization game theoretic subcarrier allocation algorithm for OFDMA-based distributed systems,” in Proc. of IEEE GLOBECOM, 2009. S. Hart and A. Mas-Colell, “A simple adaptive procedure leading to correlated equilibrium,” Econometrica, vol. 68, no.5, pp. 1127-1150, 2000. Dan Wu, Jianchao Zheng, and Yueming Cai, “Cooperation Policy Selection for Energy-constrained Ad Hoc Networks Using Correlated Equilibrium,” unpublished. S. Hart and D. Schmeidler, “Existence of correlated equilibria,” Mathematics of Operations Research, vol. 14, no. 1, pp.18-25, 1989.
[17] E. Altman, N. Bonneau, and M. Debbah, “Correlated equilibrium in access control for wireless communications,” Lecture Notes in Computer Science, no. 3976, pp. 173-183, 2006. [18] Z. Han and K. J. R. Liu, Resource allocation for wireless networks: Basics, Techniques, and Applications, Cambridge University Press, 2008. [19] M. Benainm, J. Hofbauer, and S. Sorin, “Stochastic approximations and differential inclusions, Part II: Applications,” Mathematics of Operations Research, vol. 31, no. 3, pp. 673-695, 2006.