This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE ICC 2010 proceedings
Distributed Relay Selection and Power Control in Cognitive Radio Networks with Cooperative Transmission Changqing Luo1,2 , F. Richard Yu3 , Hong Ji1 and Victor C.M. Leung2 Laboratory of Universal Wireless Communication, Ministry of Education Beijing University of Posts and Telecommunications, Beijing, P.R. China 2 Department of Electrical and Computer Engineering, The University of British Columbia, Vancouver, BC, Canada 3 Department of Systems and Computer Engineering, Carleton University, Ottawa, ON, Canada Email:
[email protected], richard
[email protected],
[email protected] and
[email protected] 1 Key
Abstract— In this paper, we present a distributed relay selection and power allocation scheme concurrently considering the channel states of all related links and residual energy state of the relay nodes for cooperative transmission in cognitive radio (CR) networks. Specifically, we formulate the CR network with cooperative transmission as a restless bandit system, which has been widely applied in operations research and stochastic control. The channel state and residual energy state are presented by finite state Markov chains. With this stochastic optimization formulation, the optimal policy for relay selection and power allocation is indexable, meaning that the relay with the highest index should be selected. The proposed scheme can achieve the tradeoff between achievable rate and network lifetime. Simulation results are presented to illustrate the performance of the proposed scheme.
I. I NTRODUCTION Cognitive radio (CR) [1] has been suggested as a promising technology that may enable to deal with the coexistence of primary and secondary users. In CR networks, unlicensed users can use available spectrum as long as the interference with licensed users is below a tolerable level. This will result in high spectrum utilization. On the other hand, cooperative communication is well known as a powerful technology that mitigates signal fading due to multipath propagation in a wireless medium [2]. In a multi-user environment, multiple users cooperate with each other to form a virtual antenna array. As a consequent, it can exploit space diversity to fight with fading for the improvement of data rate and reliability. Motivated by these two promising techniques, the cooperative transmission in CR networks has attract much 0 This work was supported in part by the National Natural Science Foundation of China under Grant 60832009, Natural Science Foundation of Beijing under Grant 4102044, Hi-Tech Research and Development Program (National 863 Program) under Grant 2009AA01Z246, and 2009AA01Z211, and the Natural Sciences and Engineering Research Council of Canada (NSERC).
interest, which may significantly improve the spectrum utilization and data rate. In most of previous work, the cooperation concept is used as a method to improve sensing capability [3], [4], while the cooperative transmission is not considered in detail. In the [5], [6], the data from the primary user is relayed by secondary users to improve throughput in the primary network at the physical layer. Apart from the cooperation between secondary users and primary users, the cooperation among secondary users are also considered in the literature. In [7], the secondary source is assisted by a group of relays with different locations. The authors focused on the outage performance when all these relays simultaneously transmit their received information to the destination. Although some work has been done for cooperation transmission in CR networks, the research that relay selection and power allocation are jointly considered to achieve tradeoff between achievable rate and network lifetime is ignored. However, it is a very important problem that directly affects the transmission performance improvement in the CR networks. In CR networks, the improvement of achievable rate can be obtained by increasing the power consumption, while the network lifetime will be degraded using this method. Therefore, the achievable rate and network lifetime should be jointly considered in our proposed scheme. In this paper, we propose a distributed algorithm for relay selection and power allocation in the underlay paradigm based CR network with cooperative transmission, which can achieve the tradeoff between the achievable rate and network lifetime. We formulate the CR networks with cooperative transmission as a restless bandit system [8], which has been widely applied in operations research and stochastic control. The proposed scheme can optimally select a relay node and allocate power for data transmission in
978-1-4244-6404-3/10/$26.00 ©2010 IEEE
This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE ICC 2010 proceedings
II. S YSTEM M ODEL In this section, We describe the models used in this paper for data transmission in cognitive radio cooperation networks. A. Cooperative Transmission in an Underlay Paradigm based Cognitive Radio Network In this paper, we consider an underlay paradigm based CR network with cooperative transmission where a source node intends to communicate with a destination node aided by a relay node, and the relaying is based on the DF cooperation protocol as shown in Fig. 1. In Fig. 1, hSD , hSRn , hSP , hRn D , hRn P denote the channel gain of Source/Destination link, Source/Relay link, Source/Primary link, Relay/Destination link, and Relay/Primary link, respectively. Full channel state information (CSI) is assumed to be available at the transmitter and receiver of the secondary user. When the source node transmits information to an intended destination, a set of intermediate nodes which are dispersed over a short range geographic area may overhear the information due to the broadcast character of wireless channel. These intermediate nodes are considered as potential relay nodes. The data transmission under consideration between source and destination nodes uses half-duplex method, and one transmission process is executed in two phases. In the first phase, a source sends information to its intended destination node and N intermediate nodes. In the second phase, a selected relay will decode and transmit it to the destination node over the acquired spectrum, using the same codebook used in the source node. Time is divided into slots with equal length T , and slot k refers to the discrete time period [kT, (k + 1)T ]. For one transmission process, it consists of two orthogonal time slots [9]. In the first phase, the received signals at the destination and each potential relay node are presented as follows, respectively. (1) ySD = hSD xS + nSD ,
Relay R1
hR1,P
P
Primary Receiver
ĂĂ
hR1,D
Secondary Source
hS,P
hS,R1 hS,Rn
Relay Rn
hRn,P hRn,D
S
hS,D hS,RN
ĂĂ
CR networks with cooperative transmission by concurrently taking into account of the channel conditions of all related links and energy state of potential relay node. Specially, It is a fully distributed and scalable scheme where an intermediate node can join and leave from the potential relay set freely. The relay is selected and power allocated without a centralized coordinator. The rest of the paper is organized as follows. Section II describes the system model used in this paper. We formulate relay selection and power allocation problem in CR networks as a restless bandit problem in Section III. In Section IV, the process of the selection scheme is described. Some simulation results are given in Section V. Finally, conclusions are given in Section VI.
Secondary Destination
hRN,P hRN,D Relay RN
D
Fig. 1. A scenario for cognitive radio networks with cooperative transmission.
ySRn = hSRn xS + nSRn ,
(2)
where xS is the transmitted signal by the source node, nSD and nSRn are the received additive Gaussian white noise at each receivers. The transmission power for the source node PS is limited by the interference with primary user Pth and maximum available transmission power PS,max . Therefore, the transmission power satisfies the following equation: if The Channel is Idle, PS,max , PS ≤ th }, others. min{PS,max , |hPSP |2 (3) In the second phase, after a decode and re-encode process, the selected relay node forwards the received information to the destination. The received signal at the destination is yRn D = hRn D xRn + nRn D ,
(4)
where xRn is a re-encoded data transmitted from relay node, nRn D is the additive white Gaussian noise received from the Relay/Destination link. In this phase, the transmission power PRn for relay Rn is also constrained by maximum transmission power of relay node and the interference with primary nodes, which is presented as follow. if The Channel is Idle, PRn ,max , PRn ≤ Pth min{PRn ,max , |hSP |2 }, others. (5) The achievable rate CDF for three nodes cooperative communication using DF cooperative protocol is as follow. 1 PS ), CDF = min{ log(1 + |hSRn |2 2 N0 1 PS PR log(1 + |hSD |2 + |hRn D |2 n )}, 2 N0 N0
(6)
where N0 denotes the noise power spectral density which is the power per unit of bandwidth. In this paper, for the
This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE ICC 2010 proceedings
simplification, we assume that the noise power spectral density is same for different links. B. Channel Model Throughout this paper, we will assume that all channels are Rayleigh fading, but constant during the transmission of one block. A finite state Markov channel (FSMC) model has been widely accepted as an effective approach to characterize the structure of the block fading process [10]. In general, an FSMC model is constructed by first partitioning the range of the channel gain into discrete levels. In addition, for the CR network with the underlay paradigm, the channel state that channels are not in use by primary users also evolves stochastically, so we include this channel state in the FSMC model. Therefore, a finitestate Markov channel is characterized by a state set, S = {1, 2, ..., S}. Here, we assume that the state S presents the idle channel state, and the corresponding channel gain is very large. The channel state is sC (k) in time slot k, k ∈ {1, 2, ..., K}. The transition probability of channel state is pc (i, j) = P r(sC (k + 1) = j|sC (k) = i), i, j ∈ S.
(7)
C. Energy Model In wireless networks, most wireless mobile devices are powered by batteries with limited energy, so the residual energy is also one of important factors that indicate the performance of potential relay nodes. Assume that the residual energy of each wireless mobile device remains unchanged when it is in sleep mode. In practical system, the residual energy of each potential relay can be detected locally. For simplification, the continuous battery residual energy can be divided into discrete levels, denoted by E = {E1 , E2 , ..., ES }, where |S| is the number of available channel state. The residual energy state at time slot k is denoted by sE (k). The residual energy is also a Markov chain when relay is active according to the results of [11]. The transition probability is pe (i, j) = P r(sE (k + 1) = j|sE (k) = i), i, j ∈ E.
(8)
Because the residual energy does not increase after a transmission action, the energy state is either to remain the same state or to change to a lower state next to the previous state. Therefore, the transition probability satisfies: pe (i, j) = 1, (i = j), or pe (i, j) = 1, (i ≤ j). III. R ESTLESS BANDIT F ORMULATION In this section, we formulate the relay selection problem in CR networks with cooperative transmission as a restless bandit system, which can determine an optimal policy for relay selection decision and power allocation.
A. Action Space At the beginning of second phase, all potential relays start the sensing process and obtain the channel state information for both relay/destination link and relay/primary link. Based on the sensed outcomes, the relay node and allocated power will be jointly decided. Therefore, for each potential relay, its action space consists of two actions: relay selection akR and corresponding power allocation akP . In this phase, M potential relays are sequentially selected to be active, which means that M relays are selected to forward information to the destination node. Here, when a potential relay is selected to be passive, the power level is selected to zero. The rest of relays keep passive, and transition to a new state independently in the next slot. For each relay Rn in slot k, aRn (k) = (akR , akP ), (9) N where the relay selection action satisfies Rn =1 akR = M , and akP must satisfy the power constraint in Eq.(6). B. State Space and Transition Probabilities The state of a relay node is characterized by the residual energy state information and channel state information. In practice, the channel state for different wireless link and energy state evolve independently. Based on the Eq.(6), the channel gains of Source/Relay link, Source/Destination link, and Relay/Destination link directly affect the achievable rate, and that of Source/Primary link and Relay/Primary link impact it indirectly. The state of potential relay Rn in slot k, defined as s(Rn , k), can be modeled as s(Rn , k) = C C C C E [sC SRn (k), sSD (k), sRn D (k), sSP (k), sRn P (k), sRn (k)], C C C C C where sSRn (k), sSD (k), sRn D (k), sSP (k), sRn P (k), sE Rn (k) correspond to the channel states of different links and energy state, respectively. If the potential relay Rn is taken action a in time slot k, then the state s(Rn , k) evolves according to an U -state Markov chain with transition probability matrix: Rn D n (i, j), pSD (i, j), P a (Rn ) = [(pSR c (i, j), pc c
Rn P n pSP (i, j), pR c (i, j), pc e (i, j))]U ×U ,
(10)
n where pSR (i, j), pSD pcRn D (i, j), pSP c (i, j), c (i, j), c Rn P Rn pc (i, j) and pe (i, j) are defined in Eq.(7), and (8) and U = |S|5 × |E|.
C. System Reward After executing each transmission process, the system will obtain an immediate reward. The reward function is contributed by the received data rate and power consumption. The reward function is defined as follows. RRn (k) = α · CDF (Rn , k) − β · CEN (Rn , k),
(11)
This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE ICC 2010 proceedings
where RRn (k) is the immediate reward function that relay Rn is selected to forward the information at slot k. α and β (0 ≤ α, β ≤ 1) are coefficient. CDF (Rn , k) and CEN (Rn , k) are the immediate reward and cost for relay Rn when action aRn (k) is taken in slot k, respectively. The CEN is related to the residual energy state and power consumption, and is denoted by CEN = f (sE , P ). The optimization goal of our problem is to maximize the total discounted reward which is defined as Z=
T −1
N
β T −k−1 RRn (k),
(12)
k=0 Rn =1
where T is the number of slots considered. Therefore, we need find out the optimal policy A∗ that is used to achieve the optimization objective. The optimal policy is A∗ = arg Z ∗ ,
(13)
where Z ∗ = max Z(A), A is all admissible policy. A∈A
D. The Solution to the Problem The restless bandit approach has an indexable rule that reduces the computational complexity dramatically. For relay Rn in state iRn , we denote by the index δRn (iRn ). According to the restless bandit approach, the optimal policy A∗ is a set of optimal actions. Let the element of A∗ in row Rn and column k be a∗Rn (k), which represents the optimal action for relay Rn in slot k. Thus if δRn is the largest in the set of {δ1 , δ2 , ..., δN }, a∗R (k) = 1, a∗P (k) > 0, otherwise a∗R (k) = 0. Define the set of all available policies to be A = {A}. Thus A∗ = arg max Z(A). A∈A To solve the restless bandit problem, a linear programming (LP) relaxation is developed based on the result of LP formulations of Markov decision chains (MDCs) [8], and a primal-dual heuristic is applied to derive the index for each relay node. The priority-index rule is to select the relay that has the largest indices to be active. IV. T HE P ROCESS OF THE O PTIMAL R ELAY N ODE S ELECTION AND P OWER A LLOCATION In our proposed scheme, for each transmission process, an optimal relay node is selected to assist to transmit data with a appropriate power. The process of relay selection and power allocation is divided into off-line part and online part. During the off-line part, priority indices are computed. According to the channel condition, the state space and transition probability matrices under different actions are determined. For each potential relay node and each possible state iRn ∈ U , input the state transition probability paiRn ,jRn , the reward RiaRn , the discount factor β and the initial state probability vector α, then off-line compute the finite set of the indices {δiRn } . The indices and the corresponding RiaRn are stored in an index table.
This index table will be used to select best relay node and allocate power in the on-line part. After the off-line initialization, at the beginning of time slot k, the on-line process is as follows. 1) At the beginning of first phase, a node that received information from source node will join the potential relay set. The source node will sense the link states in its surroundings: Source/Destination link, Source/Relay link, Source/Primary link. 2) At the beginning of second phase, the potential relay node will sense the link states in its surroundings: Relay/Destination link, Relay/Primary link. 3) With the residual energy state and link states for each potential relay, the potential relay looks up the index table to find out the corresponding index δiRn , and arranges all indices in a list from the highest to lowest. The relay node is set to be active if its index is highest among these potential node. V. S IMULATION R ESULTS AND D ISCUSSIONS In this section, we illustrate the performance of the proposed scheme by simulation examples. We adopt DF cooperation protocol to handle the cooperative transmission. The simulation area is 50 meter × 50 meter square. We assume that the users move around a free space with a slow speed. Block Rayleigh flat-fading wireless channel model [10] is adopted in this paper. The relay nodes with state variables (wireless channel and residual energy) are considered. The battery capacity is 1000 mAh with the output voltage 1 Volt. The maximum power Ps = 150mw, and the tolerable interference level at the receiver of primary user is set to be Pth = 5mw. For each node when acts as a potential relay, two action choices are available: active and passive. Within the same slot, only one potential relay can be set to be active. We consider the system performance in the following two cases: (1) our proposed scheme that takes wireless channel and residual energy into account, and (2) the scheme that only considers the wireless channel. We apply three metrics, average reward, achievable rate, and network lifetime, to evaluate the performance of our proposed algorithm. A. Dynamic Optimal Policy and Received Reward Fig. 2 illustrates the optimal policy and corresponding received immediate reward at each time slot. In Fig. 2, the following parameters setting are used in this example. For the channel between relay node and primary node, P r{Xk+1 = v|Xk = v} = 0.5, P r{Xk+1 = v|Xk = z} = 0.3, P r{Xk+1 = z|Xk = z} = 0.1. We assume that there are two potential relay nodes in this area. From the figure, we can see that our proposed scheme can work well, and the optimal relay can be selected according to the criterion of maximum reward rather than the achievable rate at each
This full text paper was peer reviewed at the direction of IEEE Communications Society subject matter experts for publication in the IEEE ICC 2010 proceedings
always consumes the highest allowed power to improve the achievable rate, which is inevitable to reduce the network lifetime. However, our proposed scheme makes a tradeoff between achievable rate and network lifetime.
Reward of Restless Bandit Scheme
Received Reward
2
1.5
1
Selected Relay Node
0.5
0
5
10
15 20 25 30 Time Slot Number Policy of Restless Bandit Scheme
35
40
2
1
0
0
5
10
15 20 25 Time Slot Number
30
35
40
Fig. 2. The optimal relay selection and corresponding received reward along the time.
160
Power Consumption (mw/Hz)
140
VI. C ONCLUSIONS AND F UTURE W ORK In this paper, we have proposed a distributed relay selection scheme jointly considering the optimal power allocation in CR network with cooperative transmission. In CR networks, with the goal of making a tradeoff between the achievable rate and the network lifetime, we have studied the multiple potential relay node selection and optimal power allocation problem. In particular, the CR network with cooperative transmission is formulated as a restless bandit system. The optimal policy is determined by an indexable algorithm. Simulation results illustrate that our proposed scheme can significantly improve the network lifetime and reduce the power consumption. Other parameters, such as security in cognitive radio cooperative communication network, will be considered in future work.
120
R EFERENCES 100
80
Our Proposed Scheme Rate Based Scheme, no Energy Consideration
60
40
20
0
Fig. 3.
5
10
15 20 Time Slot Number
25
30
The optimal power allocation along the time.
time slot. According to the sensed outcomes, the potential node searches the index table to get the corresponding index. Then, a relay node selection and optimal power allocation decision are jointly made. After a transmission process, the residual energy is updated. It indicates that the restless bandit approach can perform well to design the optimal relay node selection and power allocation scheme in CR networks. B. Optimal Power Allocation The power is allocated based on the optimal reward in our proposed scheme. Although increasing power consumption will improve the achievable rate, it also will reduce the network lifetime. Hence, the optimal power allocation has to make a balance between the achievable rate and network lifetime. Fig. 3 shows the dynamically optimal power allocation along the time. The parameter setting is same with that in Fig. 2. From the figure, we can see that the power consumption in our proposed scheme is lower than that in the rate based scheme. The rate based scheme
[1] J. Mitola, Cognitive radio: an integrated agent architecture for software defined radio. PhD thesis, Royal Inst. Technol., Stockholm, Sweden, 2000. [2] N. Laneman, D. Tse, and G. Wornell, “Cooperative diversity in wireless networks: Efficient protocols and outage behavior,” IEEE Trans. Inform. Theory, vol. 50, pp. 3062–3080, Dec. 2004. [3] G. Ganesan and Y. Li, “Cooperative spectrum sensing in cognitive radio - part II: multiuser networks,” IEEE Trans. Wireless Commun., vol. 6, June 2007. [4] K. Letaief and W. Zhang, “Cooperative communications for cognitive radio networks,” Proceedings of The IEEE, vol. 97, pp. 878–893, May 2009. [5] O. Simeone, Y. Bar-Ness, and U. Spagnolini, “Stable throughput of cognitive radios with and without relaying capability,” IEEE Trans. Computers, vol. 55, pp. 2351–2360, Dec. 2007. [6] P. Gong, J. Park, J. Yoo, B. Yu, and D. Kim, “Throughput maximization with multiuser non-selfish cognitive relaying in cr networks,” in Proc. IEEE ISWPC’09, (Melbourne, Australia), Feb. 2009. [7] K. Lee and A. Yener, “Outage performance of cognitive wireless relay networks,” in Proc. IEEE GLOBECOM’06, (CA, USA), 2006. [8] D. Berstimas and J. Ni˜no-Mora, “Restless bandits, linear programming relaxations, and a primal dual index heuristic,” Operations Research, vol. 48, no. 1, pp. 80–90, 2000. [9] J. Mietzner, L. Lampe, and R. Schober, “Performance analysis for a fully decentralized transmit power allocation scheme for relayassisted cognitive-radio systems,” in Proc. IEEE GLOBECOM’08., (LA, USA), Nov. 2008. [10] H. S. Wang and N. Moayeri, “Finite-state Markov channel - A useful model for radio communication channels,” IEEE Trans. Veh. Tech., vol. 44, pp. 163–171, Feb. 1995. [11] P. Hu, Z. Zhou, Q. Liu, and F. Li, “The hmm-based modeling for the energy level prediction in wireless sensor networks,” in Proc. IEEE 2nd Conf. on Industrial Electronics and Applications, (Harbin, P.R. China), pp. 2253–2258, May 2007.