Optimal Primary-Secondary user Cooperation Policies in ... - CiteSeerX

0 downloads 0 Views 262KB Size Report
Jul 22, 2013 - In all these works it is assumed that no interaction .... mization objective is of the form ..... The main features of this approach are the following.
Optimal Primary-Secondary user Cooperation Policies in Cognitive Radio Networks Evangelia Matskani∗ , Nestor Chatzidiamantis∗, Leonidas Georgiadis∗ , Iordanis Koutsopoulos† and Leandros Tassiulas‡ ∗ Department of Electrical & Computer Engineering, Aristotle University of Thessaloniki, Thessaloniki, Greece, Emails:{matskani,nestoras,leonid}@auth.gr † Athens University of Economics and Business (AUEB), Athens, Greece, Email: [email protected]

arXiv:1307.5613v1 [cs.NI] 22 Jul 2013

‡ Department of Computer and Communication Engineering, University of Thessaly, E-mail:[email protected]

Abstract—In cognitive radio networks, secondary users (SUs) may cooperate with the primary user (PU), so that the success probability of PU transmissions are improved, while SUs obtain more transmission opportunities. Thus, SUs have to take intelligent decisions on whether to cooperate or not and with what power level, in order to maximize their throughput subject to average power constraints. Cooperation policies in this framework require the solution of a constrained Markov decision problem with infinite state space. In our work, we restrict attention to the class of stationary policies that take randomized decisions in every time slot based only on spectrum sensing. The proposed class of policies is shown to achieve the same set of SU rates as the more general policies, and enlarge the stability region of PU queue. Moreover, algorithms for the distributed calculation of the set of probabilities used by the proposed class of policies are presented.

I. I NTRODUCTION Cognitive radio networks (CRNs) have received considerable attention due to their potential for improving spectral efficiency [1]–[3]. The main idea behind CRNs is to allow unlicensed users, also known as secondary users (SU), to identify temporal and/or spatial spectrum “holes”, i.e., vacant portions of licensed spectrum, and transmit opportunistically; thus, gaining access to the underutilized shared spectrum, while maintaining limited interference to the licensed users, also known as primary users (PU), limited. This communication paradigm has been refered to as “Dynamic Spectrum Access” (DSA) in the technical literature [4], [5]. Much prior work on DSA CRNs has been focused on the problem of optimal spectrum assignment to multiple SUs [6]–[8]. Several resource allocation algorithms have been proposed, based on either the knowledge of PU transmissions obtained from perfect spectrum sensing mechanisms [6] or from a probabilistic maximum collision constraint with the PUs [7]. Of particular interest is the opportunistic scheduling policy for SUs suggested in [8], which based on the Lyapunov optimization theory [9], [10] maximizes SUs’ throughput utility while guarantees low number of collisions with the PU, as well. In all these works it is assumed that no interaction between PUs and SUs exists. Recently, the concept of cooperation between PUs and SUs in DSA CRNs emerged, as a means for providing benefits for both types of users. These benefits have their origin in the fact that by exploiting the power resources of SUs, the performance

of PU transmissions can be substancially improved, while in return SUs obtain more transmission opportunities for their own data since the chances that the PU queue will be empty and primary channel free to use are increased. From an information theoretic perspective, cooperation between SUs and PUs at the physical layer has been investigated in many works (see the survey paper [11] and the references therein). Queueing theoretic aspects and spectrum leasing strategies for cooperative CRNs have been investigated in [12]–[15]. Specifically, spectrum leasing strategies where PU leases a portion of its spectrum to SUs in return for cooperative relaying were suggested in [12]. A protocol where a SU relays the PU packets that have not been correctly received by their destination, was suggested and investigated in terms of SU stable throughput in [13], while similar protocols were suggested and compared in [14], considering various physical layer relaying strategies. The work in the current paper builds on the model presented in [15]. In the latter work, a dynamic decision policy for the SUs activities (i.e., whether to relay PU transmissions and with what power level) was suggested. Taking into account the fact that the PU channel occupancy process is influenced by the SUs activities due to the interaction between PUs and SUs, this policy aimed at maximizing a SUs’ throughput utility subject to SUs’ average power constraints. The proposed policy is proved to be optimal, however, its basic requirement is that the PU packet arrival rates must be lower than a threshold value, which guarantees that the primary queue is stable even when SUs never cooperate. This regime places significant limitations on the achievable PU stability region, since the sustainable arrival rates of PUs may be much larger than this limit. In fact, SU cooperation aims precisely at increasing the PU transmission rates, thus emptying the PU queue at faster rate. Hence, SU cooperation has the potential to increase the range of PU traffic arrival rate for which stability of PU queues can be guaranteed. In this paper we overcome this limitation and investigate transmission policies for cooperative CRNs that can be be applied even when PU transmission rates are above the threshold set by [15]. Taking into account the fact that the decision options and success probabilities are different during the idle and busy PU periods, while the size of these periods is in turn affected by the cooperation decisions, such policies

require in general the solution of a non-trivial constrained Markov decision problem with infinite state space, where state is the size of PU queue. The solutions for such Markov decision problems suffer from large convergence times and their implementation in general requires knowledge of PU queue size [16]. In our work, we follow an alternative approach and propose a restricted class of stationary policies which take random decisions on secondaries’ activities in every time-slot based only on a set of probabilities, calculated through a linear programming optimization problem, and the primary channel spectrum sensing result, i.e., PU channel busy of idle. Our approach is proven to achieve the same set of SU rates as the more general policies, hence the policies in the restricted class can be employed when optimization of any utility function is required. In addition, the stability region of the PU queue that is supported by the proposed class of policies is defined and it is shown to be extended compared to the stability region supported by the algorithm in [15]. Implementation issues are further discussed and an algorithm for the distributed calculation among the SUs of the set of probabilities that the proposed class of randomized policies uses, is presented. This distributed version offers a robust alternative to centralized implementation that requires additional messaging among SUs in exchange of less complexity. The remainder of the paper is organized as follows. In Section II, we introduce the system model and the available controls, and we define the admissible policies and their performance region. In Section III, we describe the mode of operation of the proposed restricted class of randomized policies and show their optimality. Furthermore, the supported stability region of the PU queue is calculated and the distributed implementation algorithm is described. Section IV presents simulation results which show the optimality of the proposed restricted class of policies, and, finally, concluding remarks are provided in Section V. II. S YSTEM M ODEL We consider a network with one PU and multiple SUs. The PU is the licensed owner of the channel and transmits whenever it has data to send. On the other hand, the SUs do not have any licensed spectrum and seek transmission opportunities on the primary channel. We assume that one of the SUs can cooperate with the PU in order to improve the success probability of the latter’s transmissions. This can be achieved by allocating some of SU’s power resources for relaying the primary traffic. Furthermore, the transmission of the SUs is coordinated so that after sensing the PU channel, it is decided which SU will cooperate or not and at what power level (if the primary channel is busy) or which SU will transmit and at what power level (if the primary channel is idle). In what follows we describe the parameters of the system model under investigation as well as the available controls. A. System Model Parameters We consider the time-slotted model, where time slot t = 0, 1, ... occupies the time interval [t, t + 1); t and t + 1 are

called the “beginning” and “end” of slot t respectively. The PU receives new packets in time slot t i.e. in the interval [t, t+1) according to an independent and identically distributed (i.i.d.) arrival process Ap (t) with mean rate λp packets/slot. We assume that each SU has an infinite queue of packets. We denote by S the set of secondary users. Each secondary user s ∈ S, can transmit using one of Is power levels, Ps (i), i = 1, ..., Is , where Ps (i) < Ps (i + 1). To simplify the description that follows we set Ps (0) = 0. Secondary user s uses these power levels to either transmit its own data, or to assist the transmission of packets by the primary user. At each time slot only a single packet transmission can take place. Furthermore, when transmission of packets from the primary user takes place, at most one of the secondaries can cooperate. Finally, there is a constraint on the long-term average power Pˆs consumed by every s ∈ S. Hence, for every s ∈ S, if is (t) is the power level used by s at slot t, it must hold, t

lim sup t→∞

1X E [Ps (is (τ ))] ≤ Pˆs , is (τ ) ∈ Is0 t τ =0

(1)

where E[·] denotes expectation, Is = {1, 2, ..., Is } and Is0 = Is ∪ {0}. We assume an erasure channel model, i.e., that each transmission (by the primary or one of the secondaries) is either received correctly or erased. When SU s transmits one of its own packets with power level i ∈ Is0 , the probability of success is rs (i), where rs (0) = 0, i.e. the success probability is zero if no power is used for transmission. When s cooperates with PU, which means that it assists in the transmission of the PU packets, at power level i, the success probability of the transmitted packet is rp (s, i) . When i = 0 (i.e. the secondary “cooperates” with zero transmission power) in effect no cooperation takes place; therefore it is natural and useful in the discussion that follows to assume that rp (s, 0) = rp (0) ≥ 0 for all s ∈ S, where rp (0) denotes the probability of successful transmission by the PU when the SUs do not cooperate. In addition, we make the natural assumption that rp (s, i) ≤ rp (s, i + 1), i.e., the probability of successful reception is nondecreasing function of the transmission power. B. Available Controls In the beginning of time slot t the following controls are available, depending of the status of the primary queue Qp (t). In case Qp (t) > 0, then the available controls are the following: • A packet from the primary queue is transmitted, i.e., transmission of SU packets is excluded. We refer to this constraint as “PU Priority Constraint”. • A SU s(t) is selected for cooperation with the primary in order to assist the transmission of its packet. 0 • A power level i(t) ∈ Is is selected, so that s(t) is cooperating with the primary using power level Ps(t) (i(t)). It should be noted that when i (t) = 0 is selected, in effect no cooperation takes place. On the other hand, when Qp (t) = 0, the available controls are the following:

A SU s(t) is selected to transmit its own packet. A power level i(t) ∈ Is0 is selected, so that s(t) is transmitting its own packets using power level Ps(t) (i(t)). It should be noted that if i (t) = 0 is used for transmission, in effect no transmission takes place in slot t. The control decisions are summarized in Eq. (2) at the top of the next page. •



C. Admissible Policies, Performance Space, and Performance Objective Admissible Policies must operate in such a manner that the following constraints are satisfied: Policy Constraints • The primary queue must be mean rate stable, i.e., the output long-term rate of the PU queue should be equal its input rate, or equivalently [10] lim

t→∞

E [Qp (t)] =0 t

(3)

The power constraints of Eq. (1) are satisfied. Under an admissible policy, each SU s ∈ S obtains an average transmission rate Pt−1 E [rs (Ps (is (τ ))] r¯s = lim inf τ =0 (4) t→∞ t •

where is (t) is the power level at which s transmits in slot t. The performance space for the problem under consideration is defined as the set of {¯ rs }s∈S that can be obtained by all admissible policies. The selection of an admissible policy is usually effected through an optimization problem involving the quantities of interest, which in our case are {¯ rs }s∈S . Generally, the optimization objective is of the form maximize: f {¯ rs }s∈S



(5)

where {¯ rs }s∈S belongs to the performance space and f (·) is a function of the average transmission rates achieved by each SU. In the simplest case f is a linear function of these rates, however, fairness considerations may require f to be a nonlinear (usually separable) function [17]. According to the formulation of the system model, and taking into account the available controls in Section II-B, the primary queue size Qp (t) can be seen as the state of a constrained Markov Decision Process [16], where the constraints are imposed by the Policy Constraints described above. Let C1 be the class of admissible policies of this Markov Decision Process. This class contains policies that employ controls based on system history and includes the class of randomized stationary policies which act as follows: • When Qp (t) = m, select secondary user s to cooperate with the primary at power level i with a certain probability that depends on m. • When Qp (t) = 0, select secondary user s to transmit its own packets at power level i with a certain probability.

III. A C LASS

OF

R ESTRICTED S TATIONARY P OLICIES

In this work we restrict attention on a subset of the policies in C1 , denoted by C0 , whose decisions are based solely on whether the queue of the primary user is zero or non-zero. Thus, in every time slot, a policy in C0 acts as follows: • When Qp (t) > 0 or equivalently the channel is sensed busy, select secondary user s to cooperate at power level i with a probability q (s, i |b ). • When Qp (t) = 0 or equivalently the channel is sensed idle (empty), select secondary user s to transmit its own data at power level i with probability q (s, i |e ). Since the class C0 imposes severe restrictions on the manner controls are selected, it might seem that employing a policy in this class will lead to suboptimal performance. However, as will be shown, adopting a policy in C0 entails no loss in performance even if we consider the richer class of policies that allow secondary users to transmit when the primary queue is nonempty, i.e., do not respect the PU Priority Constraint. In order to substantiate this claim we next evaluate the performance space of policies in C0 . A. Performance Space of Admissible Policies in C0 Consider a policy π in class C0 . Under π the average packet service rate of the primary queue is given by XX rp (s, i)q(s, i |b ). (6) r¯p = s∈S i∈Is0

Standard results from queuing theory show that the stability region (i.e., the closure of the set of PU arrival rates λp for which the PU queue is mean rate stable under policy π) of the primary queue under π is the interval [0, r¯p ]. Assume next that λp ∈ [0, r¯p ) and let qb be the steady state probability that the primary queue is busy under π. Viewing the transmitter at the primary user as a queueing system holding 0 (if the primary queue is empty) or 1 packets (i.e., the packet whose transmission is attempted if the queue is nonempty) and applying Little’s formula to this system, we have qb = Pr {PU is busy} =

λp . r¯p

(7)

Hence the steady state probability that the primary queue is empty is qe = 1 − qb . Due to the PU Priority Constraint, secondary users may transmit their own data only when the primary user queue is empty. Hence, the stready state average packet transmission rate of SU s traffic is equal to ! X r¯s = rs (i) q (s, i |e ) qe . (8) i∈Is

Furthermore, regarding the average power consumed by SU s, considering the cases where the primary queue is empty or nonempty, we can write X X P¯s = qe Ps (i) q(i, s |e ) + qb Ps (i) q(s, i |b ). (9) i∈Is

i∈Is



γ (t) =

(s(t), i (t)), if Qp (t) > 0 (i.e., s(t) cooperates with primary at power level i(t)) (s(t), i (t)), if Qp (t) = 0 (i.e., s(t)transmits its own packets at power level i(t))

qb

XX

rp (s, i)q(s, i |b ) =

(2)

λp

(10)

Pˆs , s ∈ S

(11)

=

1

(12)

q (s, i |b ) =

1

(13)

q (s, i |e ) =

1

(14)

s∈S i∈Is0

qe

X

Ps (i) q(s, i |e ) + qb

i∈Is

X

Ps (i) q(s, i |b ) ≤

i∈Is

qb + qe

XX

s∈S i∈Is0

XX

s∈S i∈Is0

qb ≥ 0, qe ≥ 0

(16)

q (e, s, i) ≥ 0 q(b, s, i) ≥ 0, s ∈ S, i ∈ Is0 .

(21)

q (s, i |b ) ≥ 0, q (s, i |e ) ≥ 0,

Consider now an admissible policy π in C0 , i.e. π stabilizes the primary queue and satisfies the power constraints of Eq. (1). The discussion above shows that the constraints that need to be satisfied by the set of probabilities {qb , q (s, i |b ) , q (i |s, e ) , qs,e } s ∈ S, according to Eqs. (1), (7), are given by Eqs. (10)-(16) at the top of this page. Conversly, given the set of probabilities {qb , q (s, i |b ) , q (s, i |e ) , qe }s∈S, i∈I 0 that satisfy the s constraints (10)-(16), with qb < 1, an admissible policy in C0 can be defined. Hence, the performance space of these policies is the set of {¯ rs }s∈S that are defined by Eq. (8) where the set of probabilities {qb , q (s, i |b ) , q (s, i |e ) , qe }s∈S, i∈Is0 satisfy the constraints (10)-(16). While the constraints (10)-(16) are nonlinear with respect to the parameters {qb , q (s, i |b ) , q (i |s, e ) , qs,e }, they can be easily transformed into linear constraints by the transformation q (b, s, i) = qb q (s, i |b ) , q (e, s, i) = qe q (s, i |e )

(17)

and considering the parameters {q (b, s, i) , q (e, s, i)}s∈S,

i∈Is0

.

Note that q (b, s, i) is the probability that the PU is busy and SU s is selected for cooperation at power level i, while q (e, s, i) is the probability that the PU is idle and SU s packets are transmitted in a slot at power level i. With this transformation the constraints become, XX rp (s, i) q (b, s, i) = λp (18) s∈S

X

i∈Is0

Ps (i) q (e, s, i)+

i∈Is

X

Ps (i) q (b, s, i) ≤ Pˆs , s ∈ S (19)

i∈Is

XX

s∈S i∈Is0

q (e, s, i) +

XX

s∈S i∈Is0

q (b, s, i) = 1

(20)

(15) Is0

s ∈ S, i ∈

In addition, the achievable secondary user rates, given by Eq. (8), can be rewritten as X r¯s = rs (i) q (e, s, i) (22) i∈Is

It should be noted that the number of parameters 0 in the P I , i.e., constraints depicted by Eqs. (18)-(22) is 2 i∈SP s 0 P 0 i∈S Is parameters {q (b, s, i)}s∈S, i∈Is0 plus i∈S Is parameters {q (e, s, i)}s∈S, i∈Is0 . Hence, a policy in C0 that achieves Eq. (5) can be obtained a deterministic solving P by optimization problem with 2 i∈S Is0 unknowns satisfying the constraints of Eqs. (18)-(22). B. Sufficiency of policies in C0

In this subsection, we show that the performance space of C0 is the same as the performance space of C1 (in fact as will be shown below a stronger statement holds), thus it is sufficient for optimality to select policies in C0 . Towards this end, we consider an extended class of policies C2 where the PU Priority Constraint does not exist, i.e., the SUs may transmit even when the the primary queue is nonempty. For such policies, the available controls at the beginning of each time slot are of the form γ˜ = (u, s, i), u ∈ {1, 0} , s ∈ S, i ∈ Is0

(23)

where • Control (1, s, i), dedicates the slot for transmission of PU traffic and assigns SU s at power level i to cooperate with the primary. Note that this control can be assigned even if the PU queue is empty, in which case no packet is transmitted.

Control (0, s, i), dedicates the slot for transmission of SU traffic, and selects SU s to transmit at power level i. Since the policies in C2 do not impose the PU Priority Constraint, and C2 include even nonstationary policies, it follows that C0 ⊆ C1 ⊆ C2 . •

Hence the performance spaces R0 , R1 , R2 , satisfying the Policy Constraints in Section II-C under the classes of policies C0 , C1 , C2 respectively, satisfy the relation R0 ⊆ R1 ⊆ R2 .

(24)

We will show next that R2 ⊆ R0 ,

(25)

which together with (24) implies R0 = R1 = R2 . Hence, under any optimization objective, it suffices to restrict attention to policies in C0 even when one has the freedom of not adhering to the PU Priority Constraint. Contrary to the available controls when the PU Priority Constraint is imposed, the available controls for policies in C2 do not depend on PU queue size (a slot may be allocated to PU packet transmission even if the PU queue is empty). Hence, this class of policies falls in the framework of policies studied in [9], [10] and its performance space can be characterized again by the performance space of stationary policies. In the latter framework, a stationary policy selects at the beginning of each time slot the control (u, s, i) with probability p (u, s, i). Thus, under such a policy, the probability of successful transmission of SU s packets is X r¯s = rs (i) p (0, s, i) . (26) i∈Is

Similarly, the probability of successful transmission of PU packets is a user whose queue is empty and in this case no packet is transmitted XX rp (s, i)p(1, s, i), (27) r¯p =

equality in Eq. (18), in contrast with the inequality in Eq. (28). However, the physical meaning of the probability parameters is different. Specifically, • q (b, s, i) is the probability that PU is busy and SU s is selected for cooperation at power level i, while p(1, s, i) is the probability of selecting PU packet for transmission and SU s for cooperation at power level i. • q (e, s, i) is the probability that PU is idle and secondary user s packets are transmitted in a slot at power level i, while p(0, s, i) is the probability of selecting secondary user s packet for transmission at power level i. As expected, it is immediate from the above that R0 ⊆ R2 . The next theorem shows that R2 = R0 . Theorem 1. It holds R2 ⊆ R0 , hence R0 = R1 = R2 . Proof: Let {¯ rs }s∈S ∈ R2 . We will show that {¯ rs }s∈S ∈ R0 , which proves the theorem. P P If λp = s∈S i∈Is0 rp (s, i)p(1, s, i), then clearly {¯ rs }s∈S ∈ R0 . Assume next that XX rp (s, i)p(1, s, i). λp < s∈S i∈Is0

We distinguish the following cases: ≥ rp (0) p(1), where p (1) , PCaseP 1. λp p(1, s, i). s∈S i∈Is0 Note that since XX rp (s, i)p(1, s, i), (32) rp (0) p (1) ≤ λp < s∈S i∈Is0

the parameter α obtained as a solution of the following equation ! XX λp = α rp (s, i)p(1, s, i) s∈S i∈Is

+ (1 − α) rp (0) p (1) , (33)

s∈S i∈Is0

and stability of the PU queue requires that XX rp (s, i)p(1, s, i) ≥ λp .

0 ≤ α < 1. (28)

s∈S i∈Is0

Also, the average power constraint requirement implies that X X Ps (i) p(1, s, i) ≤ Pˆs , s ∈ S. (29) Ps (i) p(0, s, i) + i∈Is

i∈Is

Finally, since p (u, s, i) represent probabilities we must have XX XX p (1, s, i) = 1 (30) p (0, s, i) + s∈S i∈Is0

s∈S i∈Is0

p(0, s, i) ≥ 0, p(1, s, i) ≥ 0, s∈ S, i ∈ Is0 .

satisfies

(31)

Constraints of Eqs. (28)-(31) together with Eq. (26) define the performance region R2 . The similarity of these constraints compared to Eqs. (18)-(22) should be noted. From a purely mathematical point of view, the only difference is that there is

(34)

We define now the set of parameters q (b, s, i) and q (e, s, i)) by setting q (e, s, i) = p (0, s, i) for all s ∈ S and i ∈ Is0 and  αp(1, s, i) if i ∈ Is q(b, s, i) = αp(1, s, 0) + (1 − α)p (1, s) if i = 0 (35) P p(1, s, j). Due to Eq. (34), for all s ∈ S, where p (1, s) , j∈Is0

these parameters are nonnegative. Note also that X X p(1, s, i) q(b, s, i) = i∈Is0

(36)

i∈Is0

Hence the new set of parameters satisfies Eq. (20). Also, since Ps (0) = 0, after some basic algebraic manipulations, it can be seen that the new set of parameters satisfy Eq. (29). Finally, due to (33), it holds that

XX

rp (s, i)q(b, s, i) = λp

(37)

s∈S i∈Is0

Hence the new set of parameters satisfy Eqs. (18)-(21). Also, since q (e, s, i) = p (0, s, i) and the rates {¯ rs }s∈S ∈ R2 depend only on p (0, s, i), it follows that the SU rates under the new set of parameters does not change, which implies that {¯ rs }s∈S ∈ R0 . Case 2. λp < rp (0) p(1) Define the new set of parameters as follows ( 0 if i ∈ Is q (b, s, i) = (38) λp i=0 rp(0) p(1) p (1, s) and q (e, s, i) =

(

β

P

p (0, s, i) if i ∈ Is p (0, s, i) + p (0, s, 0) if i = 0

i∈Is0 λ

(39)

p 1− rp (0)

for all s ∈ S, where β = 1−p(1) − 1. Since λp < rp (0) p(1), and p (1) ≤ 1, it follows that β > 0,

(40)

Corollary 2. The stability region of the PU queue under the ˆ where λ ˆ is the value class of policies in C0 is the interval [0, λ] of the following linear optimization problem, OPT1. P P (44) maximize: s∈S i∈Is0 rp (s, i)x(b, s, i) P ˆ (45) u.c. i∈Is Ps (i) x(b, s, i) ≤ Ps , s ∈ S P P (46) s∈S i∈I 0 x (b, s, i) ≤ 1 s

x (b, s, i) ≥ 0, s ∈ S, i ∈ Is0

Proof: Note that problem OPT1 always has a feasible solution: set x (b, s, i) = 0 for P s ∈ S, i ∈ Is and select arbitrarily x(b, s, 0) ≥ 0, so that s x(b, s, 0) = 1, thus XX rp (s, i)x(b, s, i) = rp (0). (48) s∈S i∈Is0

ˆ is the optimal value of OPT1, it follows that rp (0) ≤ λ ˆ Since λ as expected. Physically, this choice of parameters, corresponds to the case where SUs never cooperate. Moreover, since it is easily seen that the value of the optimization problem is bounded, it follows that there are parameters xˆ (b, s, i) , s ∈ S, i ∈ Is0 such that, XX ˆ= rp (s, i)ˆ x(b, s, i), (49) λ s∈S i∈Is0

X

hence, all the defined parameters are nonnegative. Also, due to (30), it holds that XX XX q (e, s, i) = 1 (41) q (b, s, i) + s∈S i∈Is0

s∈S i∈Is0

thus, satisfying Eq. (18). Furthermore, regarding the average power constraints, due to (29), X Ps (i) q(e, s, i) + Ps (i) q(b, s, i) ≤ Pˆs (43) i∈Is

and hence (19) is also satisfied. Finally, regarding the rates, since again we have q (e, s, i) = p (0, s, i) for all s ∈ S and i ∈ Is and Ps (0) = 0, we conclude that {¯ rs }s∈S ∈ R0 .

Ps (i) x ˆ(b, s, i) ≤ Pˆs , s ∈ S, XX

x ˆ (b, s, i) ≤ 1,

The stability region of the PU queue is the closure of the set of PU arrival rates for which there is a policy in C0 (hence, according to the above, in C2 as well) that can stabilize the PU queue. According to the above this is equivalent to the requirement that there exist a set of probabilities {q (b, s, i) , q (e, s, i)}s∈S, i∈I 0 satisfying Eqs. (18)-(22). s Based on this observation and the previous discussion we have the following corrolary.

(51)

s∈S i∈Is0

x ˆ (b, s, i) ≥ 0, s ∈ S, i ∈ Is0 .

(52)

If λp belongs to the stability region of the system, then Eqs. (18)-(21) are satisfied. But then, Eqs. (44)-(47) are also satisfied by choosing x(b, s, i) = q(b, s, i), which implies that ˆ Conversely, given any λp ≤ λ, ˆ the choice of λp ≤ λ.   ˆ x q(b, s, i) = λp /λ ˆ (b, s, i) , s ∈ S, i ∈ Is0 , q(e, s, i) = 0, s ∈ S, i ∈ Is ,

and q(e, s, 0) ≥ 0 arbitrarily chosen so that P P P s∈S i∈I 0 q(b, s, i), s∈S q(e, s, 0) = 1 − s

satisfies Eqs. (19)-(21). In addition, XX λp X X rp (s, i)q(b, s, i) = rp (s, i)ˆ x(b, s, i) ˆ λ s∈S i∈I 0 s∈S i∈I 0 s

C. Stability region of PU Queue.

(50)

i∈Is

s∈S i∈Is0

and Eq. (20) is satisfied. Next, after some basic algebraic manipulations, XX rp (s, i)q(b, s, i) = λp , (42)

(47)

s

= λp ,

proving that the λp belongs to the stability region of the PU queue. This concludes the proof. Remark 3. It can be easily seen that the value of optimization problem OPT1 in Corrolary 2 does not change if inequality in Eq. (46) is replaced by equality. This implies what is intuitively ˆ no idle slots are left by PU, i.e., expected, i.e., when λp = λ, qb = 1 and qe = 0, and the available power from any SU is allocated only to the cooperation or not with the PU.

D. Distributed Implementation In this section we assume that the policy in C0 is designed as a solution to the following optimization problem, OPT2. P maximize rs ) s∈S fs (¯ u.c. Eqs. (18), (19), (20), (21)

(53)

The functions {fs }s∈S are usually selected so that certain fairness criteria for SU rate allocation are satisfied and are assumed to be concave with respect to the variable r¯s . Thus, due to the fact n that r¯s ,∀s ∈ S, is a linearofunction of rs ) is the variables {q (e, s, i)}i∈I 0 , {q (b, s, i)}i∈I 0 , fs (¯ s s also a convex function of these variables. Hence, OPT2 is a convex optimization problem and can be solved efficiently in a centralized manner via modern interior point methods. In an operational environment where parameters may change, problem OPT2 will have to be solved whenever significant changes to parameters occur. Thus, a centralized solution requires the assignment of a specific “Leader” node which will be responsible for the collection of parameters, the solution of OPT2 and the appropriate scheduling of packet transmission. Since the priority requirements of the PU may exclude the use of this node as a Leader, one of the SUs may need to be assigned for this purpose. Such a solution creates a “single point of failure” in case there is a fault at the Leader node. There is also a scaling problem with this approach since the number of unknowns is of the order 2 | S | I , where I is the maximum number of power levels at the SU nodes. Hence, if the number of SUs increases the computational requirement increases and depending on the type of nodes, may render the solution of OPT2 by a single node infeasible. In this section, we investigate the possibility of obtaining the solution to OPT2 in a distributed fashion, without the need of a Leader. The main features of this approach are the following. a) The PU’s involvement in the algorithm is only to announce its arrival rate λp at the beginning of the algorithm no further participation is required. b) A SU node does not need to know the parameters (i.e., rs (i), rp (s, i), i ∈ Is ) of the rest of the SU nodes. c) The distributed solution requires each SU node s ∈ S to solve optimization problems with Is variables, hence the computational complexity for each node does not depend on the size of the SU network. d) Two messages are broadcasted by each SU node per iteration. The number of iterations for convergence depends on the SU network size, but this seems unavoidable. e) Once convergence of the algorithm is reached for a given arrival rate, the SUs need only observe the state of the PU channel (busy or idle); they can decide who to cooperate with the primary or transmit its own packets if the primary is busy, and at what power level, disrtibutively, without the need of a scheduler, or the exchange of control messages. We assume that there is separate low rate channel which is used by the SUs for control message exchange [18]. In

particular we assume that control messages may be broadcasted among the SUs, either because the low rate channel is broadcast in nature, or through the establishment of Broadcast Trees that usually are employed in ad-hoc network formation [19], [20]. Towards a distributed solution to problem OPT2, we would ideally like to decompose this global problem into | S | parallel subproblems, so that each one could be solved by each of the SU nodes, involving only its local variables and parameters. Yet, this is not directly possible, although the objective function is already decomposable across SUs, due to the coupling of SUs’ local variables in the constraints. A traditional technique to circumvent this difficulty is the dual decomposition method using the Lagrange function corresponding to problem OPT2. However, this method can be very slow in terms of convergence [21], [22]. Indeed, the dual decomposition based algorithm we initially tried in order to solve OPT2 appeared to be prohibitively slow, i.e., failed to converge within a tolerable number of iterations. Among all alternatives we tried, the best numerical results, in terms of convergence, were obtained by the algorithm based on the Alternating Direction Method of Multipliers (ADMoM). ADMoM has superior convergence properties over its precursor, the dual ascent method, see textbooks [22]–[24] and references therein for a complete overview of the method. To apply ADMoM to OPT2, we first turned the average power inequality constraints given by Eq. (19) into equalities, by introducing auxiliary variables {ys }s∈S , where ys is associated with the respective sth constraint, and is positive-valued. Hence, OPT2 can be equivalently written as OPT3 given by minimize u.c.

P − s∈S fs (¯ rs ) Eqs. (18), (20), (21) x1,s + x2,s + ys = Pˆs , s ∈ S ys ≥ 0, s ∈ S

(54) (55) (56)

where x1,s ,

X

Ps (i)q(e, s, i)

(57)

X

Ps (i)q(b, s, i).

(58)

i∈Is

and x2,s ,

i∈Is

Since the variables {q(b, s, i}i∈Is0 are not involved in the objective, we can assume without loss of generality a zero function of these variables additively in the objective. OPT3 consists of | S |+2 equality constraints. Let ν denote the dual variable associated with constraint of Eq. (18), µs the dual variable associated with the sth constraint of Eq. (55), and ξ the dual variable associated with the constraint of Eq. (20). ADMoM minimizes iteratively a properly defined augmented Lagrange function, parametrized by the penalty parameter ρ > 0, in a similar way as traditional dual ascent method minimizes the standard Lagrange function [22], [23]. For our specific problem, the augmented Lagrange function

n o {q (e, s, i) , q (b, s, i)}i∈I 0 , ys Lp s



+ν

s∈S

XX

s∈S i∈Is0



+ξ

XX

=−

X

fs

s∈S



rp (s, i)q(b, s, i) − λp  +

q(e, s, i) +

s∈S i∈Is0

, ν, {µs }s∈S , ξ



X

rs (i) q (e, s, i)

i∈Is

X s∈S

!

  µs x1,s + x2,s + ys − Pˆs



2 XX ρ  rp (s, i)q(b, s, i) − λp  q(b, s, i) − 1 +  2 0 0

XX

s∈S i∈Is



s∈S i∈Is

2  2 X XX XX  + q(b, s, i) − 1  x1,s + x2,s + ys − Pˆs +  q(e, s, i) + s∈S

with penalty parameter ρ is defined by Eq. (59) given at the top of the page. The augmented Lagarangian is the standard Lagrange function plus the penalty term, which is the squared Euclidean norm of the vector formed from the residuals of the equality constraints, multiplicated by ρ2 . The penalty parameter ρ is the step-size used for the dual variables updates and plays a key role for the convergence of the method, since it is used for the satisfaction of the necessary and sufficient optimality conditions (primal and dual feasibility) of the original problem to which ADMoM is applied o [22]–[24]. Since n {q (e, s, i)}i∈I 0 , {q (b, s, i)}i∈I 0 , ys , µs are local variables s s for each user s, while {ν, ξ} are global dual variables, it can be easily seen that the standard Lagrangian is already decomposable across the | S | SUs, while the quadratic term is not. This means that every user will need to know the evaluated primal variables of the rest of the users in order to update its local variables. This can be accomplished via message broadcasts by each SU whenever necessary. Next, the optimization tasks and updates that need to be carried out at each user s are given, in the particular variables update order required by ADMoM [22]–[24]. By denoting with k the iteration index, the first optimization step is given by Eq. (60) at the top of the next page. This optimization problem for each user s ∈ S with respect to its local primal variables {q (e, s, i)}i∈Is can be solved via modern interior point methods, or standard methods such as Newton’s Method [21]. Notice that in this step all other variables except for the optimization variables are evaluated in the previous or in the current iteration, and are constants assumed known to user s . That o s has received the nPimplies that user k messages of all users evaluated in the i∈Is0 q (b, s, i) s∈S previous iteration k. Furthermore,P user s P needs the already upk+1 s−1 dated in k + 1 iteration messages m=1 i∈Im q (e, m, i) from users m ∈ {1, . . . , s − 1}, as well as the messages P|S| P k m=s+1 i∈Im q (e, m, i) evaluated at the previous iteration k, from users m ∈ {s + 1, . . . | S |} that have not updated their variables yet. To simplify this we can assume that there



s∈S i∈Is0

(59)

s∈S i∈Is0

is a prespecified order set to all users according to which they broadcast sequentially their updated messages throughout the network via the control channel. The next optimization step in order is given by Eq. (61) at the top of the next page, where now each user ns has already knowledge of the updated o P k+1 variables in the current q (e, s, i) i∈Is0 s∈S iteration due to the message broadcasts assumed in the previous step. Similarly with P the previous step, user Ps−1 k+1 s needs to know 0 rp (m, i) q (b, m, i) m=1 i∈Im Ps−1 P k+1 and from users 0 q (b, m, i) m=1 i∈Im that have already updated their variables, as P|S| P k well as and 0 rp (m, i) q (b, m, i) m=s+1 i∈Im P P|S| k from the rest that have not 0 q (b, m, i) m=s+1 i∈Im updated their variables yet. We assume again that each user updates his variables and broadcasts the values of these two functions of his variables in one message, according to the prespecified order. Up next, each user s updates its primal auxiliary variable ys according to 2 ρ  k+1 ˆ (62) x1,s + xk+1 ysk+1 = arg min µks ys + 2,s + ys − Ps ys ≥0 2 which results in the closed form update rule given by "  #+ k+1 ρPˆs − µks − ρ xk+1 1,s + x2,s k+1 ys = ρ

(63)

0

and implemented by all users in parallel, since involves only local variables. The update rules for the dual variables, using the updated primal variables, are given by Eqs. (64)-(66) located at the top of the next page. As noted earlier, it is crucial that the step-size for the updates of the dual variables are strictly equal to the penalty parameter ρ > 0, in order for the convergence properties of ADMoM to hold. Also, the updates of the global dual variables in (64) and (65) can be computed at each user in parallel and independently, without the necessity of a central controller,

k+1

q (e, s, i)

= arg

min

{q(e,s,i)}i∈I 0 ≥0 s

+ µks



−fs 

X

i∈Is0



rs (i) q (e, s, i) + ξ k

X

q(e, s, i)

i∈Is0

2  ρ X Ps (i) q (e, s, i) + xk2,s + ysk − Pˆs  Ps (i) q (e, s, i) + 2 0 0

X

i∈Is

i∈Is

2 |S| s−1 X X X X X ρ X X k+1 k k + q (e, m, i) + q (b, s, i) − 1 q (e, s, i) + q (e, m, i) + 2 m=1 0 0 m=s+1 

k+1

q (b, s, i)

= arg

i∈Im

i∈Im

min

{q(b,s,i)}i∈I 0 ≥0

ξk

X

q(b, s, i) + ν k

rp (s, i)q(b, s, i) + µks

+

ρ 2

Ps (i)q(b, s, i)

 ρ  k+1 X + Ps (i) q (b, s, i) + ysk − Pˆs  x1,s + 2 0 i∈Is



X

i∈Is0 2

i∈Is0

i∈Is0

s

X

XX

k+1

q (e, s, i)

+

k+1

q (b, m, i)

+

s−1 X X

ρ 2 m=1

k+1

rp (m, i) q (b, m, i)

|S| X

+

X

k

q (b, m, i) +

X

X

i∈Is0 k

rp (m, i) q (b, m, i) +

P

P

k+1

P

P

k+1

q (b, s, i)  k+1 − λp ν k+1 = ν k + ρ s∈S i∈Is0 rp (s, i)q(b, s, i)   k+1 k+1 k k+1 ˆ , s ∈ S. µk+1 = µ + ρ x + x + y − P s s s s 1,s 2,s s∈S

q (e, s, i) P P

i∈Is0

since all the required information is already known to all users. Furthermore, the last update involves only local variables of user s. As discussed earlier in the main points of the distributed approach, the computational burden is distributed across the SU nodes. Thus, instead of solving the linear problem OPT2 involving 2 | S | I variables at a single node, now each SU s ∈ S is assigned to solve the quadratic optimization problems in Eq. (60) and Eq. (61) of the order Is variables, plus the update steps in Eq. (63)-(66), which are simple gradient steps involving a single variable. The resulting distributed algorithm is summarized as described at the top of the next page. Since convergence of the algorithm needs to be determined in a decentralized manner, each SU should keep track of a local metric and determine local convergence with respect to it, within a prespecified accuracy. Then, as soon as all SUs reach convergence the algorithm terminates. For the termination of the algorithm each SU may keep a binary variable, taking the

+

s∈S

2

q (b, s, i) − 1

X

i∈Is0

0 m=s+1 i∈Im

0 i∈Im

ξ k+1 = ξ k + ρ

|S| X

0 m=s+1 i∈Im

0 m=1 i∈Im

s∈S i∈Is0



+

s−1 X X

(60)

s∈S i∈Is

i∈Is

i∈Is0

−1



2

q (b, s, i) − λp 

(61)

(64) (65) (66)

value 1 once local convergence is reached and 0 otherwise. At the end of each iteration each SU broadcasts this message to all other users in the network and it keeps on iterating for as long as it receives zero flags from the rest SUs. Regarding the local metric used by each SU s ∈ S for determining convergence, this can be the successive differences of SU’s local objective function under optimization, i.e., fs (¯ rs ). Once the local metric which each user keeps track of drops under a given accuracy, local convergence is declared. The initialization of the algorithm can be arbitrary as long as the variables are positive valued. Also, we assume that PU broadcasts its arrival rate λp at the beginning of the algorithm. The communication overhead of the proposed distributed protocol is 2 | S | message broadcasts per iteration. At this point it can be clear why our argument that once the protocol terminates for a given arrival rate, the network evolves without the need of a scheduler, or any further communication among SU nodes, holds. Note that once convergence of the algorithm is reached all SUs have knowledge of the sums of

Algorithm 1 Distributed solution to problem OPT2 . • P U : Broadcast the packet arrival rate λp . • Initialization 0, set: ρ > o :For iterationn k = o n 0 0 0, q (e, s, i) > 0∀s, q (b, s, i) > 0∀s, i∈Is0 i∈Is0  0 µs s∈S > 0, ξ 0 > 0, ν 0 > 0. P 0 • ∀s ∈ S : Broadcast initial values i∈Is0 q (e, s, i) , P 0 P 0 i∈Is0 q (b, s, i) , i∈Is0 rp (s, i) q (b, s, i) . • Repeat : o n

q (e, s, i)k+1 by solving i∈Is0 k+1 P accord(60) and broadcast message i∈Is0 q (e, s, i) ing to a prespecified order. o n k+1 2) ∀s ∈ S : update primal q (b, s, i) by i∈Is0 P k+1 solving (61) and broadcast and i∈Is0 q (b, s, i) P k+1 r (s, i) q (b, s, i) in one message according i∈Is0 p to a prespecified order. 3) ∀s ∈ S : update primal variable ysk+1 according to (63). 4) ∀s ∈ S : update dual variable ξ k+1 according to (64). 5) ∀s ∈ S : update dual variable ν k+1 according to (65). 6) ∀s ∈ S : update dual variable µk+1 according to (65). s • Set k = k + 1 and go to step 1. • Until convergence within ∀s ∈ S. Set o o n n desired tolerance opt k q (e, s, i) = q (e, s, i) , ∀s ∈ S and o i∈Is0 n o i∈Is0 n opt k q (b, s, i) = q (b, s, i) , ∀s ∈ S. 0 0 1) ∀s ∈ S : update primal

i∈Is

i∈Is

o n P opt opt P , ∀s ∈ probabilities i∈Is0 q (e, s, i) , i∈Is0 q (b, s, i) S. Thus, as long as all the parameters remain still and SUs observe the state of the PU channel, they can all independently produce the same result about who SU is scheduled whether to cooperate or not in every time slot, if they are provided with the same randomization algorithm and common seed. Then, the SU that is scheduled determines his power level for his activity based on his own probability parameters. The algorithm runs again only when some of the parameters of the operational environment change significantly. Thus, when the arrival rate changes within a prespecified percentage of its previous value, the PU informs the SUs via the control channel about the new value of λp . Also, in case channel characteristics change for some SU within a certain percentage, the corresponding SU may announce the rerun of the algorithm. However, in such cases, the algorithm can adapt to changes in the operational environment, in the sense that the problem is not solved from scratch, but the algorithm is initialized at the optimal point of the previous system’s state, which finally speeds up its convergence and reduces the overall communication overhead, as will be shown in the simulation results that follow. IV. S IMULATION

AND

N UMERICAL R ESULTS

In this section we evaluate the performance of the presented class of policies and the proposed distributed algorithm,

through simulation and numerical results. For deriving these results, we consider as objective optimization function f (·) the weighted sum of the transmission rates of the SUs, i.e.,  X f {¯ rs }s∈S = r¯s . (67) s∈S

We initially investigate the performance of an optimal policy in C0 in comparison with an optimal dynamic policy from the C2 class, constructed through the well known Lyapunov optimization techniques [9], [10] and described in the Appendix, in terms of SUs throughput and average PU queue size. This is achieved by providing some indicative simulation results from experiments in various setups. We also present the performance of the transmission algorithm presented in [15]. Next, for the same setups, we examine the convergence and the adaptivity capability of the distributed algorithm through numerical results. Initially, we consider a setup where there are |S| = 2 SUs with two levels of transmission power available. More specifically, s ∈ {1, 2}, while the set of available power levels, common for the two users, is Is0 = {0, 1}, where P (0) = 0 and P (1) = 1 = Pmax . We also assume rp (0) = 0.4, rp (s, 1) = 0.8, rs (1) = 1 and finally Pˆs = 0.5 for s ∈ S. For such a scenario, the performance of the system is depicted by Figs. 1-2 in terms of f {¯ rs }s∈S and average backlog of PU queue, respectively. It can be seen in Fig. 1 that, as expected, the sum rate achieved by SUs that employ an optimal policy from the restricted class C0 is identical with the sum rate achieved under the the optimal policy in C2 . It should be noted that as seen in Fig. 2, under the optimal policy in C0 the average backlog of the PU queue remains very low. On the contrary, the dynamic policy from C2 induces large sizes to PU queue even in small arrival rates. Furthermore, when compared with the transmission algorithm presented in [15], the C0 class of policies extends the range of λp that can be supported by the system, providing mutual benefits to both PU and SUs from their cooperation. In particular, transmission rates above the service rate without cooperation are supported for the PU, while transmission opportunities for their own data are given to the SUs. It should be noted that the policy in [15] was shown to be optimal for λp < 0.4, and this is confirmed in Fig. 1, where it is seen that all three policies achieve the same sum-rate for λp < 0.4. However, as seen by this experiment, the policy in [15] renders the PU queue unstable for λp > 0.4 and reduces the SU sum rates to zero. The reason is the following. In [15], decisions are taken at the end of busy periods of the PU queue. When λp > 0.4, whenever a decision of not cooperating is taken, there is a nonzero probability that the primary queue never becomes empty and hence there is no possibility for the SUs to take corrective actions. Figs. 3-4 depict the second simulation scenario which consists of | S | = 5 SUs, with five available power levels for each one. In particular, Is0 = {0, 1, 2, 3, 4}, where P (0) = 0, P (1) = 0.25, P (2) = 0.5, P (3) = 0.75, P (4) = 1 = Pmax ,∀s ∈ S. Furthermore, we assume in this case that rp (0) = 0.4, rp (s, 1) = 0.5, rp (s, 2) = 0.6,

1.0

1.0

C

0.9

0

class of policies

C

0.9

Algorithm presented in [15] C

2

0.8

class of policies

0.7

0.7

0.6

0.6

f( r )

0.5

s

s

f( r )

0.8

0

class of policies

Algorithm presented in [15] 2

class of policies

0.5

0.4

0.4

0.3

0.3

0.2

0.2

0.1

0.1

0.0

C

0.0 0.1

0.2

0.3

0.4

0.5

PU arrival rate,

Fig. 1.

0.6

0.7

0.8

0.1

0.2

0.3

0.4

0.5

PU arrival rate,

p

The sum of the secondary users’ rates for the first scenario.

Fig. 3.

0.6

0.7

0.8

p

The sum of the secondary users’ rates for the second scenario.

160 C

140

0

class of policies

Algorithm presented in [15]

140

C

2

class of policies

C 100

0

class of policies

Algorithm presented in [15] C

2

class of policies

80

60

40

20

Average PU queue backlog

Average PU queue backlog

120 120

100

80

60

40

20

0

0 0.1

0.2

0.3

0.4

0.5

PU arrival rate,

0.6

0.7

0.8

0.1

0.2

0.3

0.4

0.5

PU arrival rate, p

0.6

0.7

0.8

p

Fig. 2. The average backlog of the primary user’s queue for the first scenario.

Fig. 4. The average backlog of the primary user’s queue for the second scenario.

rp (s, 3) = 0.7, rp (s, 4) = 0.8, ∀s ∈ S. On the other hand, rs (1) = 0.3, rs (2) = 0.5, rs (3) = 0.8, rs (4) = 1, ∀s ∈ S. Finally, the average power constraint is Pˆs = 0.15, ∀s. From these figures we make again the same observations as with the previous simulation scenario that involved two users and two power levels. The convergence of the proposed distributed algorithm is examined in Table I, when both system setups are considered. Specifically, this table depicts the number of iterations and the broadcast messages required for convergence, when the arrival rate λp is varied inside the stability region and the proposed algorithm begins from scratch (arbitrary initialization). Regarding algorithm’s parameters, we set the desired accuracy for converegence equal to ǫ = 10−5 , while the penalty parameter is taken equal to n ρ = 0.1.o For the initialization 0 = 0.01, ∀s ∈ S, of the algorithm, we used q (e, s, i) 0

o n q (b, s, i)0

i∈Is

i∈Is0

= 0.03, ∀s ∈ S,

 0 µs s∈S = 1, ξ 0 = 1,

ν 0 = 1. As local metric for convergence of each node s function fs (r¯s ) was used. The distributed algorithm was tested against the centralized solution to the problem of Eq. (53), in terms of the value of the objective, the weighted sum rates of SUs in both setups, and for all cases of the arrival rate λp considered. The numerical results obtained from both algorithms where identical, showing that our proposed algorithm keeps up with its centralized counterpart. We skip these results for brevity, but provide some detailed results concerning the convergence of the algorithm in Table I. It can be easily seen from the results there that the algorithm converges within a tolerable number of iterations at lower PU transmission rates, and even faster at higher ones. Finally, the adaptivity of the distributed algorithm to small or even larger changes in the arrival rate λp , is investigated

in Table II. In particular, we begin with an initial rate equal to λ0p = 0.5, i.e., well within the stability region, and run the algorithm from scratch, i.e., arbitrary initialization as described above. For all values of λp different from λ0p , we use as warm initialization for the algorithm the optimal point found at λ0p , and write down the number of iterations required for convergence within the given accuracy. Clearly, for both setups considered, there is a remarkable reduction in the total number of iterations required for convergence compared with the arbitrary initialization. This justifies our claim that the proposed algorithm can efficiently adapt to not only slight but even larger changes in the arrival rate. V. C ONCLUSIONS In this work we put forward novel primary-secondary user cooperation policies for cognitive radio networks that orchestrate a primary user and co-existing secondary users in a wireless channel. The objective is to maximize a function of the average SU transmit rates, while maintaining stability in the PU queue. The policies operate as follows: when the primary queue is non-empty (channel is busy), a secondary user is selected for cooperation, together with a transmit power, with a certain probability. When the primary queue is empty (channel is idle), a secondary user and a transmit power are selected for transmission, again with a certain probability, different than the one above. The major finding and contribution to the state of the art is that although requiring only the sensing of the state of PU channel (busy or empty) for their implementation, these policies: a) achieve substantial augmentation of stability region of the PU queue, and b) can obtain any long term SU rates achievable by policies for which the restriction of always giving priority to PU traffic is removed. An important feature of the algorithm is that the optimal transmit probabilities can be computed offline and communicated to users. A centralized and a distributed version of the problem are presented, both of which are applicable depending on the setup. The key idea is that secondary users increase the service rate of the primary queue and therefore they increase the range of arrival rate of the primary users for which its queue is stable. At the same time, the primary queue empties more often, and the channel becomes idle, and secondary users obtain more transmission opportunities. This in turn improves their own transmit opportunities and therefore their throughput region. There exist several directions for future study. First, in this work, we assumed that channel sensing is error-free; imperfect sensing introduces several new features and alters the structure of the problem. Second, the issue of designing dynamic, online version of this algorithm is open. Finally, the coordination of multiple primary and secondary users gives rises to game theoretic models that warrant further investigation. A PPENDIX In this appendix we present an optimal dynamic algorithm from C2 class of policies, which is based on the Min Drift-Plus-

Penalty algorithm [9], [10]. After defining the virtual queues Xs (t) for all s ∈ S and i ∈ Is0 as h i Xs (t + 1) = max Xs (t) − Pˆs , 0 + Ps (i) , (68)

the Lyapunov function of the sytem’s backlog Θ (t) = {Qp (t) , Xs (t)} as 1X 2 1 Xs (t) (69) L (Θ (t)) = Q2p (t) + 2 2 s∈S

and minimizing the drift-plus-penalty upper bound in every time-slot t and for all s ∈ S, the following algorithm is derived, where V is a design parameter that provides a tradeoff between accuracy of utility optimization and PU queue size: 1) Calculate the maximum of the following expressions based on the current size of the primary user’s queue and the virtual queues, cp =

max Qp (t) rp (s, i) − Xs (t) Ps (i)

s∈S,i∈Is0

(70)

and (s∗1 , i∗1 ) = arg max 0 cp s∈S,i∈Is

2) Calculate for each s ∈ S the following expressions cs = max V rs (i) − Xs (t) Ps (i) i∈Is

(71)

and is = arg max0 cs . i∈Is

Furthermore, determine s∗0 = arg max cs s∈S

3) Based on the calculated values of cp and cs , determine the control decision γ˜ defined by Eq. (23) as  (1, s∗1 , i∗1 ), if cp ≥ maxs∈S cs (72) γ˜ = otherwise 0, s∗0 , is∗0 , 4) Update the PU queue and virtual queues accordingly. R EFERENCES [1] “Report of the spectrum efficiency working group,” FCC Spectrum Policy Task Force, Tech. Rep. 02-135, 2002. [2] J. Mitola, “Cognitive radio: An integrated agent architecture for software defined radio,” Ph.D. dissertation, KTH, Stockholm, Sweden, 2000. [3] S. Haykin, “Cognitive radio: Brain-empowered wireless communications,” IEEE J. Sel. Areas Commun., vol. 23, no. 2, pp. 201–220, Feb. 2005. [4] I. F. Akyildiz, W.-Y. Lee, M. C. Vuran, and S. Mohanty, “Next generation/dynamic spectrum access/cognitive radio wireless networks: A survey.” Comput. Netw., vol. 50, no. 13, pp. 2127–2159, Sept. 2006. [5] Q. Zhao and B. Sadler, “A survey of dynamic spectrum access,” IEEE Signal Processing Magazine, vol. 24, no. 3, pp. 79–89, May 2007. [6] C. Peng, H. Zheng, and B. Y. Zhao, “Utilization and fairness in spectrum assignment for opportunistic spectrum access,” ACM/Springer MONET, vol. 11, no. 4, Aug. 2006. [7] Y. Chen, Q. Zhao, and A. Swami, “Joint design and separation principle for opportunistic spectrum access in the presence of sensing errors,” IEEE Trans. Inf. Theory, vol. 54, pp. 2053–2071, Jan 2008. [8] R. Urgaonkar and M. J. Neely, “Opportunistic scheduling with reliability guarantees in cognitive radio networks,” IEEE Trans. Mobile Comput., vol. 8, pp. 766–777, Jan 2009.

TABLE I N UMBER OF ITERATIONS AND TOTAL MESSAGE BROADCASTS FOR THE DISTRIBUTED PROTOCOL . λp 2SUs / iterations 5SUs / iterations

0.2 164 263

0.3 119 172

0.4 103 129

0.5 92 119

0.6 79 105

0.7 76 74

0.8 55 -

TABLE II N UMBER OF

ITERATIONS AND TOTAL BROADCASTS FOR THE DISTRIBUTED PROTOCOL , λ0 p

λp 2 SUs / iterations 5 SUs / iterations

0.35 32 44

0.4 37 34

0.45 32 39

[9] L. Georgiadis, M. J. Neely, and L. Tassiulas, Resource allocation and cross-layer control in wireless networks. Foundations and Trends in Networking, 2006, vol. 1, no. 1. [10] M. J. Neely, Stochastic Network Optimization with Application to Communication & Queueing Systems. Morgan & Claypool, Aug. 2010. [11] A. Goldsmith, S. A. Jafar, I. Maric, and S. Srinivasa, “Breaking spectrum gridlock with cognitive radios: An information theoretic perspective,” in Proc. IEEE, vol. 97, Jan 2009, pp. 894–914. [12] O. Simeone, I. Stanojev, S. Savazzi, Y. Bar-Ness, U. Spagnolini, and R. Pickholtz, “Spectrum leasing to cooperating secondary ad hoc networks,” IEEE J. Sel. Areas Commun., vol. 26, pp. 203–213, Jan 2008. [13] O. Simeone, Y. Bar-Ness, and U. Spagnolini, “Stable throughput of cognitive radios with and without relaying capability,” IEEE Trans. Commun., vol. 55, pp. 2351–2360, Jan 2007. [14] I. Krikidis, J. Laneman, J. Thompson, and S. Mclaughlin, “Protocol design and throughput analysis for multi-user cognitive cooperative systems,” IEEE Trans. Wireless Commun., vol. 8, pp. 4740–4751, Jan 2009. [15] R. Urgaonkar and M. Neely, “Opportunistic cooperation in cognitive femtocell networks,” IEEE J. Sel. Areas Commun., vol. 30, no. 3, pp. 607 –616, April 2012. [16] E. Altman, Constrained Markov Decision Processes. Chapman & Hall/CRC, 1999. [17] J. Mo and J. Walrand, “Fair end-to-end window-based congestion control,” IEEE/ACM Trans. Netw., vol. 8, no. 5, pp. 556–567, 2000. [18] B. F. Lo, “A survey of common control channel design in cognitive radio networks,” Ph. Commun., vol. 4, no. 1, pp. 26–39, 2011. [19] I. Papadimitriou and L. Georgiadis, “Minimum energy broadcasting in multihop wireless networks using a single broadcast tree,” Mob. Networks and Appl., vol. 11, pp. 361–375, 2006. [20] J. E. Wieselthier, G. D. Nguyen, and A. Ephremides, “Energy-efficient broadcast and multicast trees in wireless networks,” Mob. Networks and Appl., vol. 7, pp. 481–492, 2002. [21] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge, UK: Cambridge Univ. Press, 2004. [22] D. Bertsekas, Constrained Optimization and Lagrange Multiplier Methods. 2nd ed. Belmont, MA: Athena Scientific, 1996. [23] S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, “Distributed optimization and statistical learning via the alternating direction method of multipliers,” Foundations and Trends in Machine Learning, vol. 3, no. 1, pp. 1–122, 2011. [24] D. Bertsekas and N. J. Tsitsiklis, Parallel and Distributed Computation: Numerical Methods. 2nd ed. Belmont, MA: Athena Scientific, 1999.

0.52 33 29

0.55 37 39

0.58 39 40

0.6 38 45

0.65 43 31

0.7 43 16

= 0.5.

Suggest Documents