Scheduling and Resource Allocation in Downlink Multiuser MIMO

0 downloads 0 Views 2MB Size Report
Abstract—Multiuser MIMO (MU-MIMO) in general, and block diagonalization (BD) in particular, are playing a prominent role toward the achievement of higher ...
IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 64, NO. 5, MAY 2016

2019

Scheduling and Resource Allocation in Downlink Multiuser MIMO-OFDMA Systems Guillem Femenias, Senior Member, IEEE, and Felip Riera-Palou, Senior Member, IEEE

Abstract—Multiuser MIMO (MU-MIMO) in general, and block diagonalization (BD) in particular, are playing a prominent role toward the achievement of higher spectral efficiencies in modern OFDMA-based wireless networks. The utilization of such techniques necessarily has implications in the scheduling and resource allocation processes taking care of assigning subcarriers, power, and transmission modes to the different users. In this paper, a framework for channel- and queue-aware scheduling and resource allocation for BD-based MU-MIMO-OFDMA wireless networks is introduced. In particular, using an SNRbased abstraction of the physical layer, the proposed design is able to cater for different BD-MU-MIMO processing schemes [co-ordinated Tx-Rx (CTR) or receive antenna selection (RAS)], uniform or adaptive power allocation (UPA/APA), continuous or discrete rate allocation (CRA/DRA), and many different scheduling rules. Additionally, the different strategies are complemented by a new greedy user/stream selection algorithm that is shown to perform very close to the optimal user/stream selection policy at a much lower complexity. Results using system parameters typically found in 4G networks reveal that, in most cases, lowcomplexity solutions (RAS-, UPA-based) achieve a performance close to the one attained by their more complex counterparts (CTR-, APA-based). Index Terms—Multiuser MIMO, OFDMA, block diagonalization, scheduling, resource allocation.

I. I NTRODUCTION

I

N THE downlink of a multiple-input multiple-output orthogonal frequency division multiple access (MIMOOFDMA) system, the scheduling and resource allocation (SRA) unit at the base station (BS) obtains channel state information (CSI) from the physical (PHY) layer of all the mobile stations (MSs) in the system and collects queue state information (QSI) by observing the backlogged data at the data link control (DLC) layer to be transmitted to these MSs. Based on this information, the SRA unit can then make SRA decisions allowing a good trade-off between frequency, space and multiuser diversity exploitation, provision of fairness, and delivering of quality-of-service (QoS) to the wide range of applications supported by emerging wireless networks [1].

Manuscript received October 2, 2015; revised January 20, 2016 and March 17, 2016; accepted March 21, 2016. Date of publication March 28, 2016; date of current version May 13, 2016. This work was supported by the Ministerio de Economia y Competitividad (MINECO), Spain, under projects AM3DIO (TEC2011-25446) and ELISA (TEC2014-59255-C3-2-R). The associate editor coordinating the review of this paper and approving it for publication was M. Tao. The authors are with the Mobile Communications Group, University of the Balearic Islands (UIB), Mallorca 07122, Spain (e-mail: guillem.femenias@uib. cat; [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TCOMM.2016.2547424

When the CSI is known at the transmitter (CSIT), channel sum capacity can be achieved using non-linear multiuserMIMO (MU-MIMO) precoding techniques based on dirty paper coding (DPC) [2] combined with an implicit user scheduling and power loading algorithm. Such non-linear techniques, however, are difficult to implement in practical systems due to the high computational complexity of successive encodings and decodings, especially when the number of users in the system is large. Based on implementation considerations for current wireless standards, suboptimal but less complex MU-MIMO schemes based on linear precoding have been alternatively considered. A comprehensive discussion of classic MU-MIMO techniques showing the key advantages they offer over single-user-MIMO (SU-MIMO) communications can be found, for instance, in [3], [4]. Among many different linear MU-MIMO approaches, zero-forcing beamforming (ZFBF) [5] and block diagonalization (BD) [6] are particularly appealing as they have been shown to asymptotically approach the sum-rate capacity of DPC while relying on low complexity linear processing. A thorough comparison between BD-based techniques and ZFBF presented by Chen et al. in [7] reveals the absence of major performance differences among these schemes. Nonetheless, the BD approach, which can indeed be regarded as a generalization of the ZFBF scheme, will be the MU-MIMO technique considered henceforth notwithstanding that ZFBF or other linear MU-MIMO approaches could also be incorporated to the proposed framework. Linear MU-MIMO processing can achieve the channel sum capacity when the number of active users in the system is large [4], [5], [8]. Furthermore, as the number of spatial data streams that can be simultaneously transmitted in MU-MIMO systems is upper bounded by the number of transmit antennas at the BS, the sum throughput can be optimized by scheduling the optimum subset of users and the optimum number of spatial data streams to each selected user [5], [7], [9]–[11]. As the number of active users in the system and the number of receive antennas per user increase, however, the joint optimal selection of a subset of users and the corresponding number of allocated spatial streams over a given resource using an exhaustive search over all possible combinations becomes computationally prohibitive, and thus low-complexity suboptimal user/mode selection algorithms must be devised. Different scheduling and user/mode selection algorithms have been proposed in the literature leveraging features of different MU-MIMO strategies [5], [7], [9]–[11]. Two algorithms that are particularly suited to BD-based MU-MIMO schemes are the capacity-based greedy user/multimode selection algorithm [6] and the capacity-based greedy user/multiantenna selection algorithm [11]. The former

0090-6778 © 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

2020

is based on the dominant left singular vectors of the channel of each user in the selected group, and the later relies on the selection of an optimal subset of receive antennas for each selected user in the group. Scheduling and resource allocation can be regarded as a multi-objective optimization problem taking into account not only the channel capacity but also, among many others, the transmitted power, the QoS constraints, the priority levels of different traffic classes, and the amount of backlogged data in the queues. In general, there is not a single optimal solution to a multi-objective optimization problem, however, using tools from information theory, queueing theory, convex optimization, and stochastic approximation, there is a rich literature in the context of MIMO-OFDMA reporting optimal and suboptimal SRA algorithms. These techniques take into consideration the PHY layer channel conditions jointly with the DLC layer bursty packet arrivals and queueing behavior [12]–[21]. Despite the large body of knowledge regarding SRA in general, and in the context of MIMO-OFDMA architectures in particular, contributions contemplating the effects the MU-MIMO component has on the SRA problem are scarce, with notable exceptions being [22]–[24], addressing the margin adaptive (power minimization subject to users’ rate constraints) problem, which is out of the scope of our work, and [25]–[30], targeting the rate adaptive (sum-rate maximization subject to a power constraint) problem addressed in this work. Ho and Liang in [25] propose a solution to minimize the total transmit power subject to peruser minimum rate constraints without exploring the effect that the user/mode selection process may have on the SRA process. Maciel et al. in [26] and Papoutsis et al. in [27] tackle this specific SRA problem but only in the MISO-OFDMA context, thus oversimplifying the SRA problem given the current and envisaged capabilities of state-of-the art mobile stations. In [28], Yen et al. propose a greedy heuristic utility-based suboptimal throughput maximization and complexity reduction SRA scheme for downlink multiuser MIMO-OFDMA systems whose approach lacks of enough generality to address designs based on different power/subband/rate allocation strategies. Chen and Swindlehurst [29] explore the problem of SRA in the context of MIMO-OFDMA from a game theoretic bargaining perspective. The proposed bargaining solutions achieve a useful trade-off between overall system efficiency and user fairness; however, even though they do not consider neither the possibility of users with heterogeneous QoS requirements nor the joint selection of optimal multiuser/multimode groups, the corresponding algorithms only have reasonable complexity for situations involving a relatively small number of users with oversimplified approaches being required for practical implementation. Very recently, authors in [30] have investigated a setup similar to the one considered in this work, however they restrict to a uniform power allocation in both the spatial and frequency domains, thus simplifying the problem, and moreover, the scheduling technique employed does not consider queue state information at the DLC layer. For completeness, we also note that in order to address the statistical delay requirements of multimedia traffic, the link-layer model termed effective capacity by Wu and Negi in [31] has been used to formulate resource allocation optimization problems in multicarrier

IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 64, NO. 5, MAY 2016

systems. Unfortunately, most of them only consider designs that strive for energy efficiency (see for example [32]–[34]) and furthermore, do not consider the use of MU-MIMO-OFDMA techniques. In this paper, channel- and queue-aware rate adaptive SRA algorithms addressing some of the shortcomings of previous approaches are proposed. These algorithms are able to optimally exploit MIMO-OFDMA wireless systems using MUMIMO schemes based on BD processing 1 . Our main contributions in this paper can be summarized as follows: 1) Based on the availability of PHY layer CSI and DLC layer QSI, a novel optimal design framework suitable for MU-MIMO scenarios with an arbitrary number of transmit/receive antennas is introduced integrating the steps of BD-based precoding design, multiuser/multimode selection, and power/subband allocation while taking into account heterogeneous traffic classes and supporting different scheduling policies. Our approach relies on the derivation of the signal-to-noise ratio (SNR) expression general enough to encompass different BD precoding schemes as well as different power/rate allocation strategies such as adaptive and uniform power allocation, along with discrete and continuous rate allocation. As far as we know, our proposal is the first one addressing the unified treatment of this complex optimization problem in the context of SRA for multiuser MIMO-OFDMA-based systems. 2) In order to simplify the optimal algorithms, which are based on a computationally prohibitive exhaustive search approach, a new low-complexity multiuser/multimode selection algorithm is proposed. Based on greedy principles grounded on both the suboptimal user/mode selection algorithms hinted by Yoo and Goldsmith in [5, Sect. VIII] and the capacity-based user selection algorithm proposed by Shen et al. in [10, Sect. III.A] this novel algorithm is shown to perform very close to the optimal solution in practical scenarios. 3) Reinforcing the practical applicability of the proposed framework, results for the discrete data-rate schemes presented in Section VI are obtained using block error probability (BLER) models for state-of-the-art adaptive modulation and coding (AMC) schemes standardized by the 3GPP [35]–[37]. This allows the quantification of the role played by the MU-MIMO component in terms of realistic QoS metrics. The rest of the paper is organized as follows. Section II introduces the system model under consideration alongside with the key assumptions made in the formulation of the optimization problem. A thorough description of the scenario, the channel model employed, as well as of the MUMIMO transmitter and receiver architectures is provided. Section III presents a description of the variables involved in the optimization problem. Section IV formally states the unified framework for constrained channel- and queue-aware SRA for heterogenous multi-service MU-MIMO-OFDMA 1 The proposed framework could easily be extended to encompass other MUMIMO techniques such as ZFBF.

FEMENIAS AND RIERA-PALOU: SCHEDULING AND RESOURCE ALLOCATION

2021

TABLE I N OTATION G LOSSARY

wireless networks. Both discrete (AMC-based) and continuous (Shannon-capacity-based) strategies are considered, and solutions based on dual-optimization are provided. A utility-based suboptimal greedy multiuser/multimode selection algorithm is proposed in Section V. Extensive numerical results are obtained in Section VI. Finally, the main outcomes of this paper are recapped in Section VII. This introduction ends with a notational remark. In this paper, vectors and matrices are denoted by lower- and uppercase bold letters, respectively. The K-dimensional identity matrix is represented by I K . The symbols C and R+ serve to denote the sets of complex and non-negative real numbers, respectively. The operator rank (X) denotes the rank of matrix X whereas diag (x) represents a diagonal matrix with the components of vector x in its main diagonal. Superscripts (·)T and (·) H are used to denote the transpose and the conjugate transpose (Hermitian) of a matrix. Table I summarizes the notation for the most commonly used parameters in this paper. II. S YSTEM M ODEL AND A SSUMPTIONS Let us consider the downlink of a time-slotted BD-based MU MIMO-OFDMA wireless packet access network similar to those currently found in LTE/LTE-A deployments. In this setup, a BS with a total transmit power PT and equipped with N T transmit antennas provides service to Nm active MSs, each (m) equipped with N R receive antennas. Active MSs are indexed by the set Nm = {1, . . . , Nm }, and it is assumed in this paper (m) that N T ≥ N R for all m ∈ Nm . Transmission between the BS and active MSs is organized in space-time-frequency resource allocation units, also known as resource blocks (RB). Each RB is formed by a slot in the time axis, a subband in the frequency axis and a spatial mode in the space axis: • In the time axis, each RB occupies a time-slot of a fixed duration Ts , assumed to be less than the channel coherence time. Thus, the channel fading can be considered constant over the whole slot and it only varies from slot to slot. Each of these slots consists of a fixed number

No of OFDM symbols of duration To + TC P = Ts /No , where To denotes the duration of payload data, and TC P is the cyclic prefix duration. Slot duration and scheduling time interval will be used interchangeably throughout the paper. • Slotted transmissions take place over a bandwidth B, which is divided into N f orthogonal subcarriers, out of which Nd are used to transmit data and N p are used to transmit pilots and to set guard frequency bands. The Nd data subcarriers are split into Nb orthogonal subbands, each consisting of Nsc adjacent subcarriers and with a bandwidth Bb = B Nd /(N f Nsc ) small enough to assume that all subcarriers in a subband experience frequency flat fading. Frequency subbands in a given slot are indexed by the set Nb = {1, . . . , Nb }. • The BS, according to the allocation criteria and using BD-based MU-MIMO techniques, distributes the available spatial modes in a given slot t over subband b to a group of users selected from the pool of Nm active MSs. The potential user and mode subsets, also known as multiuser/multimode groups, on a given slot t and subband b are indexed by the set Gb (t) = {1, . . . , G b (t)}, where G b (t) denotes the number of possible multiuser/multimode groups g ∈ Gb (t). The number of spatial modes allocated to a specific group of selected MSs will be denoted as Sb,g (t), and the number of spatial modes simultaneously allocated to a given MS m in multiuser group g will be denoted as Sb,g,m (t). Since optimization is performed on a slot-by-slot basis, from this point onwards the time dependence (i.e., (t)) of all the variables will be omitted to simplify notation. Without loss of generality, and in order to simplify the mathematical notation of the problem, only one service data flow (also known as a connection or session) per active MS will be assumed. Depending on the traffic type, three classes of service must be accounted for in wireless communications2 [39]: 2 Using LTE terminology, real-time services would be associated to guaranteed bit rate (GBR) evolved packet system (EPS) bearers, and the non GBR bearers would be suitable for best-effort and non-real-time services [38].

2022

IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 64, NO. 5, MAY 2016

Best Effort (BE), Non-Real-Time (nRT), and Real-Time (RT). Traffic flows arriving from higher layers are buffered into the corresponding Nm first-in first-out (FIFO) queues at the DLC layer. At the beginning of each scheduling time interval, based on the available joint CSI/QSI, the SRA unit selects some bits in the queues for transmission, which are then forwarded to the OFDM transmitter, where they are adaptively modulated and channel encoded and are allocated power and subbands based on the BD MU-MIMO processing.

greater than the maximum delay spread of the channel impulse response are assumed. In this case, the received samples at the (m) output of the N R FFT processing stages are given by the N R(m) × 1 complex valued vector

A. PHY Layer Modeling

˘ b,m R where H b,m = R R X,m H T X is a complex valued N R × N T matrix modeling the spatially correlated MIMO channel between the BS and MS m, for subband b and over the whole (m) (m) time slot period [40], with R R X,m ∈ C N R ×N R denoting the receive correlation matrix at MS m, R T X ∈ C NT ×NT denot˘ b,m ∈ ing the transmit correlation matrix at the BS, and H

(m b,g,1 )

NR

(m b,g,Mb,g )

, . . . , NR

× N T , as opposed to writing

(b,g)

N R × N T as in a single-user MIMO channel. In this case, bits from the queues of MSs m ∈ Mb,g are channel encoded and mapped onto a sequence of symbols drawn from the allocated normalized unit energy complex constellation (e.g., BPSK, QPSK, 16QAM, 64QAM). Furthermore, before the usual OFDM modulation steps on each transmit antenna (IFFT, cyclic prefix appending and up-conversion), the symbols are allocated power and are processed in accordance with the linear pre-processing scheme dictated by the BD procedure. (c,o) Denoting by d b,g,m the Sb,g,m × 1 vector of symbols to be sent to an arbitrary MS m ∈ Mb,g over subcarrier c ∈ {1, . . . , Nsc } of subband b and OFDM symbol o ∈ {1, . . . , No } during an arbitrary time slot, then the corresponding N T × 1 transmitted vector can be expressed as (c,o)

(c,o)

x b,g,m = Ab,g,m d b,g,m ,

(1)

where Ab,g,m ∈ C NT ×Sb,g,m is the downlink transmit precoding matrix of MS m. The vector of transmitted samples from the BS to MSs in set Mb,g over subband b can then be expressed as (c,o)

(c,o)

x b,g = Ab,g d b,g, ,

where

  Ab,g = Ab,g,m b,g,1 . . . Ab,g,m b,g,Mb,g ,

(2) and

the

(c,o) d b,g

vector of data symbols can be expressed as =  T (c,o) T (c,o) T d b,g,m b,g,1 . . . d b,g,m b,g,M . Note that the transmission b,g

of Sb,g,m spatial streams to each MS m ∈ Mb,g over each subband b ∈ Nb is only possible if the following constraints are met: Sb,g,m ≤ N T ∀b, g, and Sb,g,m ≤ N R(m) ∀b, g, m. (3) m∈Mb,g

At the receiver side, as usual, ideal synchronization and sampling processes, and an OFDM cyclic prefix duration

(4)

l∈Mb,g (c,o)

(c,o)

(c,o)

˜ b,g,m d˜ b,g,m + ν = H b,m Ab,g,m d b,g,m + H b,m A b,m 1/2

Let us assume that, in an arbitrary time slot, the BS has decided to simultaneously serve the multiuser/multimode group g over subband b. Let us also assume that this group contains the subset of users Mb,g = {m b,g,1 , . . . , m b,g,Mb,g }, and that a specific user m ∈ Mb,g is allocated Sb,g,m spatial modes. The total number of antennas  at all receivers (b,g) (m) = m∈Mb,g N R , in this group is defined to be N R and the corresponding MU-MIMO channel is represented  by

(c,o) (c,o) y(c,o) b,g,m = H b,m x b,g + ν b,m (c,o) = H b,m Ab,g,l d (c,o) b,g,l + ν b,m

T /2

(m)

(m)

C N R ×NT denoting a matrix of independent and identically distributed (i.i.d.) channels between the N T transmit antennas (m) at the BS and the N R antennas at the mth MS antennas, (m)

(c,o)

ν b,m ∈ C N R ×1 is a noise vector modeled as a vector of zeromean complex circular-symmetric Gaussian random variables (c,o) ˜ b,g,m with covariance matrix E{ν b,m 2 } = σν2 I N (m) , and A R

(c,o) and d˜ b,g,m are defined, respectively, as the preprocessing matrix and transmit vector for all users in Mb,g other than user m. (c,o) The estimate of the symbol data vector d b,g,m is obtained (c,o)

H ∈ from yb,g,m by using the postprocessing matrix F b,g,m (m)

C Sb,g,m ×N R , that is, (c,o) (c,o) (c,o) H H yb,g,m = F b,g,m H b,m Ab,g,m d b,g,m dˆ b,g,m =F b,g,m (c,o) H H ˜ b,g,m d˜ (c,o) + F b,g,m H b,m A b,g,m + F b,g,m ν b,m .

(5)

In a MU-MIMO system based on BD [6] with a joint TxRx beamforming design [41], the pre- and postprocessing H can be written as the product of matrices Ab,g,m and F b,g,m (M)

(S)

H two matrices, that is, Ab,g,m = Ab,g,m Ab,g,m , and F b,g,m = H (M) H (M) (M) H F (S) b,g,m F b,g,m , where Ab,g,m and F b,g,m are used to cancel (S)

(S) H

all multiuser interference [6], [11], and Ab,g,m and F b,g,m are used to decompose the equivalent single-user MIMO channel into parallel spatial layers (channel-diagonalization), with the objective of optimizing some specific design criteria [41]. 1) Block Diagonalization: To eliminate all multiuser inter(M) (M) H ference, Ab,g,m and F b,g,m must be designed such that (M) H

(M)

F b,g,m  H b,m  Ab,g,m = 0, ∀ m  = m.

(6)

(F  )

(M) H m Let us define H b,g,m  = F b,g,m  H b,m  as the equivalent MIMO (M) H

channel for user m  after postprocessing with F b,g,m  . Using ˜ b,g,m as the equivalent MUthis definition, let us also define H MIMO channel matrix for all MSs in Mb,g other than MS m on subband b, then the zero multiuser interference constraint can be rewritten as ˜ b,g,m A(M) = 0, H b,g,m

(7)

FEMENIAS AND RIERA-PALOU: SCHEDULING AND RESOURCE ALLOCATION

and a sufficient condition to satisfy this constraint is to force (M) ˜ b,g,m . To guarantee that the Ab,g,m to lie in the null space of H transmissions of all active MSs in group g on subband b can be accommodated under the zero-forcing constraint, the null space ˜ b,g,m must have a dimension greater than or equal to Sb,g,m of H for all m ∈ Mb,g , and this is only satisfied when

˜ b,g,m ≤ N T − Sb,g,m . (8) S˜b,g,m = rank H Assuming that the dimension condition is satisfied for all ˜ b,g,m can MSs, the singular value decomposition (SVD) of H be expressed as  H ˜ b,g,m = U˜ b,g,m  ˜ b,g,m V˜ (1) ˜ (0) H , b,g,m V b,g,m

2023

users in group g to satisfy the zero multiuser interference constraint and achieve higher throughput [11]. If (M) H F b,g,m = I N (m) , then the BD-RAS scheme reduces to the R conventional BD algorithm proposed by Spencer et al. in [6, Sect. III.A]. b) Block diagonalization with coordinated Tx-Rx processing (BD-CTR): Proposed by Spencer et al. in [6, Sect. V], it is based on the use of the dominant left singular vectors of the channel of each user m ∈ Mb,g . Let us define the SVD of H b,m as H b,m = U Hb,m  Hb,m V H Hb,m ,

where the columns of U Hb,m and V H Hb,m are the left and right singular vectors of H b,m , and  Hb,m is a rectangu(m) lar diagonal matrix containing the N R singular values. The suboptimal solution proposed by Spencer et al. is to design

(9)

(1) (0) where V˜ b,g,m and V˜ b,g,m hold, respectively, the first S˜b,g,m ˜ b,g,m . and the last N T − S˜b,g,m right singular vectors of H (0) The columns of V˜ b,g,m form an orthonormal basis for the



(M) H F b,g,m = I N (m) 1, . . . , Sb,g,m U H Hb,m ,

(M)

˜ b,g,m and, thus, any matrix A null space of H b,g,m providing zero-multiuser interference will have columns that are linear (0) combinations of the columns of V˜ b,g,m . For a given multiuser/multimode group g over subband b, the (M) optimal design of the pre- and postprocessing matrices Ab,g,m (M)

and F b,g,m requires of an iterative process in which assuming an initial set of postprocessing matrices, the pre- and postprocessing filters are iteratively computed, given the known receiver structure. To avoid the computational complexity of the iterative strategy, basically two suboptimal approaches have been proposed in the literature: a) Block diagonalization with receiver antenna selection (BD-RAS): Proposed by Shen et al. in [11], this strategy (M) H selects the postprocessing matrix F b,g,m out of all the (m)

(M) H

Sb,g,m × N R matrices F b,g,m that are formed by taking Sb,g,m rows from I N (m) . That is, R



(M) H F b,g,m = I N (m) α1 , . . . , α Sb,g,m ,

(10)

R



where I N (m) α1 , . . . , α Sb,g,m is used to denote a Sb,g,m × (m)

R

N R matrix whose ith row is equal to the αi th row of I N (m) . Using one of these postprocessing matrices, MS R m effectively selects a subset of receive antennas and disables the remaining ones. Furthermore, the dimension condition (8), assuming full rank matrices, can be rewritten as Sb,g,m  . (11) Sb,g,m ≤ N T − m  ∈Mb,g m  =m

(13)

R

whose rows contain the Sb,g,m dominant left singular vectors of H b,m . In this case, the dimension condition (8) is equivalent to (3). Furthermore, note that the postprocessing matrix constructed using BD-CTR is an orthogonal basis of a subspace of the channel matrix’s range space whereas this is not the case when employing BD-RAS. In channels exhibiting large spatial correlation, if antenna selection was conducted randomly, BD-RAS might potentially lead to an overall rank-deficient channel matrix, thus compromising ZF-based detection, however, this is not a major issue in practice because a properly designed BD-RAS algorithm takes care at avoiding such situations. 2) Joint Tx-Rx Beamforming: Once the multiuser interference has been eliminated, the estimate of the symbol data vector (c,o) d b,g,m in (5) can be simplified to (c,o) (S) H (M) (S) (c,o) (S) H (c,o) dˆ b,g,m =F b,g,m H b,g,m Ab,g,m d b,g,m + F b,g,m ν b,m . (M)

(M) H

(14)

(M)

where H b,g,m = F b,g,m H b,m Ab,g,m is the equivalent channel obtained after BD. Decomposing now the equivalent channel (M) H b,g,m by means of the SVD as (M) (M) (M) H H (M) b,g,m = U b,g,m  b,g,m V b,g,m ,

(15)

it can be shown that when the objective of the joint TxRx beamforming design is the maximization of the channel capacity, the optimal transmitter and receiver filters are given by [41] (S)

(m)

If Sb,g,m < N R , the use of the postprocessing matrix H F (M) b,g,m reduces the dimensionality of the equivalent MIMO channel for user m, thus decreasing its own mutual information. However, this in turn increases the dimensionality of its null space making it easier for the other

(12)

Ab,g,m = √

1 1/2 (M) V b,g,m P b,g,m , Nsc

(16)

and (S)

(M) H

F b,g,m = U b,g,m ,

(17)

2024

IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 64, NO. 5, MAY 2016

respectively, where P b,g,m = diag



pb,g,m,1 . . . pb,g,m,Sb,g,m

T 

(18)

contains the vector of power values allocated to the Sb,g,m spatial streams in its main diagonal (in a given subband, and for a given data stream, power is uniformly allocated to subcarriers). In this case, the instantaneous SNR experienced by every subcarrier in subband b and spatial stream s, allocated to user m ∈ Mb,g , is given by γb,g,m,s

pb,g,m,s σb,g,m,s = , σν2 Nsc

(19) (M)

where σb,g,m,s is the sth element in the diagonal of  b,g,m . Complexity considerations: Given the downlink nature of the considered setup, complexity might be an issue at the receiver as the MS may have energy and/or computational resources limitations. To this end, notice that the two considered BD schemes, BD-RAS and BD-CTR, entail significant differences in terms of computational complexity. Whereas determining the receiver multiuser cancellation matrix for the case of BD-RAS simply consists of choosing the best rows of an identity matrix, typically a straightforward step as in most cases MSs are equipped with a small number of antennas, for BD-CTR the multiuser cancellation matrix involves performing an SVD. The SVD has a computational cost of O(min{N T (N R(m) )2 , N T2 N R(m) }) that becomes O(N T (N R(m) )2 ) (m) under the reasonable assumption of N T ≥ N R . The subsequent Tx-Rx beamforming brings along another SVD but now 3 ) on a Sb,g,m -squared matrix, and thus requiring of O(Sb,g,m (m)

is said to fulfill the frugality constraint (FC) [17]. It was shown in [19] that, assuming the use of queue-aware scheduling rules, negligible performance improvements can be expected upon the activation of the FC. Thus, for the sake of exposition simplicity, this constraint will not be taken into account when performing the SRA optimization. Nonetheless, note that (21) does indeed reflect the real data rate used by user m and determines the queue dynamics for this user given by (20). III. O PTIMIZATION VARIABLES A. Power Allocation  T Let pb,g,m = pb,g,m,1 . . . pb,g,m,Sb,g,m denote the vector of power allocation values for user m ∈ Mb,g over subband b. Furthermore, let T  T T · · · p (23) pb,g = pb,g,m b,g,m b,g,M b,g,1 b,g

denote the vector of powers allocated to all the spatial streams transmitted by the BS to the users in Mb,g over subband b. In this case, the vector of powers allocated to all the multiuser groups over subband b can be denoted by T  T T pb = pb,1 · · · pb,G . b

(24)

Then, for a given set of constraints, the scheduling and resource allocation algorithm will be in charge of determining the power allocation vector T  (25) p = p1T · · · pTNb

operations. Given that both, N R and Sb,g,m , are likely to be small, both BD-based schemes will result in a rather modest computational complexity yet nonetheless note that the precoder/receiver design has to be conducted on a per-subband basis, and therefore, if complexity reduction is a priority, BD-RAS greatly simplifies the receiver processing.

optimizing a prescribed objective function. In addition to determining the power allocation, the resource allocation algorithms should also assign resource blocks and transmission rates. Nevertheless, as shown next, the power allocation vector p can also represent the allocation of all these resources, thus simplifying the formulation of the optimization problem [42].

B. DLC Layer Modeling

B. Subband and Resource Block Allocation

At the beginning of a given time slot t, MS m is assumed to have Q m (t) bits in the queue. If there are Am (t) bits arriving during time slot t, the queue length at the end of this time slot, assuming queues of infinite capacity, can then be expressed as

As usual, it is assumed that a given subband b is exclusively allocated to a multiuser group g, and a given RB in subband b is exclusively allocated to a user m belonging to multiuser group g. Hence, the RB allocation constraints can be captured by constraining the power allocation vectors as pb ∈ Pb where,   pb,g = 0 ⇒ pb,g = 0 ∀ g  = g, Sb Pb  pb ∈ R+ : , pb,g,m,s pb,g,m  ,s = 0, ∀ m  = m (26) with

Q m (t + 1) = Q m (t) + Am (t) − Rm (t) No To ,

(20)

  Q m (t) , Rm (t) = min rm (t), No To

(21)

where

with rm (t) denoting the data rate allocated to user m during this time slot. A resource allocation strategy that, in order to avoid the waste of resources, selects a transmission rate Q m (t) rm (t) ≤ No To

(22)

Sb =

Gb g=1

Sb,g =

Gb

Sb,g,m .

(27)

g=1 m∈Mb,g

Hence, the power allocation vector satisfies p ∈ P = P1 × S , where × denotes the Cartesian product (or · · · × P Nb ⊂ R+  Nb product set), and S = b=1 Sb .

FEMENIAS AND RIERA-PALOU: SCHEDULING AND RESOURCE ALLOCATION

C. Rate Allocation In the downlink of multi-rate systems based on adaptive modulation and coding (AMC), a channel estimate obtained at the receiver of each MS is fed back to the BS so that it can select a modulation and coding scheme (MCS), comprising a modulation format and a channel coding rate, which is adapted to the channel characteristics. Let us assume that the MCS allocated to spatial stream s of MS m in group g over subband b can be characterized by a transmission rate ρb,g,m,s (measured in bits per second per subcarrier). As each subband contains Nsc subcarriers, the aggregated data rate allocated to MS m over time slot t will be given by

2025

which can also be expressed using the staircase function ⎧ (0) (1) ⎪ m , 0 ≤ γb,g,m,s < m ⎪ ⎪ ⎪ (1) (1) (2) ⎨ m , m ≤ γb,g,m,s < m ρb,g,m,s = .. ⎪ ⎪ . ⎪ ⎪ ⎩ (Nk ) (Nk ) m , m ≤ γb,g,m,s < ∞

  (k) Nk −1 (k) (k+1) , with m ≤ m , are the instantaneous where m k=1 SNR boundaries defining the MCS selection intervals, which can be obtained from (29) as (k)

m =

rm = Nsc

Nb

Sb,g,m



ρb,g,m,s ,

b

(m)

where Gb is used to denote the subset of multiuser/multimode groups over subband b that contain user m as one of its members. 1) Discrete-Rate AMC: Realistic AMC strategies only use (m) (m) a discrete set Nk = {0, 1, . . . , Nk } of MCSs that, in a general setup, can differ for different MSs. Each MCS is char(k) (1) , with m < acterized by a particular transmission rate m (m)

(0)

. . . < m k . The data rate m = 0 corresponds to the notransmission mode, that is, the mode selected when the channel is so bad that no bits can be transmitted to MS m while guaranteeing the prescribed target error probability. (k) Transmission rate m can be related to the block error rate (BLER) observed by MS m in group g on its allocated spatial stream s on subband b, denoted as m,s , and SNR γb,g,m,s as [43, Chapter 9] (see also [39]) (k) ) m,s (γb,g,m,s , m  1,

= (k) (k) κ1 exp −κ2 γb,g,m,s ,

(k)

(k)

1 (k)

κ2

ln

κ1 . ˇm

(32)

(28)

b=1 g∈G(m) s=1

N

(31)

(k)

γb,g,m,s < γm otherwise,

(29)

(k)

where κ1 and κ2 are modulation- and code-specific constants that can be accurately approximated by exponential curve fitting applied to actual BLER measures or numerical simula(k) (k) tion results, and γm = 1(k) ln κ1 . This expression is general

2) Continuous-Rate AMC: A useful abstraction when exploring transmission rate limits is to assume that each user’s set of MCSs is infinite. In this case, the maximum allowable transmission rate fulfilling the prescribed BLER constraint with equality can be obtained from (29) as   γb,g,m,s 1 , (33) log2 1 + ρb,g,m,s = To m where m ≥ 1 represents the coding gap due to the utilization of a practical (rather than ideal) coding scheme. With m = 1 this expression results in the Shannon’s capacity limit and allows the comparison of practical AMC-based schemes against fundamental capacity-oriented designs. IV. U NIFIED O PTIMIZATION F RAMEWORK Scheduling and resource allocation algorithms in a wireless communications network aim at obtaining an effective trade-off among spectral/energy efficiency and fairness, while providing prescribed QoS. Utility theory, widely used in economics to quantify the benefit obtained from the usage of certain resources, can also be used in wireless communication networks to evaluate the degree up to which a given network configuration can satisfy service requirements of users’ applications. In [12], [13], Song et al. show that the cross-layer optimization of scheduling and resource allocation, defined as the one that maximizes the aggregate utility subject to physical layer constraints, can be formulated as the constrained weighted sum-rate optimization problem

κ2

enough to obtain the BLER performance of any transmission system for which the joint effects of preprocessing filters, channel coefficients and postprocessing equalizers can be represented through an instantaneous SNR γb,g,m,s , which for the special case at hand is given by (19). Given γb,g,m,s and assuming a maximum allowable BLER ˇm , (29) can be used to select the most adequate MCS scheme as the one with transmission rate

  (k) (k) ≤ ˇm , : m,s γb,g,m,s , m ρb,g,m,s = max m

(30)

max p∈P

Gb Nb

Sb,g,m



wm Nsc ρb,g,m,s

b=1 g=1 m∈Mb,g s=1

subject to

Gb Nb

Sb,g,m



pb,g,m,s ≤ PT .

(34)

b=1 g=1 m∈Mb,g s=1

where wm denotes the weighing (priorization) coefficient for MS m, which can be used to implement different scheduling rules. Weights following prescribed scheduling rules can be found in [19]; important examples comprise:

2026

IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 64, NO. 5, MAY 2016

• Proportional fair (PF) rule [44] is based on a channelaware scheduling rule aiming at maximizing the logarithmic-sum-throughput of the system with wm = 1/r m , ∀ m,

(35)

where r m is the average effective data rate actually allocated to user m, which is calculated using a moving average over a relatively long sliding window [45]. • Modified largest weighted delay first (MLWDF) [46] is based on a channel- and queue-aware scheduling rule. The MLWDF scheduler aims at choosing the best combination of queueing delay and potential transmission rate by using weights [46], wm = φm WHOL,m /r m , ∀ m,

(37)

providing in this way QoS differentiation among flows and approximately “balancing” different users’ probabilities of QoS constraints violation. • Exponential (EXP) rule [47] is also based on a channeland queue-aware throughput optimal scheduling rule that considers the waiting time in the queues, the instantaneous potential transmission rates and the maximum tolerable delay requirements. The weights in this case can be shown to be defined by

Sb,g,m



wm Nsc ρb,g,m,s

b=1 g=1 m∈Mb,g s=1



+ μ ⎝ PT −

Gb Nb



(39) ⎞

Sb,g,m

pb,g,m,s ⎠ ,

b=1 g=1 m∈Mb,g s=1

  and the dual problem g( p, μ) = minμ≥0 max p∈P L ( p, μ) can then be written as [49] ⎧ ⎨

⎧ Gb Nb ⎨

g( p, μ) = min max μ≥0 ⎩ p∈P ⎩

(40)

b=1 g=1 m∈Mb,g

⎫ ⎫ ⎬

⎬ × wm Nsc ρb,g,m,s − μpb,g,m,s + μPT , ⎭ ⎭ Sb,g,m

Now, using the RB exclusive allocation constraints and the separability of power variables across RBs, the dual problem can be simplified as3

g( p, μ) = min

⎧ Nb ⎨

μ≥0 ⎩

b=1

Sb,g,m

×

s=1

max



pb,g,m,s ≥0

max

g∈Gb

⎧ ⎨ ⎩

m∈Mb,g

wm Nsc ρb,g,m,s

⎫ ⎫ ⎬ ⎬ − μpb,g,m,s + μPT . ⎭ ⎭ (41)



φm WHOL,m − φW ⎠ φm  exp ⎝ rm 1 + φW

for all m, with Nm 1 φW = φm WHOL,m . Nm

L ( p, μ) =

Gb Nb

s=1

log(ξˇm ) φm = − , Dˇ m

wm =

Although the objective function in (34) is concave, P is a highly non-convex discrete constraint space. Fortunately, this optimization problem can be approached by using Lagrange duality principles and, as stated in [42], [48], in the dual domain it can be made separable across the subbands. With μ denoting the Lagrange multiplier associated with the power constraint, the Lagrangian of (34) can be expressed as

(36)

where φm are arbitrary positive constants that can be used to set different priority levels between traffic flows, and WHOL,m denotes the head-of-line (HOL) delay experienced at the BS buffer of MS m. In order to support users with absolute delay requirement Dˇ m and maximum outage delay probability requirement ξˇm , if this can be done at all with any scheduling rule, the authors of [46] propose to properly set the values of φm as



A. Adaptive Power Allocation (APA)

(38)

m=1

The optimization problem formulated in (34) is general enough to account for different power and rate allocation strategies: uniform power allocation (UPA), adaptive power allocation (APA), continuous rate allocation (CRA) and discrete rate allocation (DRA), which are treated next. Note that the MLWDF and EXP scheduling rules would remain throughput optimal if for all or some queues, the HOL delay WHOL,m was replaced by the queue length Q m [46].

The solution to the simplified dual problem is given by optimizing (41) over all ( p, μ) 0. This optimization can be done iteratively and coordinate-wise, starting with the p variables and continuing with μ. 1) Optimizing the Dual Function Over p : a) Continuous rate allocation (CRA): In case of using ρb,g,m,s as defined in (33), and for a given value of μ, the innermost maximization in (41) provides a multilevel waterfilling closed-form expression for the optimal power allocation given by ∗ pb,g,m,s



Nsc m σν2 Nsc wm − = μTo ln 2 σb,g,m,s

+ ,

(42)

3 Note that this assumption does not hold for more complex problems involving per-user data rate constraints as in [30].

FEMENIAS AND RIERA-PALOU: SCHEDULING AND RESOURCE ALLOCATION

where [x]+  max{0, x}. Now, using (42) in (41) yields ⎧ ⎧ Nb ⎨ ⎨ max (43) g( p, μ) = min μ≥0 ⎩ g∈Gb ⎩ b=1 m∈Mb,g ⎫ ⎫ Sb,g,m

⎬ ⎬ ∗ ∗ wm Nsc ρb,g,m,s × − μpb,g,m,s + μPT . ⎭ ⎭ s=1

with ∗ ρb,g,m,s

# $ ∗ σb,g,m,s pb,g,m,s 1 = log2 1 + . To Nsc m σν2

(44)

Hence, for a fixed dual variable μ, the subband b will be allocated to multiuser/multimode group gb∗ satisfying ⎧ ⎫ b,g,m

⎨ S ⎬ ∗ ∗ gb∗ = arg max wm Nsc ρb,g,m,s −μpb,g,m,s . ⎭ g∈Gb ⎩

2027

2) Optimizing the Dual function over μ : Once known the optimal vector p∗ for a given μ, the dual optimization problem (41) reduces to ⎧ ⎪ Nb ⎨ g(μ) = min (51) μ≥0 ⎪ ⎩b=1 m∈M ∗ b,gb ⎫ Sb,g∗ ,m ⎪ b ⎬

∗ ∗ . × wm Nsc ρb,g ∗ ,m,s − μpb,g ∗ ,m,s + μPT b b ⎪ ⎭ s=1 Using standard properties of dual optimization problems [42], [48], it can be shown that this problem is convex with respect to μ, and thus, derivative-free line search methods like, for example, Golden-section or Fibonacci, can be used to determine μ∗ . Once μ∗ has been found, it can be used to obtain optimal power, subband and rate allocation for each of the data flows in the system.

m∈Mb,g s=1

(45) b) Discrete rate allocation (DRA): In this case ρb,g,m,s is a non-derivable discontinuous function. However, the approach proposed in [42, Chapter 3] can be applied to arrive at the optimal solution. Using (31), the set of non-negative real numbers (i.e., R+ ) can be subdivided, for each MS m and subband b, into Nk segments % $ (k) (k+1) Nsc σν2 m Nsc σν2 m + (k) , (46) Rb,g,m,s = , k ∈ Nk . σb,g,m,s σb,g,m,s Furthermore, given that μ and pb,g,m,s belong to R+ , if a power (k) (k+1) allocation pb,g,m,s is used such that m ≤ γb,g,m,s < m then wm Nsc ρb,g,m,s − μpb,g,m,s

(k) Nsc m

δb,g,m,s

.

As a consequence, there only exist Nk candidate power allocations  & (0) (N −1) Nsc σν2 m Nsc σν2 m k ∗ ,..., pb,g,m,s ∈ (48) σb,g,m,s σb,g,m,s

∗ pb,g,m,s

=

Optimization problem (34) can be further simplified if we assume that the BS transmit power PT is uniformly allocated among the Nb system subbands, while adaptive power allocation is still performed among spatial subchannels in each RB. In this case, the whole problem can be expressed as Nb independent optimization subproblems, one for each subband b, of the form max

p b ∈P b

(k) Nsc σν2 m σb,g,m,s

(k ∗ ) Nsc σν2 m b,g,m,s /σb,g,m,s ,

where

  ∗ (k) (k) kb,g,m,s = arg max Nsc wm m − μNsc σν2 m /σb,g,m,s .

wm Nsc ρb,g,m,s

g=1 m∈Mb,g s=1 Gb

Sb,g,m



pb,g,m,s ≤ PT /Nb .

(52)

Exclusive allocation constraints (i.e., p b ∈ Pb ) dictate that subband b must be exclusively allocated to a multiuser/multimode group g and each RB must be exclusively allocated to one of the users in this group. Hence, for a given group g, it is straightforward to show that, for the CRA case, the optimal powers that must be allocated to users and spatial modes can be obtained through waterfilling [50], that is, ∗ = pb,g,m,s



Nsc m σν2 Nsc wm − μb To ln 2 σb,g,m,s

+ ,

(53)

where μb is the waterfilling level necessary to fulfil the power constraint in (52) with equality. For the DRA case, the set of Nk candidate power allocation values is given by (48), from which (k) the one maximizing wm Nsc m must be selected, that is, (k ∗

k∈Nk

(50) ∗ Furthermore, as in the CRA case, given μ and pb,g,m,s , the subband b must be allocated to multiuse/multimode group gb∗ satisfying (45).



g=1 m∈Mb,g s=1

must

(49)

Sb,g,m

Gb

subject to

(47)

(k) (k) − μpb,g,m,s ≤ wm Nsc m −μ = wm Nsc m

(k) from which the one maximizing wm Nsc m −μ be selected, that is,

B. Uniform Power Allocation (UPA)

where

)

∗ = Nsc σν2 m b,g,m,s /σb,g,m,s , pb,g,m,s

(54)

  ∗ (k) . = arg max Nsc wm m kb,g,m,s

(55)

k∈Nk

2028

IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 64, NO. 5, MAY 2016

TABLE II APA-BASED J OINT S CHEDULING -R ESOURCE A LLOCATION

TABLE III P ER S UB - BAND U TILITY-BASED S UBOPTIMAL U SER S ELECTION A LGORITHM

Given the formal expressions for the optimal power allocation, subband b must be allocated to group gb∗ satisfying ⎧ ⎫ b,g,m ⎨ S ⎬ ∗ gb∗ = arg max wm ρb,g,m,s , ∀b. (56) ⎭ g∈Gb ⎩ m∈Mb,g s=1

V. U TILITY-BASED S UBOPTIMAL G REEDY M ULTIUSER /M ULTIMODE S ELECTION A LGORITHM For each scheduling interval, the BS must determine the optimum power allocation vector p∗ in (25), recalling that, as already stated in Section III, power allocation implicitly results in the allocation of resource blocks and transmission rates. Based on p∗ , the QSI and average allocated rate per-user are updated accordingly. For the sake of clarity, this procedure is summarized in algorithmic form in Table II for the APA case (the UPA case avoids the Fibonacci-related iterations in Step 2). Note that a crucial process in Table II is Step 2.b, whose algorithmic details are fully described in Table III. This step is in charge of determining the per-subband optimal multiuser/multimode group gb∗ in (45) and (56). To this end, the multiuser/multimode group maximizing a utility function, defined as Ub in Table III, over all the groups in the set Gb must be found. Hence, it is necessary to determine which is the number G b of multiuser/multimode groups in Gb , and which is the composition, in terms of users and spatial modes per user, of each of these groups. The answer to these questions depends not only on the number of active MSs in the system, the number of transmit antennas at the BS and the number of receive antennas at each of the active MSs, but also on the strategy used to implement the MU-MIMO block diagonalization (i.e., BD-RAS or BD-CTR). As an illustrative example, Table IV shows all the BD-CTR MU-MIMO configurations for a BS with N T = 4 transmit antennas, and the corresponding possibly optimal distributions of per user allocated spatial streams. In this case, assuming, for example, that every MS has and uses N R(m) = 2 receive antennas, then, using the corresponding results in Table IV, yields         Nm Nm Nm Nm +4 +4 + . Gb = 2 1 2 3 4 Irrespective of the MU-MIMO block diagonalization approach, a brute-force exhaustive search over Gb , for each of the subbands in Nb and for each iteration in the optimization algorithm, is computationally prohibitive even for modest values of Nm and N T . In fact, as stated by Shen et al. in [10], there are BD strategies for which the exhaustive search

method over each of the OFDMA subbands and each iteration  N /N  would need to consider roughly O Nm T R possible user sets, where · denotes the ceiling operator and, for simplicity, it has been assumed that N R(m) = N R for all m ∈ Nm . In order to simplify the search for the optimal multiuser/multimode group, and grounded on both the suboptimal user/mode selection algorithms hinted by Yoo and Goldsmith in [5, Sect. VIII] and the capacity-based user selection algorithm proposed by Shen et al. in [10, Sect. III.A] [10], a suboptimal approach is presented in this section whose global complexity increases linearly with N T and N R(m) ∀m ∈ Nm . Our utility-based suboptimal user selection algorithm is described in Table III. In words, the algorithm first initializes a set Ub containing all the potential user/mode pairs over subband b. In the first iteration it selects the user/mode pair (m  , α  ) providing the highest utility, removes it from the set Ub , and updates the set of selected users with m  and the corresponding set of selected modes with α  . In the next iterations, and from

FEMENIAS AND RIERA-PALOU: SCHEDULING AND RESOURCE ALLOCATION

TABLE IV BD-CTR MU-MIMO C ONFIGURATIONS W ITH N T = 4

the remaining unselected user/mode pairs in Ub , it finds the user/mode pair that provides the highest total utility together with those already selected user/mode pairs. Note that the addition of a new user/mode pair modifies the null space of all the other users in the system and, hence, the zero multiuser interference constraint requirement may result in a decrease in the total utility. As a consequence, the algorithm terminates either when the number of allocated user/mode pairs reaches its maximum  (m) (i.e., min{N T , m∈Nm N R }) or when the total utility drops if more user/mode pairs are selected. The proposed algorithm  (m) needs to search over no more than N T m∈Nm N R user/mode pairs, thus greatly reducing its complexity when compared with that of the exhaustive search method. Note that the use of a greedy algorithm such as the one just described does not compromise the fairness among users as this is solely dependent on the weighing coefficients determined by the scheduling policy in use.

VI. N UMERICAL R ESULTS The numerical results shown in this section serve three different purposes. First, to illustrate the flexibility of the proposed optimization design while showing the performance of different designs. Secondly, to demonstrate the merits of the proposed greedy algorithm to perform the user/stream selection process. Lastly, to highlight the benefits of the MU-MIMO mechanism in terms of realistic network metrics such as throughput, delay, Jain’s fairness index (JFI) or service coverage, the latter defined as the percentage of users achieving their QoS requirements. To this end, the downlink of a single-cell MIMOOFDMA wireless network with a cell radius of 500 m has been simulated with a BS equipped with N T = 4 antennas serving a set of Nm MSs uniformly distributed over the whole coverage area. The entire system bandwidth is B = 10 MHz (9 MHz occupied bandwidth), and is divided into Nb = 50 orthogonal subbands, each with a bandwidth Bb = 180 kHz and consisting of Nsc = 12 adjacent subcarriers. Transmission

2029

between the BS and active MSs is organized in time slots of duration Ts = 0.5 ms, and each of these slots consists of No = 7 OFDM symbols of duration (without considering the cyclic prefix) To = 6.667 μs. Thus, an RB is formed by 12 adjacent subcarriers and 7 OFDM symbols. A wireless channel including the path-losses, shadowing effects and frequency-, time- and space-selective fading experienced by the transmitted signal on its way from the BS to the MSs has been generated conforming to the Extended Typical Urban (ETU) channel model defined within LTE [51] with a shadow fading standard deviation of 6 dB. We note that these parameters, except for minor implementation details, are illustrative of the typical values found in LTE/LTE-Advanced [38]. The maximum BLER has been set to ˇm = 0.1 for all scenarios. For the simulation results in which discrete rate adaptation (DRA) is used, MCSs defined within LTE are employed whose transmission (n) rates m are specified in [52, [Table 7.2.3-1] and lie in the (n) (n) (n) range (0.15, 5.55) bits/s/Hz. The parameters κ1 , κ2 and γm in (29) have been extracted from [53, Table I]. Without loss of generality, it is assumed that correlation matrices R R X,m |i− j| (or R T X,m ) are constructed by setting the (i, j)-entry to ρ R X |i− j| (or ρT X ), where 0 < ρ R X , ρT X ≤ 1 serve to characterize the correlations of the transmit and receive antenna arrays. Unless otherwise noted, the correlation of the Tx and Rx antenna arrays is defined by setting ρT X = ρ R X = 0.25. To demonstrate the ability of the proposed framework to schedule and allocate resources to service flows with different QoS requirements, RT, nRT and BE traffic classes are considered. Without loss of generality, traffic arrivals have been modeled as Poisson random variables, with a mean that depends on the average arrival rate per flow (measured in bits per second). RT, nRT and BE traffic flows have been characterized by maximum allowable delays of 50, 100 and 300 ms, and delay outage probabilities of 0.01, 0.1 and 0.1, respectively, in line with LTE/LTE-A specifications (see [38, Table 13.1]). For the first set of numerical results, and without loss of generality, BD-CTR with CRA and UPA across subbands and using PF-based scheduling has been chosen. Figures 1a and 1b present the attained average throughput as a function of the incoming traffic when comparing the performance of BD-based MU-MIMO with that of SU-MIMO based on maximum ratio transmission-combining (MRT-MRC) for the case of N R = 1 and N R = 2 receive antennas, respectively, and with the number of users in the system as parameter. In the case of MU-MIMO, both greedy and optimal user selection algorithms are tested. Note that a throughput performance curve with a slope of 1 indicates that all incoming traffic is able to be served whereas the point where a given curve departs from the slope-1 line is indicative that a particular scheme/configuration is starting to accumulate the incoming traffic in the queues. It can be observed that irrespective of the technique employed, increasing the number of users in the system leads to a per-flow system degradation as the limited space-frequency resources need to be shared among a larger number of incoming flows. Nonetheless, some important facts are worth mentioning regarding the particular performance of the different techniques. First, the more users in the system, the larger the advantage MU-MIMO offers in comparison to SU-MIMO. This is because a larger number

2030

IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 64, NO. 5, MAY 2016

Fig. 1. MU-MIMO vs SU-MIMO

Fig. 2. Performance comparison of different MU-MIMO configurations

of users makes it possible for the MU-MIMO scheme to find a set of users, which after the block diagonalization process, ends up having advantageous SNR levels in comparison to SUMIMO, thanks to a proper exploitation of both the multiuser and spatial diversities. Second, despite the advantage of MUMIMO is significant for both N R = 1 (Fig. 1a) and N R = 2 (Fig. 1b), it is in the case of single-antenna receivers that the advantage becomes more noticeable. This is in line with the general trend of moving most of the complexity to the BS side with the aim of simplifying the MS architecture [4]. Finally, it is important to recognize in both figures that the performance of the proposed greedy algorithm to select users/streams almost coincide with that of the optimal selection4 . Note that this is consistent with the performance obtained when using greedy 4 Further experiments, not shown here due to lack of space, have shown that the optimal user selection can indeed provide a somewhat clearer advantage over the greedy approach when using maximum SNR (MSNR) scheduling. However, MSNR is seldom used in practical deployments given the total lack of fairness it induces among the users in the system.

algorithms in the context of single-carrier systems [9], thus reinforcing the efficacy of the greedy approach introduced here to exploit the frequency diversity dimension provided by the use of multicarrier architectures. Focusing now on the MU-MIMO performance, Fig. 2 presents results for different antenna configurations and different power and bit allocation strategies. In particular, Fig. 2a shows results for various antenna configurations when employing either BD-CTR or BD-RAS. Remarkably, in most situations the simpler RAS processing attains almost the same throughput performance as the CTR scheme except for the scenarios where the MSs have more antennas than the BS, as in this case the receiver processing becomes more significant given the limited capabilities of the transmitter. Note however that this is a rare situation since it is generally desirable to move most of the processing burden (i.e., more antennas, more complexity) to the BS side with the objective of simplifying the MSs’ operation. Results in this figure have been obtained for the default correlation values (ρT X = ρ R X = 0.25), however,

FEMENIAS AND RIERA-PALOU: SCHEDULING AND RESOURCE ALLOCATION

2031

Fig. 3. Heterogeneous throughput, JFI and service coverage for PF, EXP and MLWDF scheduling.

for completeness, results for the BD-RAS setup with N T = 4, N R = 2 are also shown (square markers) for the case of larger correlations (ρT X = ρ R X = 0.85) suggesting that, except for very rare pathological situations, correlation has a small effect

on the throughput performance of the overall system. Figure 2b explores the performance of discrete and continuous rate allocation (DRA vs CRA), and also uniform and adaptive power allocation (UPA vs APA). Logically, CRA schemes, actually

2032

approaching the Shannon capacity bounds, clearly outperform their DRA-based counterparts, an effect already observed in SU-MIMO architectures [19]. Note that in both rate allocation strategies, CRA and DRA, using APA leads to a gain in throughput with respect to UPA, but the gain is much more noticeable for the case of DRA. This remarkable result suggests that to optimize throughput performance in practical (DRAbased) systems, either APA is conducted or the use of UPA is combined with a rich set of transmission modes to guarantee a negligible loss with respect to optimal power allocation. The higher influence of APA in the DRA-based scheme stems from the fact that APA assigns to each user/stream the minimum amount of power to operate on the most adequate transmission mode (see (48)-(50)) hence avoiding the power waste that UPA inevitably incurs; this saved power potentially allows one or more users/streams the use of a higher transmission mode, thus eventually leading to an increased throughput performance. The second simulation scenario is defined by a collection of users with different QoS requirements and it serves to illustrate the merits of the proposed framework to assess the overall network performance from different points of view. In particular, it is assumed that there are NmRT = NmnRT = NmBE = 4 users, thus totalling Nm = 12 different MSs in the system. Without loss of generality, this section assumes the utilization of discrete rate allocation with uniform power allocation (DRA+UPA). In order to further highlight the benefits of MU-MIMO when looking at practical network metrics, results for the SU-MIMO scheme are also shown. Figure 3 shows the throughput (left column), JFI (central column) and service coverage (right column) for the PF, EXP and MLWDF scheduling rules. Focusing first on the throughput results, these plots present the average throughput per flow as a function of the incoming arrival rate for the different traffic classes (RT, nRT and BE). Note that while PF does not differentiate among the three classes of users since scheduling decisions are taken only on the basis of previously allocated throughputs, EXP and MLWDF do indeed cause the three user classes to behave markedly different. Notably, in both EXP and MLWDF, when the arrival rate increases, the performance of nRT and BE users drop down dramatically when trying to satisfy the tight delay constraints of RT users, which in turn causes their throughput to be favoured. In fact, these schedulers try to accommodate all user classes but when falling short from resources, BE users first and nRT users next, begin to starve. Remarkably, it can be seen that the EXP scheduler ends up completely neglecting BE users. An obvious trend regardless of the scheduling technique in use is the advantage offered by the MU-MIMO component. Complementing throughput outcomes, the plots in the middle column of Fig. 3 present the JFI results regarding both the throughput and delay performance among the different users. For the sake of presentation clarity, and without loss of generality, the results shown here correspond to the RT traffic class. Notice how the EXP scheduler (Fig. 3b) can be regarded as the fairest one both in terms of throughput and delay, with the JFI for both metrics approaching one regardless of the arrival rate and MIMO technique employed. In contrast, PF and MLWDF can be seen to suffer a considerable degradation in fairness for low to medium arrival rates for the case of delay (e.g. users

IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 64, NO. 5, MAY 2016

within the RT traffic class experience rather different delays) whereas in terms of throughput, fairness can be seen to moderately decay with increasing arrival rate. Further insight is obtained from jointly looking at the throughput and JFI results. Concentrating on the RT traffic class, note that despite MLWDF outperforms EXP in terms of throughput, this advantage comes at the cost of sacrificing fairness. Comparing now MU-MIMO with SU-MIMO and focusing on the PF scheduler (Fig. 3a), it might appear that SU-MIMO is advantageous when only considering the fairness point of view. However, the corresponding throughput plot reveals that fairness is achieved at the cost of a very significant drop in transmitted rate (i.e., throughput is equally poor for all users, thus increasing the fairness among them). Finalizing our study on heterogeneous traffic, the right column in Fig. 3 presents the service coverage for the three considered schedulers. Clearly, regardless of the scheduling policy MU-MIMO is found to significantly outperform SUMIMO. Nevertheless, note that for EXP and MLWDF, when operating at low average arrival rates for which a service coverage of 100% can be guaranteed (target operating zone), the MU-MIMO scheme supports roughly the double maximum average incoming rates in comparison to the SU-MIMO system. This is not the case for the PF scheduler, for which in the target operating zone the use of MU-MIMO barely provides additional service coverage guarantees when compared to SU-MIMO. That is, the PF scheduling rule violates the QoS restrictions even for the lowest rates, thus supporting the claim that QSI-unaware schedulers are ineffective in supporting any prescribed delay constraint. In fact, whereas PF provides service to all users, irrespective of their traffic class at the cost of neglecting the specific QoS requirements, EXP and MLWDF sacrifice overall throughput when pursuing the fulfilment of QoS constraints. VII. C ONCLUSION This work has proposed an analytical scheduling and resource allocation design for MU-MIMO-OFDMA networks based on BD considering the steps of subcarrier and power allocation (in both spatial and frequency domains), scheduling policy, rate allocation and user/stream selection. The framework is general enough so as to allow the evaluation of multiple techniques for each processing step while assuming heterogenous users with different traffic requirements. Moreover, the design could easily be extended to incorporate other MUMIMO techniques such as ZFBF. Remarkably, the possibility of using either continuous or discrete rate allocation allows the assessment of the different techniques in terms of capacity or more realistic throughput related metrics. Building on the scheduling and resource allocation design, a novel user/stream selection technique has been introduced that, relying on greedy principles, has been shown to perform close to optimality under all configurations at a substantially lower complexity. Overall numerical results serve to demonstrate the advantage offered by MU-MIMO processing over SU-MIMO both from a capacity point of view but also in terms of practical network metrics such as throughput or service coverage. Focusing on MU-MIMO

FEMENIAS AND RIERA-PALOU: SCHEDULING AND RESOURCE ALLOCATION

results, it has been observed that RAS performs very close to BD-CTR without the need to coordinate the MU-MIMO processing at both transmission ends, thus making RAS the preferred configuration when dealing with multiple-antenna MSs. Simple uniform power allocation has been found to attain a performance similar to that of optimally adaptive power allocation provided that a rich set of transmission modes is available. In fact, this difference nearly vanishes when considering continuous (e.g. capacity-based) transmission. Results using different schedulers have shown the different performance trade-offs that can be achieved in terms of throughput, fairness and service coverage. Remarkably, it has been clearly shown that the use of optimal/suboptimal SRA strategies on the downlink of a MU-MIMO-OFDMA physical layer allows a proper exploitation of the various diversity degrees (i.e., multiuser, space and frequency) present in the system. Further research work will concentrate on the extension of the framework to multicell scenarios where intercell interference is taken into account in the design. R EFERENCES [1] F. Capozzi, G. Piro, L. Grieco, G. Boggia, and P. Camarda, “Downlink packet scheduling in LTE cellular networks: Key design issues and a survey,” IEEE Commun. Surv. Tuts., vol. 15, no. 2, pp. 678–700, 2nd Quart. 2013. [2] M. H. M. Costa, “Writing on dirty paper,” IEEE Trans. Inf. Theory, vol. IT-29, no. 3, pp. 439–441, May 1983. [3] M. Jiang and L. Hanzo, “Multiuser MIMO-OFDM for next-generation wireless systems,” Proc. IEEE, vol. 95, no. 7, pp. 1430–1469, Jul. 2007. [4] D. Gesbert, M. Kountouris, R. W. Heath, C. B. Chae, and T. Salzer, “Shifting the MIMO paradigm,” IEEE Signal Process. Mag., vol. 24, no. 5, pp. 36–46, Sep. 2007. [5] T. Yoo and A. Goldsmith, “On the optimality of multiantenna broadcast scheduling using zero-forcing beamforming,” IEEE J. Sel. Areas Commun., vol. 24, no. 3, pp. 528–541, Mar. 2006. [6] Q. H. Spencer, A. L. Swindlehurst, and M. Haardt, “Zero-forcing methods for downlink spatial multiplexing in multiuser MIMO channels,” IEEE Trans. Signal. Process., vol. 52, no. 2, pp. 461–471, Feb. 2004. [7] R. Chen, Z. Shen, J. G. Andrews, and R. W. Heath, “Multimode transmission for multiuser MIMO systems with block diagonalization,” IEEE Trans. Signal Process., vol. 56, no. 7, pp. 3294–3302, Jul. 2008. [8] M. Sharif and B. Hassibi, “A comparison of time-sharing, DPC, and beamforming for MIMO broadcast channels with many users,” IEEE Trans. Commun., vol. 55, no. 1, pp. 11–15, Jan. 2007. [9] G. Dimic and N. D. Sidiropoulos, “On downlink beamforming with greedy user selection: performance analysis and a simple new algorithm,” IEEE Trans. Signal Process., vol. 53, no. 10, pp. 3857–3868, Oct. 2005. [10] Z. Shen, R. Chen, J. G. Andrews, W. H. Heath, and B. L. Evans, “Low complexity user selection algorithms for multiuser MIMO systems with block diagonalization,” IEEE Trans. Signal Proccess., vol. 54, no. 9, pp. 3658–3663, Sep. 2006. [11] Z. Shen, R. Chen, J. G. Andrews, W. H. Heath, and B. L. Evans, “Sum capacity of multiuser MIMO broadcast channels with block diagonalization,” IEEE Trans. Wireless Commun., vol. 6, no. 6, pp. 2040–2045, Jun. 2007. [12] G. Song and Y. Li, “Cross-layer optimization for OFDM wireless networks-part I: Theoretical framework,” IEEE Trans. Wireless Commun., vol. 4, no. 2, pp. 614–624, Mar. 2005. [13] G. Song and Y. Li, “Cross-layer optimization for OFDM wireless networks-part II: Theoretical framework,” IEEE Trans. Wireless Commun., vol. 4, no. 2, pp. 625–634, Mar. 2005. [14] D. Hui, V. Lau, and W. Lam, “Cross-layer design for OFDMA wireless systems with heterogeneous delay requirements,” IEEE Trans. Wireless Commun., vol. 6, no. 8, pp. 2872–2880, Aug. 2007. [15] C. Mohanram and S. Bhashyam, “Joint subcarrier and power allocation in channel-aware queue-aware scheduling for multiuser OFDM,” IEEE Trans. Wireless Commun., vol. 6, no. 9, pp. 3208–3213, 2007.

2033

[16] Z. Kong, Y.-K. Kwok, and J. Wang, “A low-complexity QoS-aware proportional fair multicarrier scheduling algorithm for OFDM systems,” IEEE Trans. Veh. Technol, vol. 58, no. 5, pp. 2225–2235, Jun. 2009. [17] G. Song, Y. Li, and L. Cimini, “Joint channel-and queue-aware scheduling for multiuser diversity in wireless OFDMA networks,” IEEE Trans. Commun., vol. 57, no. 7, pp. 2109–2121, Jul. 2009. [18] N. Zhou, X. Zhu, Y. Huang, and H. Lin, “Low complexity cross-layer design with packet dependent scheduling for heterogeneous traffic in multiuser OFDM systems,” IEEE Trans. Wireless Commun., vol. 9, no. 6, pp. 1912–1923, Jun. 2010. [19] G. Femenias, B. Dañobeitia, and F. Riera-Palou, “Unified approach to cross-layer scheduling and resource allocation in OFDMA wireless networks,” EURASIP J. Wireless Commun. Netw., vol. 2012, no. 1, pp. 1–19, 2012. [20] G. Femenias, F. Riera-Palou, and J. Thompson, “Robust scheduling and resource allocation in the downlink of spatially correlated MIMOOFDMA wireless systems with imperfect CSIT,” IEEE Trans. Veh. Technol., vol. 65 no. 2, pp. 614–629, Feb. 2016. [21] G. Femenias, B. Dañobeitia, and F. Riera-Palou, “An optimization framework for scheduling and resource allocation in multi-stream heterogeneous MIMO-OFDMA wireless networks,” in Proc. IEEE/IFIP Wireless Days, 2012, pp. 1–3. [22] N. Ul Hassan and M. Assaad, “Low complexity margin adaptive resource allocation in downlink MIMO-OFDMA system,” IEEE Trans. Wireless Commun., vol. 8, no. 7, pp. 3365–3371, Jul. 2009. [23] M. Moretti and A. Perez-Neira, “Efficient margin adaptive scheduling for MIMO-OFDMA systems,” IEEE Trans. Wireless Commun., vol. 12, no. 1, pp. 278–287, Jan. 2013. [24] M. Moretti, L. Sanguinetti, and X. Wang, “Resource allocation for power minimization in the downlink of THP-based spatial multiplexing MIMOOFDMA systems,” IEEE Trans. Veh. Technol., vol. 64, no. 1, pp. 405– 411, Jan. 2015. [25] W. W. Ho and Y.-C. Liang, “Optimal resource allocation for multiuser MIMO-OFDM systems with user rate constraints,” IEEE Trans. Veh. Technol., vol. 58, no. 3, pp. 1190–1203, Mar. 2009. [26] T. F. Maciel and A. Klein, “On the performance, complexity, and fairness of suboptimal resource allocation for multiuser MIMO-OFDMA systems,” IEEE Trans. Veh. Technol., vol. 59, no. 1, pp. 406–419, Jan. 2010. [27] V. D. Papoutsis, I. G. Fraimis, and S. A. Kotsopoulos, “User selection and resource allocation algorithm with fairness in MISO-OFDMA,” IEEE Commun. Lett., vol. 14, no. 5, pp. 411–413, May 2010. [28] C.-M. Yen, C.-J. Chang, and L.-C. Wang, “A utility-based TMCR scheduling scheme for downlink multiuser MIMO-OFDMA systems,” IEEE Trans. Veh. Technol., vol. 59, no. 8, pp. 4105–4115, Oct. 2010. [29] J. Chen and A. L. Swindlehurst, “Applying bargaining solutions to resource allocation in multiuser MIMO-OFDMA broadcast systems,” IEEE J. Sel. Topics Signal Process., vol. 6, no. 2, pp. 127–139, Apr. 2012. [30] F. Lima, T. F. Maciel, W. Freitas, and F. P. Cavalcanti, “Improved spectral efficiency with acceptable service provision in multiuser MIMO scenarios,” IEEE Trans. Veh. Technol., vol. 63, no. 6, pp. 2697–2711, Jul. 2014. [31] D. Wu and R. Negi, “Effective capacity: A wireless link model for support of quality of service,” IEEE Trans. Wireless Commun., vol. 2, no. 4, pp. 630–643, Jul. 2003. [32] C. Xiong, G. Y. Li, Y. Liu, Y. Chen, and S. Xu, “Energy-efficient design for downlink OFDMA with delay-sensitive traffic,” IEEE Trans. Wireless Commun., vol. 12, no. 6, pp. 3085–3095, Jun. 2013. [33] C. She, C. Yang, and L. Liu, “Energy-efficient resource allocation for MIMO-OFDM systems serving random sources with statistical QoS requirement,” IEEE Trans. Commun., vol. 63, no. 11, pp. 4125–4141, Nov. 2015. [34] T. Abrão, L. D. H. Sampaio, S. Yang, K. T. K. Cheung, P. J. E. Jeszensky, and L. Hanzo, “Energy efficient OFDMA networks maintaining statistical QoS guarantees for delay-sensitive traffic,” IEEE Access, vol. 4, pp. 774– 791, Mar. 2016. [35] Third Generation Partnership Project, Technical Specification Group Radio Access Network; Physical layer aspects for evolved Universal Terrestrial Radio Access (UTRA). 3GPP Std. TR 25.814 v. 7.0.0, 2006. [36] Evolved Universal Terrestrial Radio Access(E-UTRA); Physical Channel and Modulation. 3GPP Std. TS36.211-R8, 2008. [37] Evolved Universal Terrestrial Radio Access(E-UTRA); Multiplexing and Channel Coding. 3GPP Std. TS36.212-R8, 2008. [38] C. Cox, An Introduction to LTE: LTE, LTE-Advanced, SAE and 4G Mobile Communications. Hoboken, NJ, USA: Wiley, 2012.

2034

[39] X. Wang, G. Giannakis, and A. Marques, “A unified approach to QoSguaranteed scheduling for channel-adaptive wireless networks,” Proc. IEEE, vol. 95, no. 12, pp. 2410–2431, Dec. 2007. [40] A. van Zelst and J. Hammerschmidt, “A single coefficient spatial correlation model for multiple-input multiple-output (MIMO) radio channels,” in Proc. URSI 27th Gen. Assembly, 2002, pp. 1–4. [41] D. P. Palomar, J. M. Cioffi, and M. A. Lagunas, “Joint Tx-Rx beamforming design for multicarrier MIMO channels: A unified framework for convex optimization,” IEEE Trans. Signal Process., vol. 51, no. 9, pp. 2381–2401, Sep. 2003. [42] I. C. Wong and B. Evans, Resource Allocation in Multiuser Multicarrier Wireless Systems. New York, NY, USA: Springer, 2008. [43] A. J. Goldsmith, Wireless Communications. Cambridge, U.K.: Cambridge Univ. Press, 2005. [44] F. Kelly, A. Maulloo, and D. Tan, “Rate control for communication networks: shadow prices, proportional fairness and stability,” J. Oper. Res. Soc., vol. 49, no. 3, pp. 237–252, 1998. [45] S. Shakkottai and A. Stolyar, Scheduling Algorithms for a Mixture of Real-Time and Non-Real-Time Data in HDR. Murray Hill, NJ, USA: Bell Laboratories, Lucent Technologies, 2000. [46] M. Andrews, K. Kumaran, K. Ramanan, A. Stolyar, P. Whiting, and R. Vijayakumar, “Providing quality of service over a shared wireless link,” IEEE Commun. Mag., vol. 39, no. 2, pp. 150–154, Feb. 2001. [47] S. Shakkottai and A. Stolyar, “Scheduling for multiple flows sharing a time varying channel: The exponential rule. Murray Hill, NJ, USA: Bell Laboratories, Lucent Technologies, 2000. [48] W. Yu and R. Lui, “Dual methods for nonconvex spectrum optimization of multicarrier systems,” IEEE Trans. Commun., vol. 54, no. 7, pp. 1310– 1322, Jul. 2006. [49] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge, U.K.: Cambridge Univ. Press, 2004. [50] D. Tse and P. Viswanath, Fundamentals of Wireless Communication. Cambridge, U.K.: Cambridge Univ. Press, 2005. [51] “LTE—Evolved universal terrestrial radio access (E-UTRA) base station (BS) radio transmission and reception-(3GPP TS36.104-r8),” 3GPP, Tech. Rep., 2009. [52] “LTE—Evolved universal terrestrial radio access (E-UTRA) physical layer procedures-(3GPP TS136.213-r8),” 3GPP, Tech. Rep., 2009. [53] G. Femenias and F. Riera-Palou, “Cross-layer resource allocation and scheduling in LTE systems under imperfect CSIT,” in Proc. IEEE/IFIP Wireless Days, Nov. 2013, pp. 1–8.

Guillem Femenias (M’91–SM’11) received the telecommunication engineer degree and the Ph.D. degree in electrical engineering from the Technical University of Catalonia (UPC), Barcelona, Spain, in 1987 and 1991, respectively. From 1987 to 1994, he worked as a Researcher with UPC, where he became an Associate Professor in 1992. In 1995, he joined the Department of Mathematics and Informatics, University of the Balearic Islands (UIB), Mallorca, Spain, where he became a Full Professor, in 2010. He is currently leading the Mobile Communications Group at UIB, where he has been the Project Manager of projects ARAMIS, DREAMS, DARWIN, MARIMBA, COSMOS, and ELISA, all funded by the Spanish and Balearic Islands Governments. In the past, he was also involved with several European projects (ATDMA, CODIT, and COST). His research interests include digital communications theory and wireless communication systems, with particular emphasis on cross-layer transceiver design, resource management, and scheduling strategies applied to fourth- and fifth-generation wireless networks. On these topics, he has authored more than 90 journals and conference papers, as well as some book chapters. Dr. Femenias has served for various IEEE conferences as a Technical Program Committee Member, and as the Publications Chair for the IEEE 69th Vehicular Technology Conference (VTC-Spring, 2009). He was the recipient of the Best Paper Award at the 2007 IFIP International Conference on Personal Wireless Communications and the 2009 IEEE Vehicular Technology Conference (Spring).

IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 64, NO. 5, MAY 2016

Felip Riera-Palou (M’99–SM’11) was born in Palma, Mallorca, Spain, in 1973. He received the B.S./M.S. degrees in computer engineering from the University of the Balearic Islands (UIB), Mallorca, Spain, in 1997, the M.Sc. and Ph.D. degrees in communication engineering from the University of Bradford, Bradford. U.K., in 1998 and 2002, respectively, and the M.Sc. degree in statistics from the University of Sheffield, Sheffield, U.K., in 2006. From May 2002 to March 2005, he was with Philips Research Laboratories, Eindhoven, The Netherlands, first as a Marie Curie Postdoctoral Fellow (European Union) and later as a member of technical staff. While at Philips, he worked on research programs related to wideband speech/audio compression and speech enhancement for mobile telephony. From April 2005 to December 2009, he was a Research Associate (Ramon y Cajal program, Spanish Ministry of Science) in the Mobile Communications Group, Department of Mathematics and Informatics, UIB. Since January 2010, he has been an Associate Research Professor (I3 Program, Spanish Ministry of Education) with UIB. His research interests include signal processing and wireless communications.