IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,
VOL. 24,
NO. 9,
SEPTEMBER 2013
1727
Coding-Aware Proportional-Fair Scheduling in OFDMA Relay Networks Bin Tang, Student Member, IEEE, Baoliu Ye, Member, IEEE, Sanglu Lu, Member, IEEE, and Song Guo, Senior Member, IEEE Abstract—In recent years, OFDMA relay networks have become a key component in the 4G standards (e.g., IEEE 802.16j, 3GPP LTEAdvanced) for broadband wireless access. When numerous bidirectional flows pass through the relay stations in an OFDMA relay network that supports various interactive applications, plenty of network coding opportunities arise and can be leveraged to enhance the throughput. In this paper, we study the proportional-fair scheduling problem in the presence of network coding in OFDMA relay networks. Considering the tradeoff between performance and overhead, we propose two models, global approach (GA) and local approach (LA), under which the corresponding problems are shown both NP-hard. For the GA model, we show that it cannot be approximated within some constant factor. Hence, we propose a heuristic algorithm with low time complexity. For the LA model, we propose a theoretical polynomial time approximation scheme (PTAS), and also present a practical greedy algorithm with approximation factor of 12 . Simulation results show that our algorithms can achieve significant throughput improvement over a state-of-the-art noncoding scheme. Index Terms—Network coding, OFDMA relay networks, proportional-fair scheduling, approximation algorithm
Ç 1
INTRODUCTION
O
RTHOGONAL Frequency Division Multiplexing (OFDM) is a digital modulation scheme that can combat multipath fading/interference robustly, achieve high spectral efficiency, and be easily implemented using the Fast Fourier Transform (FFT) Algorithm 1. In the last decade, as a multiuser version of OFDM, Orthogonal Frequency Division Multiple Access (OFDMA) has become a promising technology for supporting broadband wireless access and been adopted by fourth generation standards including IEEE 802.16e and 3GPP Long Term Evolution (LTE). Generally, a typical OFDMA-based cellular network has a high bandwidth and is expected to support various bandwidth-intensive applications; however, it usually has a limited communication range, and often suffers from coverage holes. A popular cost-effective approach for extending the range and filling up the holes is to add relay stations (RSs) between the base station (BS) and mobile stations (MSs, or users), thereby making the network multihop in essence. Recently, the two-hop OFDMA relay networks (see Fig. 1a, e.g.) have become a dominant component in some emerging fourth generation standards such as IEEE 802.16j [2] and 3GPP LTE-Advanced [3]. In two-hop OFDMA relay networks, the prescribed frequency band is divided into multiple narrow subchan-
. B. Tang, B. Ye, and S. Lu are with the National Key Laboratory for Novel Software Technology, Nanjing University, Xianlin Campus, 163 Xianlin Avenue, Qixia District, Nanjing 210046, China. E-mail:
[email protected], {yebl, sanglu}@nju.edu.cn. . S. Guo is with Performance Evaluation Lab, School of Computer Science and Engineering, The University of Aizu, Tsuruga, Ikki-machi, AizuWakamastu City, Fukushima 965-8580, Japan. E-mail:
[email protected]. Manuscript received 25 Oct. 2011; revised 8 June 2012; accepted 2 Sept. 2012; published online 11 Sept. 2012. Recommended for acceptance by M. Guo. For information on obtaining reprints of this article, please send e-mail to:
[email protected], and reference IEEECS Log Number TPDS-2011-10-0789. Digital Object Identifier no. 10.1109/TPDS.2012.269. 1045-9219/13/$31.00 ß 2013 IEEE
nels, and time is partitioned into frames of multiple time slots. The scheduling, carried out by the BS, is to allocate subchannels across the two hops over time slots on a frame basis, such that all MSs can be served in an efficient and fair manner. The two-hop nature, on one hand, makes the scheduling problem difficult to deal with [4]; on the other hand, it also brings opportunities to employ network coding, which has been proved an effective approach to improve the network throughput [5], [6]. As various interactive applications (e.g., online gaming, peer-to-peer streaming, and so on) are supported by OFDMA relay networks, numerous bidirectional flows involved in these applications pass through RSs, resulting in a plenty of network coding opportunities that can be leveraged to enhance the throughput. Such benefit of network coding is illustrated in Figs. 1b and 1c via a simple packet exchange scenario. However, most existing solutions just simply schedule downlink and uplink traffic separately, thereby missing such coding opportunities. While network coding could potentially improve the network throughput, it would hurt the throughput if blindly used. Taking Fig. 1c as an example, we consider the coded data to be broadcasted to both the BS and MS over a given subchannel. To ensure that both the BS and MS can receive the data successfully, the achieved broadcast rate is bounded by the lower one of rates over links (RS, BS) and (RS, MS), leading to a diminished throughput if one of the link rates is very low. Therefore, the scheduling decision should be made under a well-devised network coding mechanism. In this paper, we address the scheduling problem in the presence of network coding in two-hop OFDMA relay networks. We adopt the widely used proportional-fair scheduling policy [7], which provides a good balance between network throughput and system fairness. Considering the tradeoff between performance and overheads, we consider two models for supporting the coding-aware scheduling decision making, which have been accepted by Published by the IEEE Computer Society
1728
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,
VOL. 24,
NO. 9,
SEPTEMBER 2013
Fig. 1. Illustration of network model and the benefit of network coding. (a) A two-hop network model. (b) Without network coding, the packet exchange requires four transmissions. (c) With network coding, RS combines the two received packets by XOR (exclusive or) operation, and broadcasts the coded packet. Both BS and MS can recover each other’s packet by XORing again with their own packet. This process only takes three transmissions. The saved transmission can be used for new data to achieve an increased throughput.
the community. One is called the global approach (GA), which has been comprehensively studied in coding-oblivious scheduling problems, for example, in [4], [7]. Our paper extends this model for the coding-aware scheduling in OFDMA relay networks. The other is named the local approach (LA), which is essentially the classic round-robin scheduling scheme and has been also widely adopted, for example, in [8]. We study it with the target of proportional fairness in this paper. Both models investigate the benefits of network coding and frequency selectivity. In particular, the GA model achieves a higher performance by exploiting the multiuser diversity gain, but incurs a significant overhead due to the collection of link rates over the whole network; in contrast, the LA model processes the scheduling in a simplified and local fashion, and thus introduces a much lower overhead with the cost of sacrificing the multiuser diversity. For both the proportional-fair scheduling problems under each model, we establish their hardness and propose efficient polynomial time algorithms. Our main contributions are summarized as follows: 1.
2.
3.
Under the GA model, we prove that the codingaware scheduling problem is NP-hard and no polynomial time approximation scheme (PTAS)1 exits. Then, we propose a heuristic algorithm with low time complexity. Under the LA model, we show its NP-hardness and present a theoretical PTAS. We also propose a practical greedy algorithm with an approximation factor of 12 . Our algorithms are evaluated to highlight the benefit of network coding via simulations. Simulation results show that our algorithms are close to optimal and can achieve about 10-30 percent throughput improvement over a state-of-the-art noncoding scheme.
1. For a maximization problem, a PTAS is an algorithm which takes a parameter > 0 as an input, and produces a solution that is within a factor 1 of being optimal in polynomial time.
The remainder of the paper is organized as follows: Section 2 discusses about some related work, and Section 3 describes the system model. Coding-aware scheduling problems under the GA model and the LA model are investigated in Sections 4 and 5, respectively. Performance evaluation of our algorithms is presented in Section 6. Finally, Section 7 concludes.
2
RELATED WORK
In the context of OFDMA-based single hop networks, the scheduling problem has been extensively studied, for example, in [9], [10], [11], [12], [13]. However, these approaches cannot be easily applied to OFDMA relay networks, where RSs are introduced to improve network coverage and capacity [14], [15], making the network essentially multihop. Therefore, great efforts have been made in designing efficient scheduling algorithms for OFDMA relay networks. In [4], the authors propose several scheduling algorithms to exploit both multiuser diversity, frequency selectivity, as well as spatial reuse in OFDMAbased two-hop relay networks. They also present algorithms for low overhead scheduling in [16]. In [17], the authors study the downlink scheduling problem to exploit the multiuser diversity and frequency selectivity in multihop OFDMA relay networks, with an emphasis on IEEE 802.16j-based networks. In [18], the authors investigate the QoS-aware scheduling problem for multihop relay networks. As these efforts focus on the downlink scheduling and consider the uplink scheduling as a symmetric case, none of them exploit the benefit of network coding when bidirectional traffic is taken into account. In contrast, we propose network coding-aware scheduling algorithms based on a framework in [4] that we extend to accommodate network coding into OFDMA relay networks. Its significant benefit is demonstrated by the supreme performance of our algorithms over the state-of-the-art codingoblivious scheduling algorithm DIV1 [4]. Network coding, first proposed by Ahlswede et al. [5], has become a promising approach to improve the perfor-
TANG ET AL.: CODING-AWARE PROPORTIONAL-FAIR SCHEDULING IN OFDMA RELAY NETWORKS
mance of wireless networks. In [6], the authors propose a practical XOR-based network coding scheme named COPE to improve the throughput of multiple unicast flows in 802.11-based wireless mesh networks. Compared to the extensive research on coding algorithms (e.g., [19]), codingaware routing (e.g., [20], [21]), and so on, in 802.11-based wireless networks, only a few show the potential benefits of network coding in OFDMA based networks. In [23], the authors study the network coding-aware scheduling problem while making use of the multiuser diversity and frequency selectivity in OFDMA-based single-hop cellular networks, where coding is performed by BS when messages are exchanged between MSs. Obviously, this work cannot be extended to two-hop networks, where coding is used for reducing traffic at RSs. The authors in [22] and [25] show that network coding can achieve a significant throughput gain when the cooperative diversity is considered as well in relay-assisted OFDMA networks. The authors in [24] study the joint routing and resource allocation problem in the presence of network coding with the objective to maximize the scaling factor for some given traffic pattern. In [26], the authors also propose a network coding-based opportunistic scheduling scheme for OFDMA relay networks, but they allow that data can be stored at RSs over time, making it inconsistent with the simplicity assumption of RSs. All these proposals in [25], [24], [26] focus on maximizing the aggregate throughput over all MSs without taking fairness into account. Besides, they assume that each scheduling unit can be continuously partitioned with no constraints imposed by the frame structure in OFDMA relay networks. In contrast, we adopt the proportional fairness as the objective metric for the frame-based scheduling problem such that MSs can be served in an efficient and fair manner.
3
SYSTEM DESCRIPTION
3.1 Network Model We consider a two-hop OFDMA relay network as shown in Fig. 1a, in which the network consists of one BS, a set of RSs R (jRj ¼ R), and a set of MSs M (jMj ¼ M). Each MS connects to the BS directly or via an RS according to some predetermined routing scheme, such as using the transmission time based metric [27]. The prescribed frequency band is divided into a set of multiple orthogonal subchannels C ðjCj ¼ CÞ that are allowed to be used by all stations. We consider a synchronized, time-slotted, frame-based OFDMA relay network, where each station has only one transceiver and, hence, cannot transmit and receive concurrently. As shown in Fig. 2, each frame consists of T time slots and C subchannels. The scheduling, usually performed by the BS at the beginning of each frame, is to allocate the two dimensional resources to users in an efficient and fair manner. To leverage the benefit of network coding, the frame is partitioned into three sequential subframes named downlink subframe, uplink subframe, and relay subframe. As the names suggest, the downlink subframe is used by the BS to transmit data to RSs, the uplink subframe is used by MSs to upload data to RSs, and the relay subframe is used by RSs to forward the received data to respective MSs and BS with or without network coding. Different from the scheme proposed in [25] that partitions the whole frame into equal-sized subframes, our partition allows these subframes with different sizes. The
1729
Fig. 2. Illustration of frame structure in a two-hop OFDMA relay network. In this example, both the downlink and uplink subframes have four time slots, while the relay subframe has six time slots.
optimal partition can be found based on the proposed algorithms by exhaustive search, with an increased time complexity by OðT 2 Þ times. The XOR-based network coding operations are performed as illustrated in Fig. 1c, in which each coded packet is attached with the required information to guarantee the decoding at its intended receivers. Similar to [12], [4] for ease of presentation, we use the following equivalent technical treatments: We consider each MS connects to the BS via an RS. For those MSs with direct connections with the BS, we can separate their traffic scheduling from others by assigning different frames to them. . We consider only one time slot for each subframe. The subchannels of the same subframe in other time slots can be seen as additional subchannels available to the considered time slot. In such a way, all subchannels for downlink, uplink, and relay subframes are denoted as Ci (jCi j ¼ Ci ) with i ¼ 1; 2 and 3, respectively. Finally, to indicate the subchannel allocation, we introduce binary variables Ii ðm; cÞ, which is equal to one if subchannel c 2 Ci is allocated to MS m, and equal to zero otherwise, in the downlink subframe (i ¼ 1) or uplink subframe (i ¼ 2). If Ii ðm; cÞ ¼ 1, the corresponding link rate is denoted as ri ðm; cÞ. Similarly, I3 ðm; c; kÞ represents a working mode k, under which subchannel c 2 C3 is allocated to forward downlink traffic (k ¼ 1), to forward uplink traffic (k ¼ 2), or to broadcast traffic with coded data (k ¼ 3) for MS m with the corresponding rate r3 ðm; c; kÞ. To make sure both BS and MS m can receive the broadcast data successfully, the broadcast rate is constrained by the lower achievable rate for unicasts, i.e., .
r3 ðm; c; 3Þ ¼ minfr3 ðm; c; 1Þ; r3 ðm; c; 2Þg:
ð1Þ
Note that all rates mentioned above are given by the system configuration and known beforehand.
3.2 Proportional-Fair Scheduling Generally, the BS makes the scheduling decision according to some optimization objective, and then disseminates the result in the preamble of each frame. To maintain a good balance between network throughput and system fairness, we adopt the proportional-fair scheduling, which is a popular scheduling policy, and has been widely used in OFDMA systems [28], [29], [17]. Under this policy, the optimization objective in a long run is to maximize
1730
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,
X
ðlog DTm þ log UTm Þ;
ð2Þ
m2M
where DTm and UTm are the long-term downlink and uplink throughput of MS m, respectively. As proved in [7], such optimization can be achieved by maximizing the following objective function in each frame2: X dm um ; ð3Þ þ Dm Um m2M where dm (um ) is the total downlink (uplink) data amount in bits for MS m to be allocated in the current frame, and Dm (Um ) denotes the average downlink (uplink) rate for MS m till the previous frame. In the following, we propose two approaches to maximize objective (3) in terms of a tradeoff between performance and overheads: GA: It aims at fully exploiting multiuser diversity and frequency selectivity, as well as network coding gain, by acquiring all link rate information over the whole network. It works as follows: First, all data rates in bits/slot captured through channel state information (CSI) at all links for each subchannel are reported to BS. Then, the BS runs some algorithm to make the scheduling decision. Finally, the BS broadcasts the scheduling decision during the transmission of the preamble. Such model can been seen as a direct extension of the model for coding-oblivious scheduling [4], [17]. . LA: Inspired by the simplest round-robin scheduling [8], this approach introduces a restriction that each frame can only be used to serve one MS. With this restriction, the scheduling can be processed in a simplified and local fashion as described below. At the beginning, the link rates are fed back to their corresponding RSs. Each RS then calculates the maximum attainable value vm of Ddmm þ Uumm for each associated MS m, under the assumption that the whole frame is used to serve m. Subsequently, the largest vm from its MSs is reported to the BS. Finally, the BS picks the MS m ^ with the largest vm^ , and ^ will be served during broadcasts the decision that m the next whole frame. Compared to the round-robin scheduling, LA exploits the benefit of network coding targeting proportional fairness while maintains almost the same simplicity. Comparing the above two approaches, we have the following observations. First, all CSI information should be further forwarded to BS in the GA model, resulting in a doubled CSI overhead, which is the dominant component of the whole feedback overhead, of the LA model. Furthermore, LA only introduces a constant decision dissemination overhead, which is significantly less than that of GA. On the other hand, the multiuser diversity gain is not exploited in LA, and thus it would lead to some performance degradation.
4
NO. 9,
SEPTEMBER 2013
CODING-AWARE SCHEDULING UNDER GA MODEL
In this section, we study the Coding-Aware proportionalfair scheduling problem under the GA model, denoted as CA-GA. We first formulate CA-GA as an integer linear programming, and then establish its hardness. A lowcomplexity heuristic algorithm is presented at the end of this section.
4.1 CA-GA Formulation Based on the predefined indicator variables defined earlier, we can formulate the CA-GA as an integer linear programming problem as follows: X dm um ; ð4Þ þ CA-GA: max Dm Um m2M s.t. dm
.
2. When the data flow have different priorities, we can modify this objective function by associating each addition term with a multiplier which indicates the corresponding flow priority. In this case, our results still hold by integrating these priority multipliers with the denominators.
VOL. 24,
X
r1 ðm; cÞI1 ðm; cÞ;
8m 2 M
ð5Þ
c2C1
dm
X X
r3 ðm; c; kÞI3 ðm; c; kÞ;
8m 2 M
ð6Þ
c2C3 k2f1;3g
um
X
r2 ðm; cÞI2 ðm; cÞ;
8m 2 M
ð7Þ
c2C2
um
X X
r3 ðm; c; kÞI3 ðm; c; kÞ;
8m 2 M
ð8Þ
c 2 Ci
ð9Þ
c2C3 k2f2;3g
X
Ii ðm; cÞ 1;
8i 2 f1; 2g;
m2M 3 X X
I3 ðm; c; kÞ 1;
8c 2 C3
ð10Þ
m2M k¼1
Ii ðm; cÞ 2 f0; 1g; 8m 2 M; i 2 f1; 2g; c 2 Ci
ð11Þ
I3 ðm; c; kÞ 2 f0; 1g; 8m 2 M; c 2 C3 ; k 2 f1; 2; 3g:
ð12Þ
For the subchannel allocation of each frame, (5), (6), and (7), (8) characterize the achievable downlink and uplink data amounts for each MS m, respectively. The standard network configuration [4], [17], where each RS has no per-user buffer due to its simplicity, requires the flow conservation to be strictly guaranteed over the whole frame duration. Constraint (9) states that each subchannel in the downlink/uplink subframes can be allocated to only one MS. Similarly, constraint (10) represents that each subchannel in the relay subframe can be assigned to only one MS in a specific working mode. Finally, (11) and (12) are indicator constraints.
4.2 Hardness Analysis In this section, we analyze the hardness of the CA-GA problem. The theoretical results are given in the following theorem. Theorem 1. The CA-GA problem is NP-hard and no PTAS exists, i.e., for some positive constant > 0, it does not admit any ð1 Þ-approximation algorithm unless P ¼ NP .
TANG ET AL.: CODING-AWARE PROPORTIONAL-FAIR SCHEDULING IN OFDMA RELAY NETWORKS
Proof. When there is only one MS in the network, the CA-GA problem degenerates into the CA-LA problem which is NP-hard as shown in the proof of Theorem 3 in Section 5.2. Therefore, the CA-GA problem is also NP-hard. To prove the stronger result that CA-GA does not admit any PTAS, we make an approximation factor preserving reduction [30] from a scheduling problem under queuing model (SUQ) [12] to CA-GA, because no PTAS exists for SUQ [12]. We first give the definition of SUQ for the sake of completeness. SUQ: Consider a one-hop OFDMA-based cellular network where the BS transmits data to M MSs through C subchannels directly. Each MS m has a finite backlog Qm at the BS, i.e., the data transmitted to m cannot a feasible subchannel exceed Qm . Then, SUQ Pis to find P assignment such that m Qm minf c rðm; cÞIðm; cÞ; Qm g is maximized, where rðm; cÞ denotes the rate of subchannel c for MS m and Iðm; cÞ indicates whether c is assigned to m. Given any instance of SUQ as shown above, we will construct an instance of CA-GA with downlink traffic only, such that they have the same optimal solution. Consider an OFDMA relay network consisting of M MSs and the corresponding associated M RSs, in which C common subchannels and M private subchannels are available. A common subchannel can be used by BS to transmit downlink data to any RS, while a private subchannel is dedicated to a pair of associated RS and MS. More specifically, for each subchannel c and MS m, we set r1 ðm; cÞ ¼ rðm; cÞ if c is a common subchannel, and set r3 ðm; c; 1Þ ¼ Qm if c is a private subchannel. All other rates are set to zero. Finally, we set Dm ¼ 1=Qm and Um ¼ 1 for each MS m. Under this construction, a frame is partitioned into downlink subframe and relay subframe, each having a single time slot, and no benefit of network coding can be obtained. Note that during the relay subframe, only the associated private subchannel is available for each MS m, which forms a nominal backlog size of Qm for assigning subchannels in the downlink subframe due to the flow conservation constraints in (5) and (6). It is a straightforward exercise to show that both instances have the same optimal solution. Besides, any feasible assignment of subchannels in the downlink subframe for the CA-GA problem provides a feasible assignment for the corresponding instance of SUQ with equal objective value. In summary, such reduction preserves the approximation factor. u t
4.3 Heuristic Algorithm Although the GA model provides a general approach to fully exploit multiuser diversity and frequency selectivity, Theorem 1 reveals that no efficient algorithm for the CA-GA problem exists in a technical sense. In other words, it is hardly to find exact or even approximate solutions for CAGA instances. In the following, we propose a practical, low time complexity heuristic algorithm Emulated MaxWeight algorithm (EMW), which can be seen as an emulation of the well-known MaxWeight algorithm [31], [32], [33]. Note that the data amount dm (um ) is decided by a joint subchannel assignment for the downlink/uplink and relay
1731
sub-frames. To make CA-GA easy to deal with, this heuristic algorithm decouples the correlation in joint optimization by allocating the subchannels in the relay subframe first. Let wrc ðm; kÞ denote the weight of assigning relay subchannel c 2 C3 to MS m in mode k. It is defined as 8 r3 ðm;c;1Þ > k ¼ 1; > < Dm r3 ðm;c;2Þ r ð13Þ wc ðm; kÞ ¼ k ¼ 2; Um > > : r3 ðm;c;3Þ þ r3 ðm;c;3Þ k ¼ 3: Dm
Um
^ in mode Then, each subchannel c 2 C3 is assigned to MS m ^ where ðm; ^ ¼ arg maxm;k wr ðm; kÞ, such that the utility of ^ kÞ k, c assignment could be maximized. Once the assignment of all subchannels in the relay subframe is determined, according to the flow conservation constraints, the upper bounds of data amounts dm and um for each MS m will be also known, denoted as Qdm and Qum , respectively. In the subsequent subchannel allocation of downlink/uplink subframes, we apply a greedy technique that has been used in [12] as well. The main idea is to assign subchannels in a sequential manner. Let Qdm be the residual data amount for downlink traffic of MS m. Then, considering the flow conservation, we define the weight wdi ðmÞ for assigning downlink subchannel c to MS m as min Qdm ; r1 ðm; cÞ d wc ðmÞ ¼ : ð14Þ Dm ^ Our greedy algorithm will assign subchannel c to MS m, ^ ¼ arg maxm wdi ðmÞ. After that, the residual data where m volume will be updated as ^ cÞ ; max 0; Qdm^ r1 ðm; ð15Þ Qdm^ and Qdm
^ Qdm ; 8m 6¼ m:
ð16Þ
The assignment of subchannels in the uplink subframe is conducted in a similar way. The formal description of the whole algorithm is given in Algorithm 1. The following result is rather straightforward from Algorithm 1. Proposition 2. The computational complexity of Algorithm EMW is OðMCÞ. Algorithm 1. Algorithm EMW for CA-GA 1: for c 2 C3 do ^ ¼ arg maxm;k wr ðm; kÞ. ^ kÞ 2: ðm; c ^ ^ c; kÞ 1. 3: I3 ðm; ^ 0, 8m 6¼ m, ^ or k 6¼ k. 4: I3 ðm; c; kÞ 5: end for 6: for m 2 M do P P 7: Qdm c2C3 Pk2f1;3g r3 ðm; c; kÞI3 ðm; c; kÞ. P 8: Qum c2C3 k2f2;3g r3 ðm; c; kÞI3 ðm; c; kÞ. 9: end for 10: for c ¼ 1 to C1 do minfQdm ;r1 ðm;cÞg ^ . 11: m arg maxm Dm ^ cÞ 1; I1 ðm; cÞ 0, 8m 6¼ m. ^ 12: I1 ðm; ^ cÞg. maxf0; Qdm^ r1 ðm; 13: Qdm^ 14: end for 15: for c ¼ 1 to C2 do minfQum ;r2 ðm;cÞg ^ . 16: m arg maxm Um
1732
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,
In this section, we consider the Coding-Aware scheduling problem under the LA model (CA-LA). We first formulate it as an integer linear programming and show it NP-hard. Then, we propose a practical 12 -approximation algorithm.
5.1 CA-LA Formulation Under the LA model, the optimizations are conducted for each MS m. For this reason, we omit dimension-m to all symbols in this section without incurring any confusion. The corresponding problem formulation can, thus, be rewritten as follows: max
d u þ ; D U
ð17Þ
s.t. d
X
4
r1 ðcÞ ¼ Qd
ð18Þ
c2C1
d
X X
r3 ðc; kÞI3 ðc; kÞ
Strategy 1: The whole frame is used for downlink traffic. It is a straightforward exercise to show the allocation scheme that the first two slots are used by link (BS, RS) and the last one by link (RS, MS) will maximize the objective to be obj ¼ d þ 2u ¼ 4sðAÞ, where d ¼ minf6sðAÞ; 4sðAÞg ¼ 4sðAÞ and u ¼ 0. . Strategy 2: The whole frame is used for uplink traffic. Similarly, we have the maximum obj ¼ d þ 2u ¼ 0 þ 4sðAÞ ¼ 4sðAÞ. . Strategy 3: The frame is partitioned into three subframes, with one time slot for each. P Due to the flow conservation, we have d a2A r1 ðca Þ ¼ P 3sðAÞ and u a2A r2 ðca Þ ¼ sðAÞ. Therefore, we obtain obj 5sðAÞ. Denote OP T as the optimal value of the instance of CALA. The above analysis shows OP T 5sðAÞ. In the following, we prove that A has a feasible partition if and only if OP T ¼ 5sðAÞ. If there exists a subset A0 A such that sðA0 Þ ¼ sðA A0 Þ ¼ 12 sðAÞ, then we adopt the third strategy. In the last time slot, subchannels corresponding to elements in A0 are used for broadcasting with network coding, and other subchannels are assigned only for downlink traffic. Then, we have
ð19Þ
(
c2C3 k2f1;3g
u
X
d ¼ min 4
r2 ðcÞ ¼ Qu
u
X
r1 ðca Þ;
r3 ðca ; 1Þ þ
a2AA0 0
X
) r3 ðca ; 3Þ
a2A0
¼ minf3sðAÞ; 4sðA A Þ þ 2sðA0 Þg ¼ minf3sðAÞ; 3sðAÞg ¼ 3sðAÞ;
r3 ðc; kÞI3 ðc; kÞ
ð21Þ and
c2C3 k2f2;3g
3 X
X a2A
ð20Þ
c2C2
X X
SEPTEMBER 2013
.
CODING-AWARE SCHEDULING UNDER LA MODEL
CA-LA:
NO. 9,
We list all possible scheduling strategies as follows:
^ cÞ ^ 17: I2 ðm; 1; I2 ðm; cÞ 0, 8m 6¼ m. 18: Qum^ maxf0; Qum^ r2 ðm; ^ cÞg. 19: end for
5
VOL. 24,
I3 ðc; kÞ 1; 8c 2 C3
ð22Þ
u ¼ min
( X a2A
k¼1
r2 ðca Þ;
X
) r3 ðca ; 3Þ
¼ minfsðAÞ; 2sðA0 Þg
0
a2A
¼ minfsðAÞ; sðAÞg ¼ sðAÞ: I3 ðc; kÞ 2 f0; 1g; 8c 2 C3 ; k 2 f1; 2; 3g:
ð23Þ
5.2 Hardness Analysis While this problem seems to be simplified under the LA model, the following analysis reveals that CA-LA is still NP-hard. Theorem 3. The CA-LA problem is NP-hard. Proof. The proof follows by reducing the partition problem, which is a well-known NP-hard problem, to an instance of CA-LA. In the partition problem, we are given a finite set A and a size sðaÞ 2 Z þ for each P a 2 A. For any subset Y A, we use sðYÞ to denote a2Y sðaÞ. The partition problem is to decide whether there exists a subset A0 A, such that sðA0 Þ ¼ sðA A0 Þ. An instance of CA-LA with three time slots is, thus, constructed as follows: For each element a 2 A, the instance includes a subchannel ca such that r1 ðca Þ ¼ 3sðaÞ; r2 ðca Þ ¼ sðaÞ; r3 ðca ; 1Þ ¼ 4sðaÞ, and r3 ðca ; 2Þ ¼ 2sðaÞ. The last two settings lead to r3 ðca ; 3Þ ¼ minfr3 ðca ; 1Þ; r3 ðca ; 2Þg ¼ 2sðaÞ. Finally, we set D ¼ 1 and U ¼ 12 . Then, the objective is to maximize obj ¼ d þ 2u.
This leads to obj ¼ 5sðAÞ OP T . Combining this result OP T 5sðAÞ obtained earlier, we conclude OP T ¼ 5sðAÞ. Conversely, if OP T ¼ 5sðAÞ, then the third strategy must be used, and furthermore d ¼ 3sðAÞ and u ¼ sðAÞ. Let Bk (k ¼ 1; 2; 3) be the disjoint sets of elements corresponding to subchannels working on mode k in the last time slot (i.e., the relay subframe). According to the flow conservation, we can conclude 4sðB1 Þ þ 2sðB3 Þ d ¼ 3sðAÞ and 2sðB2 Þ þ 2sðB3 Þ u ¼ sðAÞ. Combining sðB1 Þ þ sðB2 Þ þ sðB3 Þ ¼ sðAÞ and sðÞ 0, we can derive sðB1 Þ ¼ sðB3 Þ ¼ 12 sðAÞ and sðB2 Þ ¼ 0. In other words, B1 (or B3 ) forms a feasible partition to set A. u t Further investigation on the hardness of the CA-LA problem leads to an important discovery that a theoretical PTAS exists for the problem as stated in Theorem 4. This result is different from the previous section and implies CALA is easier than CA-GA in some technical sense. Theorem 4. The CA-LA problem admits some PTAS.
TANG ET AL.: CODING-AWARE PROPORTIONAL-FAIR SCHEDULING IN OFDMA RELAY NETWORKS
Proof. The proof follows by constructing a PTAS for CALA. The details are given in the supplemental material, which can be found on the Computer Society Digital Library at http://doi.ieeecomputersociety.org/10.1109/ TPDS.2012.269. u t
5.3 Half-Approximation Greedy Algorithm (HAG) Although the PTAS can achieve an almost optimal performance in polynomial time, it is still too complex to be implemented in practical systems. Alternatively, we propose a practical greedy algorithm HAG with guaranteed performance. In the greedy algorithm HAG, the subchannels in the relay subframe, numbered as 1; 2; . . . ; C3 , are assigned in sequence. Let Qdc and Quc be the residual data amount for downlink and uplink traffic, respectively, before subchannel c is allocated. Initially, we have Qd1 ¼ Qd and Qu1 ¼ Qu . The subchannel allocation is conducted based on a weight wc ðkÞ that is defined for subchannel c working on mode k as follows: 8 minfQdc ; r3 ðc; 1Þg > > k ¼ 1; > > > D < minfQuc ; r3 ðc; 2Þg wc ðkÞ ¼ k ¼ 2; > U > > d u > > : minfQc ; r3 ðc; 3Þg þ minfQc ; r3 ðc; 3Þg k ¼ 3: D U ð24Þ The working mode k^ is chosen for each subchannel c such that k^ ¼ arg maxk wi ðkÞ. After c is assigned, the residual data amounts become 8 < maxf0; Qdc r3 ðc; 1Þg k^ ¼ 1; d ð25Þ Qcþ1 ¼ Qdc k^ ¼ 2; : d ^ maxf0; Qc r3 ðc; 3Þg k ¼ 3; and Qucþ1
8 u < Qc ¼ maxf0; Quc r3 ðc; 2Þg : maxf0; Quc r3 ðc; 3Þg
k^ ¼ 1; k^ ¼ 2; k^ ¼ 3:
ð26Þ
The formal description of HAG is given in Algorithm 2. It is apparent that the running time of Algorithm HAG is linear on the number of subchannels. Proposition 5. The time complexity of Algorithm HAG is OðCÞ. Algorithm 2. Algorithm HAG for CA-LA 1: Initialization: P r ðcÞ. 2: Qd1 Pc2C1 1 3: Qu1 c2C2 r2 ðcÞ. 4: Sequential Assignment: 5: for c ¼ 1 to C3 do 6: k^ arg maxk wc ðkÞ. ^ ^ 1; I3 ðc; kÞ 0; 8k 6¼ k. 7: I3 ðc; kÞ 8: Update Qdc and Quc according to Eqs. (25) and (26), respectively. 9: end for Finally, we show the theoretical performance of the proposed algorithm HAG in terms of approximation ratio. The basic idea is to show that HAG can be viewed as a special
1733
case of a greedy algorithm for maximizing a nondecreasing submodular function over a partition matroid. For the sake of completeness, we first introduce the definitions of partition matroid and nondecreasing submodular functions: .
A matroid is an ordered pair ðS; I Þ, where S is a finite nonempty set and I is a set of subsets of S, satisfying ; 2 I, if B 2 I and A B, then A 2 I, and if A; B 2 I , and jAj < jBj, then there is some element x 2 B A such that A [ fxg 2 I. A matroid ðS; IÞ is said to be a partition matroid, if there exists some partition of S into components 1 ; 2 ; . . . , such that A 2 I if and only if jA \ k j 1, for all k. A function fðÞ on sets in I is said to be submodular and nondecreasing if it satisfies -
.
.
fð;Þ ¼ 0 and For all a 2 S, A; B 2 I , if A [ fag 2 I and B A, then fðAÞ fðA [ fagÞ and fðA [ fagÞ fðAÞ fðB [ fagÞ fðBÞ: For the problem of maximizing a nondecreasing submodular function over a partition matroid, the greedy algorithm proposed in [34] works as follows: Set A is initialized as empty at the beginning. In each subsequent iterative step, an element a from component k is picked up such that fðA [ fagÞ fðAÞ is maximized and then A is updated as A [ fag. It has been shown that this greedy algorithm achieves an approximation guarantee of 12 [34]. The formal result as well as its proof are given below. -
Theorem 6. Algorithm HAG achieves an approximation factor of 1 2 for the CA-LA problem. Proof. We construct a partition matroid for a given CA-LA problem. We define S ¼ fðc; kÞ j 1 c C3 ; 1 k 3g, and I to be a set of subsets of S as follows: for each A S; A 2 I if and only if A satisfies that for any ðc; kÞ 2 A, ðc; k0 Þ 62 A for any k0 6¼ k. Furthermore, S is partitioned into components i ¼ fði; kÞ j 1 k 3g, i ¼ 1; 2; . . . ; C3 . It is a straightforward exercise to show that ðS; I Þ is a partition matroid. We then define the function fðÞ on sets in I as P min Qd ; k2f1;3g;ðc;kÞ2A r3 ðc; kÞ fðAÞ ¼ D P min Qu ; k2f2;3g;ðc;kÞ2A r3 ðc; kÞ : þ U It can be verified directly that maximizing this function corresponds to our scheduling objective. The remaining work is to show that this function is submodular and nondecreasing. According to the definition of function fðÞ, it is straightforward to see that fð;Þ ¼ 0. Besides, for all a 2 S, A; B 2 I , if A [ fag 2 I and B A, then fðAÞ fðA [ fagÞ holds. To see fðA [ fagÞ fðAÞ fðB [ fagÞ fðBÞ;
ð27Þ
1734
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,
we have
X
¼ min Qd ; (
X
min Qd ;
r3 ðc; kÞ
X
¼ min Qd ; (
r3 ðc; kÞ þ r3 ðca ; 1Þ
X
min Qd ; (
)
k2f1;3g;ðc;kÞ2A
) r3 ðc; kÞ
k2f1;3g;ðc;kÞ2A
(
))
X
¼ max 0; min Qd
r3 ðc; kÞ; r3 ðca ; 1Þ
:
k2f1;3g;ðc;kÞ2A
ð28Þ Similarly, we can derive fðB [ fagÞ fðBÞ ( (
))
X
¼ max 0; min Qd
r3 ðc; kÞ; r3 ðca ; 1Þ
:
k2f1;3g;ðc;kÞ2B
ð29Þ Since B A, we have X r3 ðc; kÞ k2f1;3g;ðc;kÞ2B
X
PERFORMANCE EVALUATION
In this section, we evaluate our proposed coding-aware scheduling algorithms through extensive simulations. Our goal is two-fold. One is to demonstrate the potential benefit of network coding; the other is to show the efficiency of our algorithms.
)
k2f1;3g;ðc;kÞ2A
(
6
r3 ðc; kÞ
k2f1;3g;ðc;kÞ2A[fag
SEPTEMBER 2013
r3 ð1;3Þ r3 ð1;3Þ On the other hand, since r3 ð1;1Þ D > D þ U , HAG makes subchannel 1 work in mode 1, and subchannel 2 becomes useless. Therefore, it achieves a value of r3 ð1;1Þ 1 D ¼ 1, resulting in an approximation factor of 2 , u t which approaches arbitrarily close to 12 .
minfx; y þ zg minfx; yg ¼ maxf0; minfx y; zgg;
)
NO. 9,
minfQd ; r3 ð1; 3Þ þ r3 ð2; 1Þg minfQu ; r3 ð1; 3Þg þ ¼ 2 : D U
we assume, without loss of generality, a ¼ ðca ; ka Þ 62 A. We will prove (27) in the case of ka ¼ 1 in the following. For cases ka ¼ 2 and ka ¼ 3, the proof is similar. According to the definition of fðÞ and the fact: for any x; y; z 0,
fðA [ fagÞ fðAÞ (
VOL. 24,
r3 ðc; kÞ:
ð30Þ
k2f1;3g;ðc;kÞ2A
By combining (28), (29), and (30), we can see that (27) holds. Finally, recalling that HAG goes through each subchannel and assigns a mode to it such that the increment of the objective is maximized, HAG is exactly the greedy algorithm for maximizing a nondecreasing submodular function over the matroid ðS; IÞ that we just construct. This conclusion completes the proof. u t We end this section by showing that the approximation factor 12 of HAG is tight. Theorem 7. For any constant 0 < < 1, there exists an instance of CA-LA on which HAG achieves at most a 1=ð2 Þ fraction of the optimal value. Proof. We prove it by constructing a tight example. In the example, only two subchannels are available. The rates for subchannel 1 are r3 ð1; 1Þ ¼ 1 and r3 ð1; 2Þ ¼ =2. The rates for subchannel 2 are r3 ð2; 1Þ ¼ 1 =2 and r3 ð2; 2Þ ¼ 0. Thus, r3 ð1; 3Þ ¼ =2 and r3 ð2; 3Þ ¼ 0. Finally, we set Qd ¼ Qu ¼ 1, D ¼ 1, and U ¼ 2ð1Þ . Using exhaustive search, we can obtain the following optimal value by setting subchannel 1 to work in mode 3 and subchannel 2 in mode 1:
6.1 Methodology To demonstrate that network coding is indeed helpful in OFDMA relay networks, we would like to compare our proposed algorithms EMW and HAG against some optimal coding-oblivious scheduling scheme. However, the codingoblivious scheduling problem is NP-hard [4], implying that there is no efficient algorithm to find its optimal solution unless P ¼ NP. To cope with this, we compare our algorithms in our simulations with 1) DIV1 [4], which is a state-of-the-art noncoding scheduling algorithm and has been shown to be close to the optimum via simulations, and 2) OPT-NO-NC, which represents the optimal fractional solution of the CA-GA formulation by letting all I3 ðm; c; 3Þ ¼ 0 and relaxing all integer constraints. The resulting LP can be solved in a timely manner by GNU Linear Programming Kit (GLPK) [38]. Note that OPT-NONC provides a natural upper bound on the performance of any coding-oblivious scheduling problem, which implies that if our algorithms have better performance than OPTNO-NC, then network coding must be beneficial. To show the efficiency of our algorithms, we quantify their performance gap to the optimal solution of the codingaware scheduling problem. Similarly, we use its optimal fractional solution instead, denoted by OPT-NC, which is obtained by solving the relaxed CA-GA using GLPK. Clearly, OPT-NC is a natural upper bound of the performance of any coding-aware scheduling algorithm for either the GA or LA model. Thus, if the performance of our algorithm is within some ratio of OPT-NC, then it must be also within the same ratio of the optimum. All our comparisons are based on the metrics of 1) longterm utility, which is defined as the sum of the logarithm of the downlink throughput and the uplink throughput of all MSs as given in (2), and 2) MS throughput, which is the sum of both downlink throughput and uplink throughput. While some other coding-aware scheduling algorithms have been proposed in the literature [24], [25], [26], they are not included in this paper for performance comparison for the following reasons. First, existing coding-aware schemes are proposed under an assumption that each scheduling unit can be continuously partitioned with no constraint imposed by the frame structure in OFDMA relay networks and, thus, cannot be applied for frame-based scheduling studied in this paper. Second, it is unfair to compare with them because they are designed for maximizing the aggregate throughput not proportional fairness. Finally, the result of a theoretical upper bound of the coding-aware
TANG ET AL.: CODING-AWARE PROPORTIONAL-FAIR SCHEDULING IN OFDMA RELAY NETWORKS
proportional-fair scheduling is included for performance comparison. The fact that our proposed scheme performs close to the upper bound, i.e., even closer to the optimal solution, is sufficient to show its supreme performance.
6.2 Simulation Setup We simulate a single cell OFDMA relay network with 1-km cell radius. A number of RSs are randomly and uniformly located within the annulus with inner and outer radii of 350 and 550 m, respectively. And a number of MSs are distributed randomly over the whole cell coverage. We adopt a simple distance-based metric for routing, in which each MS connects to the nearest RS or BS directly. To focus on the benefit of network coding, we only consider those MSs each of which connects to the BS via some RS. We consider a bandwidth of 10 MHz at carrier frequency of 5 GHz. The whole bandwidth is equally split into 1,080 subcarriers. The number of subchannels is 30, and each subchannel is made up of 36 contiguous subcarriers. For links between BS and RS, the channel impairment due to large scale fading is characterized by a large-scale log-normal shadowing with a path loss exponent of 2.4 and a standard deviation of 5.4 dB [35]. For the RS-MS links, the small scale fading effects, caused by the movement of MS, is also incorporated using the Rayleigh fading model. The inherent frequency selective fading is captured by an exponential power delay profile with a delay spread of 15 s. Each subcarrier also has a Doppler spread under the random waypoint model, where each MS is moving at a pedestrian speed of 5.0 km/h with 0 pause period [36]. The combined complex gain is generated using the modified Jakes-like method [37]. We set the transmission power of BS, RS, and MS at 41.7 dBm (15 watts), 39.5 dBm (9 watts), and 37.8 dBm (6 watts), respectively. The noise power is set at 174 dBm/Hz. Once the SNR of each subcarrier is determined, the modulation and coding scheme, which decides the rate of the subcarrier, is chosen according to the SNR/modulation mapping from the IEEE 802.16j standard [2]. Thus, the rate of a subchannel is obtained by summing up the rates of composite subcarriers. In the experiments, we consider that each frame lasts 10 ms with 60 time slots. For each network setting, i.e., a given number of MSs and RSs, the simulation results are obtained from 100 randomly generated network topologies by running over 1,000 frames on each. Important simulation parameters are summarized in Table 1. 6.3 Simulation Results We first evaluate these algorithms under simulation settings with fixed 5 RSs and 40 MSs over 100 randomly generated instances, which are classified into four categories according to the network coding gain in terms of mean throughput (the performance ratio of our algorithms to DIV1). Let IHAG and IEMW be the instance sets sorted by the network coding gain obtained by HAG and EMW, respectively, both in a decreasing order. Category 1 consists of the instances that are top 50 percent in both IHAG and IEMW , category 2 consists of the instances that are top 50 percent in IEMW but not top 50 percent in IHAG , category 3 consists of the instances that are top 50 percent in IHAG but not top 50 percent in IEMW , and category 4 consists of all other instances, i.e., the instances that are not top 50 percent in either IEMW or IHAG . We then randomly select one instance from each category as a representative case. These four
1735
TABLE 1 Simulation Parameters
concrete topologies are shown in Fig. 3, where Topology i, i ¼ 1; 2; 3; 4, is selected from category i. The long-term utilities of evaluated algorithms on the four topologies are plotted in Fig. 4. From this figure, we have the following observations: OPT-NC has higher utilities than OPT-NO-NC, showing that network coding does improve the network performance in terms of throughput and fairness. 2. Both HAG and EMW perform better than DIV1. 3. EMW also performs better than OPT-NO-NC in all cases while HAG does in most cases including the topologies given in Fig. 3. Note that even a lower performance than OPT-NO-NC cannot depreciate HAG, since OPT-NO-NC only provides an upper bound of the optimal solution of noncoding scheduling as explained in Section 6.1. 4. HAG achieves about 95.5 percent of OPT-NC, while EMW achieves about 97 percent. Both of them are very close to the optimum. The cumulative distributions of MS throughput under the four representative topologies are also plotted in Figs. 5a, 5b, 5c, and 5d, respectively. Compared to the noncoding scheme DIV1, both HAG and EMW achieve higher mean throughput (the area between the corresponding curve and the y-axis), as well as median throughput (the corresponding x-coordinate when y-coordinate is equal to 0.5). In particular, HAG improves the mean throughput by 15-20 percent and median throughput by about 10-23 percent. Such improvement offered by EMW are both 15 and 30 percent. We also counted the percentage of subchannels used for network coding in the four representative topologies as shown in Table 2. By comparing the table and figures, we have the following observations. 1) Both algorithms have substantially exploited network coding. For example, HAG used more than 80 percent of subchannels for network coding in all cases and EMW used more than 70 percent. 2) HAG creates more coding opportunities than EMW. This is because HAG only focuses on exploiting the network coding gain, while EMW additionally exploits the multiuser diversity, which would lead to some compromise on the coding efficiency. 3) The coding chance correlates to the instance category (i.e., the network coding gain). For example, both HAG and EMW use least subchannels for 1.
1736
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,
VOL. 24,
NO. 9,
SEPTEMBER 2013
Fig. 3. Distribution of BS, RSs, and MSs over a plane in each of four representative topologies.
network coding in Topology 4, which belongs to category 4 with low coding-gain instances. Here, is an intuitive explanation. By looking into the concrete topology of Topology 4 as shown in Fig. 3d, we can find that there are quite a few MSs located very closely to their associated RSs. For these MSs, the rates over links (RS, BS) and (RS, MS) must vary greatly due to their significant fading difference. Thus, according to the broadcast rate constraint as shown in (1), there is little coding gain for these MSs, i.e., few subchannels to be used for network coding in this topology. However, in general, it is difficult for us to differentiate the topologies such that HAG or EMW can achieve higher coding gain, as it is influenced by too many factors including network topologies, link rates, and so on. The overall simulation results obtained from all these 100 topologies are also analyzed. The cumulative distribution of network coding gains in terms of mean throughput and median throughput are plotted in Figs. 6a and 6b, respectively. As revealed by the dotted lines, HAG achieves a network coding gain of at least 1.17 for both mean and median throughput in half of all 100 network topologies, while EMW achieves about 1.23. Besides, HAG can achieve a network coding gain up to 1.22 for mean throughput, and up to 1.3 for median throughput. Similarly, these coding gains achieved by EMW are up to 1.28 and 1.38, respectively.
Now, we evaluate the network coding gains achieved by our algorithms under network settings with various numbers of MSs and RSs. We first study the impact of the number of MSs M. For a fixed number of RSs (R ¼ 2; 4; and 6), we vary M from 20 to 100 with an incremental step of 20. The results of mean throughput are shown in
Fig. 4. Comparison of long-term utilities in four representative topologies.
TANG ET AL.: CODING-AWARE PROPORTIONAL-FAIR SCHEDULING IN OFDMA RELAY NETWORKS
1737
Fig. 5. Cumulative distribution of MS throughput in four representative topologies.
Figs. 7a, 7b, and 7c, where data points represent the average network coding gains, and error bars indicate the standard deviation (the results of median throughput are similar and, thus, omitted for the interest of space). We note the following facts: The average network coding gain achieved by either EMW or HAG is little affected by the number of MSs. It implies that the network coding opportunities do not increase much when more MSs are involved under a fixed number of RSs. 2. The error bars of both HAG and EMW become shorter when the number of MSs increases, indicating that HAG and EMW perform more steadily when more MSs are included in the network. 3. HAG is more steady than EMW in all cases. 4. Another interesting phenomenon is that the error bars of EMW are quite long for any number of MSs when R ¼ 2. In the case of only two RSs (R ¼ 2) available as relays, the distance of each MS to its associated RS varies a lot. Such distance variance (i.e., link rate variance) correlates with the error bars of EMW because in each scheduling step 1.
TABLE 2 Percentage of Subchannels Used for Network Coding-Based Broadcast
whether a subchannel is used for coding or not is determined by the rates of all participating links (RS, BS) and (RS, MS). When more RSs are in the network, the distance variance will reduce, resulting in shorter error bars. On the other hand, the distance variance makes shorter error bars of HAG because the coding rate is constrained by the lower rate of only a pair of links (RS, BS) and (RS, MS) at each scheduling step. Then, we study the impact of the number of RSs R. For a fixed number of MSs (M ¼ 20; 60; and 100), we vary R from 3 to 9 with an incremental step of 2. The corresponding results of mean throughput are plotted in Figs. 8a, 8b, and 8c. We have the following observations: 1) With the increase of the number of RSs, the network coding gain achieved by EMW improves a lot. This is because more network coding opportunities are introduced. However, the performance of HAG varies little due to its local scheme, where only the traffic through one RS will be scheduled each cycle. 2) When the number of RSs is very small, HAG performs better than EMW; on the other hand, EMW has a better performance when more RSs are introduced in the network. 3) Both HAG and EMW perform more steadily when the number of RSs becomes larger. In summary, both HAG and EMW can improve the mean as well as the median throughput over the noncoding scheme significantly. EMW has a slightly better performance than HAG. On the other hand, we recall that HAG has lower overheads than EMW. They are both practical and effective proposals, and can be used in different applications according to various tradeoff requirements.
1738
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,
VOL. 24,
NO. 9,
SEPTEMBER 2013
Fig. 6. Cumulative distribution of network coding gains in terms of mean throughput and median throughput in one hundred topologies.
Fig. 7. The network coding gain of the mean throughput under various numbers of MSs. Data points represent the average coding gain and the error bars represent the standard deviation.
Fig. 8. Network coding gain of the mean throughput under various numbers of RSs.
7
CONCLUSION
In this paper, we study the network coding-aware scheduling problem in OFDMA relay networks under the proportional fair scheduling policy. We propose two approaches, GA and LA, to solve the problem under a tradeoff consideration between performance and overheads. For each model, we establish its hardness and propose efficient algorithms with low time complexity. The theoretical performance of our proposal is also studied. To highlight the efficiency of our algorithms, as well as the benefit of network coding, extensive simulations have been conducted.
The experimental results show that our proposals outperform one of the best existing schemes in terms of both fairness and throughput.
ACKNOWLEDGMENTS This work was partially supported by the National Basic Research Program of China under Grant No. 2009CB320705; the National Natural Science Foundation of China under Grant No. 61170069, 61073028, 61021062, 91218302, and 60903025. Baoliu Ye and Sanglu Lu are the corresponding authors.
TANG ET AL.: CODING-AWARE PROPORTIONAL-FAIR SCHEDULING IN OFDMA RELAY NETWORKS
REFERENCES [1] [2] [3] [4] [5] [6] [7] [8] [9]
[10] [11]
[12] [13] [14] [15] [16] [17] [18] [19]
[20] [21] [22] [23] [24] [25]
H. Yin and S. Alamouti, “OFDMA: A Broadband Wireless Access Technology,” Proc. IEEE Sarnoff Symp., pp. 1-4, 2006. 802.16: Air Interface for Broadband Wireless Access Systems, IEEE Standard, 2009. 3GPP, LTE Release 10 and Beyond (LTE Advanced), RP-090939, http://www.3gpp.org/LTE-Advanced, 2013. K. Sundaresan and S. Rangarajan, “On Exploiting Diversity and Spatial Reuse in Relay-Enabled Wireless Networks,” Proc. ACM MobiHoc, 2008. R. Ahlswede, N. Cai, S.Y. Li, and R.W. Yeung, “Network Information Flow,” IEEE Trans. Information Theory, vol. 46, no. 4, pp. 1204-1216, July 2000. S. Katti, H. Rahul, W. Hu, D. Katabi, M. Medard, and J. Crowcroft, “XORs in the Air: Practical Wireless Network Coding,” Proc. ACM SIGCOMM, 2006. H.J. Kushner and P.A. Whiting, “Convergence of ProportionalFair Sharing Algorithms under General Conditions,” IEEE Trans. Wireless Comm., vol. 2, no. 6, pp. 1150-1158, Nov. 2003. O. Oyman, “OFDMA2 A: A Centralized Resource Allocation Policy for Cellular Multi-Hop Networks,” Proc. IEEE Asilomar Conf. Signals, Systems and Computers, 2006. Y.W. Cheong, R.S. Cheng, K.B. Latief, and R.D. Murch, “Multiuser OFDM with Adaptive Subcarrier, Bit, and Power Allocation,” IEEE J. Selected Areas in Comm., vol. 17, no. 10, pp. 1747-1758, Oct. 1999. D. Kivane, G. Li, and H. Liu, “Computationally Efficient Bandwidth Allocation and Power Control for OFDMA,” IEEE Trans. Wireless Comm., vol. 2, no. 6, pp. 1150-1158, Nov. 2003. M. Ergen, S. Coleri, and P. Varaiya, “QoS Aware Adaptive Resource Allocation Techniques for Fair Scheduling in OFDMA Based Broadband Wireless Access Systems,” IEEE Trans. Broadcasting, vol. 49, no. 4, pp. 362-370, Dec. 2003. M. Andrews and L. Zhang, “Scheduling Algorithms for MultiCarrier Wireless Data Systems,” Proc. ACM MobiCom, 2007. S. Bodas, S. Shakkottai, L. Ying, and R. Srikant, “Low-complexity Scheduling Algorithms for Multi-Channel Downlink Wireless Networks,” Proc. IEEE INFOCOM, 2010. S. Mengesha and H. Karl, “Relay Routing and Scheduling for Capacity Improvement in Cellular WLANs,” Proc. Modeling and Optimization in Mobile, Ad Hoc and Wireless Networks (WiOpt), 2003. A. So and B. Liang, “Effect of Relaying on Capacity Improvement in Wireless Local Area Networks,” Proc. IEEE Wireless Comm. Networking Conf. (WCNC), 2005. K. Sundaresan, X. Wang, and M. Madihian, “Low-overhead Scheduling Algorithms for OFDMA Relay Networks,” Proc. Fourth Ann. Int’l Conf. Wireless Internet (WiCON), 2008. S. Deb, V. Mhatre, and V. Ramaiyan, “WiMAX Relay Networks: Opportunistic Scheduling to Exploit Multiuser Diversity and Frequency Selectivity,” Proc. ACM MobiCom, 2008. C.Y. Hong and A.C. Pang, “Link Scheduling with QoS Guarantee for Wireless Relay Networks,” Proc. IEEE INFOCOM, 2009. S. Rayanchu, S. Sen, J. Wu, S. Banerjee, and S. Sengupta, “LossAware Network Coding for Unicast Wireless Sessions: Design, Implementation, and Performance Evaluation,” Proc. ACM SIGMETRICS Int’l Conf. Measurement and Modeling of Computer Systems, 2008. S. Sengupta, S. Rayanchu, and S. Banerjee, “An Analysis of Wireless Network Coding for Unicast Sessions: The Case for Coding-Aware Routing,” Proc. IEEE INFOCOM, 2007. J.L. Le, J.C.S. Lui, and D.M. Chiu, “DCAR: Distributed CodingAware Routing in Wireless Networks,” Proc. IEEE 28th Int’l Conf. Distributed Computing Systems (ICDCS), 2008. H. Xu and B. Li, “XOR-Assisted Cooperative Diversity in OFDMA Wireless Networks: Optimization Framework and Approximation Algorithms,” Proc. IEEE INFOCOM, 2009. X. Zhang and B. Li, “Network Coding Aware Dynamic Subcarrier Assignment in OFDMA Wireless Networks,” Proc. IEEE Int’l Conf. (ICC), 2008. Y. Xu, J.C.S. Lui, and D.M. Chiu, “Analysis and Scheduling of Practical Network Coding in OFDMA Relay Networks,” Computer Networks, vol. 53, pp. 2120-2139, 2009. Y. Liu, M. Tao, B. Li, and H. Shen, “Optimization Framework and Graph-Based Approach for Relay-Assisted Bidirectional OFMDA Cellular Networks,” IEEE Trans. Wireless Comm., vol. 9, no. 11, pp. 3490-3500, Nov. 2010.
1739
[26] B.G. Kim and J.W. Lee, “Opportunistic Subchannel Scheduling for OFDMA Networks with Network Coding at Relay Stations,” Proc. IEEE GlobeCom, 2010. [27] J. Padhye, R. Draves, and B. Zill, “Routing in Multi-Radio, MultiHop Wireless Mesh Network,” Proc. ACM MobiCom, 2004. [28] T. Nguyen and Y. Han, “A Proportional Fairness Algorithm with QoS Provision in Downlink OFDMA Systems,” IEEE Comm. Letters, vol. 10, no. 11, pp. 760-762, Nov. 2006. [29] Y. Ma, “Rate Maximization for Downlink OFDMA with Proportional Fairness,” IEEE Trans. Vehicular Technology, vol. 57, no. 5, pp. 3267-3274, Sept. 2008. [30] V.V. Vazirani, Approximation Algorithms. Springer, 2001. [31] L. Tassiulas and A. Ephremides, “Stability Properties of Constrained Queueing Systems and Scheduling Policies for Maximum Throughput in Multihop Radio Networks,” IEEE Trans. Automatic Control, vol. 37, no. 12, pp. 1936-1948, Dec. 1992. [32] M. Andrews, K. Kumaran, K. Ramanan, A. Stolyar, R. Vijayakumar, and P. Whiting, “Providing Quality of Service over a Shared Wireless Link,” IEEE Comm. Magazine, vol. 39, no. 2, pp. 150-154, Feb. 2001. [33] M. Neely, E. Modiano, and C. Rohrs, “Power and Server Allocation in a Multi-Beam Satellite with Time Varying Channels,” Proc. IEEE INFOCOM, 2002. [34] M. Fisher, G. Nemhauser, and L. Wolsey, “An Analysis of Approximations for Maximizing Submodular Set Functions-II,” Math. Programming Study, vol. 14, pp. 265-294, 1978. [35] J. Gross, H. Geerdes, H. Karl, and A. Wolisz, “Performance Analysis of Dynamic OFDMA Systems with Inband Signaling,” IEEE J. Selected Areas in Comm., vol. 24, no. 3, pp. 427-436, Mar. 2006. [36] J. Broch, D.A. Maltz, D.B. Johnson, Y.-C. Ju, and J. Jetcheva, “A Performance Comparison of Multi-Hop Wireless Ad Hoc Network Routing Protocols,” Proc. ACM MobiCom, 1998. [37] J.K. Cavers, Mobile Channel Characteristics. Kluwer Academic Publishers, 2000. [38] GLPK (GNU Linear Programming Kit), version 4.8, http:// www.gnu.org/s/glpk/, 2013. Bin Tang received the BS degree in computer science from Nanjing University, Nanjing, China, in 2007, where he is currently working toward the PhD degree with the Department of Computer Science and Technology. His research interests lie in the area of communications, network coding, and distributed computing, with a focus on the application of network coding to file distribution in various networking environments. He is a student member of the IEEE.
Baoliu Ye received the PhD degree in computer science from Nanjing University, China, in 2004. He is currently an associate professor at the Department of Computer Science and Technology, Nanjing University, China. He served as a visiting researcher of the University of Aizu, Japan from March 2005 to July 2006. His current research interests include peer-to-peer (P2P) computing, online/mobile social networking, and wireless network. He has published more than 40 technical papers in the above areas. He served as the TPC cochair of HotPOST’12, HotPOST’11, and P2PNet’10. He is the regent of CCF and a member of the IEEE, ACM.
1740
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS,
Sanglu Lu received the BS, MS, and PhD degrees from Nanjing University in 1992, 1995, and 1997, respectively, all in computer science. She is currently a professor in the Department of Computer Science & Technology and the State Key Laboratory for Novel Software Technology. Her research interests include distributed computing, wireless networks and pervasive computing. She has published more than 80 papers in referred journals and conferences in the above areas. She is a member of the IEEE.
VOL. 24,
NO. 9,
SEPTEMBER 2013
Song Guo received the PhD degree in computer science from University of Ottawa, Canada. He is currently an associate professor with the School of Computer Science and Engineering, the University of Aizu, Japan. His research interests are mainly in the areas of protocol design and performance analysis for computer and telecommunication networks. He has published more than 150 papers in referred journals and conferences in these areas. He is currently the associate editor of IEEE Transactions on Parallel and Distributed Systems, Wiley Wireless Communications and Mobile Computing, Ad Hoc & Sensor Wireless Networks, and so on. He is a senior member of the IEEE.
. For more information on this or any other computing topic, please visit our Digital Library at www.computer.org/publications/dlib.
1
A PPENDIX A P ROOF OF T HEOREM 4 We prove this theorem by explicitly constructing a PTAS for the CA-LA problem. In the following, we first introduce another integer linear programming (CA-LA-c) which is correlated to CA-LA. Then we present a PTAS for CA-LA-c. Based on this result, we finally derive a PTAS for CA-LA. A.1
Correlated Integer Linear Programming
By restricting Qd (Qu ) as the maximum downlink (uplink) data amount that can be transmitted by the subchannels in the relay sub-frames, CA-LA-c is defined as follows: CA-LA-c: max
3
βk r3 (c, k)I3 (c, k)
(1)
r3 (c, k)I3 (c, k) ≤ Qd
(2)
c∈C3 k=1
s.t.
r3 (c, k)I3 (c, k) ≤ Q
u
(3)
c∈C3 k∈{2,3} 3
and
r3 (c, k) ≤ Qu .
k∈{2,3} c∈C3k :c Qu . After removing all sub-channels c ≥ i3 , i.e., C32 ← C32 − {c : c ≥ i3 },
I3 (c, k) ≤ 1, ∀c ∈ C3
(4)
k=1
I3 (c, k) ∈ {0, 1}, ∀c ∈ C3 , k ∈ {1, 2, 3}
(5)
1 1 where β1 = D , β2 = U1 , and β3 = D + U1 . Before building the relationship between CA-LA and CA-LA-c, we have to introduce some useful concepts first. We denote the set of sub-channels used in mode k as C3k , and for each c ∈ C3k , we define βk r3 (c, k) as its virtual profit v(c) in CA-LA, and as its profit p(c) otherwise. As the names suggest, the objective value of CA-LA may be smaller than the sum of v(c), while the objective value of CA-LA-c is exact the sum of p(c). Denote the optimal values of CA-LA and CA-LA-c by OP T and OP T1 , respectively. The following lemma shows that there is only a small gap between OP T and OP T1 . Lemma A.1: For any optimal assignment strategy of CA-LA, there must be two sub-channels c1 and c2 , such that OP T ≤ OP T1 + v(c1 ) + v(c2 ). Proof: Consider an arbitrary optimal assignment strategy C31 ∪ C32 ∪C33 of CA-LA. We will adjust it to obtain a feasible solution of CA-LA-c with performance guarantee. We number the sub-channels in C3 as 1, 2, . . . , C3 . Without loss of generality, we assume that r3 (c, k) > Qd and r3 (c, k) > Qu . k∈{1,3} c∈C3k
k∈{1,3} c∈C3k :c 0 and x ≥ 0 is repeatedly used. Now we the time η complexity of PLA-c(). analyze η There are i=1 3i Ci3 ≤ i=1 3i C3i = O(3η C3η ) combinations, and for each combination, the running time is dominated by solving an LP with 4C3 variables. By using the ellipsoid method [3], the LP can be solved in O(C34 ) time. Thereby, the running time of 3 PLA-c() is O(C34 3η C3η ) = O(C34 (3C3 ) ). The proof is accomplished.
A.3
PTAS for CA-LA
As Lemma A.1 implies, the gap between the solutions of CA-LA and CA-LA-c is at most the sum of virtual profits of two sub-channels. Because the virtual profits could be extremely large compared to pmin , PLA-c() may not be a PTAS for CA-LA by just adjusting the parameter η. Fortunately, we can derive a PTAS for CA-LA using a similar approach and the theoretical results in previous sections. At the beginning, the most virtually profitable assignment with η = 4
sub-channels is obtained by guessing. Rather than
η ← min{C3 , 4 }. zA ← 0. for any disjoint subsets S1 , S2 and S3 of C3 with |S1 | + |S2 | + |S3 | = η do QdA ← Qd − k∈{1,3} c∈Sk r3 (c, k). QuA ← Qu − k∈{2,3} c∈Sk r3 (c, k). if QdA ≤ 0 then Set remaining sub-channels in mode 2. x ← + c∈S3 r3 (c, 3) r (c, 2). 3 c∈C3 −S1 −S3 u d ,x} z ← QD + min{Q . U else if QuA ≤ 0 then Set remaining sub-channels in mode 1. x ← r (c, 3) + c∈S 3 3 c∈C3 −S2 −S3 r3 (c, 1). z←
13: 14: 15:
16: 17: 18: 19: 20: 21: 22: 23:
else
min{Qd ,x} D
+
Qu U .
Construct an instance of CA-LA-c by replacing Qd , Qu and C3 with QdA , QuA and C3 − ∪3i=1 Si , respectively. Solve the instance using PLA-c( 2 ) with solution z .d d Q −Q Qu −Qu z ← z + D A + U A . end if if z ≥ zA then zA ← z. end if end for return zA .
Theorem A.4: The algorithm PLA() is a PTAS for CA-LA with an approximation factor of at least 1 − 10 and a running time of O(C34 (3C3 ) ). Proof: Consider an optimal solution of CA-LA, where the set of sub-channels working in mode k is denoted by C3k . If k∈{1,3} c∈C k :c≤η r3 (c, k) ≥ Qd or 3 u k∈{2,3} c∈C3k :c≤η r3 (c, k) ≥ Q , then PLA() returns an optimal solution (steps 6-13). Otherwise, without loss of generality, we assume v(1) ≥ v(2) ≥ . . . ≥ v(C3 ). When the enumerated combination consists of sub-channels from 1 to η, PLA() solves an instance of CA-LA-c by executing PLA-c( 2 ) with sub-channel set {c ∈ C3 : c > η}, and data amount bounds QdA and QuA (steps 15-17). Let z ∗ and z1∗ be the optimal solutions of the CA-LA and CA-LA-c problems on the remaining sub-channel allocation, respectively. According to Lemma A.1, there must be some sub-channels i1 > η
4
and i2 > η, such that z ∗ ≤ z1∗ + v(i1 ) + v(i2 ).
(17)
It is easily seen that the optimal solution OP T of the original CA-LA is OP T = v(c) + z ∗ . (18) c≤η
On the other hand, due to the fact that PLA() traverses all subsets of η sub-channels in C3 , the solution returned by PLA() is at least SOL ≥ v(c) + (1 − )z1∗ . (19) 2 c≤η
By combining (17), (18) and (19), the approximation factor is ∗ SOL c≤η v(c) + (1 − 2 )z1 ≥ OP T v(c) + z ∗ c≤η ∗ c≤η v(c) + (1 − 2 )z1 ≥ ∗ c≤η v(c) + z1 + v(i1 ) + v(i2 ) ∗ c≤η v(c) + z1 ≥ ∗ + v(i ) + v(i ) − 2 v(c) + z 1 2 1 c≤η v(c) c≤η − . ≥ 2 c≤η v(c) + v(i1 ) + v(i2 ) Recalling that v(1) ≥ v(2) ≥ · · · v(n), we finally have SOL ηv(η) ≥ − OP T ηv(η) + v(i1 ) + v(i2 ) 2 ηv(η) ≥ − ηv(η) + v(η) + v(η) 2 η = − η+2 2 ≥ 1 − . Based on the time complexity of PLA-c(), it is straightforward to show that the time complexity of 3 10 PLA() is O(3η Cη3 C34 (3C3 ) /2 ) = O(C34 (3C3 ) ).
R EFERENCES [1] A.M. Frieze, and M.R.B. Clarke, Approximation Algorithms for the m-dimensional 0-1 Knapsack Problem: Worst-case and Probabilistic Analyses, European Journal of Operational Research, 1(15), pp.100-109, 1984. ´ Tardos, An Approximation Algorithm for [2] D.B. Shmoys, and E. the Generalized Assignment Problem, Mathematical Programming, 62(1993), pp.461-474. [3] D. Alevras, and M.W. Padberg, Linear Optimization and Extensions: Problems and Extensions, Springer-Verlag, 2001.