555 Technology Square, Cambridge, MA, 02139. ABSTRACT. In this paper, we ... wireless networks has become one of the major bottlenecks to provide reliableย ...
2015 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING, SEPT. 17โ20, 2015, BOSTON, USA
AN ONLINE LEARNING APPROACH TO THROUGHPUT OPTIMIZATION IN WIRELESS NETWORKS UNDER DYNAMIC AND UNKNOWN INTERFERENCE CONDITIONS Ramesh Annavajjala, Rami S. Mangoubi, Christopher C. Yu and James M. Zagami The Charles Stark Draper Laboratory 555 Technology Square, Cambridge, MA, 02139 ABSTRACT In this paper, we consider a multi-user communication system with dynamically varying interference on block fading channels. We focus on a multi-antenna receiver, single-antenna transmitters, and the case in which the receiver has no knowledge of the channel state information, interference dynamics, and the variance of the additive noise. Pilot-assisted transmission techniques are employed to enable channel estimation at the receiver. For a given channel coherence length, increasing the number of pilots improves the estimation accuracy, with the tradeoff of reduction in data throughput. Thus, we propose to optimize the pilot content within the data frame to maximize the average data throughput. We employ well-known cross-validation techniques from the machine learning literature to simultaneously improve the estimation accuracy as well as the average throughput. Simulation results with the proposed approach suggest that even when the average number of active interferers is larger than the number of degrees of freedom, at least 85% of the ideal throughput can be achieved with the optimum pilot overhead. Index Termsโ multiple-access communication, dynamic interference, optimum combining, diagonal loading, cross-validation 1. INTRODUCTION 1.1. Background The recent decade has witnessed near-exponential growth in the wireless data usage [1], and we are marching towards a fifth generation wireless technology to provide data rates of the order of multi-gigabits [2]. With network densification via small cells, and due to spatial-division multiple access (SDMA), interference in wireless networks has become one of the major bottlenecks to provide reliable communication. A large body of research has focused on characterizing the effects of interference on the system performance [3]. A well-known approach to signal detection in the presence of interference is the use of optimum combining receiver [4]. The optimum combining receiver is essentially a linear minimum-mean-square error (L-MMSE) receiver that maximizes the signal-to-interference-plus-noise ratio (SINR) at the output of the combiner. This is to be contrasted against the maximal ratio combining (MRC) receiver that maximizes the signal-to-noise ratio (SNR) which is optimal when there is no interference. While MRC requires the knowledge of the instantaneous channel state information (CSI) and the noise variance, the L-MMSE receiver requires additional information about the instantaneous interference covariance matrix. Also, in a dynamic interference environment, the number of active interferers can be random in which case neither the MRC nor the L-MMSE receiver is optimal. To enable c 978-1-4673-7454-5/15/$31.00 โ2015 IEEE
practical implementation of these algorithms, typically known (or pilot) symbols are inserted within the data frame so that the receiver can estimate the CSI of the desired and interfering users. Although pilot symbols can improve the estimation accuracy, they lead to a reduction in the data throughput. Traditionally, interference mitigation algorithms in the literature assume a fixed number of interferers, and focus on detection performance with either known or estimated CSI [5]. On the other hand, using random set theory and approximate Bayesian recursions, optimal joint detection of multiple users is formulated in [6, 7]. Since estimation of the sample covariance matrix (SMI) requires at least as many samples as the number of receive antennas, the impact of diagonal loading on the SMI-based L-MMSE receiver performance is studied in [8]. Also, the robustness of Capon beamformer is studied in [9] wherein the authors employ the Lagrangian multiplier methodology to precisely compute the diagonal loading based on the ellipsoidal uncertainty set of the array steering vector. While many works have addressed computation of optimal diagonal loading, these approaches assume either channel statistics or a deterministic number of interferers [10, 11]. Recently, there is growing interest in the application of machine learning (ML) [12] techniques to wireless communications and networking [13]-[15] as problems in ML and communication share many similarities. For example, regression in ML is closely tied to continuous-valued parameter estimation in communication, whereas the classification in ML bears similarity with detection of finite-dimensional signal constellations. In particular, for cognitive wireless networks, using the support vector machines, the authors in [16] address the channels and modulation selection problem, whereas the radio-frequency channel characterization problem is studied in [17]. Using unsupervised learning, [18] studies the robust signal classification problem. 1.2. Problem Statement In this paper, we address the problem of interference mitigation when the interference is dynamically varying and when the receiver has no knowledge of the CSI and the noise variance. We focus on a block fading channel model with single-antenna transmitters and multiple antennas at the receiver, and employ pilot-assisted transmission techniques to enable channel estimation at the receiver. Increasing the number of pilots improves the channel estimation accuracy, but the overhead results in will lead to a reduction in data throughput, we propose to optimize the pilot content within the data frame to maximize the average data throughput. We employ well-known cross-validation (CV) techniques from machine learning literature to simultaneously improve the estimation accuracy as well as the average throughput. For practical channel coherence
lengths, and for binary modulations, our simulation results suggest that significant throughput improvement can be achieved with minimal pilot overhead even when the total number of users is larger than the number of receive antennas. The paper is organized as follows. In Section 2, we introduce the system model that captures dynamic interference conditions. The problem formulation is described in Section 3. A cross-validation approach to parameter estimation and signal detection is detailed in Section 4, and simulation results are presented in Section 5. We conclude this work in Section 6. 2. SYSTEM MODEL Notation: Lower-case bold-faced variables denote the column vectors (i.e., x) whereas upper-case bold-faced variables denote the matrices (i.e., A). The identity matrix of size ๐ ร ๐ is denoted by I๐ . The transpose (or Hermitian) of a vector or a matrix is denoted by (โ
)โค (or (โ
)โ ). A complex (or real)-Gaussian random vector (cgRV or rgRV) x with mean m and covariance matrix C is denoted by x โผ ๐๐ฉ (m, C) (or x โผ ๐ฉ (m, C)). The expectation operator is denoted by ๐ผ [โ
]. The size of (or the number of elements in) a set ๐ฎ is denoted by โฃ๐ฎโฃ. For a scalar/vector/matrix โก, โ{โก} denotes the corresponding real part. We consider a communication link with a desired transmitter and its receiver, which is affected by a number of interfering transmitters. The maximum number of interferers is denoted by ๐พmax . In this work, we focus on the case when each transmitting node is equipped with a single transmit antenna. The desired receiver is assumed to have ๐๐
receive antennas. The air interference is such that a channel use is defined as communication on a specific frequency tone during a symbol period. This model, for example, corresponds to an OFDMA (orthogonal frequency-division multiple access) air interface. We consider a block fading channel model wherein the channel remains constant within a block of ๐ channel uses, and varies slowly across the blocks. We denote the block index by ๐, and the index of the channel use within a block by ๐. Assuming perfect symbol synchronization at the receiver , the ๐๐
ร 1-dimensional signal vector at the receiver can be written as y(๐, ๐)
=
h0 (๐)๐ผ0 (๐)๐ฅ0 (๐, ๐) + ๐พ max โ
h๐ (๐)๐ผ๐ (๐)๐ฅ๐ (๐, ๐) + v(๐, ๐)
๐=1
=
H(๐)๐ถ(๐)x(๐, ๐) + v(๐, ๐),
(1)
where ๐ฅ0 (๐, ๐) is the desired userโs signal, h0 (๐) is the ๐๐
ร 1dimensional channel from the desired user, ๐ฅ๐ (๐, ๐) is the signal from the ๐-th interferer, h๐ (๐) is the ๐๐
ร 1-dimensional channel from the ๐-th interferer, and v(๐, ๐) is the additive noise with mean 0 and spatial covariance matrix R. We also have H(๐) = [h0((๐), . . . , h๐พmax (๐)] the global channel matrix, and ) ๐ถ(๐) = diag [๐ผ0 (๐), . . . , ๐ผ๐พmax (๐)]โค the diagonal matrix of the
user activation factors, and x(๐, ๐) = [๐ฅ0 (๐, ๐), . . . , ๐ฅ๐พmax (๐, ๐)]โค the vector-valued symbols of all the users. For simplicity, we assume v(๐, ๐) to be independent and identically distributed (i.i.d) complex-Gaussian across channel uses. The coefficients ๐ผ๐ (๐) represent the activity factors for user ๐, ๐ = 0, . . . , ๐พmax . A simple model for ๐ผ๐ (๐) is an i.i.d Bernoulli distribution. That is, ๐ผ๐ (๐) = 1 with probability ๐๐ , and is 0 with probability 1 โ ๐๐ . In this work, we set ๐0 = 1 and ๐๐ = ๐, ๐ = 1, . . . , ๐max . That is, the desired user is always present in the received signal model of (1). Note that
when ๐ผ๐ (๐) = 1, โ๐ = 0, . . . , ๐พmax , detection of ๐ฅ0 (๐, ๐) using a linear receiver requires ๐๐
โฅ ๐พmax + 1. It is important to realize that determining the set of active interferers from the received signal model in (1) is closely related to the model order determination problem [19]. With knowledge of the sample correlation matrix of the received signal and the underlying noise variance, this problem is well-studied in [20]-[24]. We note that a block of ๐ symbols, in practice, is generally partitioned into ๐๐ pilot symbols and ๐๐ท data symbols. The pilot symbols enable the receiver estimate the channel parameters, whereas the information is carried within the data symbols. At the receiver, a more realistic assumption is that the pilot symbols from the desired transmitter are known whereas they are unknown from the interfering transmitters. Without loss of generality, the first ๐๐ positions of the block are assumed to contain pilots. As a result, for ๐ = 1, . . . , ๐๐ , we set ๐ฅ0 (๐, ๐) = 1 and ๐ฅ๐ (๐, ๐) = ยฑ1, with equal probability, for ๐ = 1, . . . , ๐พmax . The remaining ๐๐ท of the ๐ symbols contain the modulation data that must be detected at the desired receiver. Note that each transmitter can employ a modulation format that is different from the other transmitters. For simplicity, we assume a common signal constellation that has ๐ modulation symbols. The channels h๐ (๐), ๐ = 0, . . . , ๐พmax , can have a variety of distributions that strongly depend on the propagation environment. For simplicity, we assume that h๐ (๐) โผ ๐๐ฉ (0, ๐บ๐ I๐๐
). This model assumes a rich scattering environment, and the channel gains are spatially uncorrelated. This is a valid assumption for widely spaced antenna elements. The variable ๐บ๐ captures the distancedependent average channel power from user ๐. We also assume spatial independence of fading channels across the users. 3. OPTIMUM ONLINE LEARNING Our goal is to devise algorithms for channel parameter estimation and signal detection in dynamic interference conditions. From an implementation standpoint, we constrain the receiver to use linear detection algorithms (such as zero-forcing or L-MMSE receiver approaches). Note that we have a basic assumption that ๐พmax โค ๐๐
โ 1 for the feasibility of linear receivers when ๐ = 1. However, the average number of interferers is ๐พmax ๐ which can be significantly less than ๐๐
, depending upon the value of ๐, and a linear receiver with a fixed number of receive antennas can potentially withstand a group of interferers that is larger than ๐๐
โ 1. We note that the number of symbols within a coherence block, ๐ , is a function of the channel selectivity in time and frequency. Roughly, ๐ varies inversely with the product of the channel coherence lengths in time and frequency. Also, ๐ should be higher than ๐๐
to make estimation of channel covariance matrix at the receiver feasible using a portion of the pilot symbols. We also assume a flexibility in our choice of ๐๐ and ๐๐ท such that ๐ = ๐๐ + ๐๐ท . Note that choosing a higher ๐๐ provides good channel estimation accuracy, but at the cost of information rate. In this work, the noise covariance matrix is set to R = ๐๐2 I๐๐
, and, in addition to the channel gains, the receiver does not have knowledge of ๐๐2 . With linear processing constraints at the receiver, let us denote by w๐ the weight vector employed within block ๐ to detect the symbols ๐ฅ0 (๐, ๐), ๐ = ๐๐ + 1, . . . , ๐ . The detected symbol ๐ฅ ห0 (๐, ๐) is simply { } ๐ฅ ห0 (๐, ๐) = slicer w๐โ y(๐, ๐), ๐ฎ , ๐ = ๐๐ + 1, . . . , ๐, (2) where ๐ฎ is the signal constellation employed by the desired user, and
slicer {๐ง, ๐ฎ} = argmin๐ฅโ๐ฎ โฃ๐ง โ ๐ฅโฃ2 is the inverse mapping of the complex-valued signal ๐ง to produce the nearest modulation symbol within ๐ฎ. Note that since the channel gains and the noise variance are unknown, pilot symbols are used to estimate these parameters which in turn are used to form the weight vector w๐ . The fraction of symbols that are correctly detected is termed as the normalized throughput, and is given by
where Rideal is the ideal channel covariance matrix which is given by โ Rideal = h0 hโ 0 + h๐ hโ ๐ + ๐๐2 I๐๐
, (7) ๐โโ
(3)
and the denominator in (6) ensures that wโ h0 = 1 so that the desired userโs signal, upon the application of w, has no channel-specific scaling. The equalized symbols for the desired user are given by { } โ ๐ฅ ห0 (๐) = slicer we๏ฌ y(๐, ๐), ๐ฎ ๐ = ๐๐ + 1, . . . , ๐. (8)
where 1A is the indicator function that evaluates to 1 when the event A is true, and is 0 when A is false. Our goal is, for a block length ๐ , to optimally allocate the pilots and data to maximize the average normalized throughput, ๐ผ [๐ฏ๐ ]. More formally, the optimization problem is:
The probability of error in correctly detecting ๐ฅ0 (๐) when all the users employ a binary constellation is given by [ ( )] โ โ โ โ โ{we๏ฌ h0 + ๐โโ we๏ฌ h ๐ ๐ฅ๐ } ๐๐ = ๐ผ ๐ฌ 2 , (9) ๐๐ โฅwe๏ฌ โฅ
โ๐ ๐ฏ๐ =
๐=๐๐ +1
[๐๐,๐๐๐ก , ๐๐ท,๐๐๐ก ]
=
=
1{ห๐ฅ0 (๐,๐)โก๐ฅ0 (๐,๐)} ๐
,
argmax ๐ผ [๐ฏ๐ ] ๐๐ ,๐๐ท
subject to ๐๐ + ๐๐ท = ๐ ) ๐๐ท ( argmax 1 โ ๐๐ ๐๐ ,๐๐ท ๐ subject to ๐๐ + ๐๐ท = ๐,
(4)
where ๐ ๐ is the average symbol error probability. Since the constrains are integer valued, and ๐ ๐ is analytically intractable, it is rather hard to analytically solve (4). Further, with pilot-based channel estimation, ๐ ๐ itself is a function of ๐๐ . To proceed further, we ]โค [ (๐) (๐) choose a set of pilot/data partitions, ๐(๐) = ๐๐ , ๐๐ท , such that (๐)
(๐)
๐๐ + ๐๐ท = ๐ . For each partition ๐, we employ cross-validation principles from the ML literature for robust weight vector computation, and record the throughput achieved, ๐ผ [๐ฏ๐ ](๐) . The optimal partition, ๐โ , is simply ๐โ = argmax๐ ๐ผ [๐ฏ๐ ](๐) . The main advantage of this approach is that the search complexity is fully controlled by the the number of partitions, and we only need to search around the small-to-moderate pilot sizes. Since many parameters in the model (1) are unknown, we expect that cross-validation approaches provide best-in-class estimation as well as detection performances for both in-sample as well as out-of-sample data. 4. CROSS VALIDATION APPROACH 4.1. Ideal Performance Before we embark on cross-validation approaches to the throughput optimization problem in (4), we first look at the best possible performance under ideal channel knowledge. This ideal performance also serves as an upper bound on what is achievable by any learning algorithm. With ideal channel knowledge, we drop the index of the coherence block ๐. Within a coherence block, we denote by โ = {๐1 , . . . , ๐๐พ } the set of active interferers. The instantaneous interference channel matrix can then be denoted by Hโ which is given by โโ = [h๐1 , . . . , h๐๐พ ] . (5) Having knowledge of h0 , Hโ and the noise variance ๐๐2 , the linear MMSE weight vector at the receiver is we๏ฌ =
Rโ1 ideal h0
hโ 0 Rโ1 ideal h0
,
(6)
where ๐ฌ (๐ฅ) is the complimentary cumulative distribution function of a standard Gaussian rv, and the expectation is over h0 , and, for ๐ โ โ, {h๐๐ , ๐ฅ๐ }. In (9), ๐ฅ๐ = ยฑ1, with equal probability, are the modulation symbols of the ๐th active interferer. We also note that when there is no interference (i.e., ๐พ = 0), the optimal detection rule is maximal ratio combining (MRC) with the weights w = h0 /โฅh0 โฅ2 , and the error probability takes a form different from (9) as [ ( )] โ โฅh0 โฅ ๐ ๐ ,๐ ๐
๐ถ,๐พ=0 = ๐ผ ๐ฌ 2 , (10) ๐๐ and the expectation in (10) is over the channel h0 . Since (9) and (10) are not functions of ๐๐ , it follows that the optimal throughput is achieved by setting ๐๐ = 0 and ๐๐ท = ๐ . That is, as one would expect, with genie-aided channel information, all the symbols within a block are used for data transmission. 4.2. Channel Estimation and Beamforming via Cross-Validation We now describe a procedure that performs channel estimation, signal detection, and optimization of training and data phases to maximize the normalized throughput. We first divide the pilot portion of the frame into training and validation phases. We define by ๐ฟ the ratio between the number of symbols for training and the number of pilot symbols. With this, ๐๐,๐ก = ๐ฟ๐๐ is the number of pilot symbols available for training and ๐๐,๐ฃ = (1 โ ๐ฟ)๐๐ is the number of pilot symbols available for validation. The set of pilot indices โ๐ is partitioned into โ๐ก and โ๐ฃ such that โ๐ก contain the pilot indices for training, whereas โ๐ฃ contain the indices for testing. Using โ๐ก , a sample-mean based channel estimate is โ ห 0 (๐) = 1 y(๐, ๐) = h0 (๐) + h โฃโ๐ก โฃ ๐โโ ๐ก
1 โ โ h๐๐ (๐, ๐)๐ฅ๐๐ (๐, ๐) + v๐ก (๐), โฃโ๐ก โฃ ๐โโ ๐โโ
(11)
๐ก
where the second term in (11) ( is the inter-user ) (or multiple-access) ๐2 interference, and v๐ก (๐) โผ ๐๐ฉ 0, โฃโ๐๐ก โฃ I๐๐
is the channel estimation error (in the absence of any interference). An estimate of the overall covariance matrix using โ๐ก is โ ห ๐ก๐๐ก๐๐ (๐) = 1 y(๐, ๐)yโ (๐, ๐), R โฃโ๐ก โฃ ๐โโ ๐ก
(12)
K(max) = 0. SNR [dB] = 0. NR = 4. N = 1000. ฮด = 0.8
and an estimate of the noise variance is given by ห2 ๐ = ๐
โ 1 ห 0 (๐)โฅ2 . โฅy(๐, ๐) โ h โฃโ๐ก โฃ๐๐
๐โโ
0.99
(13)
๐ก
ห 0 (๐) ห โ1 (๐)h R ๐๐ ๐ก,๐ , โ โ1 ห 0 (๐) ห ห (๐)h h (๐)R 0
(14)
Normalized Throughput
0.97
Note that the estimate (13) is biased, and this bias can be corrected relatively easily only when there is no interference. We propose the following weight vector to detect the desired userโs modulation symbols: w๐ (๐) =
With CV: Based on MSE With CV: Based on mean pilot error Without CV
0.98
0.96 0.95 0.94 0.93 0.92
๐๐ ๐ก,๐
0.91
where 0.9
(15)
is an estimated covariance matrix of the received signal augmented with a diagonal load that is parameterized by ๐. We note that, (14) provides a robust beamformer in the presence of unknown noise variance and dynamic interference, and, unlike [8],[9],[10], and [11], we determine the optimal ๐ solely based on the received data and known pilot symbols without regard to the statistics of interference and noise. The detected symbols using (15) are simply { } ๐ฅ ห0 (๐) = slicer w๐โ (๐)y(๐, ๐), ๐ฎ ๐ = ๐๐ + 1, . . . , ๐. (16) Using the fact that ๐ฅ0 (๐, ๐) = 1 for ๐ โ โ๐ฃ , an optimal ๐ can be obtained by minimizing the sample MSE between the estimated and true pilot symbols, or by minimizing the sample error rate between the detected and true pilot symbols. That is, 2 1 โ โ ๐โ
,๐ ๐๐ธ = argmin (17) 1 โ w๐ (๐)y(๐, ๐) โฃโ โฃ ๐ฃ ๐โ๐ฆ ๐โโ ๐ฃ
is the optimal ๐ that minimizes the sample MSE in the testing set, and 1 โ { { 1 sign โ{wโ (๐)y(๐,๐)}}โ=1} (18) ๐โ
,๐ต๐ธ๐
= argmin โฃโ๐ฃ โฃ ๐โโ ๐ ๐โ๐ฆ ๐ฃ
is the optimal ๐ that minimizes the sample BER in the testing set. Note that in (17) and (18) ๐ฆ is a set of ๐s that the receiver must search over, and the overall detection complexity grows linearly with โฃ๐ฆโฃ. Once the optimal ๐ is found, the receiver employs all the pilot symbols to estimate the channel, overall covariance matrix, and the noise variance. The resulting weight vector w๐โ
(๐) is used to detect all the data symbols within the frame. We refer to the optimal beamformer based on (17) as the MSE-CV-BF, whereas the one based on (18) as the BER-CV-BF. The conventional beamformer without CV is termed as C-BF which is obtained by using all the pilots to estimate the desired channel, interference-plus-noise covariance matrix, and the additive noise variance, and the diagonal load is simply the noise variance. 5. SIMULATION RESULTS 5.1. Parameters and Methodology In all the simulations, we set ๐๐
= 4 receive antennas, employ binary constellations for all the users (i.e., ๐ฎ = {โ1, +1}), and set ๐ฟ = 0.8 (i.e., 80% of pilots for training and the remaining 20% for
0.89 10
20
30
40
50 60 Number of Pilots
70
80
90
100
(a) SNR = 0 dB SNR [dB] = 20. NR = 4. N = 1000. ฮด = 0.8 1 With CV: Based on MSE With CV: Based on mean pilot error Without CV
0.99 0.98 Normalized Throughput
( ) ห ๐ก๐๐ก๐๐ (๐) + ๐ trace R ห ๐๐ ๐ก,๐ (๐) = R ห ๐ก๐๐ก๐๐ (๐) I๐ R ๐
๐๐
0.97 0.96 0.95 0.94 0.93 0.92 0.91 0.9 10
20
30
40
50 60 Number of Pilots
70
80
90
100
(b) SNR = 20 dB Fig. 1: Normalized throughput under the first approach to interference modeling. Here, ๐พmax = 0, ๐ฟ = 0.8, ๐๐
= 4 antennas, and a frame length of ๐ = 1000 symbols.
validation) which is a general recommendation from the ML literature [12]. The channel coherence length ๐ is set to 1000 symbols, and the activity factor of interferers, ๐, is set to 0.5. The diagonal load search window ๐ฆ, in dB, is chosen from [โ20, 20] in increments of 2. For each data/pilot partition, we generate 100000 independent realizations of (1). For each realization, we compute the optimal weight vector from (14) with the sample MSE minimizing ๐โ
,๐ ๐๐ธ from (17), or the sample BER minimizing ๐โ
,๐ต๐ธ๐
from (18). Using the optimal BF, the data symbols are detected as per (16), and the normalized throughput per realization is computed as per (3). Upon averaging (3) over the realizations, we obtain ๐ผ [๐ฏ๐ ]. In all the simulations, the throughput is further normalized by the ideal throughput with perfect CSI at the receiver. The interference is modeled in two different approaches. In the first approach, all the active interferers transmit at the same average power level, which is denoted by ๐พ ๐ผ = ๐บ๐ /๐๐2 , ๐ โ โ, relative to the thermal noise power. If ๐พ 0 = ๐บ0 /๐๐2 denote the average received SNR from the desired user, then ๐พ 0 /๐พ ๐ผ denote the average carrier-
K(max) = 5. p = 0.5. CIR [dB] = โ10. SNR [dB] = 0. NR = 4. N = 1000. ฮด = 0.8
K(max) = 10. p = 0.5. CIR [dB] = 0. SNR [dB] = 0. NR = 4. N = 1000. ฮด = 0.8
0.9
0.94
0.88 0.92
With CV: Based on MSE With CV: Based on mean pilot error Without CV
0.84
Normalized Throughput
Normalized Throughput
0.86
0.82 0.8 0.78
0.9
0.88
0.86 With CV: Based on MSE With CV: Based on mean pilot error Without CV
0.76 0.84 0.74 0.72 10
20
30
40
50 60 Number of Pilots
70
80
90
0.82 10
100
20
30
(a) SNR = 0 dB
40
50 60 Number of Pilots
70
80
90
100
(a) SNR = 0 dB K(max) = 10. p = 0.5. CIR [dB] = 0. SNR [dB] = 20. NR = 4. N = 1000. ฮด = 0.8
K(max) = 5. p = 0.5. CIR [dB] = โ10. SNR [dB] = 20. NR = 4. N = 1000. ฮด = 0.8 0.9
0.94
0.92
Normalized Throughput
Normalized Throughput
0.85
With CV: Based on MSE With CV: Based on mean pilot error Without CV
0.8
0.75
0.9
0.88
0.86
0.84 0.7
With CV: Based on MSE With CV: Based on mean pilot error Without CV
0.82
0.65 10
20
30
40
50 60 Number of Pilots
70
80
90
100
0.8 10
20
30
40
50 60 Number of Pilots
70
80
90
100
(b) SNR = 20 dB
(b) SNR = 20 dB
Fig. 2: Normalized throughput under the first approach to interference modeling. Here, ๐พmax = 5, ๐ = 0.5, ๐ฟ = 0.8, CIR = -10 dB, ๐๐
= 4 antennas, and a frame length of ๐ = 1000 symbols.
Fig. 3: Normalized throughput under the second approach to interference modeling. Here, ๐พmax = 10, ๐ = 0.5, ๐ฟ = 0.8, CIR = 0 dB, ๐ฅ = 10 dB, ๐๐
= 4 antennas, and a frame length of ๐ = 1000 symbols.
to-interference ratio (CIR). In the second approach, each interferer is assumed to transmit at a power level that is uniformly distributed within [โ๐ฅ, ๐ฅ] dB relative to a nominal value of ๐พ ๐ผ . This model allows for distance-dependent power variations and/or any residual errors due to open-loop power control. Under the second approach, we set the nominal CIR to be 0 dB and ๐ฅ = 10 dB. 5.2. Results and Observations Under the first approach to interference modeling, the throughput is plotted as a function of ๐๐ for ๐พ 0 โ {0, 20} dB. Fig. 1 depicts the throughput performance when ๐พmax = 0, whereas Fig. 2 assumes ๐พmax = 5 and ๐ = 0.5. We observe from Fig. 1 that, in the absence of any interference, there is very little to be gained from the CV approach as the optimum load is 0. In fact, at very low SNRs (i.e., ๐พ 0 = 0 dB) and at very low ๐๐ there is a small degradation in performance with both MSE-CV-BF and BER-CV-BF relative to the C-BF. As the SNR increases to 20 dB, all the approaches yield iden-
tical performances. However, with interference, the performances are remarkably different, as shown in Fig. 2. From Fig. 2, we observe that, at both lower and higher operating SNRs, the proposed MSE-CV-BF and BER-CV-BF approaches significantly outperform C-BF. For example, with 1% pilot overhead, the throughput of MSECV-BF is around 83% which is 15% higher than that of C-BF. We also notice that at lower ๐๐ , MSE-CV-BF has a small advantage over BER-CV-BF, whereas at higher SNR and with larger ๐๐ these two approaches have comparable performances. Under the second approach to interference modeling, Fig. 3 considers an over-loaded scenario with ๐พmax = 10 and ๐ = 0.5. Note that the average number of interferers in this case is 5 which is higher than ๐๐
โ 1 = 3. When ๐พ 0 = 0 dB, we see that the normalized throughput with C-BF peaks around 87% with ๐๐ = 50, whereas with just 25 pilots the normalized throughput improves to 92% with the MSE-CV-BF. As the SNR increases to 20 dB, we see a slight dip (to around 86% at ๐๐ = 50) in the normalized throughput of C-BF,
whereas it increases to around 93% with the MSE-CV-BF. We also observe that, in the region of higher pilot overhead, the BER-CV-BF has a slightly inferior performance compared with the MSE-CV-BF at lower SNRs, and the two approaches have comparable performances as the SNR increases. However, for lower pilot overhead, MSE-CV-BF offers superior performance compared with BER-CVBF. 6. CONCLUSION Traditionally, interference mitigation algorithms in the literature have focused on either identifying/estimating a deterministic number of interference channels or employing a variety of receivers with either ideal/estimated channels. In this work, we have addressed the problem of robust interference mitigation with linear receivers for data throughput optimization when the receiver has no knowledge of the channel statistics, and when the interference itself is dynamically varying across the channel coherence length. Using the cross-validation principles from machine learning, we have obtained the optimum data and pilot allocation to maximize the average throughput. Our results have shown that even when the average number of active interferers, ๐พmax ๐, is larger than the number of degrees of freedom, ๐๐
โ 1, at least 85% of the normalized throughput can be achieved with the optimum pilot overhead. 7. REFERENCES [1] Cisco White Paper, โVisual Networking Index: Global Mobile Data Traffic Forecast Update, 2014 2019.โ Available at http://www.cisco.com/c/en/us/ solutions/collateral/service-provider/ visual-networking-index-vni/white_paper_ c11-520862.html.
[10] N. Ma and J. Goh, โEfficient method to determine diagonal loading value,โ in Proc. IEEE Int. Conf. Acoustics, Speech Signal Processing, vol. V, 2003, pp. 341-344. [11] X. Mestre and M. A. Lagunas, โFinite sample size effect on minimum variance beamformers: Optimum diagonal loading factor for large arrays,โ IEEE Trans. Sig. Processing, vol. 54, no. 1, pp. 69-82, Jan. 2006. [12] T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Ed., Springer & Sons, 2013. [13] C. Clancy, J. Hecker, E. Stuntebeck, and T. OShea, โApplications of machine learning to cognitive radio networks,โ IEEE Wireless Commun., vol. 14, no. 4, pp. 47-52, Aug. 2007. [14] A. He, K. K. Bae, T. Newman, J. Gaeddert, K. Kim, R. Menon, L. Morales-Tirado, J. Neel, Y. Zhao, J. Reed, and W. Tranter, โA survey of artificial intelligence for cognitive radios,โ IEEE Trans. Vehicular Techno., vol. 59, no. 4, pp. 1578-1592, May 2010. [15] M. Bkassiny, Y. Li and S. K. Jayaweera, โA survey on machinelearning techniques in cognitive radios,โ IEEE Comm. Surveys & Tutorials, vol. 15, no. 3, pp. 1136-1159, Third Quarer 2013. [16] G. Xu and Y. Lu, โChannel and modulation selection based on support vector machines for cognitive radio,โ in Proc. International Conference on Wireless Communications, Networking and Mobile Computing (WiCOM), Sept. 2006, pp. 1-4. [17] T. Atwood, โRF channel characterization for cognitive radio using support vector machines,โ Ph.D. dissertation, University of New Mexico, Nov. 2009. [18] T. Clancy, A. Khawar, and T. Newman, โRobust signal classification using unsupervised learning,โ IEEE Trans. on Wireless Commun., vol. 10, no. 4, pp. 1289-1299, Apr. 2011. [19] P. D. Grunwald, The Minimum Description Length Principle, The MIT Press, 2007.
[2] J. G. Andrews, S. Buzzi, W. Choi, S. V. Hanley, A. Lozano, A. C. K. Soong, J. C. Zhang, โWhat will 5G be?,โ IEEE Journal on Selected Areas in Commun., vol. 32, no. 6, pp. 1065-1082, June 2014.
[20] H. Akaike, โA new look at the statistical model identification,โ IEEE Trans. Automat. Contr., vol. 19, pp. 716-723, 1974.
[3] P. Stavroulakis, Interference Analysis of Communication Systems, Edited, IEEE Press Selected Reprint Series, 1980.
[21] G. Schwartz, โEstimation the order of a model,โ Ann. Stat., vol. 6, pp. 461-464, 1974.
[4] J. H. Winters, โOptimum combining in digital mobile radio with cochannel interference,โ IEEE Journal on Selected Areas in Commun., vol. 2, no. 4, pp. 528-539. July 1984.
[22] J. Rissanen, โModeling by shortest data description,โ Automatica, vol. 14, pp. 465-471, 1978.
[5] M. L. Honig, Advances in Multiuser Detection, Edited, John Wiley & Sons, 2009. [6] E. Biglieri and M. Lops, โMultiuser detection in a dynamic environment. Part I: User identification and data detection,โ IEEE Trans. Info. Theory, vol. 53, no. 9, pp. 3158-3170, Sep. 2007. [7] E. Biglieri and M. Lops, โMultiuser detection in a dynamic environment. Part I: Joint user identification and parameter estimation,โ IEEE Trans. Info. Theory, vol. 55, no. 5, pp. 23652374, May 2009. [8] B. D. Carlson, โCovariance matrix estimation errors and diagonal loading in adaptive arrays,โ IEEE Trans. Aerospace and Electronics Systems, vol. 24, no. 4, pp. 397-401, July 1998. [9] J. Li, P. Stoica, and Z. Wang, โOn robust Capon beamforming and diagonal loading,โ IEEE Trans. Signal Processing, vol. 51, no. 7, pp. 1702-1715, July 2003.
[23] M. Wax and T. Kailath, โDetection of signals by information theoretic criteria,โ IEEE Trans. on Acoustic, Speech, and Signal Processing (ASSP), vol. 33, pp. 387-392, Apr. 1985. [24] R. R. Nadakuditi and A. Edelman, โSample eigenvalue based detection of high-dimensional signals in white noise using relatively few samples,โ IEEE Trans. Sig. Processing, vol. 56, no. 7, pp. 2625-2638, July 2008.