A Decoupling Approach for Low-Complexity Vector ... - IEEE Xplore

1 downloads 0 Views 299KB Size Report
A Decoupling Approach for Low-Complexity Vector Perturbation in. Multiuser Downlink Systems. Seok-Hwan Park, Student Member, IEEE, Hyeon-Seung Han,Β ...
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 10, NO. 6, JUNE 2011

1697

A Decoupling Approach for Low-Complexity Vector Perturbation in Multiuser Downlink Systems Seok-Hwan Park, Student Member, IEEE, Hyeon-Seung Han, Student Member, IEEE, Sunho Lee, and Inkyu Lee, Senior Member, IEEE

Abstractβ€”In this letter, we propose an efficient algorithm which reduces the complexity of conventional vector perturbation schemes by searching the real and imaginary components of a perturbation vector individually. To minimize a performance loss induced from the decoupled joint search, we apply diagonal precoding at the transmitter whose parameters are iteratively optimized to maximize the chordal distance between subspaces spanned by the real and imaginary components. We also propose a simple non-iterative method with a slight performance loss which can achieve a significant complexity reduction compared to the conventional vector perturbation schemes. Index Termsβ€”Multiuser downlink, vector perturbation.

I. I NTRODUCTION

I

N a multiple-input multiple-output (MIMO) broadcast channel (BC) [1], co-channel interference (CCI) is inevitable when communicating to several users. A simple channel inversion (CI) technique was proposed to eliminate the CCI and allow independent signals to be directed to the intended users [2]. However, the performance of the CI is quite poor due to a power boosting effect for ill-conditioned channels. To enhance the performance of the CI, vector perturbation was introduced in [3], which adopts a modulo operator at both transmitter and receiver. With the modulo operation, the transmitter gains a degree of freedom to choose an element in the multiple of constellation to be transmitted as a desired value for the modulo operator input. In the vector perturbation technique where the data is perturbed such that the transmit power is minimized, finding a perturbation vector is a lattice closest-point problem which can efficiently be solved via sphere encoding (SE) [4] or LLL-aided lattice reduction (LR) [5]. Also, the vector perturbation scheme with limited lattice search was proposed in [6]. However, complexity of those algorithms is still problematic for a large number of users. In this letter, we introduce a new algorithm which further reduces the complexity of the conventional vector perturbation

Manuscript received July 6, 2010; revised October 5, 2010 and December 6, 2010; accepted February 16, 2011. The associate editor coordinating the review of this letter and approving it for publication was G. Vitetta. S.-H. Park was with the School of Electrical Eng., Korea University, Seoul. He is now with the Agency for Defense Development, Daejeon, Korea (email: [email protected]). H.-S. Han is with Korea District Heating Corp., Kyeongi-do, Korea. S. Lee is with the Dept. of Applied Math., Sejong University, Seoul, Korea (e-mail: [email protected]). Inkyu Lee is with the School of Electrical Eng., Korea University, Seoul, Korea (e-mail: [email protected]). This work was supported in part by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MEST) (No. 20100017909), and in part by the Seoul R&BD Program (WR080951). Digital Object Identifier 10.1109/TWC.2011.040411.101024

schemes such as the SE and LR algorithms by searching for the real and imaginary components of the perturbation vectors individually [7]. This provides us significant complexity savings since the complexity of the search problem is determined by its dimension. To minimize a performance loss induced by the decoupled joint search, diagonal precoding is applied at the transmitter. We first describe a method which iteratively optimizes parameters of the diagonal precoder. Since our main objective is to reduce the system cost, we also present a simple method of choosing parameters which can provide the performance almost identical to the optimized ones. It is confirmed that the proposed scheme can achieve a significant complexity reduction compared to the conventional vector perturbation algorithms. II. C ONVENTIONAL V ECTOR P ERTURBATION S YSTEMS Throughout this letter, normal letters represent scalar quantities, boldface letters indicate vectors, and boldface uppercase letters designate matrices. With a bar accounting for complex variables, for any complex notation 𝑐¯, we denote the real and imaginary part of 𝑐¯ by β„œ[Β― 𝑐] and β„‘[Β― 𝑐], respectively. We use (β‹…)𝑇 , (β‹…)𝐻 , ∣∣ β‹… ∣∣ and ∣∣ β‹… ∣∣𝐹 for transpose, complex conjugate transpose, 2-norm and Frobenius norm, respectively. We define span(X) as the column space of a matrix X. We consider a multiuser downlink system where the base station with 𝑁𝑇 antennas transmits independent data streams to 𝐾 single-antenna users. Let us define the 𝑁𝑇 -dimensional complex transmitted signal vector x, and the 𝐾-dimensional 𝑇 𝑦1 β‹… β‹… β‹… 𝑦¯𝐾 ] where π‘¦Β―π‘˜ is complex received signal vector y = [Β― the received signal at user π‘˜. Then, the corresponding complex vector equation can be written as y = Hx + w

(1)

where w ∈ ℂ𝐾×1 is the white Gaussian noise vector at the 2 users with zero mean and the covariance matrix πœŽπ‘€ I𝐾 , and the channel matrix H consists of 𝐾 Γ— 𝑁𝑇 independent and identically distributed (i.i.d.) complex Gaussian coefficients with zero mean and unit variance. The desired signal vector for 𝐾 users is denoted by u = [𝑒1 , 𝑒2 , β‹… β‹… β‹… , 𝑒𝐾 ]𝑇 whose elements are chosen from an 𝑀𝑐 -ary QAM with unit energy. In the CI, the transmit signal vector is formed as 1 x = √ Pu (2) 𝛾 where ( P 𝐻is)βˆ’1the right 𝐻 H HH and 𝛾 =

c 2011 IEEE 1536-1276/11$25.00 ⃝

pseudoinverse of H as P =  2 Pu denotes the normalization

1698

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 10, NO. 6, JUNE 2011

factor to satisfy the transmit power constraint. After passing through the channel, the received signals are interference-free, i.e., 1 π‘¦π‘˜ = √ π‘’π‘˜ + 𝑀 π‘˜ 𝛾 where π‘€π‘˜ represents the additive noise at user π‘˜. We notice √ that the effective channel gain is given by 1/ 𝛾 and is assumed to be known at all users. For ill-conditioned channels, the system performance is degraded due to boosting of 𝛾. The vector perturbation technique proposed in [3] attempts to reduce 𝛾 by adopting 1 x = √ P(u + 𝜏 l) 𝛾

(3)

where l is a complex integer vector 2 𝜏 denotes  of length 𝐾, a positive real number, and 𝛾 = P(u + 𝜏 l) represents the new normalization factor for the vector perturbation. The integer vector l which minimizes 𝛾 can be found as   P(u + 𝜏 lβ€² )2 . l = arg β€² min (4) l βˆˆβ„€πΎ +j℀𝐾 To solve (4), we can apply some efficient methods such as the SE [4] and LR [5] algorithms. Especially, the LR-based vector perturbation schemes have advantage in slow fading channels since the lattice basis reduction has to be performed only once for each channel realization. However, the complexity of these algorithms is problematic for a large number of users. III. D ECOUPLED V ECTOR P ERTURBATION S CHEME In this section, we propose a new algorithm which further reduces the complexity of the conventional vector perturbation techniques. For systems with 𝐾 users, the lattice closestpoint problem (4) becomes a 2𝐾-dimensional problem in realvalued representation which requires prohibitive complexity for large 𝐾. We address this problem by computing the real and imaginary part of integer vectors separately so that the closest-point problem (4) changes to two 𝐾-dimensional problems. The real-valued representation of P is given as [ ] ] [ β„œ[P] βˆ’β„‘[P] Λ™e P= = Pe P β„‘[P] β„œ[P] Λ™ e = [βˆ’β„‘[P]𝑇 β„œ[P]𝑇 ]𝑇 . where Pe = [β„œ[P]𝑇 β„‘[P]𝑇 ]𝑇 and P Then, the 2𝐾-dimensional problem in (4) is equivalent to ) ( β„œ[l], β„‘[l] 2 β€² Λ™ e (u𝐼 + 𝜏 l′𝐼 ) = arg β€² min (u + 𝜏 l ) + P (5) P e 𝑅 𝑅 l𝑅 ,l′𝐼 βˆˆβ„€2𝐾 β€²

β€²

where u𝑅 = β„œ[u], u𝐼 = β„‘[u], l′𝑅 = β„œ[l ], and l′𝐼 = β„‘[l ]. We will call the vector perturbation schemes using the SE and LR algorithms to solve (5) as the joint SE and joint LR, respectively. To reduce the system cost, we propose to find β„œ[l] and β„‘[l] individually as β„œ[l] = β„‘[l] =

arg β€²min βˆ₯Pe (u𝑅 + 𝜏 l′𝑅 )βˆ₯2 , l𝑅 βˆˆβ„€πΎ Λ™ e (u𝐼 + 𝜏 l′𝐼 )βˆ₯2 . arg β€²min βˆ₯P 𝐾 l𝐼 βˆˆβ„€

(6)

The proposed vector perturbation schemes which use the SE and LR algorithms to solve (6) will be referred to as the decoupled SE and decoupled LR, respectively. It is natural that solving (6) instead of (5) results in a significant performance loss since the joint search over 2𝐾 integers is transformed into the two 𝐾-dimensional searches. Thus, we attempt to minimize this performance loss by applying a diagonal precoding at the transmitter. Notice that (6) is equivalent to (5) if and only if the column Λ™ e . Thus, we try to make space of Pe is orthogonal to that of P two subspaces orthogonal by applying a diagonal precoding Fπœƒ at the transmitter as 1 x = √ P Fπœƒ (u + 𝜏 𝑙) 𝛾 where Fπœƒ = diag(1, π‘’π‘—πœƒ2 , β‹… β‹… β‹… , π‘’π‘—πœƒπΎ ). Here, we use a diagonal matrix since the effect of non-diagonal precoding cannot be compensated at the receiver side due to the lack of receive cooperation. It is noted that the base station should report the information about πœƒπ‘– to user 𝑖 through feedforward links (𝑖 = 2, β‹… β‹… β‹… , 𝐾). Λ™ 𝑒 are replaced with With the diagonal precoding, P𝑒 and P 𝑇 Λ™ Λ™ 𝑒,πœƒ = P𝑒,πœƒ and P𝑒,πœƒ where P𝑒,πœƒ = [β„œ[Pπœƒ ] β„‘[Pπœƒ ]𝑇 ]𝑇 , P 𝑇 𝑇 𝑇 [βˆ’β„‘[Pπœƒ ] β„œ[Pπœƒ ] ] . Here Pπœƒ is defined as Pπœƒ = P Fπœƒ . Let us define p𝑖,πœƒ and pΛ™ 𝑖,πœƒ as the 𝑖-th column of P𝑒,πœƒ and Λ™ 𝑒,πœƒ , respectively. Since we have p𝑇 pΛ™ = 0 and p𝑇 pΛ™ = P 𝑖,πœƒ 𝑖,πœƒ 𝑖,πœƒ 𝑗,πœƒ Λ™ 𝑒,πœƒ ) if and βˆ’p𝑇𝑗,πœƒ pΛ™ 𝑖,πœƒ , βˆ€π‘–, 𝑗, we achieve span(P𝑒,πœƒ ) βŠ₯ span(P only if p𝑇𝑖,πœƒ pΛ™ 𝑗,πœƒ = 0 for 𝑖 = 1, β‹… β‹… β‹… , 𝐾 βˆ’ 1 and 𝑗 = 𝑖 + ( ) 1, β‹… β‹… β‹… , 𝐾. Thus, 𝐾 2 requirements should be satisfied by using (𝐾 βˆ’ 1) variables, which is possible only when 𝐾 = 2. As we are interested in the general cases of 𝐾 > 2, we try to make the column space of Pe,πœƒ roughly orthogonal to the Λ™ e,πœƒ by adjusting πœƒ2 , β‹… β‹… β‹… , πœƒπΎ . To this end, column space of P the chordal distance [8] is adopted as the metric for assessing the orthogonality of two subspaces defined as Λ™ 2𝐹 Λ™ e,πœƒ ) β‰œ 𝐾 βˆ’ βˆ₯Q𝑇 Qβˆ₯ (7) dist(Pe,πœƒ , P Λ™ are orthonormal bases of the column spaces where Q and Q Λ™ e,πœƒ , respectively. If two subspaces are fully of Pe,πœƒ and P orthogonal, then the chordal distance will be 𝐾, whereas the distance becomes zero if they are the same. Thus, the optimization of πœƒ2 , β‹… β‹… β‹… , πœƒπΎ can be formulated as Λ™ e,πœƒ ) (8) max dist(Pe,πœƒ , P πœƒ2 ,β‹…β‹…β‹… ,πœƒπΎ

which is generally non-convex. To obtain the suboptimal parameters, we propose two methods: an iterative algorithm which guarantees a local optimal solution, and a non-iterative solution with a small performance loss. A. Iterative Optimization of πœƒ2 , β‹… β‹… β‹… , πœƒπΎ To solve the problem (8) efficiently, we propose an iterative algorithm which updates 𝐾 βˆ’ 1 phase angles in an alternating fashion. Thus, we now discuss the optimization of πœƒπ‘– with other angles πœƒπ‘— fixed (𝑗 βˆ•= 𝑖). Since only p𝑖,πœƒ and pΛ™ 𝑖,πœƒ depend Λ™ by performing the Gram-Schmidt on πœƒπ‘– , we compute Q and Q process in the following order. p1,πœƒ β†’ β‹… β‹… β‹… β†’ pπ‘–βˆ’1,πœƒ β†’ p𝑖+1,πœƒ β†’ β‹… β‹… β‹… β†’ p𝐾,πœƒ β†’ p𝑖,πœƒ , pΛ™ 1,πœƒ β†’ β‹… β‹… β‹… β†’ pΛ™ π‘–βˆ’1,πœƒ β†’ pΛ™ 𝑖+1,πœƒ β†’ β‹… β‹… β‹… β†’ pΛ™ 𝐾,πœƒ β†’ pΛ™ 𝑖,πœƒ .

PARK et al.: A DECOUPLING APPROACH FOR LOW-COMPLEXITY VECTOR PERTURBATION IN MULTIUSER DOWNLINK SYSTEMS

Λ™ are given by Following the above ordering, Q and Q [ ] [ ] Λ™ = Q ˜ q ,Q ΛœΛ™ qΛ™ Q= Q (9) ˜ and Q ΛœΛ™ are the orthonorwhere the columns ( of Q ) mal (bases of span [p1,πœƒ β‹… β‹… β‹… pπ‘–βˆ’1,πœƒ ) p𝑖+1,πœƒ β‹… β‹… β‹… p𝐾,πœƒ ] and span [pΛ™ 1,πœƒ β‹… β‹… β‹… pΛ™ π‘–βˆ’1,πœƒ pΛ™ 𝑖+1,πœƒ β‹… β‹… β‹… pΛ™ 𝐾,πœƒ ] , respectively, and q and qΛ™ are equal to ˜Q Λœπ‘‡p p𝑖,πœƒ βˆ’ Q 𝑖,πœƒ  and qΛ™ = q=  𝑇   ˜ ˜ p𝑖,πœƒ βˆ’ QQ p𝑖,πœƒ 

𝑇 ΛœΛ™ Q ΛœΛ™ p Λ™ pΛ™ 𝑖,πœƒ βˆ’ Q 𝑖,πœƒ  .   Λ™Λœ ΛœΛ™ 𝑇 Λ™  pΛ™ 𝑖,πœƒ βˆ’ QQ p𝑖,πœƒ

(10)

Notice that only q and qΛ™ are dependent on πœƒπ‘– . Using (9), the chordal distance (7) is computed as

= =

[ ] Λœπ‘‡ [ ] 2 Q Λ™ ˜ qΛ™ 𝐾 βˆ’ Q q𝑇 𝐹 𝑇 2 𝑇 2 Λ™ 𝑇 2  𝑇 2 ˜ ΛœΛ™ ˜ ˜ q βˆ’ q qΛ™  Q qΛ™ βˆ’ Q βˆ’ Q 𝐾 βˆ’ Q 𝐹 𝑇 2 Λ™ ˜ 𝐾 βˆ’2 (11) Q q

 𝑇 2  ˜ ΛœΛ™  Λ™ =0 where the last equality comes from Q Q = ∣q𝑇 q∣ 𝐹  𝑇     ˜   ΛœΛ™ 𝑇  and Q qΛ™  = Q q. In (11), the last term is the only one which is dependent on πœƒπ‘– , and is written as 2 𝑇 2 𝑇 ˜Q Λœπ‘‡p Λ™ Λ™ p𝑖,πœƒ βˆ’ Q 𝑖,πœƒ ˜ q = Q ˜ Q ˜Q ˜ 𝑇 p𝑖,πœƒ βˆ₯ βˆ₯p𝑖,πœƒ βˆ’ Q where p𝑖,πœƒ = p𝑖 cos πœƒπ‘– + pΛ™ 𝑖 sin πœƒπ‘– . Then, πœƒπ‘–,opt is given as πœƒπ‘–,opt

2 𝑇 ˜Q ˜ 𝑇 p𝑖,πœƒ Λ™ p𝑖,πœƒ βˆ’ Q ˜ Q = arg min 𝑇 πœƒπ‘– ˜Q ˜ p𝑖,πœƒ βˆ₯ βˆ₯p𝑖,πœƒ βˆ’ Q = arg max πœƒπ‘–

βˆ₯Ax𝑖 βˆ₯

2

(12)

𝑇 ˜Q ˜ 𝑇 )[p𝑖 p˙𝑖 ], B = Q ΛœΛ™ (I βˆ’ Q ˜Q ˜ 𝑇 )[p𝑖 p˙𝑖 ], where A = (I βˆ’ Q and x𝑖 = [cos πœƒπ‘– sin πœƒπ‘– ]𝑇 . It can be shown from the RayleighRitz quotient result [9] that the problem (12) is convex with respect to πœƒπ‘– for the given πœƒπ‘— (𝑗 βˆ•= 𝑖), and the solution is obtained as ( ) 𝑇 x𝑖,opt = [π‘₯1 π‘₯2 ] = maximum eigenvector (B𝑇 B)βˆ’1 A𝑇 A .

Finally, the optimal πœƒπ‘–,opt with other angles πœƒπ‘— fixed (𝑗 βˆ•= 𝑖) is computed as πœƒπ‘–,opt = tanβˆ’1

π‘₯1 . π‘₯2

TABLE I AVERAGE AND MAXIMUM NUMBER OF THE VISITED NODES FOR THE SE- BASED VECTOR PERTURBATION SYSTEMS Average

Maximum

𝑁𝑇 =𝐾=4

Joint SE Decoupled SE

64 28

5553 535

𝑁𝑇 =𝐾=6

Joint SE Decoupled SE

353 63

108861 1096

summarized as follows: Step 1. Initialize πœƒ2 , β‹… β‹… β‹… πœƒπΎ . Step 2. Update πœƒ2 , β‹… β‹… β‹… , πœƒπΎ with the following loop: For 𝑖 = 2 : 𝐾 ΛœΛ™ as ˜ and Q Construct Q ˜ ← orthonormal basis of Q ( ) span [p1,πœƒ β‹… β‹… β‹… pπ‘–βˆ’1,πœƒ p𝑖+1,πœƒ β‹… β‹… β‹… p𝐾,πœƒ ] , ΛœΛ™ ← orthonormal basis of Q ( ) span [pΛ™ 1,πœƒ β‹… β‹… β‹… pΛ™ π‘–βˆ’1,πœƒ pΛ™ 𝑖+1,πœƒ β‹… β‹… β‹… pΛ™ 𝐾,πœƒ ] . Update πœƒπ‘– according to (13). End Λ™ e,πœƒ ) using (7). Step 3. Compute the chordal distance dist(Pe,πœƒ , P Step 4. Go back to Step 2 until the convergence of the chordal distance. It should be emphasized that due to the convexity of (12) over πœƒπ‘– for the fixed πœƒπ‘— (𝑗 βˆ•= 𝑖), we have a non-decreasing chordal distance with respect to the number of iterations. Since Λ™ e,πœƒ ) is upper bounded by 𝐾, the chordal distance dist(Pe,πœƒ , P the sequence of the chordal distance generated by the above algorithm converges to a local optimal point [10]. We notice that if we apply the above algorithm to determine πœƒπ‘– ’s, then our main objective which is complexity reduction cannot be achieved. Thus, a simple non-iterative method is proposed in the following subsection. B. Non-Iterative Choice of πœƒ2 , β‹… β‹… β‹… , πœƒπΎ

2

βˆ₯Bx𝑖 βˆ₯

1699

(13)

The whole algorithm for iterative optimization of πœƒπ‘– ’s is

In this subsection, we propose a simple non-iterative choice of πœƒ2 , β‹… β‹… β‹… , πœƒπΎ by relaxing the orthogonal condition as ( ) (14) p1,πœƒ βŠ₯ span [pΛ™ 1,πœƒ β‹… β‹… β‹… pΛ™ 𝐾,πœƒ ] , ( ) pΛ™ 1,πœƒ βŠ₯ span [p1,πœƒ β‹… β‹… β‹… p𝐾,πœƒ ] . (15) We (select p1,πœƒ as) a representative basis vector for span([p1,πœƒ β‹… β‹… β‹… p𝐾,πœƒ ]) and make p1,πœƒ orthogonal to span [pΛ™ 1,πœƒ β‹… β‹… β‹… pΛ™ 𝐾,πœƒ ] . Similarly, pΛ™ 1,πœƒ( which is selected ) as a representative (basis for span )[pΛ™ 1,πœƒ β‹… β‹… β‹… pΛ™ 𝐾,πœƒ ] is orthogonalized to span [p1,πœƒ β‹… β‹… β‹… p𝐾,πœƒ ] . According to the property p𝑇𝑖,πœƒ pΛ™ 𝑗,πœƒ = βˆ’p𝑇𝑗,πœƒ pΛ™ 𝑖,πœƒ , both (14) and (15) are satisfied when βˆ’p𝑇1 p𝑖 sin πœƒπ‘– + p𝑇1 pΛ™ 𝑖 cos πœƒπ‘– = 0, for 𝑖 = 2, β‹… β‹… β‹… , 𝐾. The solution πœƒπ‘– of the above equation is given by ( 𝑇 ) p1 pΛ™ 𝑖 βˆ’1 πœƒπ‘– = tan , for 𝑖 = 2, β‹… β‹… β‹… , 𝐾. p𝑇1 p𝑖

(16)

In Figure 1, the cumulative distribution functions (CDF) √ of the effective channel gain 1/ 𝛾 with non-iterative and

1700

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 10, NO. 6, JUNE 2011

W

X WU` WU_ WU^

XW uΒ–GΒ—Β™ΒŒΒŠΒ–Β‹ΒΒ•ΒŽGOjpP kΒŒΒŠΒ–ΒœΒ—Β“ΒŒΒ‹GzlGOuΒ–GΒ‹ΒΒˆΒŽUGΒ—Β™ΒŒΒŠΒ–Β‹ΒΒ•ΒŽP kΒŒΒŠΒ–ΒœΒ—Β“ΒŒΒ‹GzlGOuΒ–Β•TΒΒ›ΒŒΒ™ΒˆΒ›ΒΒΒŒGGGG ΞΈP kΒŒΒŠΒ–ΒœΒ—Β“ΒŒΒ‹GzlGOpΒ›ΒŒΒ™ΒˆΒ›ΒΒΒŒGGG ΞΈP q–•›Gzl

TX

XW

ily

jkm

WU] WU\

TY

XW

WU[ WUZ

TZ

XW WUY WUX W W

uΒ–GΒ—ΒŒΒ™Β›ΒœΒ™Β‰ΒˆΒ›ΒΒ–Β•GOjpP kΒŒΒŠΒ–ΒœΒ—Β“ΒŒΒ‹GzlGOuΒ–GΒ‹ΒΒˆΒŽUGΒ—Β™ΒŒΒŠΒ–Β‹ΒΒ•ΒŽP kΒŒΒŠΒ–ΒœΒ—Β“ΒŒΒ‹GzlGOuΒ–Β•TΒΒ›ΒŒΒ™ΒˆΒ›ΒΒΒŒGGGG ΞΈP kΒŒΒŠΒ–ΒœΒ—Β“ΒŒΒ‹GzlGOpΒ›ΒŒΒ™ΒˆΒ›ΒΒΒŒGGG ΞΈP q–•›Gzl

T[

WUX

WUY

WUZ

WU[

WU\

XW

WU]

W

\

XW

X\

YW

Y\

zuyG‚‹iΒ„

1/ Ξ³

√ Fig. 1. CDF of 1/ 𝛾 for the SE-based vector perturbation system with 𝑁𝑇 = 𝐾 = 4 and 4-QAM.

Fig. 2. Performance comparison with sphere encoding for 𝑁𝑇 = 𝐾 = 4 and 4-QAM.

TABLE II AVERAGE NUMBER OF THE FLOATING POINT MULTIPLICATIONS FOR THE LR- BASED VECTOR PERTURBATION SYSTEMS 𝑁𝑇 = 𝐾 = 6

Joint LR

1166.9

3568.5

Decoupled LR

804.8

2540.5

TX

XW

ily

𝑁𝑇 = 𝐾 = 4

W

XW

iterative setting of πœƒπ‘– ’s are presented for SE-based vector perturbation systems with 𝑁𝑇 = 𝐾 = 4. It is observed that the proposed non-iterative method achieves the performance almost identical to the iterative scheme with much reduced complexity. Also, we can see that the diagonal precoding is essential for the proposed decoupled scheme since the decoupled SE without it shows significantly degraded performance. We briefly investigate the complexity of the conventional vector perturbation schemes based on joint search and the proposed decoupled approach. Table I shows the average number of the nodes visited by the SE algorithm to find the perturbation vector Β―l for various antenna configurations with 4-QAM modulation. The results in Table I are obtained through simulations over 10000 channel realizations. For the implementation of the SE-based vector perturbation, we have used a sphere decoding algorithm proposed in [11]. The proposed decoupled approach is simulated with the non-iterative setting of πœƒ2 , β‹… β‹… β‹… , πœƒπΎ . It is observed that for 4-by-4 system, the proposed decoupled approach achieves a complexity reduction of about 56% and 90% in terms of the average and maximum number of the visited nodes, respectively. Also, the complexity of the LR-based vector perturbation system is summarized in Table II in terms of the average number of the floating point multiplications. In order to focus on the dominant part in the system cost, the number of multiplications is counted during the LLL algorithm, the phase computation in (16) and the construction of the rotated inverse matrix Pπœƒ = P Fπœƒ . With the proposed decoupled approach, we can achieve a complexity saving of about 30% which is smaller compared to the SE-based system. However, it should be emphasized that a performance loss caused by decoupling

TY

XW

TZ

XW

uΒ–GΒ—ΒŒΒ™Β›ΒœΒ™Β‰ΒˆΒ›ΒΒ–Β•GOjpP kΒŒΒŠΒ–ΒœΒ—Β“ΒŒΒ‹GsyGOuΒ–GΒ‹ΒΒˆΒŽUGΒ—Β™ΒŒΒŠΒ–Β‹ΒΒ•ΒŽP kΒŒΒŠΒ–ΒœΒ—Β“ΒŒΒ‹GsyGOuΒ–Β•TΒΒ›ΒŒΒ™ΒˆΒ›ΒΒΒŒGGGG ΞΈP kΒŒΒŠΒ–ΒœΒ—Β“ΒŒΒ‹GsyGOpΒ›ΒŒΒ™ΒˆΒ›ΒΒΒŒGGG ΞΈP q–•›Gsy

T[

XW

W

\

XW

X\

YW

Y\

zuyG‚‹iΒ„

Fig. 3. Performance comparison with lattice-reduction for 𝑁𝑇 = 𝐾 = 4 and 4-QAM.

joint search is minor in the LR-based vector perturbation system. IV. S IMULATION R ESULTS In this section, we evaluate the bit-error rate (BER) performance of several vector perturbation systems. Throughout the 2 simulations, the signal-to-noise ratio (SNR) is defined as 1/πœŽπ‘€ and we set 𝜏 = 2(Ξ”/2 + βˆ£π‘βˆ£max ) in (4) where Ξ” and βˆ£π‘βˆ£max are defined in [3]. Figure 2 illustrates the results for the SEbased vector perturbation systems. As expected, the proposed decoupled SE significantly outperforms the CI scheme while achieving a significant complexity savings compared to the conventional joint SE scheme. Also, we can confirm the importance of the diagonal precoding since the decoupled SE without it suffers from a diversity loss. Our simple choice of πœƒπ‘– ’s provides the performance almost identical to that of iteratively optimized parameters. The BER performance of the vector perturbation systems employing LR is evaluated in Figure 3. The performance gap between the joint LR and the proposed decoupled LR reduces

PARK et al.: A DECOUPLING APPROACH FOR LOW-COMPLEXITY VECTOR PERTURBATION IN MULTIUSER DOWNLINK SYSTEMS

to the conventional perturbation scheme, we have adopted the diagonal precoding to divide the column space into two orthogonal subspaces efficiently. It is shown that the proposed decoupled scheme achieves a significant complexity reduction at the expense of a slight performance loss. We also verify that our decoupling approach can be efficiently combined with any conventional algorithms to further reduce the complexity.

W

XW

TX

XW

ily

1701

TY

XW

R EFERENCES TZ

XW

uΒ–GΒ—ΒŒΒ™Β›ΒœΒ™Β‰ΒˆΒ›ΒΒ–Β•GOjpP kΒŒΒŠΒ–ΒœΒ—Β“ΒŒΒ‹GsyGOuΒ–GΒ‹ΒΒˆΒŽUGΒ—Β™ΒŒΒŠΒ–Β‹ΒΒ•ΒŽP kΒŒΒŠΒ–ΒœΒ—Β“ΒŒΒ‹GsyGOuΒ–Β•TΒΒ›ΒŒΒ™ΒˆΒ›ΒΒΒŒGGGG ΞΈP kΒŒΒŠΒ–ΒœΒ—Β“ΒŒΒ‹GsyGOpΒ›ΒŒΒ™ΒˆΒ›ΒΒΒŒGGG ΞΈP q–•›Gsy

T[

XW

W

\

XW

X\

YW

Y\

zuyG‚‹iΒ„

Fig. 4. Performance comparison with lattice-reduction for 𝑁𝑇 = 𝐾 = 6 and 4-QAM.

significantly in comparison to the SE-based systems. Also, as in the SE, the simple non-iterative method provides the performance almost identical to the iterative method. One might think that a performance loss of the proposed scheme would be significant for a large number of users. We show that this is not the case in Figure 4 where the vector perturbation with the LR is evaluated for 𝑁𝑇 = 𝐾 = 6. The performance of the proposed decoupled scheme is shown to be within a few tenth of a dB away from that of the conventional joint LR system. V. C ONCLUSION In this letter, we have proposed a low-complexity vector perturbation scheme based on decoupled search of perturbation vectors. To minimize a performance loss compared

[1] S. Vishwanath, N. Jindal, and A. Goldsmith, β€œDuality, achievable rates, and sum-rate capacity of Gaussian MIMO broadcast channels," IEEE Trans. Inf. Theory, vol. 49, pp. 2658-2668, Oct. 2003. [2] C. B. Peel, B. M. Hochwald, and A. L. Swindlehurst, β€œA vectorperturbation techniques for near-capacity multiantenna multiuser communication–part I: channel inversion and regularization," IEEE Trans. Commun., vol. 53, pp. 195-202, Jan. 2005. [3] B. M. Hochwald, C. B. Peel, and A. L. Swindlehurst, β€œA vectorperturbation techniques for near-capacity multiantenna multiuser communication–part II: perturbation," IEEE Trans. Commun., vol. 53, pp. 537-544, Mar. 2005. [4] E. Viterbo and J. Boutros, β€œA universal lattice code decoder for fading channels," IEEE Trans. Inf. Theory, vol. 45, pp. 1639-1642, July 1999. [5] C. Windpassinger, R. F. H. Fisher, and J. B. Huber, β€œLattice-reductionaided broadcast precoding," IEEE Trans. Commun., vol. 52, pp. 20572060, Dec. 2004. [6] H.-S. Han, S.-H. Park, S. Lee, and I. Lee, β€œModulo loss reduction for vector perturbation systems," IEEE Trans. Commun., vol. 58, pp. 33923397, Dec. 2010. [7] H. Lee, S.-H. Park, and I. Lee, β€œOrthogonalized spatial multiplexing for closed-loop MIMO systems," IEEE Trans. Commun., vol. 55, pp. 10441052, May 2007. [8] A. Edelman, T. A. Arias, and S. T. Smith, β€œThe geometry of algorithms with orthonormality constraints," SIAM J. Matrix Anal. Appl., vol. 20, no. 2, pp. 303-353, 1998. [9] G. H. Golub and C. F. V. Loan, Matrix Computations, 3rd edition. Johns Hopkins University Press, 1996. [10] M. Kobayashi and G. Caire, β€œJoint beamforming and scheduling for a multi-antenna downlink with imperfect transmitter channel knowledge," IEEE J. Sel. Areas Commun., vol. 25, pp. 1468-1477, Sep. 2007. [11] A. M. Chan and I. Lee, β€œA new reduced-complexity sphere decoder for multiple antenna systems," in Proc. IEEE ICC ’02, pp. 460-464, Apr. 2002.