A Fast Online SVM Algorithm for Variable-Step CDMA ... - Springer Link

4 downloads 116 Views 205KB Size Report
Network Communication System and Control Laboratory (218),. Department of Automation .... support vectors(SVs) and lie on the margin of the decision region.
A Fast Online SVM Algorithm for Variable-Step CDMA Power Control Yu Zhao, Hongsheng Xi, and Zilei Wang Network Communication System and Control Laboratory (218), Department of Automation, University of Science and Technology of China, 230027 Hefei, Anhui, China [email protected]

Abstract. This paper presents a fast online support vector machine (FOSVM) algorithm for variable-step CDMA power control. The FOSVM algorithm distinguishes new added samples and constructs current training sample set using K.K.T. condition in order to reduce the size of training samples. As a result, the training speed is effectively increased. We classify the received signals into two classes with FOSVM algorithm, then according to the output label of FOSVM and the distance from the data points to the SIR decision boundary, variable-step power control command is determined. Simulation results illustrate that the algorithm has a fast training speed and less support vectors. Its convergence performance is better than the fixed-step power control algorithm.

1

Introduction

Power control is one of the most important techniques in CDMA cellular system. Since all the signals in a CDMA system share the same bandwidth, it is critical to use power control to maintain an acceptable signal-to-interference ratio (SIR) for all users, hence maximizing the system capacity. Another critical problem with CDMA is the “near-far effect”. Due to propagation characteristics the signals from mobiles closer to the base station could overpower the signals from mobiles located farther away, with power control each mobile adjusts its own transmit power to ensure an desired QoS or SIR at the base station. Over the past decades, many power control techniques drawn from centralized control [1], distributed control [2], stochastic control [3] etc. have been proposed and applied to cellular radio systems. Most of these available approaches are based on the accurate estimates of signal-to-interference ratio (SIR), bit error rates (BER), or frame error rates (FER). Support vector machine (SVM) is one of the most effective machine learning methods, which are based on principles of structural risk minimization and statistical learning theory [4]. The standard SVM algorithms are solved using quadratic programming methods, these algorithms are often time consuming and difficult to implement for real-time application, especially for the condition that the size of training data is large. Some modified SVM algorithms [5,6] are proposed to solve the problem of real-time application. Rohwer [7] just applied L. Wang, K. Chen, and Y.S. Ong (Eds.): ICNC 2005, LNCS 3610, pp. 1090–1099, 2005. c Springer-Verlag Berlin Heidelberg 2005 

A Fast Online SVM Algorithm for Variable-Step CDMA Power Control

1091

a least squares support vector machine (LS-SVM) algorithm presented in [6] for fixed-step CDMA power control, but the Lagrangian multipliers for the LS-SVM tend to be all nonzero whereas for the SVM case only support vector are nonzero. On the other hand, the convergence performance of the fixed-step power control is not satisfying when the initial received SIR is far away from the desired SIR. This paper presents a fast online support vector machine (FOSVM) algorithm for variable-step CDMA power control. The algorithm classify the sets of eigenvalues, from the sample covariance matrices of the received signal, into two SIR sets using FOSVM, then according to the output label of FOSVM and the distance from the data point to the SIR decision boundary, variable-step power control command is determined. For the data points far away from the decision boundary, large step will be considered, otherwise, we will choose small step. In the process of training SIR decision boundary, we choose training samples using K.K.T. condition, hence reducing the size of training data and increasing the training speed effectively. Simulation results illustrate that the FOSVM algorithm has a faster training speed and less support vectors without compromising the generalization capability of the SVM, furthermore its convergence performance is better than fixed-step power control algorithm.

2

Support Vector Machines

SVM can be described firstly considering a binary classification problem. Assume we have a finite set of labeled points in Euclidean n space, they are linearly separable. The goal of SVM is to find the hyperplane, (wT x) + b = 0, which maximizes the minimum distance between any point and the hyperplane [4]. Let (xi , yi ) i = 1, · · · , l be a set of l labeled points where xi ∈ RN and yi ∈ {+1, −1}. Among all hyperplanes separating the data, there exists an optimal one yielding the maximum margin of separation between the classes. Since 2 , maximizing the margin is equivalent to minimizing the the margin equals w magnitude of the weights. If the data is linearly separable, the problem can be described as a 1-norm soft margin SVM [8] min(w2 /2 + C

l

i=1 ξi )

s.t. yi (w, xi  + b) ≥ 1 − ξi ξi ≥ 0, i = 1, · · · , l C > 0

(1)

where ξi denotes slack variables to the quadratic programming problem. Data points are penalized by regularization parameter C if they are misclassified. For the sets of data points that can not be separated linearly, we need replace inner products xi , yi  with kernel function K(xi , yi ) which satisfies Mercers Theorem [4]. The quadratic programming problem can be transformed into a dual problem by introducing Lagrangian multiplier αi .

1092

Y. Zhao, H. Xi, and Z. Wang

min LD (α) =

s.t.

l 

l l  1  αi αj yi yj K(xi , xj ) − αi 2 i,j=1 i=1

αi yi = 0, i = 1, · · · , l

0 ≤ αi ≤ C

(2)

i=1

The optimal solution must satisfy the following Karush-Kuhn-Tucker (K.K.T.) conditions αi [yi (w, xi  + b) − 1 + ξi ] = 0, i = 1, · · · , l

(3)

The weight w obtained from (2) is then expressed as w=

l 

αi yi xi

(4)

i=1

The decision function can then be written as    f (x) = sgn αi yi K(x, xi ) + b

(5)

i∈SV

From equation (4), we see that only the inputs associated with nonzero Lagrangian multiplier αi contribute to the weight vector w. These inputs are called support vectors(SVs) and lie on the margin of the decision region. These support vectors are the critical vectors in determining the optimal margin classifier.

3

FOSVM

In the conventional SVM, training data are supplied and computed in batch by solving the quadratic programming problem, therefore, it is time consuming to classify a large data set and can not satisfy the demands of online application, such as CDMA power control, which needs periodically retraining because of the update of the training data. With this objective in mind, an online training of support vector classifier (OSVC) algorithm is proposed in [5] to overcome the shortcoming of conventional SVM. Although the simulation results show that the training time of the OSVC algorithm is much less than the standard SVM and SMO algorithm [9], it is still inefficient for some kinds of training data. Borrowed the idea from OSVC algorithm, we present a modified fast online SVM (FOSVM) algorithm to improve the performance of training phase. The FOSVM algorithm is summarized as follow: At the initial stage of online training, obtain the initial optimal hyperplane f0 (α0 , b0 ) with the initial training data set S0 , the size of S0 can be chosen by users according to the actual application. SV0 = {(Sx0i , Syi0 )} is the corresponding support vectors set. When a new set of training data Sk is available, we will

A Fast Online SVM Algorithm for Variable-Step CDMA Power Control

1093

judge if the training data (xi , yi ) ∈ Sk can be classified by the current optimal hyperplane fk−1 (αk−1 , bk−1 ) correctly, where the corresponding hyperplane |SV | Syik−1 K(x, Sxk−1 ) + bk−1 ). When there is no is fk−1 (x) = sgn( i=1k−1 αk−1 i i training data misclassified, set fk = fk−1 , SVk = SVk−1 , continue for the next step, otherwise we need train the new hyperplane. In the process of training new hyperplane, we should first determine the training data set Wk . If there is a high demand on real-time application and the size of Sk is large, we’d better choose less training data to reduce the training time. Without losing generality, we set Wk = Sk . Later, we retrain the current training data set Tk circularly until all the training data (xi , yi ) ∈ Wk satisfy the K.K.T. conditions. The current training data set Tk includes SVk and the samples violating the K.K.T. conditions corresponding to the current hyperplane fk . Compared with OSVC algorithm, the advantage of FOSVM algorithm is that it reduces the training time by decreasing training times. The OSVC algorithm obtain the new sample one by one in a sequence, once obtain a new sample which can not be classified by the current optimal hyperplane correctly, it need carry out a training for a new hyperplane. The training is so frequent that it leads to increase the training time. While in FOSVM algorithm, we obtain a new set of training data and construct current training data set using K.K.T. conditions at every step. With this procedure, the training time can be decreased obviously without compromising the generalization capability of the SVM. We give the pseudo code of FOSVM algorithm as Algorithm 1:

Algorithm 1 Fast online SVM algorithm 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: 14: 15: 16: 17: 18:

Obtain initial training data set S0 . Set W0 = S0 . Minimize (2) with W0 to obtain an optimal hyperplane f0 (α0 , b0 ) and SV0 . for k = 1, 2, 3, · · · do Obtain the kth training data setSk , Ek = {(xi , yi ) ∈ Sk |(xi , yi ) are misclassif ied by fk−1 } if |Ek | > 0 then Determine the training data set Wk . Vk = {(xi , yi ) ∈ Wk |yi fk (xi ) violate the K.K.T. conditions} while |Vk | > 0 do  the current training data set Tk = SVk Vk Minimize (2) to obtain a new optimal hyperplane fk and SVk with Tk Vk = {(xi , yi ) ∈ Wk /Tk |yi fk (xi ) violate the K.K.T. conditions} end while else fk = fk−1 , SVk = SVk−1 end if end for

1094

4

Y. Zhao, H. Xi, and Z. Wang

Convergence of FOSVM

In order to explain the convergence of FOSVM, we rewritten the dual objective function (2) in a matrix as follows: LD =

1 T α Kα − c, α 2

(6)

where c is an l×1 vector, α = (α1 , · · · , αl )T and K = {Kij }, Kij = yi yj K(xi , xj ). Next we will prove the convergence of the FOSVM algorithm by comparing it with the decomposition algorithm (DA) proposed in [10]. The DA partitioned the training set into two sets B and N. The set B is called working set and N correcting set. Suppose α, y, c and K from (6) can be arranged properly as follows:         αB yB cB KBB KBN α= ,y = ,c = ,K = αN yN cN KN B KN N Then the dual objective function (6) is rewritten involving the working and correcting sets as follows: 1 min [αTB KBB αB + αTB KBN αN + αTN KNB αB + αTN KNN αN ] − cB , αB  − cN , αN  2 s.t. yB , αB  + yN , αN  = 0, 0 ≤ αB , αN ≤ C

The main idea of DA is that instead of solving the large quadratic programming problem at once, small quadratic programming sub-problem are solved by exchanging elements between the set B and N. Each sub-problem will bring the solution closer to the optimal solution. The process of exchanging elements includes two steps, Build-down and Build-up, which shows in the Fig.1. [11]gives the detailed proof of the convergence of DA. Randomly select m elements

Bk = SVk

SVk−1 B N B N The samples violate the K.K.T. condition Randomly select m elements including one violating the K.K.T. condition Fig. 1. Decomposition algorithm for SVC Fig. 2. Elements exchange for FOSVM

Compared with DA, the FOSVM keeps the support vector SVk set in the working set B and removes the other elements of B to the correcting set N. Another difference between FOSVM and DA is that FOSVM takes a new set of elements at each step and put the elements violating the K.K.T. conditions into the working set. Fig.2 shows the state diagram of the FOSVM. Because the elements which are not SVs will farther away from the hyperplane, they have no effect on the optimal solution, which means the parameter αN = 0. Therefore,

A Fast Online SVM Algorithm for Variable-Step CDMA Power Control

1095

the optimal solution of FOSVM is unchanged after removing only the elements which are not SVs. In order to show the convergence of the FOSVM, we have the following corollary: Corollary 1. Moving elements {m} which are not SVs from B to N leaves the cost function unchanged and the solution is feasible in the sub-problem.  Proof. Let B  = B − {m}, N  = N {m}, {m} ∈ B − SV ⇒ αm = 0 and notice that αN = 0, we have L

D (B



, N )

1 T [α  KB  B  αB  + 2αTB  KB  N  αN  + αTN  KN  N  αN  ]−cB  , αB   − cN  , αN   2 B 1 = [αTB  KB  B  αB  + 2αTB  KB  m αm + αTm Kmm αm ] − cB  , αB   − cm , αm  2 1 = αTB KBB αB − cB , αB  2 = LD (B, N ) (7) =

yB  , αB   + yN  , αN   = yB  , αB   + ym,αm  = yB , αB  = 0

(8)

From (7) (8), we notice that the objection function and constraints conditions both are unchanged, therefore, the sub-problem has the same solution, using the proposed FOSVM algorithm which modifies the build-down and build-up process of the DA.

5 5.1

FOSVM for Variable-Step CDMA Power Control CDMA Signal Model

The received vector at the output of the ith antenna array detected at the adaptive array processor can be written as [12] xi (t) =

M  L   l Pj Gji aj (θl )gji sj (t − τj ) + ni (t)

(9)

j=1 l=1

where sj (t − τj ) is the message signal transmitted from the jth user, τj is the corresponding time delay, ni (t) is the thermal noise vector at the input of antenna array at the ith receiver, and Pj is the power of the jth transmitter. aj (θl ) is the response of the jth user to the direction θl . The attenuation due to shadowing in l . The link gain between transmitter j and receiver the lth path is denoted by gji i is denoted by Gji . The terms relative to the multiple paths are combined as Zji =

L  l=1

l aj (θl )gji

(10)

1096

Y. Zhao, H. Xi, and Z. Wang

Vector Zji is defined as the spatial signature or array response of the ith antenna array to the jth source. In a spread spectrum system, the message signal is given by  si (t) = bi (n)ci (t − nT ) (11) n

where bi (n) is the ith user information bit stream and ci (t) is the spreading sequence. The received signal, sampled at the output of the matched filter, is expressed as nT +τi yi (n) = ci (t − nT − τi )· (n−1)T +τi



⎞   ⎝ pj (t)Gji bj (m)cj (t − mT − τj )Zji + ni (t)⎠ dt j

5.2

(12)

m

Variable Step CDMA Power Control

In CDMA system, the signals sampled from the same mobile user at different time are independent, moreover, the power control algorithm have a high demand on the speed of convergence, so we can discard the previous signal samples and only use the current set of signal samples to train the current hyperplane. It means that we can set Wk = Sk in the FOSVM algorithm. According to the fixed-step power control algorithm mentioned in [7], we can classify the set of eigenvalues from the sample covariance matrices of the received signal into two classes, One represents the SIR greater than the desired SIR, another represents the lower. Variable-step power control command is determined according to the output label of FOSVM and the distance from the data point to the SIR decision boundary. The size of step changes adaptively based on the distance from the data point to the SIR decision boundary. The application of FOSVM for variable-step CDMA power control can be described as follow: 1. Obtain one set of signals sampled from equation (12), suppose the set is Sk , the number of signal sample in Sk is N . 2. Generate the sample covariance matrices with the signals from Sk , each covariance matrix is generated with M samples, M < N . N 3. Calculate the eigenvalues of sample covariance matrices, obtain [ M ] eigenvalue vectors. 4. Using FOSVM algorithm to classify the eigenvalue vectors into two SIR sets. 5. Generate power control command according to the output label of FOSVM and the distance from the data point to the SIR decision boundary.

6

Simulation and Results

In this section, we compare the performance of various SVMs in terms of classification errors, the number of support vectors and training time. The convergence

A Fast Online SVM Algorithm for Variable-Step CDMA Power Control

1097

performance of variable-step power control based on the FOSVM algorithm is also compared with the fixed-step power control. The signals generated from equation (12) include random noise components and received SIR between 0dB and 15dB. The attenuation of each signal in the different path is randomly generated with a maximum attenuation of 50% of the dominant signal. The simulations include randomly generated link gains for the different paths and the number of paths is 4. The direction of arrival signal (DoA) θl is generated randomly from the range 0 to 180 degree and the number of antenna array elements is 8. Simulation results show that the best performance is achieved with the linear kernel, the value of the regularization parameter C is 2. Simulation results with the radial basis function (RBF) includes error rates 30% to 50% higher than error rates from simulations based on the linear kernel. To evaluate the performance of FOSVM, we compared the FOSVM algorithm with the OSVC and conventional SVM using the same data. We orderly obtain 10 sets of training data based on the signal model, each set includes 100 training samples and 400 test samples. The training time and the number of SVs is shown in Talbe 1 and Talbe 2. Tabel 1 shows that the training speed of

Table 1. The training time of FOSVM, OSVC and conventional SVM (unit: msec) Training set FOSVM OSVC Conventional SVM

1 221 221 221

2 90 321 445

3 0 0 324

4 40 234 415

5 10 118 287

6 10 345 345

7 10 187 223

8 0 0 218

9 10 114 356

10 10 110 372

Table 2. The number of SVs in FOSVM,OSVC and conventional SVM Training set FOSVM OSVC Conventional SVM

1 34 34 34

2 35 36 39

3 36 36 42

4 32 36 43

5 32 35 39

6 33 33 38

7 31 34 37

8 29 29 38

9 33 33 42

10 32 34 45

FOSVM is faster than the other two algorithms. Because their initial training set have no difference, the training time of the first training set is the same for these algorithms. When the new training data can be classified by the current hyperplane, FOSVM and OSVC algorithms need not train the new hyperplane, so the training time of FOSVM and OSVC algorithms is zero in column 3 and 8. From Table 2, we find that the number of SVs in FOSVM is less than the other two algorithms. Table 3 presents the percentage of mean classification error of the three algorithms for 5 different desired SIR. Each simulation includes 100 training samples and 10 set test samples. It can be seen that the percentage of classification error of these SVM algorithms are almost the same, which means their generalization capability are comparative.

1098

Y. Zhao, H. Xi, and Z. Wang

Table 3. Percentage of mean classification error of three different SVM algorithms SIR threshold 11dB 9 dB 7 dB 5 dB 3 dB

FOSVM 6.811 7.606 8.085 7.513 7.130

OSVC Conventional SVM 6.809 6.813 7.608 7.618 8.091 8.093 7.513 7.515 7.132 7.138

Fig.3(a) shows one received signal with 15dB SIR converges to the desired SIR (7dB) through 100 iterations. There is a little vibration near 7 dB, because at the points which are misclassified by the decision boundary, the outputs of classifier are opposite to the real value, then the power control system will send the wrong commands. Fig.3(b) shows the power control command responding to the output of the FOSVM algorithm. A “-1” represents a received SIR greater than the desired SIR and therefore corresponds to a power down command, a “1” represents a received SIR lower than the desired SIR and therefore corresponds to a power up command.

2

SIR Curve PC Command

Power Control Command

SIR(dB)

16 15 14 13 12 11 10 9 8 7 6

0

100

200 300 Iteration Number

400

1

0

−1

−2

500

0

100

200 300 Iteration Number

400

500

(a) convergence performance of FOSVM (b) power control command responding variable-step power control algorithm to the output of the FOSVM algorithm Fig. 3. FOSVM algorithm for variable step CDMA power control

SIR Curve

15 14 13 12 11 10 9 8 7 6 5

SIR Curve

SIR(dB)

SIR(dB)

15 14 13 12 11 10 9 8 7 6 5 4

0

50

100

150

200 250 300 350 Iteration Number

400

450

500

0

100

200

300

400 500 600 700 Iteration Number

800

900 1000

(a) convergence performance of FOSVM (b) convergence performance of FOSVM power control algorithm with step=0.5 power control algorithm with step=0.1 Fig. 4. FOSVM algorithm for fixed step CDMA power control

A Fast Online SVM Algorithm for Variable-Step CDMA Power Control

1099

Fig.4 shows the convergence performance of fixed-step power control with FOSVM algorithm. When the size of fixed step is large, the transmitted signal vibrates sharply near the 7 dB. When the size of step is small, it costs much more time for the transmitted signal converges to 7 dB.

7

Conclusion

In this paper, a FOSVM algorithm was proposed as the modified algorithm of OSVC. The FOSVM algorithm speeds up the training phase by reducing the size of training sample set using K.K.T. conditions. Simulation results show that the FOSVM outperform the OSVC and conventional SVM in term of the number of SVs and training time while keeping the comparable classification errors. According to the distance from the data point to the SIR decision boundary, we present a variable-step CDMA power control method based on the FOSVM algorithm. It offers better convergence performance than fixed-step algorithm.

References 1. Zander, J.: Performance of Optimum Transmitter Power Control in Cellular Radio Systems. IEEE Transaction On Vechicular Technology, vol.41, no.1, 57-62, February 1992. 2. Zander, J.: distributed Cochannel Interference Control in Cellular Radio Systems. IEEE Transaction On Vechicular Technology, vol.41, no.3, 305-311, August 1992. 3. Ulukus, S., Roy D.Yates.: Stochastic Power Control for Cellular Radio Systems. IEEE Transactions On Communications, vol.46, no.6, 784-798, June 1998. 4. Vapnik, V.: Statistical Learning Theory. Wiley-Interscience Publication, 1998. 5. Lau, K.W., Wu, Q.H.: Online training of support vector classifier. Patten Recognition, 36(2003): 1913-1920. 6. Suykens, J., Lukas, L., Vandewalle, J.: Least squares support vector machine classifiers. Neural Processing Letters, Vol.9, 3, 293-300, June 1999. 7. Rohwer, J.A., Abdallah, C.T., Christodoulou, C.G.: Least squares support vector machines for fixed-step and fixed-set CDMA power control. Decision and Control, 2003. Proceedings. 42nd IEEE Conference on, vol. 5 , 5097 - 5102, Dec. 2003. 8. Nello, C., John, S.T.: An Introduction to Support Vector Machines. London: Cambridge University Press, 2000 9. Platt, J.: Fast training of support vector machines using sequential minimal optimization. Scholkopf, B., Burges, C., Smola, A.J.: Advances in Kernel MethodSupport Vector Learning. Cambridge, MIT Press, 1999. 185-208 10. Joachims, T.: Making large-scale support vector machine learning practical. Scholkopf, B., Burges, C., Smola, A.J.:Advances in Kernel Methods–Support Vector Learning. Cambridge, MIT Press, 1998. 11. Osuna, E., Freund, R., Girosi, G.: An Improved Training Algorithm for Support Vector Machines. IEEE Workshop on Neural Network for Signal Processing, Amelia Island , 1997. 276-285 12. Farrokh, R.F., Leandros, T., Ray Liu, K.J.: Joint optimum power control and beamforming in wireless network using antenna arrays. IEEE Transaction On Communications, vol.46, no.10, 1313-1324, October 1998.

Suggest Documents