Speed Up SVM Algorithm for Massive ... - Semantic Scholar

Speed Up SVM Algorithm for Massive Classification Tasks Thanh-Nghi Do1, Van-Hoa Nguyen2, and François Poulet3 1

College of Information Technology, Can Tho University, 1 Ly Tu Trong street, Can Tho city, Vietnam [email protected] 2 IRISA Symbiose, Campus de Beaulieu, 35042 Rennes Cedex, France [email protected] 3 IRISA Texmex, Campus de Beaulieu, 35042 Rennes Cedex, France [email protected]

Abstract. We present a new parallel and incremental Support Vector Machine (SVM) algorithm for the classification of very large datasets on graphics processing units (GPUs). SVM and kernel related methods have shown to build accurate models but the learning task usually needs a quadratic program so that this task for large datasets requires large memory capacity and long time. We extend a recent Least Squares SVM (LS-SVM) proposed by Suykens and Vandewalle for building incremental and parallel algorithm. The new algorithm uses graphics processors to gain high performance at low cost. Numerical test results on UCI and Delve dataset repositories showed that our parallel incremental algorithm using GPUs is about 70 times faster than a CPU implementation and often significantly faster (over 1000 times) than state-of-the-art algorithms like LibSVM, SVM-perf and CB-SVM.

1 Introduction Since Support Vector Machine (SVM) learning algorithms were first proposed by Vapnik [23], they have shown to build accurate models with practical relevance for classification, regression and novelty detection. Successful applications of SVMs have been reported for such varied fields as facial recognition, text categorization and bioinformatics [12]. In particular, SVMs using the idea of kernel substitution have been shown to build high quality models, and they have become increasingly popular classification tools. In spite of the prominent properties of SVMs, current SVM algorithms cannot easily deal with very large datasets. A standard SVM algorithm requires solving a quadratic or linear program; so its computational cost is at least O(m2), where m is the number of training data points and the memory requirement of SVM frequently make it intractable. There is a need to scale up these learning algorithms for dealing with massive datasets. Efficient heuristic methods to improve SVM learning time divide the original quadratic program into series of small problems [2], [18]. Incremental learning methods [3], [6], [7], [11], [19], [21] improve memory performance for massive datasets by updating solutions in a growing training set without needing to load the entire dataset into memory at once. Parallel and distributed algorithms [7], [19] C. Tang et al. (Eds.): ADMA 2008, LNAI 5139, pp. 147–157, 2008. © Springer-Verlag Berlin Heidelberg 2008

148

T.-N. Do, V.-H. Nguyen, and F. Poulet

improve learning performance for large datasets by dividing the problem into components that are executed on large numbers of networked personal computers (PCs). Active learning algorithms [22] choose interesting data point subsets (active sets) to construct models, instead of using the whole dataset. In this paper, we describe methods to build the incremental and parallel LS-SVM algorithm for classifying very large datasets on GPUs, for example, an Nvidia GeForce 8800 GTX graphics card. Most of our work is based on LS-SVM classifiers proposed by Suykens and Vandewalle [20]. They replace standard SVM optimization inequality constraints with equalities in least squares error; so the training task only requires solving a system of linear equations instead of a quadratic program. This makes training time very short. We have extended LS-SVM in two ways. 1. We developed an incremental algorithm for classifying massive datasets (at least billions data points). 2. Using a GPU (massively parallel computing architecture), we developed a parallel version of incremental LS-SVM algorithm to gain high performance at low cost. Some performances in terms of learning time and accuracy are evaluated on datasets from the UCI Machine Learning repository [1] and Delve [5], including Forest cover type, KDD cup 1999, Adult and Ringnorm datasets. The results showed that our algorithm using GPU is about 65 times faster than a CPU implementation. An example of the effectiveness of our new algorithm is its performance on the 1999 KDD cup dataset. It performed a binary classification of 5 million data points in a 41dimensional input space within 0.7 second on the Nvidia GeForce 8800 GTX graphics card (compared with 55.67 seconds on a CPU, Intel core 2, 2.6 GHz, 2 GB RAM). We also compared the performances of our algorithm with the highly efficient standard SVM algorithm LibSVM [4] and with two recent algorithms, SVM-perf [13] and CB-SVM [25]. The remainder of this paper is organized as follows. Section 2 introduces LS-SVM classifiers. Section 3 describes how to build the incremental learning algorithm with the LS-SVM algorithm for classifying large datasets on CPUs. Section 4 presents a parallel version of the incremental LS-SVM using GPUs. We present numerical test results in section 5 before the conclusion and future work. Some notations are used in this paper. All vectors are column vectors unless transposed to row vector by a T superscript. The inner dot product of two vectors, x, y is denoted by x.y. The 2-norm of the vector x is denoted by ||x||. The matrix A[mxn] is m data points in the n-dimensional real space Rn.

2 Least Squares Support Vector Machine Consider the linear binary classification task depicted in figure 1, with m data points xi (i=1..m) in the n-dimensional input space Rn. It is represented by the [mxn] matrix A, having corresponding labels yi = ±1, denoted by the [mxm] diagonal matrix D of ±1 (where D[i,i] = 1 if xi is in class +1 and D[i,i] = -1 if xi is in class -1). For this problem, a SVM algorithm tries to find the best separating plane, i.e. the one farthest from both class +1 and class -1. Therefore, SVMs simultaneously maximize the distance between two parallel supporting planes for each class and minimize the errors.

Speed Up SVM Algorithm for Massive Classification Tasks w.x – b = -1

-1

zi

149

optimal plane: w.x – b = 0 w.x – b = +1

zj

+1

margin = 2/||w||

Fig. 1. Linear separation of the data points into two classes

For the linear binary classification task, classical SVMs pursue these goals with the quadratic program (1): min Ψ(w, b, z) = (1/2) ||w||2 + cz s.t.: D(Aw – eb) + z ≥ e

(1)

where the slack variable z ≥ 0 and the constant c > 0 is used to tune errors and margin size. The plane (w, b) is obtained by the solution of the quadratic program (1). Then, the classification function of a new data point x based on the plane is: predict(x) = sign(w.x – b). SVM can use some other classification functions, for example a polynomial function of degree d, a RBF (Radial Basis Function) or a sigmoid function. To change from a linear to non-linear classifier, one must only substitute a kernel evaluation in (1) instead of the original dot product. Unfortunately, the computational cost requirements of the SVM solutions in (1) are at least O(m2), where m is the number of training data points, making classical SVM intractable for large datasets. The LS-SVM proposed by Suykens and Vandewalle has used equality instead of the inequality constraints in the optimization problem (2) with a least squares 2-norm error into the objective function Ψ as follows: - minimizing the errors by (c/2)||z||2 - using the equality constraints D(Aw – eb) + z = e Thus substituting for z from the constraint in terms w and b into the objective function Ψ of the quadratic program (1), we get an unconstraint problem (2): min Ψ (w, b) = (1/2)||w||2 + (c/2)||e – D(Aw – eb)||2

(2)

In the optimal solution of (2), the gradient with respect to w and b will be 0. This yields the linear equation system of (n+1) variables (w1, w2, …, wn, b) as follows: Ψ'(w) = cAT(Aw – eb – De) + w = 0

(3)

150


Ψ'(b) = ceT(-Aw + eb + De) = 0

(4)

(3) and (4) are rewritten by the linear equation system (5):

[w1 w2 w3 ...wn b]T

−1

⎛1 ⎞ = ⎜ I o + E T E ⎟ E T De c ⎝ ⎠

(5)

where E = [A -e], Io denotes the (n+1)x(n+1) diagonal matrix whose (n+1)th diagonal entry is zero and the other diagonal entries are 1. The LS-SVM formulation (5) requires thus only the solution of linear equations of (n+1) variables (w1, w2, …, wn, b) instead of the quadratic program (1). If the dimensional input space is small enough, even if there are millions data points, the LS-SVM algorithm is able to classify them in some minutes on a single PC. The numerical test results have shown that this algorithm gives test accuracy compared to standard SVM like LibSVM but the LS-SVM is much faster than standard SVMs. An example of the effectiveness is given in [7] with the linear classification into two classes of one million data points in 20-dimensional input space in 13 seconds on a PC (3 GHz Pentium IV, 512 MB RAM).

3 Incremental Algorithm of LS-SVM Although the LS-SVM algorithm is fast and efficient to classify large datasets, it needs to load the whole dataset in memory. With a large dataset e.g. one billion data points in 20 dimensional input space, LS-SVM requires 80 GB RAM. Any machine learning algorithm has difficulties to deal with the challenge of large datasets. Our investigation aims at scaling up the LS-SVM algorithm to mine very large datasets on PCs (Intel CPUs). The incremental learning algorithms are a convenient way to handle very large datasets because they avoid loading the whole dataset in main memory: only subsets of the data are considered at any time and update the solution in growing training set. Suppose we have a very large dataset decomposed into small blocks of rows Ai, Di. The incremental algorithm of the LS-SVM can simply incrementally compute the solution of the linear equation system (5). More simply, let us consider a large dataset split into two blocks of rows A1, D1 and A2, D2(6):

⎡A ⎤ ⎡D A = ⎢ 1 ⎥, D = ⎢ 1 ⎣ A2 ⎦ ⎣0 ⎡A E = [A − e] = ⎢ 1 ⎣ A2

0 ⎤ ⎡e ⎤ ,e = ⎢ 1⎥ ⎥ D2 ⎦ ⎣e 2 ⎦

(6)

− e1 ⎤ ⎡ E1 ⎤ = − e 2 ⎥⎦ ⎢⎣ E 2 ⎥⎦

We illustrate how to incrementally compute the solution of the linear equation system (5) as follows:

Speed Up SVM Algorithm for Massive Classification Tasks T

⎡ E ⎤ ⎡D E De = ⎢ 1 ⎥ ⎢ 1 ⎣ E2 ⎦ ⎣ 0 T

[

0 ⎤ ⎡ e1 ⎤ T ⎥ ⎢ ⎥ = E1 D2 ⎦ ⎣e2 ⎦

]

⎡D E2T ⎢ 1 ⎣0

151

0 ⎤ ⎡ e1 ⎤ ⎥⎢ ⎥ D2 ⎦ ⎣e2 ⎦

(7)

E T De = E1T D1e1 + E2T D2 e2 T

[

⎡E ⎤ ⎡E ⎤ E E = ⎢ 1 ⎥ ⎢ 1 ⎥ = E1T E2T ⎣ E2 ⎦ ⎣ E 2 ⎦ E T E = E1T E1 + E2T E2 T

2 [w1w2 ...wnb] = ⎛⎜ 1 I o + ∑ EiT Ei ⎞⎟ i =1 ⎝c ⎠ T

]⎡⎢EE ⎤⎥ 1

⎣

2

(8)

⎦

−1 2

∑E

T i

(9)

Di ei

i =1

From the formulas (7), (8) and (9), we can deduce the formula (10) of the incremental LS-SVM algorithm with a very large dataset generally split into k small blocks of rows A1, D1, …, Ak, Dk:

[w1 w2 ...wn b]

T

⎛1 = ⎜ I o+ ⎜c ⎝

k

∑ i =1

E iT

⎞ Ei ⎟ ⎟ ⎠

−1

k

∑E

T i

Di e i

(10)

i =1

Consequently, the incremental LS-SVM algorithm presented in table 1 can handle massive datasets on a PC. The accuracy of the incremental algorithm is exactly the same as the original one. If the dimension of the input space is small enough, even if there are billions data points, the incremental LS-SVM algorithm is able to classify them on a simple PC. The algorithm only needs to store a small (n+1)x(n+1) matrix and two (n+1)x1 vectors in memory between two successive steps. The numerical test has shown the incremental LS-SVM algorithm can classify one billion data points in 20-dimensional input into two classes in 21 minutes and 10 seconds (except the time needed to read data from disk) on a PC (Pentium-IV 3 GHz, 512 MB RAM). Table 1. Incremental LS-SVM algorithm

Input: - training dataset represented by k blocks: A1, D1, ..., Ak, Dk - constant c to tune errors and margin size Training: init: ETE=0, d=ETDe=0 for i = 1 to k do load Ai and Di compute ETE=ETE+EiTEi and d=d+di (di=EiTDiei) end for solve the linear equation system (10) get the optimal plane (w,b)=w1, w2, ..., wn, b Classify new datapoint x according to f(x)=sign(w.x-b)

152


4 Parallel Incremental LS-SVM Algorithm Using GPU The incremental SVM algorithm described above is able to deal with very large datasets on a PC. However it only runs on one single processor. We have extended it to build a parallel version using a GPU (graphics processing unit). During the last decade, GPUs described in [24] have developed as highly specialized processors for the acceleration of raster graphics. The GPU has several advantages over CPU architectures for highly parallel, compute intensive workloads, including higher memory bandwidth, significantly higher floating-point, and thousands of hardware thread contexts with hundreds of parallel compute pipelines executing programs in a single instruction multiple data (SIMD) mode. The GPU can be an alternative to CPU clusters in high performance computing environments. Recent GPUs have added programmability and been used for general-purpose computation, i.e. non-graphics computation, including physics simulation, signal processing, computational geometry, database management, computational biology or data mining. NVIDIA has introduced a new GPU, the GeForce 8800 GTX and a C-language programming API called CUDA [16] (Compute Unified Device Architecture). A block diagram of the NVIDIA GeForce 8800 GTX architecture is made of 16 multiprocessors. Each multiprocessor has 8 streaming processors for a total of 128. Each group of 8 streaming processors shares one L1 data cache. A streaming processor contains a scalar ALU (Arithmetic Logic Unit) and can perform floating point operations. Instructions are executed in SIMD mode. The NVIDIA GeForce 8800 GTX has 768 MB of graphics memory, with a peak observed performance of 330 GFLOPS and 86 GB/s peak memory bandwidth. This specialized architecture can sufficiently meet the needs of many massively data-parallel computations. In addition, NVIDIA CUDA also provides a C-language API to program the GPU for general-purpose applications. In CUDA, the GPU is a device that can execute multiple concurrent threads. The CUDA software package includes a hardware driver, an Table 2. Incremental LS-SVM algorithm

Input: - training dataset represented by k blocks: A1, D1, ..., Ak, Dk - constant c to tune errors and margin size Training: init: ETE=0, d=ETDe=0 in GPU memory for i = 1 to k do load Ai and Di into CPU memory copy Ai and Di to GPU memory use CUBLAS to perform matrix computations on GPU ETE=ETE+EiTEi and d=d+di (di=EiTDiei) end for copy ETE and ETDe in CPU memory solve the linear equation system (10) in CPU get the optimal plane (w,b)=w1, w2, ..., wn, b Classify new datapoint x according to f(x)=sign(w.x-b)

Speed Up SVM Algorithm for Massive Classification Tasks

153

API, its runtime and higher-level mathematical libraries of common usage, an implementation of Basic Linear Algebra Subprograms (CUBLAS [17]). The CUBLAS library allows access to the computational resources of NVIDIA GPUs. The basic model by which applications use the CUBLAS library is to create matrix and vector objects in GPU memory space, fill them with data, call a sequence of CUBLAS functions and finally, upload the results from GPU memory space back to the host. Furthermore, the data transfer rate between GPU and CPU memory is about 2 GB/s. Thus, we developed a parallel version of incremental LS-SVM algorithm based on GPUs to gain high performance at low cost. The parallel incremental implementation in table 2 using the CUBLAS library performs matrix computations on the GPU massively parallel computing architecture. It can be used on any CUDA/CUBLAS compatible GPU (today around 200 different ones, all from NVidia). Note that in CUDA/CUBLAS, the GPU can execute multiple concurrent threads. Therefore, parallel computations are done in an implicit way. First, we split a large dataset A, D into small blocks of rows Ai, Di. For each incremental step, a data block Ai, Di is loaded into the CPU memory; a data transfer task copies Ai, Di from CPU to GPU memory; and then GPU computes the sums of EiTEi and di = EiTDiei in a parallel way. Finally, the results EiTEi and di = EiTDiei are uploaded from GPU memory space back to the CPU memory to solve the linear equation system (9). The accuracy of the new algorithm is exactly the same as the original one.

5 Results We prepared an experiment setup using a PC, Intel Core 2, 2.6 GHz, 2 GB RAM, Nvidia GeForce 8800 GTX graphics card with NVIDIA driver version 6.14.11.6201 and CUDA 1.1, running Linux Fedora Core 6. We implemented two versions (GPU and CPU code) of parallel and incremental LS-SVM algorithm in C/C++ using NVIDIA’s CUDA, CUBLAS API and the high performance linear algebra package, Lapack++ [9]. The GPU implementation results are compared with the CPU results under Linux Fedora Core 6. We have only evaluated the computational time without the time needed to read data from disk and for the GPU implementation, the time for GPU/CPU data transfer is taken into account (so the only computational time is about 10 to 50% lower than the ones listed). We focus on numerical tests with large datasets from the UCI repository, including Forest cover type, KDD Cup 1999 and Adult datasets (cf. table 3). We created two Table 3. Dataset description

Dataset Adult Forest Cover Types KDD Cup 1999 Ringnorm-1M Ringnorm-10M

# dimensions 110 20 41 20 20

Training set 32,561 495,141 4,868,429 1,000,000 10,000,000

Test set 16,281 45,141 311,029 100,000 1,000,000

154

T.-N. Do, V.-H. Nguyen, and F. Poulet Table 4. Classification results

Dataset GPU 0.03 0.10 0.70 0.07 0.61

Adult Forest Cover Types KDD Cup 1999 Ringnorm-1M Ringnorm-10M

Time (s) CPU 2.00 10.00 55.67 3.33 31.67

Accuracy (%) ratio 66.67 100 79.53 47.57 51.92

85.08 76.41 91.96 75.07 76.68

other massive datasets by using the RingNorm software program with 20 dimensions, 2 classes and 1 and 10 million rows. Each class is created from a multivariate normal distribution. Class 1 has mean equal to zero and covariance 4 times the identity. Class 2 (considered as -1) has unit covariance with mean = 2/sqrt(20). First, we have split the datasets into small blocks of rows. For the CPU implementation, we have varied the block size for each incremental step from one thousand data points to fully in main memory to compare the time needed to perform the classification task, the best results are obtained for small sizes using only the main memory (and not the secondary memory). Therefore we have finally split the datasets into small blocks of fifty thousand data points to reach good performances on our experiment setup. The classification results obtained by GPU and CPU implementations of the incremental LS-SVM algorithm are presented in table 4. The GPU version is about 70 times faster than the CPU implementation. 60 CPU

50

40

time(s)

CPU

30

20

CPU 10 CPU CPU GPU

GPU

GPU

GPU

GPU

0 Adult

Forest Cover Types

KDD Cup 1999

Ringnorm-1M

Fig. 2. GPU vs. CPU execution times

Ringnorm-10M


155

For the Forest cover type dataset, the standard LibSVM algorithm ran during 21 days without any result. However, recently-published results show that the SVM-perf algorithm performed this classification in 171 seconds on a 3.6 GHz Intel Xeon processor with 2 GB RAM. This means that our GPU implementation of incremental LSSVM is probably about 1500 times faster than SVM-Perf. The KDD Cup 1999 dataset consists of network data indicating either normal connections (negative class) or attacks (positive class). LibSVM ran out of memory. CB-SVM has classified the dataset with over 90 % accuracy in 4750 seconds on a Pentium 800 MHz with 1GB RAM, while our algorithm achieved over 91 % accuracy in only 0.7 second. It appears to be about 6000 times faster than CB-SVM. The numerical test results showed the effectiveness of the new algorithm to deal with very large datasets on GPUs. Furthermore, we have said our new algorithm is an incremental one, but it is also decremental and so it is particularly suitable for mining data stream. We add blocks of new data and remove blocks of old data, the model is then updated from the previous step. It is very easy to get any temporal window on the data stream and compute the corresponding model. Domingos and Hulten [8] have listed the criteria a data mining system needs to meet in order to mine high-volume, open-ended data streams: -

-

-

it must require small constant time per record: our algorithm has a linear complexity with the number of data points, it must use only a fixed amount of main memory, irrespective of the total number of records it has seen: the amount of memory used is determined by the block size and is fixed, it must be able to build a model using at most one scan of the data: we only read and treat each data point once, it must make a usable model available at any point in time: we can compute the model at any time with the data points already read, ideally, it should produce a model that is equivalent (or nearly identical) to the one that would be obtained by the corresponding ordinary database mining algorithm, operating without the above constraints: the results obtained are exactly the same as the original sequential algorithm, when the data-generating phenomenon is changing over time (i.e., when concept drift is present), the model at any time should be up-to-date, but also include all information from the past that has not become outdated: our algorithm is incremental and decremental and can follow concept drift very easily and efficiently. Our algorithm meets all the criteria to deal efficiently with very large datasets.

6 Conclusion and Future Work We have presented a new parallel incremental LS-SVM algorithm being able to deal with very large datasets in classification tasks on GPUs. We have extended the recent LS-SVM algorithm proposed by Suyken and Vandewalle in two ways. We developed an incremental algorithm for classifying massive datasets. Our algorithm avoids loading the whole dataset in main memory: only subsets of the data are considered at any

156


one time and updates the solution in growing training set. We developed a parallel version of incremental LS-SVM algorithm based on GPUs to get high performance at low cost. We evaluated the performances in terms of learning time on very large datasets from the UCI repository and Delve. The results showed that our algorithm using GPU is about 70 times faster than a CPU implementation. We also compared the performances of our algorithm with the efficient standard SVM algorithm LibSVM and with two recent algorithms, SVM-perf and CB-SVM. Our GPU implementation of incremental LS-SVM is probably more than 1000 times faster than LibSVM, SVM-Perf and CB-SVM. We also applied these ideas to build other efficient SVM algorithms proposed by Mangasarian and his colleagues: Newton SVM (NSVM [14]) and Lagrangian SVM (LSVM [15]) in the same way, because they have the same properties as LS-SVM. The new implementation based on these algorithms is interesting and useful for classification on very large datasets. We have only used one GPU (versus one CPU) in our experiments and we have shown the computation time is already more than 1000 times faster than standard SVM algorithms like LibSVM, SVM-Perf and CB-SVM. The GPU parallel implementation is performed implicitly by the library, may be it can be improved. A first and obvious improvement is to use a set of GPUs. We have already shown [19] the computation time is divided by the number of computers (or CPU/GPU) used (if we do not take into account the time needed for the data transfer), if we use for example 10 GPUs, the computation time is divided by 10,000 compared to usual algorithms. This means a classification task needing previously one year can now be performed in only one hour (using 10 GPUs). This is a significant improvement of the machine learning algorithms to massive data-mining tasks and may pave the way for new challenging applications of data-mining. Another forthcoming improvement will be to extend our methods to construct other SVM algorithms that can deal with non-linear classification tasks.

References 1. Asuncion, A., Newman, D.J.: UCI Repository of Machine Learning Databases, http://archive.ics.uci.edu/ml/ 2. Boser, B., Guyon, I., Vapnik, V.: A Training Algorithm for Optimal Margin Classifiers. In: Proc. of 5th ACM Annual Workshop on Computational Learning Theory, Pittsburgh, Pennsylvania, pp. 144–152 (1992) 3. Cauwenberghs, G., Poggio, T.: Incremental and Decremental Support Vector Machine Learning. In: Advances in Neural Information Processing Systems, vol. 13, pp. 409–415. MIT Press, Cambridge (2001) 4. Chang, C.-C., Lin, C.-J.: LIBSVM – A Library for Support Vector Machines (2003), http://www.csie.ntu.edu.tw/~cjlin/libsvm/ 5. Delve, Data for evaluating learning in valid experiments (1996), http://www.cs.toronto.edu/~delve 6. Do, T.-N., Fekete, J.-D.: Large Scale Classification with Support Vector Machine Algorithms. In: Proc. of ICMLA 2007, 6th International Conference on Machine Learning and Applications, pp. 7–12. IEEE Press, Ohio (2007)


157

7. Do, T.-N., Poulet, F.: Classifying one Billion Data with a New Distributed SVM Algorithm. In: Proc. of RIVF 2006, 4th IEEE International Conference on Computer Science, Research, Innovation and Vision for the Future, Ho Chi Minh, Vietnam, Vietnam, pp. 59– 66 (2006) 8. Domingos, P., Hulten, G.: A General Framework for Mining Massive Data Streams. Journal of Computational and Graphical Statistics 12(4), 945–949 (2003) 9. Dongarra, J., Pozo, R., Walker, D.: LAPACK++: a design overview of object-oriented extensions for high performance linear algebra. In: Proc. of Supercomputing 1993, pp. 162– 171. IEEE Press, Los Alamitos (1993) 10. Fayyad, U., Piatetsky-Shapiro, G., Uthurusamy, R.: Summary from the KDD-03 Panel – Data Mining: The Next 10 Years. In: SIGKDD Explorations, vol. 5(2), pp. 191–196 (2004) 11. Fung, G., Mangasarian, O.: Incremental Support Vector Machine Classification. In: Proc. of the 2nd SIAM Int. Conf. on Data Mining SDM 2002 Arlington, Virginia, USA (2002) 12. Guyon, I.: Web Page on SVM Applications (1999), http://www.clopinet.com/ isabelle/Projects/SVM/app-list.html 13. Joachims, T.: Training Linear SVMs in Linear Time. In: Proc. of the ACM SIGKDD Intl Conf. on KDD, pp. 217–226 (2006) 14. Mangasarian, O.: A Finite Newton Method for Classification Problems. Data Mining Institute Technical Report 01-11, Computer Sciences Department, University of Wisconsin (2001) 15. Mangasarian, O., Musicant, D.: Lagrangian Support Vector Machines. Journal of Machine Learning Research 1, 161–177 (2001) 16. NVIDIA® CUDATM, CUDA Programming Guide 1.1 (2007) 17. NVIDIA® CUDATM, CUDA CUBLAS Library 1.1 (2007) 18. Platt, J.: Fast Training of Support Vector Machines Using Sequential Minimal Optimization. In: Schoelkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods – Support Vector Learning, pp. 185–208 (1999) 19. Poulet, F., Do, T.-N.: Mining Very Large Datasets with Support Vector Machine Algorithms. In: Camp, O., Filipe, J., Hammoudi, S. (eds.) Enterprise Information Systems V, pp. 177–184. Kluwer Academic Publishers, Dordrecht (2004) 20. Suykens, J., Vandewalle, J.: Least Squares Support Vector Machines Classifiers. Neural Processing Letters 9(3), 293–300 (1999) 21. Syed, N., Liu, H., Sung, K.: Incremental Learning with Support Vector Machines. In: Proc. of the 6th ACM SIGKDD Int. Conf. on KDD 1999, San Diego, USA (1999) 22. Tong, S., Koller, D.: Support Vector Machine Active Learning with Applications to Text Classification. In: Proc. of ICML 2000, the 17th Int. Conf. on Machine Learning, Stanford, USA, pp. 999–1006 (2000) 23. Vapnik, V.: The Nature of Statistical Learning Theory. Springer, New York (1995) 24. Wasson: Nvidia’s GeForce 8800 graphics processor, Technical report, PC Hardware Explored (2006) 25. Yu, H., Yang, J., Han, J.: Classifying Large Data Sets Using SVMs with Hierarchical Clusters. In: Proc. of the ACM SIGKDD Intl Conf. on KDD, pp. 306–315 (2003)

Speed Up SVM Algorithm for Massive ... - Semantic Scholar

Speed Up SVM Algorithm for Massive ... - Semantic Scholar

Suggest Documents

A Fast Parallel SVM Algorithm for Massive ... - Semantic Scholar

Using a multi-objective genetic algorithm for SVM ... - Semantic Scholar

Support Vector Reduction in SVM Algorithm for ... - Semantic Scholar

A Hash-based Hierarchical Algorithm for Massive ... - Semantic Scholar

a geometric compression algorithm for massive ... - Semantic Scholar

A self-training semi-supervised SVM algorithm and ... - Semantic Scholar

Speeding Up Multi-class SVM Evaluation by PCA ... - Semantic Scholar

A Grid-Based Distributed SVM Data Mining Algorithm - Semantic Scholar

A Fast Bypass Algorithm for High-Speed Networks - Semantic Scholar

Getting Back Up to Speed: Cognitive Principles for ... - Semantic Scholar

Using dial-up lightpaths for high-speed access - Semantic Scholar

Edge Reduction for EVMDDs to Speed Up ... - Semantic Scholar

SVM Optimization for Lattice Kernels - Semantic Scholar

SVM Optimization for Lattice Kernels - Semantic Scholar

A NEW FAST TRAINING ALGORITHM FOR SVM

A viral system massive infection algorithm to solve ... - Semantic Scholar

SVM: Semantic Versioning Manager - Semantic Scholar

cage motor faults detection algorithm using speed ... - Semantic Scholar

A High-Speed Square Root Algorithm in Extension ... - Semantic Scholar

bridging the semantic gap using ranking svm for ... - Semantic Scholar

Research on the Filtering Algorithm in Speed and ... - Semantic Scholar

Modeling Speed-up and Transfer of Declarative ... - Semantic Scholar

Lizards speed up visual displays in noisy motion ... - Semantic Scholar

Gridifying IBM's Generic Log Adapter to Speed-up ... - Semantic Scholar