Yong-jie WANG, Yi-bo WANG, Dun-wei DU, Yan-ping BAI*
Application of CS-SVM Algorithm based on Principal Component Analysis in Music Classification Abstract: Data of music type have been increasing dramatically, however, the traditional classification method is slow and with lower accuracy in practical application of large scale music classification. In order to improve the accuracy of classification, the CS-SVM music classification model based on principal component analysis is proposed. Firstly, the music characteristic is extracted by Mel frequency cepstrum coefficient method; secondly, using the method of principal component analysis to reducing the dimensions of characteristic signal; finally, we build the CS-SVM classification model to classify the music, which shows that the average correct rate of the CS-SVM algorithm based on principal component analysis has reached 95.30%. Compared with traditional models, CS-SVM model has higher accuracy. The results show that this method can be effectively applied to music classification. Keywords: Principal Component Analysis; Cuckoo search; Support vector machine; Classifications
1 Introduction Music type classification is an important part of multimedia applications. With the rapid development of the data-storage technology and internet technology, the data of the music type has an explosive growth. Traditional manual retrieval has been unable to meet the retrieval and classification of massive music information, so computer automatic classification is proposed. By extracting the features of the audio signals, we can get music category. Music classification is essentially a pattern recognition problem, which mainly consists of characteristic extraction and classification. At present, researchers have proposed some methods, such as audio classification method based on rules [1], pattern matching method, BP neural network method based on characteristic extraction and so on. However, these methods have the disadvantages of large amount of computation and low classification accuracy. In order to avoid the shortcomings of traditional methods, the CS-SVM music classification method based on principal
*Corresponding author:Yan-ping BAI, College of Science, North University of China, Taiyuan, China, e-mail:
[email protected] Yong-jie WANG, Yi-bo WANG, College of Science, North University of China, Taiyuan, China Dun-wei DU, Electromechanical Engineering Institute, Beijing, China
Unauthenticated Download Date | 12/31/17 6:01 PM
414
Application of CS-SVM Algorithm
component analysis is raised. In this method, Mel frequency cepstrum coefficient is used to extract the characteristic signal of music, then characteristic vectors are reduced by principal component analysis, at last we built a CS-SVM classification model. The simulation results show that this method has higher classification accuracy.
2 Principles of music classification Music classification is essentially a pattern recognition process, hence the classification process meet the general pattern recognition application process. In this paper, we design the process of music classification using pattern recognition theory. The basic process is as follows: 1. Collecting the training and testing data for music classification. 2. Extracting characteristic signals of music data and selecting a classification model. 3. Training model and determining model parameters. 4. Testing model classification performance.
3 PCA-CS-SVM music classification models 3.1 Music characteristic extraction Mel frequency cepstral coefficients (MFCCs) is a feature widely used in automatic speech and speaker recognition. They were introduced by Davis and Mermelstein in the 1980’s, and have been state-of-the-art ever since [2]. MFCCs are commonly derived as follows: 1. Take the Fourier transform of (a windowed excerpt) a signal. 2. Map the powers of the spectrum obtained above onto the Mel scale, using triangular overlapping windows. 3. Take the logs of the powers at each of the Mel frequencies.
N −1 e( = j ) log ∑ ω jk ⋅ sk K =0
j 1, 2, ⋅ ⋅ ⋅, p =
(1)
e(j) is the j filter log energy output; ωjk is the weight of the j triangle filter of the k point; |Sk| is the amplitude of the DCT spectrum transformed to Mel scale; p is the number of filter, generally 24. 4. Take the discrete cosine transform of the list of Mel log powers, as if it were a signal. 5. The MFCCs are the amplitudes of the resulting spectrum.
Unauthenticated Download Date | 12/31/17 6:01 PM
Application of CS-SVM Algorithm
415
iπ 2 p (2) ∑ e ( j ) ⋅ cos p ( j − 0.5) , =i 1, 2, ⋅ ⋅ ⋅, L p j =1 L is the dimension of the coefficient of MFCCs, in general L≤P, L is 24 in this paper. xi =
3.2 Principal component analysis Principal component analysis is an optimization algorithm, which was introduced by Pearson in 1901. Principal component analysis (PCA) is a multivariate statistical method to transform several original variables into few principal components. The principal component variables still reflect most information of the original variable after dimensionality reduction. Generally the principal component is a linear combination of the original variables which they are linearly independent [3]. Defining the data matrix as X, each row of X represents a data value and each column of X represents an index variable. If there are p variables X1,X2,...Xp, the linear combination of variables can be formulated as:
= Y1 a11 X 1 + a12 X 2 + ⋅ ⋅ ⋅ + a1 p X P = Y2 a21 X 1 + a22 X 2 + ⋅ ⋅ ⋅ + a2 p X P (3) ⋅⋅⋅ ⋅⋅⋅ ⋅⋅⋅ Y a X + a X + ⋅ ⋅ ⋅ + a X = p1 1 p2 2 pp P p
The formula can be abbreviated as:
= Yi a1i X 1 + a2i X 2 + ⋅⋅⋅ + a pi X P= , i 1, 2, ⋅⋅⋅ p
(4)
The coefficients of the PCA model must satisfy the following conditions: 1. Yi , Y j are nonlinear correlation. k 1, 2, ⋅ ⋅ ⋅, p 2. Cov (Yk , Yk +1 ) > Cov (Yk +1 , Yk + 2 ), = 2 ak21 + ak22 + ⋅ ⋅ ⋅ + a= 1,k = 1,2 ,⋅ ⋅ ⋅ p kp
(5)
The total variance of the principal component is equal to the total variance of the original variable. The proportion of the variance of Yi is called the contribution rate of the principal component. The sum of the contribution rate of the former m principal components was called as the cumulative contribution rate of the former m principal components. The cumulative contribution rate of the former m principal component is not lower than a limit value (usually 85%). Then former m principal components were extracted as input variables to realize the dimension reduction of the variables.
Unauthenticated Download Date | 12/31/17 6:01 PM
416
Application of CS-SVM Algorithm
3.3 Support vector machine Support vector machine is a new learning method based on statistical theory proposed by Vapink firstly. SVM model basic contents: the data is mapped to high dimensional space F by nonlinear mapping function φ(X), and we find decision surface (such as (6)) in high dimension space by linear regression, then the classification can be achieved [4]. In order to find out the best decision surface, we need to introduce the conditions (7):
f ( x) = ω ⋅ ϕ ( X ) + b=0
yi (ωϕ ( xi ) + b ) ≥ 1
(6)
(7)
ω is the weight vector, b is the threshold value. Because we can’t get the correct classification result by linear decision surface, it need add slack variable and penalty factor to ensure the correctness of the classification [5]. n 1 min ω 2 + C ∑ ε i (8) 2 i =1
Constraint conditions:
yi ( wϕ ( xi ) + b ) ≥ 1 − ε i
ε i ≥ 0,= i 1, 2 ⋅ ⋅ ⋅ n
(9)
In the formula, C is the penalty factor, C>0, εi is slack variable. In order to avoid the course of dimensionality, we use the kernel function k(xi,yi)to replace the vector inner product (φ (xi),φ(yi)) in nonlinear partition. Using Lagrange multiplier algorithm, the problem can be transformed into the next formula:
n 1 n ∂ ∂ y y ϕ x ϕ x + ( ) ( ) ∑ i ji j i ∑ ∂i j 2 i , j 1 =i 1 =
min
(
)
(10)
and (10) meets these conditions: n
= ∑ ∂i yi 0 i =1
(C > ∂i > 0)
(11)
In the formula, the point ( ∂ i > 0 ) is called support vector. A mathematical model SVM for the classification of characteristic signal is constructed.
n f ( x ) =sign ∑ ∂ i yi k ( xi , x j ) + b (12) i , j =1
In this paper, the radial basis function is selected as the kernel function of SVM, and its definition is as follows:
x −x i j k ( xi ,= yi ) exp − 2 2δ
2
(13)
Unauthenticated Download Date | 12/31/17 6:01 PM
Application of CS-SVM Algorithm
417
In the formula, δ is the width parameter of radial basis kernel function. We can find that the selection of penalty factor C and width parameter δ has a great influence on the accuracy and precision of SVM classification[6]. If C is too small, punishment is too small, the error is large; if the C is too large, the generalization ability is poor [7]. δ is related to the map of low dimension space to high dimension space, which directly affects the complexity of decision surface.
3.4 Cuckoo search algorithm Cuckoo Search (CS) is a swarm intelligence optimization algorithm proposed by Yang in 2009. It is based on cuckoo populations of parasitic reproductive strategies and the Levy fight model of the birds and the fruit. The algorithm has fast convergence and less parameters, easy realization etc [8]. It has been successfully used in engineering optimization, designing optimization and other practical problems. In order to simulate the nest-searching behavior, CS algorithm set up the following three basic principles [9]. 1. Each cuckoo lays one egg at a time, and dumps it in a randomly chosen nest. 2. The best nests with high quality of eggs (solutions) will carry over to the next generations. 3. The number of available host nests is fixed, and a host can discover an alien egg with a probability. In this case, the host bird can either throw the egg away or abandon the nest so as to build a completely new nest in a new location. ( t +1) Based on these three rules, when generating new solutions xi , say cuckoo i, a Levy flight is performed stepmax ) xi( = xi( ) + ∂ ⊕ L ( λ ) t +1
t
= i 1,2 ,⋅ ⋅ ⋅,n
(14)
In the formula, ∂ represents the step size control. The product ⊕ means entry-wise multiplications. L(λ) represents a Levy random search path. When the position is updated, a random number r is generated ( 0 < r < 1 ). If r > Pa , xi(t +1) will change randomly; otherwise unchanged. At last, the group of nest location will be remained whose test value is best. The basic CS algorithm uses Levy flight to generate the step size, which is random, lack of adaptability, unable to ensure fast convergence. In order to solve the relationship between the global optimization ability and accuracy, the size of the step will be adjusted adaptively in this paper
stepi =stepmin + ( stepmax − stepmin ) di (15)
In this formula, stepmax is the maximum step and stepmin is the minimum step.
di =
ni − nbest d max
(16)
Unauthenticated Download Date | 12/31/17 6:01 PM
418
Application of CS-SVM Algorithm
In the formula, nbest is the best position of the nest in t time. ni is the location of the nest i. dmax is the maximum distance between best nest location and other nests. 3.5 PCA-CS-SVM model music classification process 1. Collecting music classification data for training and testing; the characteristic signal of music data will be extracted by MFCCs, and make label. 2. To reduce the dimension of the characteristic signal by PCA. 3. According to experience, determine range of C, δ; determine stepmax, stepmin and max-epoch number Nmax. 4. Initializing probability parameter Pa = 0.75 and generating randomly N nest location. The training set is calculated to cross validation error, finding the current optimal nest location and the corresponding minimum error. 5. Keep the best nest position which error is minimum in last generation. 6. The bird nest is updated by Levy fright. The Levy fright step is calculated by formula (15). Then we can get a new set of bird nest location, and calculate the error. 7. According to the classification error, we can compare the new nest position with the position of last generation nest. If new nest position is better, last generation will be replaced by new nest position; otherwise, the last generation nest will be remained. 8. Comparing Pa with the random number r, the nest which is discovered with a low probability will be keep. The nest which is discovered with a high probability will change randomly, then corresponding classification error is calculated. Comparing the classification error in the position of each nest in p, a new set of better nest is obtained by replacing the larger nest position with the smaller error of the nest. t 9. The best nest xb will be found out in last step. Then we will make a judgment whether the minimum error meets the accuracy requirements or not. If accuracy is satisfied, stop iteration, and output the global minimum error and corresponding best nest. Otherwise, return to step four. 10. The testing set will be retained with SVM in which C, δ is determined according to t the best nest location xb . Then music classification model will be built, which is used to predict the classification of the test set.
4 Simulation experiment In order to verify the classification performance of the PCA-CS-SVM model, the PCACS-SVM model will classify four different music signals folk song, zither, pop and rock. For each piece of music, using MFCCs to extract 500 groups of 24 dimensions
Unauthenticated Download Date | 12/31/17 6:01 PM
Application of CS-SVM Algorithm
419
characteristic signal. At the same time, the characteristic signals are identified by 1,2,3,4. The extracted signals are stored in a database file, each group of data is 25 dimensions, the first dimension is the category identifier, and the later 24 dimensions are the characteristic signal. Four sets of data are combined as a set, and reducing dimension of the data by PCA. From result, we can find that the cumulative contribution rate of the first 17 principal components was 86.36%, so the first 17 data are selected as the experimental input. 1500 sets of data were randomly selected as the training data, and the remaining 500 groups was tested classification performance of the model. Set the desired output of each group according to the characteristic signal classification, if the identity is 1, the desired output is set to vector [1, 0, 0, 0]. The test data are classified by the trained PCA-CS-SVM model. The classification error of the model is shown in Figure 1. The prediction error of the model is shown in Figure 2. The classification accuracy is shown in odel has a higher accuracy and efficiency than that of other models in music classification. Furthermore, in order to verify the superiority of the model, we conduct three sets of experiments, BP neural network, additional momentum BP Table 1. From the result we know that the music classification algorithm based on PCA-CS-SVM malgorithm, PCA-SVM algorithm. In the Table we can see, PCA-CS-SVM model have higher accuracy than other models.
PCA-CS-SVM model
2 1.5
classification error
1 0.5 0 -0.5 -1 -1.5 -2
0
50
100
150
200
250
300
music signal
350
400
450
500
Figure 1. PCA-CS-SVM model classfication error
Unauthenticated Download Date | 12/31/17 6:01 PM
420
Application of CS-SVM Algorithm
4 forecast classification actual classification 3.5
3
2.5
2
1.5
1
0
50
100
150
200
250
300
350
400
450
500
Figure 2. PCA-CS-SVM forecast Table 1. Model classification accuracy Music signal
PCA-CS-SVM
PCA- SVM
BP
AMBP
Folk song
91.43
89.00
78.29
84.26
Zither
99.90
99.12
94.00
94.35
Pop
93.50
81.29
68.12
88.32
Rock
96.36
98.86
90.46
88.50
5 Conclusions Automatic music classification provides a new method to solve the large volume of music classification. Based on the extraction characteristics of music signal by MFCCs, PCA-CS-SVM model is used to classify music, the average classification accuracy rate reached 95.30%. In particular, the Zither classification accuracy rate reaches 99.90%. Experiments show that this method is reasonable and feasible. Acknowledgment: National Natural Science Foundation of China (61275120)
Unauthenticated Download Date | 12/31/17 6:01 PM
Application of CS-SVM Algorithm
421
References [1] S G Mallat. “A Theory for Multire solution Signal Decomposition: the Wavelet Representation,” IEEE Trans Pattern Analysis and Machine Intelligence,vol.11,Jul.1989,pp.674-693.doi: 10.1109/34.192 463. [2] S. Molau. “Computing Mel-frequency cepstral coefficients on the power spectrum,” IEEE International Conference on Acoustics,vol.8,May.2001,pp.73-76.doi: 10.1109/ICASSP.2001.940 770. [3] R Bro, AK Smilde. “Principal component analysis,” Analytical Methods,Mar. 2014, pp.433–459. doi: 10.1039/C3AY41907J. [4] Astorino A,Gorgone E,Gaudioso M,et al. “Data preprocessing in semi - supervised SVM classification,” Optimization, vol.60, Jan. 2011, pp. 143-151. doi: 10.1080/02331931003692557. [5] Horng M H. “Performance evaluation of multiple classification of the ultrasonic supraspinatus images by usingML, RBFNN and SVM classifiers,” Expert Systems with Applications,vol.37,Nov. 2010,pp. 4146-4155. doi: 10.1016/j.eswa.2009.11.008. [6] XS Yang, S Deb. “Engineering Optimisation by Cuckoo Search,” Mathematical Modelling and Numerical Optimisation, vol.1,May.2010,pp.330-343. doi: 1005.2908v3. [7] Xin-She Yang, Suash Deb. “Multiobjective cuckoo search for design optimization,” Computers &OperationsResearch,Vol.40, Jun. 2013, pp. 1616-1624,doi:10.1016/j. cor.2011.09.026. [8] Zhang Xiaoli, Chen Xuefeng, He Zhengjia. “An ACO-based algorithm for parameter optimization of support vector machines,” Expert Systems with Applications, vol.37,Sep.2010, pp.6618-6628.doi: 10.1016/j.eswa.2010.03.067. [9] Wu C H, Ken Y, Huang T. “Patent classification system using a new hybrid genetic algorithm support vector machine,”Applied Soft Computing,vol.10,Sep.2010,pp.1164-1177.doi: 10.1016/j. asoc.2009. 11.033.
Unauthenticated Download Date | 12/31/17 6:01 PM