Proceeding of the IEEE International Conference on Information and Automation Yinchuan, China, August 2013
Urban Vehicle Classification Based on Linear SVM with Efficient Vector Sparse Coding Tao Ma and Yuexian Zou*
Qing Ding
School of Electronic and Computer Engineering Peking University Shenzhen, Guangdong Province, China * Correspondence Author:
[email protected]
Shenzhen Traffic Science and Technology Institute Shenzhen, Guangdong Province, China
[email protected]
Abstract - This paper presents a new method to solve the urban vehicle classification problem by incorporating an efficient vector sparse coding technique with the linear support vector machine (SVM) classifier. Essentially, SIFT descriptors are able to give good local characteristics of a vehicle image. However, in general, SIFT feature vectors are nonlinearly discriminated. With sparse coding, the SIFT feature vectors can be firstly projected to a higher dimensional feature domain where the resultant sparse code vectors may be more distinguishable than those in original feature domain and thus the linear SVM classifier can be adopted. Conventional vector sparse coding is computationally expensive which reduces the practical value of sparse coding for real vehicle classification applications. In this paper, an efficient L2-norm constraint based vector sparse coding algorithm for vehicle classification has been formulated and derived accordingly. The performance evaluations using real vehicle images extracted from surveillance video data are carried out and six vehicle classes (bus, truck, SUV, van, car, and motorcycle) are considered. Experimental results validate the effectiveness of the proposed method and it is encouraged to see that a good classification performance is achieved.
Classifier Design (Training) Input Video/Images
Decision
Fig. 1 The general framework of supervised vehicle classification.
adopted to classify three vehicle classes (car, van, and HGV). In [2], the random forest and the nonlinear SVM classifiers have been used to evaluate the performance of vehicle classification, where four classes (car, van, bus, and motorcycle) were tested. In their study, the measures of size and shape derived from each binary silhouette of vehicles were taken to form the feature vectors, and their experimental results showed that the nonlinear SVM classifier with a Gaussian kernel function outperforms the random forest classifier. In [3], the vehicle silhouette and the intensity-based pyramid HOG features have been extracted to train a nonlinear SVM classifier with a Gaussian kernel function. Ma and Grimson [7] used edge-based features and modified scale invariant feature transform (SIFT) descriptors to represent vehicle images, and a Bayesian decision rule was adopted to distinguish between sedan, minivan and taxi. It is noted that, from literature studies, the nonlinear SVM classifier has been widely used and highly successful in many vehicle classification tasks. Compared with the linear SVM classifier, the nonlinear SVM classifier usually has a better classification performance, mainly because it maps all the data into a higher dimensional space by using the nonlinear kernel functions. However, the nonlinear SVM classifier asks for the computational complexity of O(n3) in training and O(n) in testing, where n is the number of support vectors which grows linearly with the training size [14]. The high computational complexity implies a poor scalability of the nonlinear SVM classifier for real applications, where the training size is typically large. On contrary, the linear SVM classifier has a computational complexity of O(n) in training and a constant complexity in testing, which is significantly lower than that of the nonlinear SVM. Hence, if the linear SVM classifier can be applied to achieve a good classification performance, it is preferred over the nonlinear SVM to provide the tradeoff
Index Terms – urban vehicle classification, vector sparse coding, L2-norm constraint, linear SVM, classification accuracy.
I. INTRODUCTION Vision-based traffic monitoring has increased the support for traffic management due to the advantages of low installation cost and a wide range of information it contains. Automatic urban vehicle classification is one of the key functions of traffic management systems, which has been used in Electronic Toll Collection (ETC), traffic control, traffic flow analysis, parking optimization, surveillance, etc. Research shows that there are two major challenging problems that motivate the research of automatic urban vehicle classification in a surveillance framework. One is the robustness and reliability of the classification under varying and complex environments, and the other is the high computational complexity of classifiers which restricts the scalability in practical applications. Vision-based vehicle classification is a typical pattern recognition problem. The general framework of the vehicle classification in a supervised manner is shown in Fig. 1, which is followed by most previous works. There are many solutions proposed for solving this problem. Chen et al. [1] took the width and the aspect ratio of the vehicle foreground blob to form the feature vector. The nonlinear support vector machine (SVM) classifier with a polynomial kernel function has been
978-1-4799-1334-3/13/$31.00 ©2013 IEEE
Feature Extraction and Selection
527
between computational complexity and classification accuracy, when the scalability of training and the speed of testing is the main concern. The features derived from the silhouette and edges are popular in urban vehicle classification. However the reliability of these features needs high requirement in vehicle foreground extraction and image resolution. Generally the size and quality of vehicle patches are limited in surveillance videos, and varying lighting conditions further complicate the problem. To tackle these challenges, considering the geometric stability of vehicles and lack of clutter after foreground extraction, the densely sampled SIFT descriptors [13] are used in this paper to capture the discriminative information and give a robust primary representation of a vehicle. In this paper, aiming to develop an effective urban vehicle classification method, we use a new scheme based on sparse coding to map all the extracted SIFT feature vectors into a higher dimensional space, which can obtain the more discriminative feature representation of vehicles, and then the linear SVM classifier is adopted in the sparse feature domain. The systematic block diagram of the proposed method for vehicle classification is shown in Fig. 2. In Fig. 2, the vector sparse coding can be designed with different optimization cost functions. For example, the vector sparse coding algorithm proposed by Yang et al. in [4] is based on L1-norm sparsity constraint to obtain the sparse representation of a vector, which is implemented in this paper and termed as the L1-norm-VSC algorithm. In general, sparse coding consists of two main stages: dictionary learning and vector sparse coding. Careful evaluation shows that the dictionary learning can be done in an off-line manner. Hence, for a real application, the computational cost of the vector sparse coding is the main concern. For the L1-norm-VSC algorithm, the process of solving L1-norm optimization problem is however computationally expensive, which reduces the practical value of sparse coding for real vehicle classification applications. In order to reduce the computational cost, we develop an efficient vector sparse coding algorithm based on L2-norm constraint, which is termed as the L2-norm-VSC algorithm. In the rest of this paper, Section II details the vehicle feature extraction and the process of coding SIFT vectors by using the L1-norm-VSC algorithm and the proposed L2-normVSC algorithm, respectively. Section III describes the vehicle classification using the linear SVM. Section IV shows the experimental results and Section V gives the conclusion. Input Images
Feature Extraction
Vehicle Types
(a)
Fig. 3 Vehicle ROI extraction: (a) a video frame; (b) the vehicle ROI (Region of Interest) extracted from (a).
II. VECTOR SPARSE CODING ON SIFT DESCRIPTOR First, the vehicle ROI (Region of Interest) of each video frame can be extracted by using the background subtraction to reduce the clutter in a surveillance framework. For example, Fig. 3(a) shows a video frame and its vehicle ROI (bus) is shown in Fig. 3(b). Each vehicle ROI is converted into a grayscale image as the input of the SIFT extraction. In order to facilitate the description, let Is denote the vehicle ROI image extracted from a video frame. Ys is the SIFT feature matrix of Is, which consists of all the SIFT feature vectors extracted from Is. Xs denotes the sparse code matrix obtained by doing the vector sparse coding on each SIFT feature vector in Ys. A. SIFT Feature Vector Extraction SIFT [5] is empirically shown to outperform many other local descriptors [6]. A SIFT feature vector (SIFT-FV) is created by first computing the gradient orientation and magnitude at each image sample point in a region around an anchor point. The region is divided into r×r subregions. A gradient orientation histogram for each subregion is then formed by accumulating samples within the subregion, weighted by the corresponding gradient magnitudes. All the obtained orientation histograms from subregions are concatenated to give a SIFT-FV [7]. Generally, the SIFT-FV with the best performance [5] is extracted from a 16×16 pixel patch, which is divided into 4×4 subregions. Then, the obtained 4×4 array of orientation histograms with 8 orientation bins in each creates a single SIFT-FV with 4×4×8=128 elements. Considering the reduction of clutter after vehicle ROI extraction and motivated by the good performance of densely sampled SIFT descriptors in object recognition [13, 14], we utilize a dense regular grid instead of commonly adopted interest points to extract SIFT-FVs, which can capture more discriminative information about vehicles. For vehicle ROI image Is, each SIFT-FV is extracted from a 16×16 pixel patch and the SIFT-FVs are densely sampled on a grid with stepsize 8 pixels. All the SIFT-FVs extracted from Is constitute the SIFT feature matrix Ys of Is .
Vector Sparse Coding
Linear SVM (Decision)
(b)
B. Vector Sparse Coding under L1-norm Constraint
Linear SVM (Training)
Let Ψ = [y1, …, yN] be a set of SIFT-FVs, where yi (i = 1, 2, … , N) is a M × 1 (M = 128) SIFT-FV and N is the total number of the SIFT-FVs in Ψ. The sparse coding algorithm
Fig. 2 The system framework of the proposed method for urban vehicle classification.
528
proposed by Yang et al. in [4] is based on the following optimization:
⎧N min ⎨∑ yi − Dxi X ,D ⎩ i =1 s.t.
dj
2
2 2
⎫ + λ xi 1 ⎬ ⎭
Table I, where the input SIFT-FV set Ψ needs to be given. In our implementation, we collected 200 vehicle (given their side view) images including various vehicle classes to extract SIFT descriptors from the patches. As a result, about 50,000 SIFTFVs are obtained and used to form the input SIFT-FV set Ψ for dictionary learning. In the vector sparse coding phase, Eq. (1) is solved with respect to X only when the dictionary D is available. The sparse code vector xi of each SIFT-FV yi will be obtained by solving the L1-norm optimization problem in (2), which is termed as the L1-norm-VSC algorithm.
(1)
≤ 1, ∀j = 1, 2,..., L
where D = [d1, … , dL] is the M × L dictionary matrix and dj (j = 1, 2, … , L) denotes a M × 1 base vector in D. L is generally greater than M to obtain an over-complete dictionary. X = [x1, … , xN] denotes the set of sparse code vectors associated with the SIFT-FV set Ψ, and xi (i = 1, 2, … , N) is the L × 1 sparse code vector obtained by coding yi over D. It is noted that the L1-norm (||.||1) of xi is the sparsity regularization term and λ is a free parameter that enforces the sparsity of the solution. Lee et al. [8] proved that (1) is an optimization problem which is not convex in both X and D simultaneously, but it is convex in X when D is fixed and convex in D when X is fixed. The conventional process for the optimization problem in (1) is to solve it iteratively by alternatingly optimizing over D or X while fixing the other. When D is fixed, the optimization problem in (1) can be solved by optimizing over each xi individually as follows:
{
min yi − Dxi xi
2 2
+ λ xi
1
}
C. Efficient Vector Sparse Coding under L2-norm Constraint From the L1-norm-VSC algorithm, it is noted that the process of solving the L1-norm constraint optimization problem in (2) is computationally demanding when doing the online vector sparse coding, which reduces the practical value of the sparse coding strategy for real vehicle classification tasks. In this paper, an efficient coding algorithm is proposed to reduce the computational cost of the vector sparse coding meanwhile the good feature representation of vehicles can still be maintained. In principle, L0-norm is the best measure of the sparsity of a vector, which counts the number of non-zero elements in the vector. However, the vector sparse coding based on L0norm sparsity constraint is a NP hard problem. The L1-norm based vector sparse coding algorithm, as shown in (2), is widely adopted because it is the closest convex function to the coding function based on L0-norm constraint [9]. Moreover, the vector sparse coding based on L1-norm constraint is preferred to that based on L0-norm since there are many fast algorithms developed [8]. However, for real applications, the L1-norm constraint based vector sparse coding are still too time-consuming because it involves an iterative optimization process. Considering the overcompleteness and redundancy in the coding results for urban vehicle classification and motivated by the good performance of the L2-norm regularization for face recognition [10], we further relax the sparsity in the sparse code vectors and reformulate the vector sparse coding algorithm by using the L2-norm constraint instead of the L1-norm. As mentioned before, the proposed vector sparse coding algorithm is termed as the L2-norm-VSC algorithm. Specifically, for the proposed L2-norm-VSC algorithm, the dictionary learning is as the same as that shown in Table I under the L1-norm constraint which aims to learn a set of over-complete dictionary bases in an off-line manner. However, the online vector sparse coding is reformed as solving the following optimization problem:
(2)
This is essentially a linear regression problem with L1-norm regularization on xi . The optimization problem in (2) can be solved efficiently by using the feature-sign search algorithm proposed in [8]. When X is fixed, the optimization problem in (1) can be reduced to the following: min Ψ − DX D
s.t.
dj
2
2 F
≤ 1, ∀j = 1, 2,..., L
(3)
Eq. (3) is a least square problem with the L2-norm constraints on each base vector dj in D. The Lagrange dual algorithm proposed in [8] can be employed to efficiently solve (3). As mentioned before, sparse coding has a dictionary learning phase and a vector sparse coding phase. The algorithm to obtain the dictionary matrix D is summarized in TABLE I THE DICTIONARY LEARNING ALGORITHM
Algorithm 1. Input: SIFT-FV set Ψ = [y1, … , yN]. Initialization: Randomly generate L base vectors in D, each of which is normalized to a unit vector. Repeat 1: Fixing D, Eq. (2) is solved for each yi to form a temporary X. 2: Fixing X, Eq. (3) is solved to get a temporary D. Until the maximum number of iterations is exceeded. Output: The final D and X. The final D is the dictionary we want.
xˆ i = arg min xi
{ y − Dx i
2 i 2
+ λ xi
2 2
}
(4)
Eq. (4) is a regularized least square problem that can be considered as a weak vector sparse coding function, where L2norm constraint (||.||2) is imposed on xi to introduce certain sparsity to the solution. The sparsity introduced by (4) is weaker than that introduced by the L1-norm constraint in (2). However, for the specific urban vehicle classification task, as
529
the variation in intra-class of urban vehicles is limited and the number of vehicle classes that needs to be classified is usually small, the requirement of the sparsity is also weaker, which has been validated by our experiments. The efficiency of (4) depends on that it has an analytical solution, which is derived accordingly and shown as follows: xˆ i = Pyi
(a) (b) Fig. 4 Spatial pooling: (a) the sparse code vectors associated with the SIFT feature vectors for a vehicle ROI image, each blue circle represents one sparse code vector obtained at its location; (b) the pooled sparse code vectors of each subregion. All the pooled vectors in (b) are directly concatenated to form the final image representation.
(5)
where P = (λ ⋅ I + DT D) −1 DT (6) I denotes a unit matrix here. It is noted that P is a projection matrix associated with the over-complete dictionary D only. Since D can be obtained in the dictionary learning phase in an off-line manner, P can also be pre-calculated. Therefore, coding yi over D by using the proposed vector sparse coding algorithm is only a simple mapping shown in (5). As a result, the computational cost of the vector sparse coding is drastically reduced by solving (4) instead of (2). By coding each SIFT-FV in Ys over D via (5) and (6), the sparse code matrix Xs of Is is determined.
A multiclass linear SVM classifier is employed to classify the vehicles. For a binary linear SVM classifier, its discrimination surface is the optimal hyperplane defined by: wT z + b = 0 (8) where w is the weight vector of the linear discriminant function and b is a constant. We take the widely used one-vsall (OVA, or one-vs-rest) strategy to train n binary linear SVM classifiers for building a multiclass linear SVM classifier, where n is the number of vehicle classes that need to be classified. The classification process using SVM has a training phase and a testing phase. For the vehicle classification which classifies n classes, a training set of the instance-label pairs (zs , cs), s = 1, 2, … , l, is first built, where zs is the final representation of image Is, cs ∈ {1, 2, … , n} represents the given class label of Is , and l is the number of training images. Given the labeled samples in the training set, the pair of w and b for each binary linear SVM classifier can be obtained in the training phase by using the theory presented in [11]. In SVM testing (or decision), a query image can be directly classified by using the trained multiclass linear SVM classifier.
III. SUPERVISED VEHICLE CLASSIFICATION In this study, the supervised linear SVM classifier is adopted to implement the vehicle classification.
A. Pooled Sparse Code Vector This subsection presents the formation of the input vectors of the linear SVM classifier. Firstly, each vehicle ROI image is partitioned into several subregions. Then, for each subregion, the sparse code vectors within it are pooled together with a pooling function to get the corresponding pooled sparse code vector (PSCV). For ease of explanation, let zs(p,q) denote the resulting PSCV of the (p,q)-th subregion of image Is. Careful evaluation shows that the max pooling function outperforms other alternative pooling functions, such as the mean of absolute values and the square root of mean squared statistics. Hence, the spatial pooling in each subregion is based on the following equation:
z sj ( p, q) = max{x j1 , x j 2 ,..., x jT }, j = 1, 2,..., L
IV. EXPERIMENTAL RESULTS A. Experimental Setup Extensive research has shown that there is no appropriate public dataset for evaluating the classification performance on a variety of urban vehicle classes. Following most previous works on vehicle classification, we collected video data of passing vehicles on real urban road to form the dataset and evaluate the performance. The videos were taken by a Sony camera installed at the roadside of a busy urban road and captured at 25 frames per second with an image size of 640 × 480 pixels. The camera captured the side view of vehicles at a fixed viewpoint. Six classes of vehicles are considered: bus, car, motorcycle, SUV, truck and van. We extract one frame for each passing vehicle, and with manually counting there are 987 vehicle images collected, which includes 117 buses, 296 cars, 120 motorcycles, 168 SUVs, 127 trucks, and 159 vans. Each vehicle is labelled manually with the corresponding vehicle class for performance assessment. To compare with our vehicle classification method which classifies vehicles in the sparse feature domain of SIFT descriptors, we also implemented the popular image classification method proposed by Lazebnik et al. in [13] based on our urban vehicle dataset. Lazebnik’s method classifies images in the original feature domain of SIFT
(7)
where z sj ( p, q) is the j-th element of the PSCV zs(p,q), xjt denotes the j-th element of the t-th sparse code vector in the (p,q)-th subregion of image Is , and T is the number of sparse code vectors in (p,q)-th subregion. It is noted that L is the dimension value of each sparse code vector which is equal to the number of base vectors in dictionary D. Fig. 4 illustrates the spatial pooling in our implementation, where a vehicle ROI image is partitioned into 4×4 subregions and spatial pooling in each sub-region is operated to get the corresponding PSCV, as shown in Fig. 4(b). When the max pooling across all the subregions of an image is finished, all these PSCVs from each subregion are then directly concatenated and normalized as the final image representation. For image Is , its final representation is denoted by zs , which is used to be an input vector of the linear SVM. B. Vehicle Classification Using Linear SVM
530
descriptors using the nonlinear SVM classifier. To achieve the best performance of their method, the spatial pyramid suggested in [13] is used when doing the spatial pooling. In the implementation of our proposed method, the two vector sparse coding algorithms described in Section II, the L1-normVSC proposed in [4] and the L2-norm-VSC proposed in this paper, are used respectively to make a comparison. Several experimental parameters are set as follows. The vehicle ROI images extracted from video frames are resized to be no larger than 160 × 120 pixels with preserved aspect ratio using the bicubic interpolation [12]. For the proposed method, the dictionary size L is selected to be 1024 for both the L1norm-VSC and L2-norm-VSC algorithms as suggested in [4], and λ used in Eq. (2) and Eq. (4) is empirically set as 0.15. For Lazebnik’s method we implemented for vehicle classification, a three-level pyramid is constructed, where 2l ×2l subregions, l = 0, 1, 2, are used when doing the spatial pooling. We use PC as computing platform. All the experiments are carried out using MATLAB R2011b on a 3.20GHz Dual Core CPU with 8GB RAM. The OS is the 64-bit Windows 7.
proposed method with L1-norm-VSC for (bus, car, motorcycle, SUV, truck, van) are (1.0, 0.9474, 1.0, 0.9058, 1.0, 0.9457) respectively. From Table V, the accuracy of the proposed method with L2-norm-VSC for (bus, car, motorcycle, SUV, truck, van) are (1.0, 0.9361, 1.0, 0.8768, 1.0, 0.9613) respectively. From the three confusion matrixes, both the proposed method with L1-norm-VSC and that with L2-norm-VSC outperform Lazebnik’s method in almost every vehicle class. Three classes (bus, motorcycle, and truck) achieve 100% accuracy by using our proposed methods. The proposed method with L2-norm-VSC has higher accuracy for van but lower accuracy for SUV compared with that with L1norm-VSC. It is noted that for all methods the highest number of misclassification occurs between SUV and car, the size and shape of which hold some similarity. Table VI shows the running time of the vector sparse process by using different algorithms based on our experimental platform. The results are the average coding time for one vehicle image in the experimental dataset. The proposed L2-norm-VSC algorithm is about 39 times faster than the L1-norm-VSC algorithm. Hence, the coding time by using the L2-norm-VSC is significantly reduced. From the experimental results shown above, we can conclude that our proposed method using sparse coding strategy for urban vehicle classification achieves a good classification performance. In addition, our method using the proposed L2-norm-VSC algorithm provides a comparable
B. Classification Results Following the common benchmarking procedure of multiclass classification, we repeated the classification process by 10 times with different random selection of the training and testing images and averaged the results to get reliable classification performance. In our experiments, the training set was formed by using 15, 20, 25, and 30 images per vehicle class respectively and we tested on the rest. For each run, perclass accuracy values were recorded and their average value was computed. And we report the final average classification accuracy by the mean of the results from the individual runs. Detailed comparison results are shown in Table II. The increase of the training size can improve the average classification accuracy for all methods. For all the cases, our proposed method with L1-norm-VSC outperforms Lazebnik’s method by more than 4 percent, and our proposed method with L2-norm-VSC outperforms Lazebnik’s method by more than 3 percent. Our method using the proposed L2-norm-VSC achieves comparable average classification accuracy compared with that using L1-norm-VSC. Table III, Table IV, and Table V show the resulting confusion matrixes between the six vehicle classes for the methods with 30 training images per class. The results are also the mean of the values from 10 runs. From Table III, the accuracy of Lazebnik’s method for (bus, car, motorcycle, SUV, truck, van) are (0.9195, 0.8722, 1.0, 0.8406, 0.9588, 0.9147) respectively. From Table IV, the accuracy of the
TABLE III CONFUSION MATRIX FOR LAZEBNIK’S METHOD Predicted Class → Bus Car Motorcycle SUV Truck Bus 80 0 0 0 0 Car 0 232 0 28 0 Motorcycle 0 0 90 0 0 SUV 0 16 0 116 6 Truck 0 0 0 0 93 Van 5 3 0 3 0
Van 7 6 0 0 4 118
TABLE IV CONFUSION MATRIX FOR THE PROPOSED METHOD WITH L1-NORM-VSC Predicted Class → Bus Car Motorcycle SUV Truck Van Bus 87 0 0 0 0 0 Car 0 252 0 14 0 0 Motorcycle 0 0 90 0 0 0 SUV 0 10 0 125 3 0 Truck 0 0 0 0 97 0 Van 3 0 0 4 0 122 TABLE V CONFUSION MATRIX FOR THE PROPOSED METHOD WITH L2-NORM-VSC Predicted Class → Bus Car Motorcycle SUV Truck Van Bus 87 0 0 0 0 0 Car 0 249 0 17 0 0 Motorcycle 0 0 90 0 0 0 SUV 0 13 0 121 4 0 Truck 0 0 0 0 97 0 Van 1 2 0 2 0 124
TABLE II AVERAGE CLASSIFICATION ACCURACY COMPARISON training images 15 20 25 30 (per class) Lazebnik’s 86.42% 88.17% 90.96% 91.86% Method The Proposed Method 90.63% 92.21% 95.35% 96.64% with L1-norm-VSC The Proposed Method 89.56% 92.19% 95.86% 96.21% with L2-norm-VSC
TABLE VI AVERAGE RUNNING TIME OF VECTOR SPARSE CODING PROCESS Vector Sparse Coding Algorithm Time L1-norm-VSC [4] 0.1877 s L2-norm-VSC (proposed) 0.0048 s
531
[2]
classification performance and meanwhile is more efficient in the vector sparse coding phase compared with that using the commonly adopted L1-norm-VSC algorithm.
[3]
V. CONCLUSION [4]
This paper has presented an effective urban vehicle classification method using the sparse coding technique to obtain the discriminative feature representation of vehicles. To reduce the computational cost of vector sparse coding, an efficient coding algorithm based on L2-norm constraint has been further developed to fit the real application requirements. Experimental results demonstrate the good classification ability of the linear SVM classifier in the sparse feature domain of SIFT. Moreover, we also validate the effectiveness of the proposed vector sparse coding algorithm for urban vehicle classification. It is expected that the proposed method can be applied to the practical large-scale applications. Future work will focus on studying detailed parameter selection in the sparse coding process.
[5] [6] [7] [8] [9] [10] [11]
ACKNOWLEDGMENT
[12]
This work is supported by Shenzhen Science and Technology Fundamental Research Program (No. JCYJ20130329175141512).
[13]
REFERENCES [1]
[14]
Zezhi, C., et al. Road vehicle classification using Support Vector Machines. In IEEE International Conference on Intelligent Computing and Intelligent Systems, 2009.
532
Zezhi, C., T. Ellis, and S.A. Velastin. Vehicle type categorization: A comparison of classification schemes. In 14th International IEEE Conference on Intelligent Transportation Systems (ITSC), 2011. Chen, Z., T. Ellis, and S.A. Velastin. Vehicle detection, tracking and classification in urban traffic. In 15th International IEEE Conference on Intelligent Transportation Systems (ITSC), 2012. Jianchao, Y., et al., Linear spatial pyramid matching using sparse coding for image classification. IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2009: pp. 1794-1801. Lowe, D.G., Distinctive image features from scale-invariant keypoints. International journal of computer vision, 2004. 60(2): pp. 91-110. Mikolajczyk, K. and C. Schmid, A performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2005. 27(10): pp. 1615-1630. Xiaoxu, M. and W.E.L. Grimson. Edge-based rich representation for vehicle classification. In Tenth IEEE International Conference on Computer Vision, ICCV, 2005. Lee, H., et al., Efficient sparse coding algorithms. Advances in neural information processing systems, 2007. 19: pp. 801. Donoho, D.L., Compressed sensing. IEEE Transactions on Information Theory, 2006. 52(4): pp. 1289-1306. Zhang, L., M. Yang, and X. Feng. Sparse representation or collaborative representation: Which helps face recognition? In 2011 IEEE International Conference on Computer Vision (ICCV), 2011. Vapnik, V.N., An overview of statistical learning theory. IEEE Transactions on Neural Networks, 1999. 10(5): pp. 988-999. Kafai, M. and B. Bhanu, Dynamic Bayesian Networks for Vehicle Classification in Video. IEEE Transactions on Industrial Informatics, 2012. 8(1): pp. 100-109. S. Lazebnik, C. Schmid, and J. Ponce. Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. Proc. of CVPR’06, 2006. Wang, J., et al. Locality-constrained linear coding for image classification. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2010.