2012 IEEE International Conference on Multimedia and Expo Workshops
VEHICLE TYPE CLASSIFICATION USING PCA WITH SELF-CLUSTERING Yu Peng, Jesse S. Jin, Suhuai Luo School of DICT, University of Newcastle, Australia {Yu.Peng, Jesse.Jin, Suhuai.Luo}@uon.edu.au
Min Xu Faculty of Eng. & IT, University of Technology, Sydney
[email protected]
model [1], edge point groups [2] and weighted edge matching [3]. Wu et al. [1] proposed a parameterized edge model to describe the topological structure of vehicle, and then fed models into a multi-layer perceptron networks based classifier. Although classification rate is satisfactory with high-quality images, the authors stated that their method was limited when performing on low quality images. Ma et al. [2] associated edge points with SIFT based descriptors to guarantee the repeatability and sufficient discriminability of their proposed feature. Although their method achieved impressive result, it is time consuming and falls short when view changes. Model-based approaches that use additional prior shape information have also been investigated [4] [5] [6]. Jolly et al. [4] proposed five classes of deformable 2D models to segment a vehicle from background. Although this algorithm strictly depends on camera view angle and image quality, it breaks a new path for vehicle type classification in real world conditions. In order to generate 3D models [5] [6], vehicle parameters, such as length, width and height were recovered from 2D projections of vehicles under a calibrated camera model. However, the complexity of stereo algorithm make these methods time consuming. Moreover, in order to recover 3D parameters of a vehicle, both of vehicle side view and frontal (rear) view are needed. Compared to research work with vehicle side views, very few works had been done with the vehicle front/rear views. Most works dealing with front/rear views applied machine learning on extracted features. Based on SIFT descriptors, vehicle make and model recognition from the frontal views was investigated in [7, 8]. Conos et al. [7] fed SIFT descriptors extracted from frontal view image into kNN classifier. Psyllos et al. [8] used a neural network classifier to recognize the logo, manufacture and model of a vehicle. Relying on logo detection, Psyllos’ method might fail if logo cannot be detected correctly. Edge-based feature sets were utilized in [9]. From vehicle frontal view, Petrovic et al. [9] extracted a set of features including Sobel edge, edge orientation, direct normalized gradients, locally normalized gradients, square mapped gradients, Harris corner, and spectrum phase. Based on these features, a simple nearest neighbor classifier was applied. Although impressive recognition accuracy was
ABSTRACT Different conditions, such as occlusions, changes of lighting, shadows and rotations, make vehicle type classification still a challenging task, especially for real-time applications. Most existing methods rely on presumptions on certain conditions, such as lighting conditions and special camera settings. However, these presumptions usually do not work for applications in real world. In this paper, we propose a robust vehicle type classification method based on adaptive multi-class Principal Components Analysis (PCA). We treat car images captured at daytime and night-time separately. Vehicle front is extracted by examining vehicle front width and the location of license plate. Then, after generating eigenvectors to represent extracted vehicle fronts, we propose a PCA method with self-clustering to classify vehicle type. The comparison experiments with the state of art methods and real-time evaluations demonstrate the promising performance of our proposed method. Moreover, as we do not find any public database including sufficient desired images, we built up online our own database including 4924 high-resolution images of vehicle front view for further research on this topic. Index Terms— Vehicle type classification, license plate localization, PCA 1. INTRODUCTION Automatically accomplishing vehicle type classification is still a challenging task due to occlusions, various light conditions, shadows and rotations. These conditions limit the accuracy of vehicle type classification and make it difficult for real-time applications. General steps of vehicle type classification including foreground segmentation, vehicle detection, feature extraction and classification, which make it too time consuming to achieve real-time application. In this paper, a real-time vehicle type classification system is proposed to address these problems. Visual based vehicle type classification usually falls into two categories: methods using vehicle’s side view and methods using vehicle’s front (or rear) view. For the side view ones, edge and model based methods have been widely used. Edge-based approaches include parameterized edge 978-0-7695-4729-9/12 $26.00 © 2012 IEEE DOI 10.1109/ICMEW.2012.73
Yue Cui National Lab. of PR, Chinese Academy of Sciences, China
[email protected]
384
achieved, calculating so many low-level features calculation are time consuming. Some researcher attentions were attracted by appearance structure for vehicle type classification. Kuwabara et al. [10] proposed an appearance structure with Gaussian Mixture Model to recognize a vehicle. The appearance model representing a vehicle image viewed from the rear includes a window, tail lights, and so on. Since the appearance model was established based on color recognition, this method is very sensitive to light conditions. Besides the aforementioned limitations, two crucial facts were usually ignored by most of existing methods of vehicle type classification. 1) For most methods deal with frontal (rear) views, ROIs (Region of Interesting) were defined manually according to the location of license plates or logos [11]. However, ROI defined by this way is not ideal. Sometimes, manually defined ROI includes background. Sometimes it is only part of the vehicle frontal (rear) view. 2) Images with vehicle frontal view appear significantly different between under daylight and under nightlight, while most existing methods treat daylight and nightlight as same cases. In this paper, we address the above problems and develop an automatic ROI extraction algorithm by combining license plate detection, vehicle segmentation and shadow removal. Furthermore, we treat vehicle type classification under daylight and nightlight separately. In this paper, we propose a real-time vehicle type classification system with high accuracy rate to address the aforementioned problems. In this system, a camera, still mounted on a pole, looks down and faces coming vehicles on the highway. The rest of this paper is organized as follows. Section 2 gives an overview of our proposed algorithm and highlights the contributions of our work. In Section 3, a novel method for localizing license plate is presented. Section 4 details the procedure of segmenting vehicle and extracting vehicle frontal image. Eigenvectors generation and classification algorithm are described in section 5. Section 6 demonstrates our experimental result. Finally, this paper is finished in section 7 by conclusions.
representative vector for each sub-class by averaging all vectors in this sub-class. Two image datasets of daylight and nightlight are used for training separately. The classification procedure is demonstrated in Fig. 1(b). We first distinguish images captured in daylight from in nightlight, according to intensity distribution within the image. Second, in order to quickly and accurately localize license plate, we propose a license plate localization method
Fig. 1. The procedures of our proposed system based on line segment feature inside license plate image region. Background subtraction and shadow removal are thirdly implemented to obtain vehicle region in images. Furthermore, the width of vehicle front is learned. According to obtained license plate location and vehicle front width, the vehicle front can be correctly extracted. By projecting the vehicle front into the previously constructed eigen-space of daylight or nightlight, a vector is obtained to represent the vehicle front. The vehicle is classified into a specific type when this vector is closest with the representative vector of the correspondent sub-class. There are four key contributions of our paper. 1) In order to achieve a real-time vehicle type classification under various possible conditions an highway, including lighting changes, rotations, shadows and occlusion, a novel structure of multi-class PCA is proposed. The multiclass PCA has not before been used in the context of vehicle classification for frontal (rear) view. 2) We propose a comprehensive system for vehicle type classification. Different from the traditional method of manually defined vehicle front, the vehicle front is extracted by calculating vehicle front width and detecting location of license plate. 3) To the best of our knowledge, this paper is the first one to train image dataset of daylight and nightlight separately. Although some image pre-processing techniques can reduce the effect of changing light conditions, but they
2. OVERVIEW AND CONTRIBUTIONS OF THE PROPOSED SYSTEM Our proposed system consists of two parts: training and classification. As shown in Fig. 1(a), on training stage, we first construct an eigen-space by all manually cropped vehicle front images with 5 class labels (Truck, Bus, Minivan, Passenger car and Sedan). Every training image can be represented by a combination of all weighted eigenvectors after projecting it on the constructed eigenspace. Thereafter, every training image is represented by a vector with all eigenvectors’ weights. In order to enhance the classification accuracy, the vectors in each class is automatically clustered into sub-clusters, which are still with the same label as parent-classes. Finally, we obtain the most
385
cannot evade the significant difference between daylight and nightlight. 4) We build up a database including 4924 highresolution images of vehicle frontal view. Everyone can access to this database for research purpose. Fig. 3. License plate localization based on line segments
3. TIME-TELLING AND LICENSE PLATE LOCALIZATION
deployed on ROI to remain significant edges, shown in Fig. 4(c). Finally, line segments are obtained by applying Hough Transform, as shown in Fig. 3(d). As shown in Fig.3 (e), we constructed three properties of these line segments that discriminate license plate from background. These three properties are density, directionality and regularity. It tends to have a high density of line segments in a license plate region. Moreover, rather than distribute disorderly, line segments inside license plate tend to be approximately vertical. License plate region should have high value of directionality. Finally, regularity is introduced to measure the regular repetition of line segments inside license plate region. License plate region intends to have high regularity. The localized license plate is shown as red rectangle in Fig.3 (d).
3.1. Identify image capture time We need to firstly decide whether the image is captured under daylight or nightlight. In nightlight case, vehicle front light is the main light source. In images captured under nightlight, pixels on the top corners of image are with very low intensity since the front light is normally on the bottom. Therefore, rather than apply intensity assessment to the entire input image, we assess four selected blocks with 20*20(pixels) size on the four corners of the input image. According to the following equation, we can tell one image is captured under nightlight or daylight: ( I c1 + I c 2 + I c 3 + I c 4 ) / 4 < 20 nightlight (1) ® daylight ¯ otherwise where Ic is the average intensity value of a patch with 20*20
4. VEHICLE FRONT EXTRACTION AND SHADOW REMOVAL
pixels size on each corner. 3.2. License plate localization
4.1. Daylight case
Fast and accurate license plate localization is the necessary step of following steps in our system. We achieve this through a coarse-to-fine method. We firstly normalize the ROI to reduce effect of various lighting condition and noise. We band-pass filter the image and weight the pixels using a Gaussian function centered on the image. The image was then normalized to have zero mean and unit standard deviation. The normalized ROI image is shown as the left of Fig. 2. As shown in Fig.2, we find that the histograms of license plate region fluctuate wildly when we show every horizontal line of image as intensity histograms, as the red line. The initial ROI of license plate is detected quickly, according to this feature.
For classification, each vehicle front should be detected and extracted. We complement this by two steps: 1) vehicle front width is calculated from the valid ROI. ROI extraction has been introduced in the last section; 2) vehicle front is extracted based on vehicle width and the location of detected license plate. We deal with vehicle extraction differently for daylight cases and nightlight cases. The situation is much more complicated in daylight case. For daylight cases, we need to not only find out vehicle front but also remove shadow. To make our background subtraction algorithm robust, we capture several road images containing no vehicle under different lighting and weather conditions, and then average them as background. The subtraction operation is presented as follow: 0, if I k ( x , y ) − Bk ( x , y ) ≤ Td Dk ( x , y ) = ® 1, otherwise ¯
(2)
where Dk is the difference image between input image and background image, Ik is detected ROI, Bk is the correspondent ROI region on averaged background image. And, Td is a predefined threshold which is chosen as the average of Dk .
Fig. 2. Localize ROI of license plate We further locate license plate on ROI based on line segment features. We proposed line segment feature because after applying edges detection on an image, we can obtain dense vertical strokes inside license plate region. We first run across ROI with Sobel operator to obtain gradient of the image in horizontal direction. The ROI with detected vertical edges is shown in Fig. 3(b). Binarization is then
After subtraction, simple morphological operation is applied to remove noise. However, due to the shadow, we cannot obtain exact location of vehicle after background subtraction.
386
In order to remove shadow, we further implement three steps. Firstly, illumination assessment for detected ROI is performed in order to determine whether there are shadows in the image. Secondly, through determining the direction of illumination and sampling shadow point, the attributes of shadow including average intensity and brightness contrast are calculated. Finally, vehicle part is exactly extracted by three criteria: 1) preserving bright pixels; 2) preserving pixels with attributes different from attributes of shadow; 3) preserving pixels nearby object edges.
It is extremely important to apply image pre-processing techniques to standardize the images. Most visual based classification algorithms are extremely sensitive to many factors such as camera angle, lighting condition and image size. In our experiment, all vehicle images were captured by a static camera. In order to reduce the adverse effect of various lighting conditions, we first convert all vehicle front images to grayscale and then apply band-pass filtering. Finally, we transform all processed images into a fixed size, which are 450*150 pixels in our experiment. From above steps, we obtain a set of regularized training vehicle front images. For training, we need to manually label all images into five categories as above: truck, minivan, bus, passenger car, and sedan. Every image can be represented as a m by nmatrix with m and nbeing image height and width, respectively. In order to compute eigenvectors conveniently from all image matrices, we need to store all images in one matrix. We first reshape every image matrix to a 1 × m n vector. With k being the number of training images, we store all these images into a matrix of
4.2. Nightlight case When the input image is captured in nightlight, we easily obtain vehicle front width using background subtraction technique as explained in the daylight case. 4.3. Vehicle front extraction
k columns Fig. 4. Vehicle front extraction
I = [I1 I 2 " I k
] . The length of
Ii
is
m×n .
Then we can compute the average image of all training images and the difference images between the average image and each training image: 1 k a = ¦ I i , σ i = Ii − a (3) k i =1
With vehicle front width and location of license plate, we further extract vehicle front. As shown in Fig. 4, the height of vehicle front is four times height of detected license plate. Moreover, the rectangle of vehicle front shares bottom line with license plate.
where a is the average image represented by a 1× mn vector, σ i is a different image, and is a matrix storing all difference images. The covariance matrix of I is: 1 k C = ¦σ iσ iT = σσ T (4) k i =1 The principal components are the eigenvectors of C. Those eigenvectors that have the biggest associated eigenvalues contribute most to classify the training images. However, it is infeasible in eigenvector computation since σσ T is a too huge matrix. In our experiment, the used vehicle front image is the size of 450*150. This makes the size of σσ T 67500*67500. This computation burden was avoided by the method in [9]. Suppose is an eigenvector of σ σ and λi is the associated eigenvalue, then:
σ
5. MULTI-CLASS PCA CLASSIFIER 5.1. Training Eigenvectors are derived from the covariance matrix of the probability distribution of the high-dimensional vector space of images. In our paper, eigenvectors extracted from vehicle front images are used for vehicle type classification. The eigenvector describes invariant characteristics of a vehicle front image. The mechanism behind eigenvector is Principal Component Analysis (PCA). In image classification, it is impossible to computing distance between two images pixel by pixel when comparing these two images, because this method is too time consuming and the total noise contributed from every pixel will be very high. The noise could be anything that affects pixel intensity value. Therefore, we usually compare two images in a subspace with much lower dimensionality than the total pixels’ number. PCA is a popular method to find the subspace, namely eigen-space. The eigenvector that can separate all vectors maximally, which are obtained by projecting images on subspace, is called the first principle component of this image dataset. Theoretically, total number of eigenvectors we can find is the number of all images minus one. However, in practice, we only keep the ones with good separation capacity. Before generating eigenvector, vehicle front images should be processed.
T
σ σ T u i = λ i u i σ σ T σ u i = λ iσ u i
(5 )
where we can deduce σ u i is an eigenvector of σσ . This method greatly reduced the computation complexity since the size of σ Tσ is only k * k . T
Fig. 5. Eigenvector generation from training images
387
outperforms existing popular approaches by comparison experiments. Finally, in order to demonstrate the real-time performance of our system, we install our system on a highway for evaluation. Please refer to the supplementary video to watch the demonstration.
As explained above, (k−1) numbers of eigenvectors would be generated from k training images. As shown in Fig.5 (c) and Fig.5 (d), they are the first eigenvector and last eigenvector in our experiment, respectively. These eigenvectors figured out main differences between all the training images. Moreover, every training image can be represented by a combination of the average image and weighted eigenvectors. As shown in Fig. 5(a), the training image is made up of: a + 35.6%v − 13.5%v + 23.3%v + " + 0%v + 0%v 0
1
2
797
6.2. Build up our own database
As we do not find a public database included sufficient desired images of vehicle frontal or rear view, we collect images and build up own database. We collected captured images of passing vehicles on a highway using a Sony HDR-SR12 camera. The images are taken from 20th June, 2011 to 22nd June, 2011. The images were taken in both of daylight with sunny and partly cloudy conditions and nightlight. The total number of daylight images and night images are 3618 and 1306, respectively. As much more vehicles during day than vehicles during night, there is big amount difference between daylight images and nightlight images. All JPG format images are with 1600*1264. Everyone can access this database for research purpose with this link http://dl.dropbox.com/u/52984000/Database1.rar . We used 800 daylight images and 800 nightlight images for training. For both daylight training case and nightlight training case, each vehicle type is with 160 images. And other 500 daylight images and 500 nightlight images for are used for testing. For both daylight testing case and testing training case, each vehicle type is with 100 images. One vehicle is in every image. All captured vehicles falls into five classes: truck, minivan, bus, passenger car, sedan (including sport-utility vehicle (SUV)).
798
where a is the average image of all training images, as shown in Fig. 5(b). The first eigenvector describes dominant features of vehicle, can discriminate different type vehicle maximally. However, there are mainly image noises in the last eigenvector, which make nearly no contribution to vehicle type classification. Rather than using all generated eigenvectors, we only pick up the eigenvectors with biggest eigenvalues. In our experiment, we choose the top 32 eigenvectors for classifying vehicles. Therefore, every training image is represented by 32 ratios, like {35.6, −13.5,23.3,", −1.2} . 5.2. Self-clustering and classification
For classifying a vehicle type, traditional PCA method classifies test image into the same category as the closest training image. We develop an adaptive multi-class PCA method in our system. In our system, vehicle fronts are classified as 5 types: Truck, Bus, Minivan, Passenger car and Sedan. Within each class, there is still appearance difference between vehicle fronts. Therefore, it is necessary to classify vehicle fronts into sub-classes within each class. We apply K-mean clustering for each class, k
arg min ¦ s
¦
6.1. License plate localization evaluation
2
x j − μi
i =1 x j ∈S j
X = { x1 , x2 , x3 ," , xn }
(6)
S = { S1 , S 2 , S 3 ," , S k }
where a set of vehicle front images { x1 , x 2 , x 3 , " , x n } in each vehicle type, where a vehicle front image is represented as a vector with 32 values, K-means clustering aims to partition the n observations into k sets (k ≤ N ) so as to minimize the within-cluster sum of squares. ui is the mean of vectors in
S
j
.Thereafter, we calculate the distance
between vector representative of test image and {μ1 , μ 2 , μ 3 ," , μ n } . Test image falls into a vehicle type with the shortest distance. Fig. 6. Evaluation for license plate localization
6. EXPERIMENTS
In order to evaluate our license plate localization algorithm, we randomly select 60 images from each type for daylight case, 60 images from each type of nightlight case. The localization is successful if the overlapped area between
As accurate license plate localization and vehicle front extraction is the premise of our system. We first evaluate these two steps. Thereafter, we demonstrate our method
388
localized license plate and ground truth is not less than 80% of ground truth area. As shown in Fig.6, we correctly localize 277 license plates for 300 daylight images (92.3%) and 275 license plates for 300 nightlight case (91.6%).
1600*1264 pixels. The captured images are then sent to our computer via IP address. Our system runs on a computer with Intel(R) Core(TM) 2 Duo CPU, E8400 3.00GHz, RAM 2GB. The average processing time is 70ms. Moreover, for every coming vehicle, we need to process one image. Our system can work real-time. We test our system for a whole day. The total amount of passing vehicles is 550(daylight) and 120(nightlight). Our system correctly classified 474(86.1%) for daylight case and 93(81%) for nightlight case. The real-time classification rate is lower than that performed in 6.3. After having a closer look of the images, we realize that some images taken on real-time might not be very clear due to car moving fast or camera focus issues. Therefore, in some real-time images, vehicle front might not be able to be extracted correctly.
6.2. Vehicle Front extraction evaluation We used 40 images from each type for daylight case and 40 images from each type for nightlight case in vehicle front extraction evaluation. We extract vehicle front based on vehicle front width and location of license plate. In order to avoid errors of license plate localization to effect the evaluation for vehicle front extraction, license plate can be correctly localized in these 200 images. We still defined correct extraction as the overlapped area of extracted vehicle front and ground truth is not less than 80% of ground truth area. As shown in Fig.7, we obtain correct extraction rate in daylight case (80.0%) and in nightlight case (80.5%).
7. CONCLUSION
We proposed a practical and robust vehicle type classification. Evaluation experiments demonstrate promising performance of our method. Moreover, our system deals with daylight case and nightlight case separately. Finally, we build up a large public database for further research on this topic. We will develop a vehicle manufacture classification system in next work. 8. REFERENCES [1]
W. Wei, et al., "A method of vehicle classification using models and neural networks," in Vehicular Technology Conference, VTC 2001. [2] M. Xiaoxu and W. E. L. Grimson, "Edge-based rich representation for vehicle classification," in Computer Vision, ICCV 2005. [3] S. Ying, et al., "Unsupervised Learning of Discriminative Edge Measures for Vehicle Matching between Nonoverlapping Cameras," Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 30, pp. 700-711, 2008. [4] M. P. Dubuisson Jolly, et al., "Vehicle segmentation and classification using deformable templates," Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 18, pp. 293-308, 1996. [5] A. H. S. Lai, et al., "Vehicle type classification from visual-based dimension estimation," in Intelligent Transportation Systems, Proceedings. IEEE, pp. 201-206, 2001. [6] S. Gupte, et al., "Detection and classification of vehicles," Intelligent Transportation Systems, IEEE Transactions on, vol. 3, pp. 37-47, 2002. [7] M. Conos, "Recognition of vehicle make from a frontal view," Master, Czech Tech. Univ., Prague, Czech Republic, 2006. [8] A. Psyllos, et al., "Vehicle model recognition from frontal view image measurements," Computer Standards & Interfaces, vol. 33, pp. 142-151, 2011. [9] V. S. Petrovic and T. F. Cootes, "analysis of features for rigid structure vehicle type recognition," in British Machine Visual conference, pp. 587-596, 2006. [10] K. Kuwabara, et al., "Vehicle Appearance Model for Recognition system considering The Change of Imaging Condition " Journal of Advanced Computational Intelligence and Intelligent Informatics, vol. 13, pp. 463-469, 2009. [11] M. Kafai and B. Bhanu, "Dynamic Bayesian Networks for Vehicle Classification in Video," Industrial Informatics, IEEE Transactions on, vol. PP, pp. 1-1, 2011.
Fig. 7. Evaluation for vehicle front extraction 6.3. Vehicle type classification evaluation Table 1. Evaluation for vehicle type classification Captured Time Daylight Nightlight
Correct numbers Correct Numbers
SIFTbased[8] 235 (78.3%) 220 (73.3%)
Edgebased[9] 253 (84.3%) 246 (82.7%)
Our proposed 270 (90.0%) 263 (87.6%)
We compare our system with Sift-based method [8] and Edge-based method [9] by classifying 300 daylight images consisting of 60 images of each type and 300 nightlight images consisting of 60 images of each type. Moreover, vehicle fronts of all these 300 images can be extracted correctly. As shown in Table 1., our system outperforms the other two methods in both of daylight case(90.0%) and nightlight case(87.6%). 6.4. Real-time performance evaluation In order to test our algorithm in real-time, we connect our system with an installed camera on a highway. The camera captured images when vehicles pass. Every image has
389