Traffic Sign Detection and Recognition Using Color Standardization and Zernike Moments Ma Xing, Mu Chunyang, Wang Yan Institute of Information and Communication Technology Beifang University of Nationalities Yinchuan, P. R. China
[email protected],
[email protected] Abstract—Traffic sign detection and recognition plays an important role in driver assistance system especially in complexity environment. Firstly, RGB image is converted to standardization image only contained 8 colors for reducing computational burden. Only interesting color components are extracted as candidate region for further area filtering. Then HOG descriptor is considered as detection characteristic of candidate regions. Finally, Zernike moment is combined with support vector machine for traffic sign recognition. It is efficient computation by color standardization compared with classical color space transform, and it could defense of strong light, shadow, rainy days, and occlusion by take advantage of the invariant features of rotational and scale of Zernike moment. Keywords—image recognition; color standardization; Zernike Moments; SVM; traffic sign
Several methods are mostly taking advantage of color space transform for image segmentation, and then detection the ratio of length to width for sign detection [3-4]. RGB images are usually converted to a certain space for extracting component values. However this color space transform process wastes a large amount of computing time, which might influence the real-time. II. ALGORITHM The problem is mainly separately into three steps: a. Color extraction, b. Shape detection, and c. Recognition, the flow is illustrated in Fig. 2. Input image
I.
INTRODUCTION
Traffic sign detection and recognition could provide more visible and helpful traffic information to driver for declining traffic accident especially in complexity environment. According to design specification, traffic signs have specific colors and shapes, which is helpful for discriminate from surrounding. German Traffic Sign Detection Benchmark (GTSDB) test dataset provides 300 images in natural traffic scenes for detection and recognition test [1]. These signs mainly contain five colors: red, black, yellow, blue and white, and round, triangle, inverted triangle and eight side shape, as shown in Fig. 1.
Color standardization
a
Interesting color components extracting Candidate region setting
HOG descriptor detection
b
Classified by SVM
Zernike moment
c
Recognition results
Fig. 2. Flow chart of the algorithm
Fig. 1. Traffic Signs [2]
978-1-4673-9714-8/16
Part a. Color components extraction, contains color standardization, interesting color components extracting, and candidate region setting by area filtering. Part b. Shape detection, is described by HOG descriptors, and then classified by SVM [5]. Part c. Recognition, Zernike moment is used for recognition. The algorithm will be described in detail.
2016 28th Chinese Control and Decision Conference (CCDC)
A. Color extraction by color standardization 1) Color standardization Nature is most colorful, which is not enough to expression by limited bits. For improving detection efficiency, real color image should be standard into 8 colors for declining computational burden. The essence in mathematics is a complex space nonlinear mapping to a simple space [6]. Given a pixel in image were (Ri, Gi, Bi), and threshold values of these three color components are given Tr, Tg and Tb, respectively. Then component R could be standard as 255, Ri ≥ Tr Ri = Ri < Tr 0,
(1)
Component G and B have similar principle. Only red, green, blue, white, yellow, magenta, cyan and black are reserved after color standardization. The relationship between traffic signs and color components is shown in Tab. 1. TABLE I.
RELATIONSHIP BETWEEN SIGNS AND COLOR COMPONENTS R
G
B
Prohibition
Indication
Warning
255
0
0
255
0
0
Green
0
255
0
0
0
0
Blue
0
0
255
255
255
0
255
255
0
0
0
255
0
255
255
0
0
0
Magenta
255
0
255
0
0
0
White
255
255
255
255
255
0
Black
0
0
0
255
0
255
Red
Yellow Cyan
2) Interesting color components extracting In real world, traffic sign might be too small to recognize by naked as shown in Fig. 3(a). And there are so many interference, such as buildings, shadow of trees, traffic lamp shell etc. It is painted by a red block for highlighting.
(a) Original image
(b) Standardized image
(b) Speed limited sign
(c) Interesting color extracting
Fig. 3. Results of interesting color components extracting
Take a speed limited sign as example. It is mainly consisted of red boundary, white background with black words as in Fig. 3(b). Original image is converted by color standardized, and result is shown in Fig.3 (c), which only contains colors like black, red, green, blue etc. Red border is narrow, but significantly. Then interesting color components are maintained and others are set 0. 3) Candidate region setting by area filtering Sometimes it still contains mistake detection by color standardized for candidate region of interesting. Area filtering might be helpful for further selection. Street lamp and other red interference could be excluded. 3×3 template and 8 connected domains are used. Interference area could be eliminate according to lw(i)>Thh or lw(i)< Thl or p(i)< Thp
(2)
Where p(i) is the proportion of specified sign in total image, lw(i) is the ratio of length to width of rectangular, Thh, Thl and Thp are thresholds obtained by experimental. Fig. 4(a) is the original image of a traffic scene. And Fig. 4(b) shows the detected result, which highlight the detected traffic signs with a red rectangular box.
(a) Original image
(b) Detection result by area filtering
Fig. 4. Detection result of candidate region setting by area filtering.
B. Shape detection by HOG 1) HOG descriptors HOG is proposed by Dalal in 2005. It is used for extracting image texture and shape information. HOG characteristics could maintain image optical and geometric invariance because of its feature information is extracted from every local cell and then normalized in larger block. So it is robustness to the influence of illumination and shadow [7]. Features are extracted by follow: a) Gradient on horizontal and vertical direction Let pixel in image represents as (x,y), and H(x,y) means its intension. Its gradient on horizontal and vertical direction would be calculated as follow: Gx ( x, y ) = H ( x + 1, y ) − H ( x − 1, y )
(3)
G y ( x, y ) = H ( x, y + 1) − H ( x, y − 1)
(4)
b) Calculate gradient histogram of every cell. After gradient degree obtained, vote for every direction in every cell. The magnitude and direction of every pixel gradient could be calculated by (5) and (6) respectively:
G ( x, y ) = G x ( x, y ) 2 + G y ( x, y ) 2
α ( x, y ) = tan −1 (
G y ( x, y ) G x ( x, y )
)
(5)
Traffic signs are classified by Libsvm method proposed by Pro. Lin [9], that processing steps could be described as follow: •
Format converting. Format of training sample is converted according to the requirements of Libsvm package. There are 170 sets of data as training, and their corresponding label is set as train_labels.txt. Then 30 sets are selected as test, their label is test_labels.txt.
•
Scaling data. For avoiding the range of characteristics is too large or too small, data is scaled before modeling. It is efficiently for reducing calculating difficult problem when trainings kernel function calculates inner product.
•
Generating model. Appropriate parameters are selected, and model could be generated by calling Svmtrain in Libsvm.
•
Classified test. Test data is classified by svmpredict function. Then prediction type coding and classification accuracy are calculated.
(6)
c) Cells and block Rectangular normalized is a mature method and has perfect experiment. So 6×6 pixels cell which has 9 histogram components, and every block consisted of 3×3 cells are selected. Spatial connected cells and block is shown in Fig. 5. Block
Cell
Fig. 5. Cells and block of rectangular normalized
d) Normalize HOG feature vector Paradigm 2 normalize is used in this step. f =
Paradigm 2:
Ten traffic signs from GTSDB test dataset are selected as shown in Fig. 6. And they are coded as Tab. 2.
v 2
v 2 + e2
(7)
v Where, k means the K order norm of v. e represents a very small constant.
2) Classified by SVM SVM is a linear classification could be achieved by mapping feature vector into a higher dimensional space [8]. There are five shapes in these traffic signs, round, diamond, triangle, inverted triangle, eight side shape. So there are five SVM model classifiers are needed. C. Recognition by Zernike moment features Objects could be recognized by Zernike moment features because of its invariance of rotation, translation and size scaling, which is proposed by Teague in 1980 [9]. The Zernike moment features of f(x,y) are defined as
Z mn =
n +1
π
∫∫
1π 0 −π
Rmn (r )e jmθ f (r , θ )rdrdθ
(8)
where Rmn is the radial polynomial of point (x,y), m ≥ 0, n ≥ 0 , and = r
x2 + y 2
θ = tan −1 ( y / x) , −1 < x, y < 1
(9) (10)
Fig. 6. Ten traffic signs TABLE II. Codes
1
2
3
4
5
Meaning
Speed limited 30
Speed limited 100
Priority access
Without priority access
Prohibition of pass
Codes
6
7
8
9
10
Meaning
Prohibition of entry
Dangers warning
Left turning
Right turning
Driving around the island
Each of these traffic signs is selected in several conditions, such as strong light, dark, fuzzy jitter, rotation etc. And 7 Zernike moments of each group are extracted for test.
If image was rotated, Zernike moment of this image will be changed from Zmn to Zmn’, and ′ = Z mn e jmθ Z mn
(11)
It could be deduced that the amplitude of Zernike moment is not changed. Zernike moment has the invariant features of rotational and scale.
TEN TRAFFIC SIGNS AND THEIR CODES
III.
RESULT
A. Detection results 1) Detection results between HSV and color standardization Images are normalized to 300×500 pixels in test. Fig. 7 is the original traffic scenery. Detection results based on HSV
color space transform and color standardization are shown in Fig. 8 and Fig .9.
The horizontal axis represents each frame, and the vertical axis represents the computing time each frames. It is obviously see that the computing time of method based on HSV color space transform waste about 45ms more than color standardization in average. Computing efficient is very important for real-time driver assistance. 2) Detection results by color standardization and HOG Tests have been performed in several situations, with different illumination conditions as shown in Fig.11.
Fig. 7. Original image
(a) Color segmentation result
(b) Sign detection result
(a) Strong light
(b) Sunny
Fig. 8. Detection results by HSV color transform
(c) Shadow
(a) Color segmentation result
(d) Normal
(b) Sign detection result
Fig. 9. Detection results by color standardization
Fig.8 show the result of color segmentation and sign detection by HSV color transform. There is some noise in the left corner of image besides detected sign in Fig. 8(b). Compared with Fig.8, results detected by color standardization proposed in this paper shown in Fig. 9(b) have a nice result. Detection times by HSV color space transform and color standardization adopted in this paper is shown in Fig. 10. 100
Calculated Time [ms]
90 80 70
HSV color space transform Color standardization
60
(e) Rainy
(f) Dim
Fig. 11. Detection results in several situation
Test results are highlighted by red rectangular boxes in Fig. 11. While the bottom line show the corresponding magnification display. It is difficult to find out these traffic signs by naked eye observation because of these signs are too far to be distinguished. In other words, it is very helpful for driver assistance when the traffic sign is far and very hard to distinguish by human.
50
B. Recognition results
40
1) Zernike feature Traffic sign of speed limit 30km/h is as an example. There are five conditions shown in Fig. 12. And Z03, Z09, Z18, Z24, Z30, Z36 and Z43 are selected as in Tab. 3.
30
0
10
20
Frames
30
40
Fig. 10. Computing time between HSV and color standardization.
50
61163002), Natural Science Foundation of Ningxia Province (NO. NZ14107) and External-Planned Task (NO. SKLRS2013-MS-05) of State Key Laboratory of Robotics and System (HIT). REFERENCES [1]
Fig. 12. The speed limited signs in five conditions
[2] TABLE III.
ZERNIKE MOVEMENT OF SPEED LIMITED 30 KM/H
Z03
Z09
Z18
Z24
Z30
Z36
Z43
1
0.2306
0.00472
0.000927
0.06792
0.5660
5.3113
0.6684
2
0.2335
0.00361
0.000974
0.06796
0.5663
5.3176
0.6688
3
0.2350
0.00339
0.000845
0.06947
0.5625
5.3177
0.6837
4
0.2369
0.00396
0.000605
0.07037
0.5664
5.3123
0.6826
5
0.2377
0.00395
0.000850
0.07017
0.5647
5.3188
0.6805
By analyzing, error of 7 Zernike moments and standard image feature is within range (10-4, 10-6). It is proved that Zernike moment has the scaling and rotation invariance. 2) Recognition results between Zernike and Hu Recognition results are tested in 200 images. Recognition results are compared between Zernike and Hu features. Experimental results are shown in Tab.4. TABLE IV.
Zernike Hu
RESULTS BETWEEN ZERNIKE AND HU
Training sets 170
Testing sets 30
Discrimination ratio 93.21%
Recognition time(ms) 81.35
170
30
90.56%
93.21
Results show that Zernike moments have higher recognition rate and real-time performance. The minimum recognition time is 10.163ms, and the average recognition time is below 0.1s, which could meet the requirements of real-time. IV.
CONCLUSION
A method for traffic sign detection and recognition based on color standardization and Zernike moment is proposed. Color standardization is used to lower the computational load. And then HOG descriptor is extracted for detection characteristic of candidate regions. Finally traffic signs are recognized by SVM with Zernike moments. Experience results show that this algorithm could defense of strong light, shadow, rainy days, dim etc by take advantage of Zernike moment, and has an efficient computation by color standardization, which is a very important point for real-time driver assistance system in future. However, the workload of early recognition stage is relatively large, which is a problem to resolve. ACKNOWLEDGEMENTS This work was financially supported by the National Natural Science Foundation of China (NO. 61162005 and NO.
[3] [4]
[5] [6] [7] [8] [9]
German Traffic Sign Detection Benchmark (GTSDB), benchmark.ini.rub.de J. Stallkamp, M. Schlipsing, J. Salmen and C. Igel, “Man vs. computer: Benchmarking machine learning algorithms for traffic sign recognition,” Neural networks, 2012, pp. 323-332. A. Broggi, P. Cerri, P. Medici, P.Porta, and G. Ghisio, “Real time road signs recognition,”. in Intelligent Vehicles Symposium IEEE, 2007, pp. 981-986. X.W. Gao, L. Podladchikova, D. Shaposhnikov, K. Hong and N. Shevtsova, “Recognition of traffic signs based on their colour and shape features extracted using human vision models,” in Journal of Visual Communication and Image Representation, 2006, pp. 675-685. Y. Wang, C.Y. Mu, X. Ma, “Traffic sign detection by color standard and HOG feature,” P.R.China: Software Guide, 2015, pp.142-144. G. Mo and Y.Aoki, “Recognition of traffic signs in color images,” in Tencon IEEE Region 10 Conference, 2004, pp. 100-103. N. Dalal, B. Triggs, “Histograms of oriented gradients for human detection,” in IEEE Conference on Computer Vision and Pattern Recognition: San Diego, California, USA, 2005, pp. 886-893. C. Cortes and V. Vapnik, “Support-Vector Network” in Machine Learning, 1995, pp. 273-297. C.C. Chang, and C. J. Lin, “Libsvm: alibrary for support vector maehines”, 2001.