Matrics, a Car License Plate Recognition System? Andr´es Marzal, Juan Miguel Vilar, David Llorens, Vicente Palaz´on, and Javier Mart´ın Departament de Llenguatges i Sistemes Inform` atics Universitat Jaume I Castell´ o (Spain) {amarzal,jvilar,dllorens,palazon}@dlsi.uji.es
[email protected]
Abstract. Matrics is a system for recognition of car license plates. It works on standard PC equipment with low-priced capture devices and achieves real-time performance (10 frames per second) with state of the art accuracy: the character error rate is below 1% and the plate error rate is below 3%. The recognition process is divided in two phases: plate localization and plate decoding. The system finds the plate analyzing the connected components of the image after binarization. The decoding algorithm is a Two Level process which uses fast template-based classification techniques in its first stage and optimal segmentation in the second stage. On the whole, the system represents a significant improvement over a previous version which was based on HMM.
1
Introduction
Car License Plate Recognition (CLPR) is an application of Pattern Recognition with high demand in several fields [1, 5]: control of highways and borders, traffic monitoring, law enforcement, recovery of stolen cars, etc. When the working conditions can be controlled (light settings, camera position with respect to vehicle, background, etc.), CLPR can be satisfactorily solved with well-known Pattern Recognition techniques [2]. But there is a large number of situations where not all these conditions can be controlled. To meet industry-standard requirements [6], a CLPR must (1) operate in a wide range of illumination conditions (indoors and outdoors); (2) be invariant to size, scale, and font boldness; (3) be robust to broken strokes, printing defects, and other kind of noise; (4) be insensitive to camera-car relative positions within a reasonable distance and angle interval; (5) provide real-time response; and (6) work with different capturing devices (including image repositories). There is a demand for flexible CLPR engines satisfying these requirements and easily integrable in final application programs. The Matrics system is a CLPR engine that meets these requirements and runs on the .NET 2.0 platform, which allows easy portability (it can run on the ?
Work partially supported by the Ministerio de Educaci´ on y Ciencia (TIN200612767), the Generalitat Valenciana (GV06/302) and Bancaixa (P1 1B2006-31)
Microsoft Windows system and on Linux under the Mono platform). The engine has been designed to ease its integration in application software.
2
The Matrics system architecture
Typically, CLPR systems proceed in two stages: (1) license plate localization and (2) license character recognition. There are several approaches for plate localization: connected component detection, morphology, texture, etc. Matrics uses a connected component based approach that yields a series of “Regions Of Interest” (ROIs): it finds lined-up connected components on several binarizations of the image. Recognition is performed on each ROI until some stop criterion is met (or all ROIs have been considered). A decoding procedure, consisting of a template-based Two Level algorithm, is executed on each ROI: every segment of the ROI is pre-classified in a first stage and, then, an optimal composition of classified segments compatible with a language model is found by iteratively solving a recursive equation. Decoding more than one ROI provides a great degree of robustness against false plate detections. The stop criterion for this “ROIs on demand” generation is related to the confidence on the quality of the decoding result. Since the number of ROI decoding attempts can be large, the decoding stage must be extremely efficient. This tight integration of coarse localization and fast recognition greatly improves the overall results. 2.1
ROIs detection subsystem
The aim of the ROIs detection phase is to find a set of quadrangles that can be considered promising places for containing a plate and nothing else than a plate. A ROI cannot be assumed to have a rectangular shape due to the perspective distortion introduced by the angle of the camera with respect to the car. These ROIs are searched for by analyzing the connected components of binarized images. Connected components of similar height, aspect ratio in some range, and (approximately) distributed along a line are considered probable license characters and, therefore, their minimum inclusion quadrangle is a plate place candidate, i.e, a ROI. As a single binarization procedure cannot yield the right connected components in all lighting settings, camera distance/angle and blur conditions, Matrics uses up to four different binarizations. All of them are applied to a smoothed image (obtained by applying a Gaussian filter to the original, gray-scale image). This filtered image suffers a mean-minus-C local thresholding that produces a binary image. This thresholding filter has two parameters: n, the window length; and c, the value to be subtracted to the mean gray-level in a pixel neighbourhood before deciding whether the current pixel is black or white. Three different binarizations result of applying the mean-minus-C filter with parameters (n = 21, c = 2), (n = 21, c = 6), and (n = 9, c = 6). Under hardlight conditions such as direct exposition to sunlight, projected shadows on the plate make it hard for any local thresholding technique to properly binarize the
(a)
(b)
(c)
(d)
Fig. 1. (a) Plate directly exposed to sunlight. (b) A single connected component in the binarized image groups several characters. (c) Edges in the original image. (d) Removal of edges from the binary image: the characters of the plate are separated from one another.
image: the frontier between the shadowed and lighted regions is a high contrast line that usually connects most characters in the plate. A fourth binarization is performed to solve this problem. The (binarized) image resulting from the application of a Canny filter is “subtracted” from the (n = 9, c = 6) binarization. This subtraction of thin edges effectively disconnects the characters (see Fig. 1). Sometimes this subtraction produces connected components smaller than those associated to a character. A post-processing phase heuristically joins connected components of similar width and very closely placed in the vertical axis (thus joining the two connected components split by the shadow border from a single character). The set of connected components in a binarized image is filtered according to the absolute size and the aspect ratio in order to discard noise and too large items in the image. Connected components along the line joining every pair of surviving connected components are selected as candidates to be part of a ROI. Whenever the number of selected elements is between 3 and 9, the minimumarea quadrangle enclosing all these components becomes a ROI candidate. ROIs whose baseline has an absolute slope of more than 45 degrees are discarded (these parameters can be modified to tune the system for different requirements). Only ROIs not included in other ROIs and whose four vertices are sufficiently different of those defining other ROIs are effectively generated. Former versions of the system heuristically scored each ROI and yielded ROIs in a picture in decreasing-score order [4]. The score took into account the number of included connected components, baseline angle, percentage of overlapping in components, etc. According to our experiments with the new system, this ROI scoring does not have a significant impact on the recognition accuracy of the whole CLPR system. Finally, the quadrilateral ROIs are mapped into rectangles by means of a bilinear transform. The rectangle dimensions are chosen as close as possible to the quadrangle side lengths. Since the resulting image is expected to be a plate, i.e., a locally high-contrast region, it is enhanced with an adaptive contraststretching filter that takes into account the average gray level on each row and column of pixels. In order to avoid the perspective distortion introduced by the bilinear transform, the enhanced gray-image is slant-corrected before being yield by the ROIs detection subsystem.
2.2
Plates decoding subsystem
A gray scale image can be seen as a sequence of frames, each one consisting of a column of pixels. The decoding problem can be formulated as the computation of an optimal segmentation of this sequence of frames: each segment is a sequence of consecutive frames labeled as either a character or a white space, and such that the concatenated string of labels belongs to a given language (the valid license plate codes). This problem can be solved by a Two-Level decoding algorithm consisting of a segments classification stage and a simultaneous optimal segmentation/decoding stage. For the sake of clarity, we describe first the second stage. Second stage: optimal segmentation and decoding. Let hf1 , f2 , . . . , fn i be a sequence of frames (in our case, a sequence of pixel columns) and let A = (Σ, Q, q0 , δ, F ) be a Finite State Automaton (FSA) where Σ is an alphabet, Q is the set of states, q0 ∈ Q is the initial state, δ ⊆ Q × Σ × Q is the set of transitions, and F ⊆ Q is the set of final states. Let d(i, j, a) be a dissimilarity measure between the subsequence (segment) hfi , fi+1 , . . . , fj i and the character a ∈ Σ (we will present its computation in the next subsection). A segmentation of f into m segments is a sequence of m + 1 integers, hs0 , s1 , . . . , sm i, such that s0 = 0, sm = n, and si < si+1 for all i 6= j. The i-th segment is the subsequence fsi−1 , fsi−1 +1 , . . . , fsi . Given both a sequence of states, q = hq0 , q1 , . . . , qm i, and a segmentation, hs0 , s1 , . . . , sm i, we define their (normalized) distortion as m X
Dq (hs0 , s1 , . . . , sm i) =
i=1
min a∈Σ:(qi−1 ,a,qi )∈δ
m
d(si−1 , si , a) .
(1)
If all sequences of states ending at the same final state, q ∈ F , have the same length, l(q), we can minimize (1) over all state sequences and segmentations computing minq∈F ∆(n, q)/l(q), where 0, if j = 0 and q = q0 ; if j = 0 and q 6= q0 ; (2) ∆(j, q) = +∞, 0 min min ∆(i, q ) + d(i + 1, j, a), if j > 0. 0≤i