Using machine learning for defect detection M. Bariani, R. Cucchiara, P. Mello, M. Piccardi
Istituto di Ingegneria, University of Ferrara Via Saragat 1 - 44100 Ferrara, Italy Tel 00-39-532-293800 Fax 00-39-532-768602 E-mail: fmbariani,rcucchiara,pmello,
[email protected] Abstract
In this paper we present an approach to defect detection in images based on machine learning algorithms. A qualitative model of defect has been devised based on human experience. A set of vision primitives measuring defect features has been de ned. Objects in images candidate to be classi ed as defects are submitted to automatic classi cation, which is achieved with learning by examples algorithms. Results attained on a set of images from an industrial environment are presented and discussed.
1 Introduction
A critical problem in machine vision is perception of shapes without a well de ned geometrical structure, or more generally whose model is not directly de ned by means of a set of measurable features. In such a framework, the choice of a complete and adequate set of vision primitives may aect considerably the performance of a detection or classi cation system in terms of expressive power, reliability and eciency. The main diculties arising in these classi cation problems concern both the selection of a reliable, discriminating, independent set of features (and with minimal cardinality) which characterize the considered shapes [Vernon, 1991] and the inference of an adequate set of rules able to distinguish classes with an acceptable tolerance. A typical case of classi cation derives from the context of automated visual inspection, for the target of defect detection: it requires classi cation of objects in the two main classes of defects and non-defects. Sometimes a precise description of the object without defect is given and classi cation can be carried out by means of a process of template matching [Newman and Jain, 1995]. However, in many other cases, inspection aims at exploring shapes, extracted from images, which can be classi ed as defects if they match an approximative and often non well de ned model derived from human experience. A possible approach to defect detection consists of selecting an adequate set of features, de ning the vision primitives to measure these features and, basing on knowledge of human experts, eliciting by hand a set of decision rules [Cucchiara, 1996]. However, this approach is time-consuming and not exible. Firstly, the correct translation of human experience to a set of computable decision rules is often non trivial, especially when classi cation is made out on the basis of qualitative assumptions (e.g. a defect has a "bright" and "thin" shape). Moreover, whenever features can be measured by a quantitative value (e.g. the luminosity gradient of shape edges) the de nition of acceptable thresholds is very critical, in particular when classi cation is carried out in an unconstrained and unstructured acquisition environment. Finally, it is very dicult to evaluate the correctness of the rules in large data sets and automatically update them when new examples are available. According to these considerations, machine learning techniques [Michalski et al., 1984] can be explored with the aim of inferring classi cation rules and evaluating the information content associated with the considered primitives.
2 Vision primitives for describing the model of defect In this work we use as a case study a typical application in automatic visual inspection: i.e. the detection of unstructured defects, such as surface cracks or scratches; in such a case these "objects" manifest a 2D shape in images, whose model is diculty described without uncertainty both on the features which have to be extracted and on their acceptance intervals. Examples are thin surface cracks on metal products occurred in the fabrication process: through a visual inspection their shape results roughly elongated, straight (or with a low curvature) in the main direction, with a high and almost constant luminosity with respect to background [Cucchiara and Filicori, 1995]. A 1
qualitative model of defects has been devised including features as: a) rectilinearity r, b) high elongation e (intended as the ratio between length l and width w), c) high gradient g, and d) thinness t. Actual defects are strongly aected by noise with respect to the ideal model; therefore we adopted a possibly redundant set of primitives, all consistent with model features, in order to achieve robust detection. A canonical approach would be segmenting images through edge detection or thresholding, and then analyzing the obtained segments. Anyway, segmentation can introduce substantial loss of information in noisy or blurred images, due to its critical parametrization. The approach herein followed is to apply an image analysis operator that is able to take into account all contributions deriving from partial model matching. The adopted operator provides a single gure of merit that re ects the overall correspondence of the object with the model: i.e., a mainly rectilinear, thin, and sharp object will be associated with a defect. The operator belongs to the class of Hough transforms, that give evidence in form of a peak to parametric curves in images [Illingworth and Kittler, 1988]. The operator, called Correlated Hough transform (CH ) was rst proposed in [Cucchiara and Filicori, 1995]: rstly, a gradientweighted Hough transform is applied, mapping angles of votes on a [0; 2[ range in order to discriminate not only the gradient direction but also its orientation. Therefore, elongated objects will manifest two primal peaks in the Hough space, associated with the rising and falling gradient edges. This information is used in a post-processing step to individuate thin objects, which rising and falling fronts are close: a value tmax is assumed as an upper bound for thickness and correlation between peaks is evaluated. Problems arise in images where no actual defect is present, but objects that have a moderate match with a few of the model features; for instance: a) long, rectilinear edges, showing a very low gradient, belonging to the contour of the inspected component, to be considered as noise; b) bubbles, which have non negligible gradient values but not an evidently elongated shape (i.e.: their length/width ratio is low, because they are either short or thick, or both). These images can give rise to false-positive detection, because those matches contribute together to determine the CH space, where non negligible values accumulate: solely thresholding the CH space can signal the presence of an inexistent crack. Therefore, even though CH proves as the most sensitive operator for defect detection, an analysis of the "spectrum" of the various features can achieve more robust classi cation. As a consequence, values of all the model-related features are to be examined in order to decide about the defectiveness of a component. The actual set of primitives considered includes: CH : the value CH (; ). This parameter carries global dependence on the defect features, being a function f = f (r; l; g; t). H 1: the value of the gradient-based Hough transform at the same ; coordinates, corresponding approximately to the rst of the two correlated peaks; this value mainly depends on r, l, and g, but doesn't depend on t. CH=H 1: the ratio of these two values, that expresses the magni cation due to correlation of the two edge steps; it is a f = f (t). N : the number of votes in the Hough space for the ; coordinates: this value strongly depends on r and l alone. Gave : the average gradient of the votes; it resumes sharpness, depending on g. H 2max : the maximum in the H neighbourhood where correlation with H (; ) is explored. D: the Euclidean distance of ; from H 2max , which is another measure of thickness. IGave: the average gradient of the whole image; this is not a model-related feature, but is introduced to take into account overall luminance conditions.
3 Object classi cation using machine learning techniques The problem of classi cation has been approached with use of automatic classi cation algorithms, based on learning by examples techniques. This approach is advantageous with respect to try and elicit handy rules both in terms of capability and exibility. The system considered for the application is C4.5 [Quinlan, 1993]. C4.5 generates a classi cation system represented in form of a decision tree or production rule set. This representation is based on attribute testing, exploiting some fuzzy logic ability. In this application, the ultimate purpose of classi cation is to assess the presence of defects in components. To this aim, two classes of interest have been de ned, namely Defect and NonDefect, and objects to be classi ed are extracted from images in order to be submitted to classi cation. An object is represented by a tuple of values evaluated by the image processing operators. The tuple includes values both from the image space and the Hough space. In particular, in this application every point belonging to the Correlated Hough space could have a tuple associated with and be considered as a candidate for classi cation. Anyway, in this case, the number of tuples to be evaluated and classi ed would be substantially high. Instead, for many of those points classi cation proves to be achievable by solely thresholding the Correlated Hough transform value, which is the comprehensive indicator of overall correspondence of the object with the defect model. As a 2
consequence, the need for extracting and analyzing the whole tuple is limited to those points for which the CH value results ambiguous. Composition of the training set was made by mixing preclassi ed objects belonging to both the Defect and NonDefect classes. In order to train the classi er in the ambiguous region, elements in the NonDefect class were chosen as those objects exhibiting high CH values, that are hard to classify correctly. In particular, objects were selected in images as having CH value higher than a percentage (75%) of the maximum CH value: this results in selecting many defective objects from images where real defects are present, and the most ambiguous points from images carrying no real defect. System
Training set Test set Overall Defect NonDefect Overall Defect NonDefect error (%) error (%) error (%) error (%) error (%) error (%) C4.5 tree 0.0 2.8 4.5 2.3 C4.5 rules 0.9 0.0 1.2 2.8 4.5 2.3 C4.5 tree y 6.0 12.9 44.8 3.8 C4.5 rules y 7.9 37.3 0.0 12.6 44.8 3.4 y learning without average gradient. Table 1: Results on the training and test sets.
Table 1 shows the results attained with a training set composed of 317 objects (67 Defect (21.1%) - 250 NonDefect (78.9%)). The test set was generated by cross-validation, extracting one object at a time from the training set and submitting it to the classi cation system obtained from the remaining examples. Table 1 reports results for the two forms of classifying system provided by C4.5, trees and rule sets. In addition, the rule form allows to separate contributions to the overall error rate deriving from misclassi cation of defects and non defects. This is useful in this application, where erroneous classi cation of defective object is intolerable because it validates faulty components. As it is possible to observe, the two forms achieve similar error rates; anyway, the rule-set form aims to high generalization of examples, and thus can sometimes sacri ce precision on the training set. In both cases, classi cation achieves very low error rates, satisfying application requirements. C4.5 [release 8] decision tree generator ---------------------------------------Read 317 cases (8 attributes) from defect.data Decision Tree: CH > 78.6283 : Defect (29.0) CH 0.0470579 : NonDefect (98.0) | | IGAve 17.7267 : Defect (18.0) | | | CH 0.0456994 : NonDefect (13.0) | | | | IGAve 12.1436 : Defect (15.0) | | | | | | CH/H1