Fast and Robust 2D-Shape Extraction Using ... - Semantic Scholar

1 downloads 2695 Views 3MB Size Report
Language, College of Computer Science, South-Central University for Nation- alities, Wuhan 430074 ... fast and robust 2D-shape extraction method, which can locate potential fault regions .... different degrees of blurring. Yang [16] takes into ...
4762

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 22, NO. 12, DECEMBER 2013

Fast and Robust 2D-Shape Extraction Using Discrete-Point Sampling and Centerline Grouping in Complex Images Zongxiao Zhu, Student Member, IEEE, Guoyou Wang, Jianguo Liu, Member, IEEE, and Zhong Chen

Abstract— This paper initially develops the discrete-point sampling operator’s concept, model, and parameters that we have previously proposed, and makes its belt-shaped regions in a discrete-point sampling map more salient and appropriate for centerline extraction. The cross-sectional features of these beltshaped regions are then analyzed and seven types of feature points are defined to facilitate descriptions of such features. Based on these feature points, a three-level detection system is proposed, including feature points, line segments, and centerlines, to extract centerlines from the belt-shaped regions. Eight basic types of centerlines and five types of relationships among the centerlines are defined by computational geometry algorithms, and Gestalt laws are used to cluster them into groupings. If some prior information about a desired shape is available, retrieval grouping may be carried out by a discrete-point sampling map, the purpose of which is to find centerlines by best matching with prior information. Discrete-point sampling effectually overcomes the influences of interference from noise, textures, and uneven illumination, and greatly reduces the difficulty of centerline extraction. Centerline clustered groupings and retrieval grouping can offer a strong anti-interference ability with nonlinear deformations such as articulation and occlusion. This method can extract large-scale complex shapes combined of lines and planes from complex images. The wheel location results of noise test and other shape extraction experiments show that our method has a strong capability to persist with nonlinear deformations. Index Terms— Discrete-point sampling, centerline extraction, centerline grouping, shape extraction.

I. I NTRODUCTION

S

HAPE extraction [1], [2] is vital for many computer vision applications. Shape is the characteristic surface configuration of an object; its outline or contours. Simple

Manuscript received October 27, 2012; revised May 5, 2013; accepted July 18, 2013. Date of publication August 8, 2013; date of current version September 27, 2013. This work was supported by the Wuhan Huamu Science and Technology Company. The work of Z. Zhu was supported by the Natural Science Foundation of China under Grant 60975021. The associate editor coordinating the review of this manuscript and approving it for publication was Prof. Theo Gevers. (Corresponding author: G. Wang.) Z. Zhu is with the State Key Laboratory for Multi-Spectral Information Processing Technology, Institute for Pattern Recognition and Artificial Intelligence, Huazhong University of Science and Technology, Wuhan 430074, China, and also with the Information Processing Laboratory for Minority Language, College of Computer Science, South-Central University for Nationalities, Wuhan 430074, China (e-mail: [email protected]). G. Wang, J. Liu, and Z. Chen are with the State Key Laboratory for MultiSpectral Information Processing Technology, Institute for Pattern Recognition and Artificial Intelligence, Huazhong University of Science and Technology, Wuhan 430074, China (e-mail: [email protected]; [email protected]; [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TIP.2013.2277824

Fig. 1.

The whole image formed by the fifty-three photos of the TFDS.

2D-shapes can be described by basic geometric objects, such as a set of two or more points, a line, a curve, a plane or a plane figure such as a square or circle [3]. During the past two decades, massive amounts of research have focused on extraction of such simple 2D shapes, separately, yielding many significant results. For example, boundary-based extraction [4], [5] for plane objects is the key in many practical applications of computer vision, such as object recognition, robot vision, and medical image analysis. Curvilinear structures detectors [6]–[8], for the detection and delineation of lines, are widely used in handwriting recognition, human computer interaction, and road detection in aerial images. However, extraction of more complex 2D-shapes, combined by lines and planes in a complex image, is still an unsolved problem. Our human eyes are quite good at extracting such complex shapes, which is extensively used in ordinary life. For example, there is an inspecting system widely equipped in Chinese railways which is officially named “Trouble of Freight Car Detection System” (TFDS) [9], [10]. This system uses five cameras to take 53 parts photos (in 1,400 × 1,024, 24-bit color, jpg format) of the two sides and the bottom of a freight car, as shown in Fig. 1. The freight train is running normally when the size, distance, angle, and shooting times for each photo are relatively consistent. There will be 2,000 to 3,000 photos taken of one freight train, depending on the number of cars that it drives. All of these photos will be inspected, and fault judgments will be made in ten minutes, by four freight train inspectors in an inspection office. If any fault is found, the inspectors will report it immediately. If the fault is severe enough to endanger the train’s operation, the freight train will be stopped, and an external inspector will be sent to further inspect, confirm and repair the fault. During this process, numerous extractions of complex shapes are involved. First,

1057-7149 © 2013 IEEE

ZHU et al.: FAST AND ROBUST 2D-SHAPE EXTRACTION

4763

Fig. 2. Some common types of nonlinear deformations in the right wheel part. (a) Blurring. (b) Blazing. (c) Insufficient illumination. (d) Foreign body shelter. (f) Artificial shelter.

inspectors locate potential fault regions according to some salient shapes in the current image. Second, inspectors make judgments whether there are faults or not in those potential fault regions according to some preferred shapes are available, absent or changed. The starting point of this paper is to build a fast and robust 2D-shape extraction method, which can locate potential fault regions and help to making fault judgments, in lieu of this paper of four freight train inspectors. Our method will reduce human labor requirements, improve the quality and efficiency of inspections, and better ensure the security and speed of railway transport. This is a difficult problem, as shapes in the same category, which look similar to the human eye, are often very different when extracted with geometric transformations (translation, rotation, scaling, etc.) and nonlinear deformations, (noise, articulation and occlusion). Compared to geometric transformations, the nonlinear deformations are much more challenging [11]. In TFDS images, there are small geometric transformations and large nonlinear deformations. Geometric transformations originate from differences in the installed distances and angles of different inspection stations, changes in the speed of trains, and shaking of the camera, which lead to some translation, small rotation and scaling. Nonlinear deformations come from illumination and weather change, noise, all types of spots, strong reflected light, articulations and occlusions suffered by freight cars which have been running in the open for a long time. In this case, if the common regional threshold segmentation or edge extraction algorithms are adopted, irregular and random changes will occur within these regions or edges, which make it very difficult to establish a unified mathematical model for desired shapes. Fig. 2 shows some common types of nonlinear deformations in the images of the right wheel parts. Difficulties also stem from online detection. The reliability and speed of operation must be taken into account, which makes complex pattern recognition algorithms difficult to apply. If two PCs are used to assume the four inspectors’ work, each PC should be able to process 150 photos per minute, and the processing time of each photo should be less than 400 ms. In order to overcome the influences of small geometric transformations and large nonlinear deformations, and to quickly and accurately extract complex shapes combined by lines and planes in TFDS images, this paper proposes a fast and robust 2D-shape extraction method using discrete-point sampling and centerline grouping. The schematic diagram is shown in Fig. 3. Our method first changes a 24-bit color image into an 8-bit gray image, then applies discrete-point sampling to the gray image to represent both plane objects’ boundaries and line objects, uniformly, as belt-shaped regions clustered by discrete points in a discrete-point sampling

Fig. 3.

Schematic diagram of the proposed method.

map (DPSM). Through cross-sectional analysis with different types of belt-shaped regions, a three-level detection system for feature points, line segments and centerlines was designed, which can extract centerlines from these belt-shaped regions quickly, smoothly and accurately. Computational geometry algorithms are used to classify these centerlines into eight basic types. The next two steps in this paper are the centerline clustering grouping, which first calculates the five kinds of relationships among centerlines including intersection, joining, encountering, co-linearity and parallel, and then groups the interrelated centerlines into a shape with Gestalt laws. If some prior information of a desired shape can be introduced here, one or more extracted centerlines reasonably matching that prior information are taken as key centerlines for locating some possible areas of the shape, and other centerlines which belong to that shape are retrieved in those areas of the DPSM to find the desired shape. These two steps belong to centerline retrieval grouping in this paper. In summary, discrete-point sampling can effectually overcome the influences of interference from noise, textures and uneven illumination, greatly reduce the difficulty of centerline extraction, and tolerate small geometric transformations. Flexibly using centerline clustering grouping and retrieval grouping can offer a strong anti-interference capability with nonlinear deformations such as articulation and occlusion. Based on the proposed method, this paper sets up a fast procedure for wheel location and desired shape extraction from TFDS images with satisfactory results. Experiments adding random noise and salt and pepper noise prove that our method has a strong capability to address nonlinear deformations. In our research we found that our method can also be used for road detection in aerial images and some man-made object extraction from nature scenes. The remainder of this paper is organized as follows. Related studies are presents in Section II. Then in Section III we introduce our methods of discrete-point sampling, centerline extractions and groupings. The results of experiments are described in Section IV. Finally, discussions and conclusions are given in Section V.

4764

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 22, NO. 12, DECEMBER 2013

II. R ELATED S TUDIES Traditionally 2D-shape extraction research focuses on plane objects and line objects separately. The shape extraction methods for plane objects are classified into two general groups called boundary-based and region-based methods. This distinction is made based on whether feature extraction is done on the boundary only, or from the whole region. Regions are global features, so region-based methods are more robust for local-shape distortions and occlusions, but for complex shapes they may have difficulty in extracting shapes reliably and representing the shapes precisely. Since regions have closed boundaries, they don’t adapt easily for the representation of open image curves, which are abundant in TFDS and aerial images. Early boundary-based methods were predominantly based on selecting interested edges that meet some fitness criterion, such as curvature or arc length, and representing a shape by grouping those edges. However, traditional edge extraction algorithms are designed and optimized individually without considering subsequent grouping processes, which makes edge grouping difficult or even inapplicable. Taking the Canny operator [12], for example, the Canny’s Criterion I asks that an edge detector has a good signal-to-noise ratio (SNR), so that all the edges can be detected even when the image quality is poor. Such an edge detector will generate too many edge responses for grouping. Often there are a lot of edge responses, even when the input image is just random noise, which means that edges originated from both (contours or lines) and (noise or texture), tend to be in good continuation with each other, obstructing subsequent edge grouping. This problem is referred to as problem A in this paper. Canny’s Criterion II asks that all detecting edges to be as close to real edges as possible, with single edge point response, so that edge grouping has to be carried out on single-connected edge pixels, with many gaps to be bridged, decreasing its efficiency dramatically. This problem is referred to as problem B in this paper. In the case of problem A, caused by Criterion I, many researchers have tried to distinguish long chains of collinear edges from randomly scattered, oriented stimuli [13] by using improved grouping methods, such as multiresolution detection, which can be classified into the two categories of edge focusing [14] and adaptive scale parameters [15]. The former gives an excellent trade-off between good localization and good noise rejection, while the latter is more applicable for detecting contours and lines from images that contain different degrees of blurring. Yang [16] takes into account the appearance of image patches around edges, including their brightness, color, texture cues, etc., which makes grouping more accurate. However, most of these existing algorithms have high computational costs and cannot be used in online applications. In the case of problem B caused by Criterion II, other studies propose contextual edge detectors [17], which are different from general edge detectors. Such detectors take into account additional information around an edge, such as local image statistics, image topology, perceptual differences in

local cues, edge continuity, density, etc., meanwhile counting them. Contextual edge detectors are not aimed at detecting all luminance changes in an image, but at selectively enhancing only those that are of interest in the context of a specific computer vision task. For example, edge drawing [18] and EDPF [19] can obtain clean, contiguous, connected, chains of edge pixels, by computing a set of pixels, referred to as anchors, within an image, and then linking them directly with a smart routing algorithm. Obviously going against Canny’s Criterion II, it avoids the result in edge maps of individual edge pixels with no real relationship and connectivity, which makes subsequent edge tracing simple and efficient. EDContours [20] is a high-speed contour detector that produces good results by combining together EDPF edges at different scales to compute the final soft contour map. Regardless of whether boundary-based methods or regionbased methods are used, generating intermediate-level representations [21], [22], like threshold segmentation maps and edge maps, is the very important first step of these shape extraction methods. For complex images with rich edges, especially when interfered with noise and texture or with uneven illumination, an ideal intermediate-level representation should be like the impressions in human brains, which accentuate the integrity and salience of objects in images, and minimizes the importance of the accuracy of details. Thus the main information in images can be enhanced while detailed information can be ignored automatically. It is advantageous for shape extraction in such images. Canny’s Criteria places an overemphasis on the edge’s detail and accuracy, and cannot satisfy the above requirements. Threshold segmentation maps usually confound objects with backgrounds in complex scenes. This paper compares such intermediate-level representations with our DPSM in section IV. For line objects, the above contour-based methods may also be used if the lines are treated as objects with parallel edges [23]. Another two groups of methods for line objects are the centerline based method and the region based method. The centerline based method is designed to obtain the delineation of curvilinear structures. Steger [24] detects line center points as points where the directional gradient is zero and the second directional derivative has a high absolute value, negative for ridges and positive for valleys. Line edges are defined as the points of maximum directional gradient on a sweep perpendicular to the line. Eberly [25] proposed a method to extract the ridges as points for which the intensities are maxima or minima in the direction of the maximum Eigenvalue of the Hessian matrix. Nevertheless, due to the use of second order derivatives, these approaches are sensitive to noise with high computational costs. The region based method is designed to extract all the pixels of curvilinear structures which correspond to salient regions of the input gray image. C. Lemaitre [26] proposed a robust curvilinear structure detector (CRD) utilizing three steps; the cross-section detection, curvilinear region construction, and the merging and bridging operations, which have a strong characteristic of anti-interference from noise and texture in complex scenes. However its processing time, using a standard PC and C++ based implementation, is around 20s for

ZHU et al.: FAST AND ROBUST 2D-SHAPE EXTRACTION

1,024 × 1,024 images, which is too slow for online applications. Li [27] achieved interested curvilinear structures through density estimation, then extracted their centerlines through seed generation and line following. His method is real-time and insensitive to noise. However, those two methods cannot be used in extracting shapes that are combinations of lines and planes. In this paper, when we try to extract centerlines from belt-shaped regions in DPSMs, we independently use some ideas which are similar with these two methods, such as cross-section detection, seed generation and line following. For grouping, if there is no prior information about the desired shape, centerline grouping may be done according to the Gestalt laws of proximity, good continuation, closure and symmetry, as with edge grouping [28], [29]. However, if the prior information is available, e.g., shapes are known as straight lines, circles, right angle polygonal lines, long curves or close contours, or their position relationships are known, then retrieval can be started with the prior information from the image’s intermediate-level representation, such as an edge map or a DPSM. Retrieval from an edge map needs explicit point-to-point edge correspondences, while retrieving in a DPSM can be done with beltshaped regions, which reduces the difficulty and improves the efficiency.

4765

Fig. 4.

Diagram of the discrete-point sampling model.

Fig. 5. Difference between DPSMs with or without the absolute value sign. (a) Initial gray image, 1,400 × 1,024. (b) DPSM by (2), r = 2, R = 8, ξ = 0.05, 55 236 discrete points, 62.80 ms. (c) DPSM by (4), r = 2, R = 15, ξ = 0.05, 41 052 discrete points, 88.00 ms.

with this model as (2)   1   C R−r P j ∈C R−r g j − U (Pk ) =  1 Cr

III. T HEORY A. Discrete-Point Sampling Discrete-Point sampling [30] is a type of fixed grid sampling using adaptive local parameters. Fixed grid sampling [31], [32] is a widely used shape-model building method that creates a subset of points by picking equally spaced points, which can tolerate some deformations and reduce the computational costs. The goal of using adaptive local parameters is to represent contours with different blurring, illumination and noise. We designed our adaptive local parameters with a psychophysics concept, the Just Noticeable Difference (JND) [33], which is also known as the difference threshold. JND is the minimum amount by which stimulus intensity must be changed in order to produce a noticeable variation in a sensory experience. According to Weber’s law [34], the JND between two stimuli is proportional to the magnitude of the stimuli. Measured in physical units, it can be expressed as  I I = ξ (1) where I is the original intensity of stimulation, I is the addition amount required for the difference to be perceived (the JND), and ξ is a constant called the Weber constant. Activated by (1), our discrete-point sampling model is shown in Fig. 4. Pk is the current fixed grid sampling point, Cr and C R represent the inner circle area and outer circle area of concentric circles, with the center Pk , and their radii are r and R, respectively. C R−r is the concentric ring area, pi is the No. i pixel in Cr with gray value gi .P j is the No. j pixel in C R−r with gray value g j . Radii r is also the radii of the fixed grid sampling. So the left part of (1) can be expressed

Pi ∈Cr

1 Cr



Pi ∈Cr

  gi 

.

(2)

gi

U (Pk ) in (2) is used as a measurement to decide whether Pk is noticeable or not, and the threshold of such decisions is just the Weber constant ξ . Every noticeable fixed grid sampling point is set to 1 in a map called a discrete-point sampling map (DPSM), and every unnoticeable fixed grid sampling point is set to 0. So (3) is for calculating each fixed grid sampling point T (Pk ) in a DPSM  1 U (Pk ) > ξ T (Pk ) = . (3) 0 U (Pk ) ≤ ξ Other points which are not fixed grid sampling points are all set to 0. Noticeable fixed grid sampling points are named discrete points in this paper, which make up of the DPSM. Equation (2) works well with ideal images that are clear and without noise. However, when images are full of noise and have different levels of blurring, like TFDS images or aerial images, (2) is too sensitive to be used. So we remove the absolute value sign to reduce the number of discrete points and avoid belt-shaped regions conglutinating with their neighborhood, which makes the belt-shaped regions in a DPSM more salient and easier to extract centerlines from. Fig. 5 shows the difference between DPSMs with or without the absolute value sign. We also add a minor ε in the denominator to avoid 0 when applying to zero regions. The modified formula is shown in (4). In our program, (2) and (4) can be switched by hand for different applications. More results comparisons for these two formulas will be discussed in section IV   1  1  P j ∈C R−r g j − Cr Pi ∈Cr gi C R−r    . (4) U (Pk ) = 1 Pi ∈Cr gi + ε Cr

4766

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 22, NO. 12, DECEMBER 2013

Fig. 7. Relationship among the sampling radius r, the external circle radius R, the numbers of discrete points and the sampling time. (a) R = 15, ξ = 0.05 and (b) r = 2, ξ = 0.05.

Fig. 6. Discrete-point sampling results for different parameters. (a) Initial gray image, 697 × 584. (b)–(d) R = 15, ξ = 0.05, r changes. (e)–(h) r = 2, ξ = 0.05, R changes. (i)–(l) r = 2, R = 15, ξ changes.

As (4) is more widely used, the following discussion of parameters selection will be based on (4). There are three parameters in our discrete-point sampling model; the fixed grid sampling r, the external circle radius R, and the Weber constant ξ . Fig. 6 shows how these three parameters affect discrete-point sampling results, and we will discuss them in regards to the determining order. 1) Sampling Radius r: Sampling radius r directly determines the density of discrete points in a DPSM. It can be seen from Fig. 6(b)–(d) that the larger its value, the sparser the density of the discrete points. Sampling radius r also affects a DPSM’s ability to describe the detailed information contained in its initial grey image. The smaller its value, the stronger this ability is, with greater computational costs. The difficulty of subsequent centerline extractions is also proportional to this value, and a larger sampling radius r may lead to a more complex centerline extraction strategy. Fig. 7(a) gives the relationships among the sampling radius r, the number of discrete points and the sampling time with Fig. 6(a). It can be seen that when the sampling radius r increases from 1 to 2, the number of discrete points reduces by 80.3%, and the sampling time shortens by 76.9%. Taking all these factors into account, including the salient level of belt-shaped regions in a DPSM, difficulties of subsequent centerline extractions, and sampling time costs, we use r = 2 for all the applications in this paper. 2) External Circle Radius R: External circle radius R determines the level to which belt-shaped regions can be strengthened. The larger its value, the farther away from the real edge the discrete points are generated, and the more salient are the belt-shaped regions gathered by these points, as Fig. 6 (e)–(h) show. So it is more beneficial for centerline extraction. At the same time, the external circle radius R also affects the ability to suppress interference from noise and texture. (2) or (4) can be seen as a mean filter, and the external circle radius R is its window size. The larger its value, the better is its inhibiting ability for interference from noise and textures, and the more time needed for calculating discrete points. Fig. 7(b) gives the relationships among the external circle radius R, the

number of discrete points and the sampling time with Fig. 6(a). It can be seen that the sampling time grows linearly with the external circle radius R, while the number of discrete points does not, which means that the effect of filtering is enhanced with R increasing. Through experimentation, we use R = 15 for all applications using (4) in this paper. 3) Weber Constant ξ : The Weber constant ξ affects two abilities; one is the DPSM’s ability to describe details of initial images, and the other is its inhibiting ability for interference from noise and texture. However, these two abilities are in contradiction. The larger the value of ξ , the weaker its ability to describe details of initial images, and better the inhibition ability, as shown in Fig. 6(i)–(l). By experimentation, we chose ξ = 0.05 for all applications in this paper, which corresponds to the human eye in very high brightness. B. Centerline Extraction Before centerlines may be extracted from belt-shaped regions in a DPSM, the neighborhood space and discrete distance of discrete points in a DPSM should first be defined. Traditionally neighborhood spaces are a 4 or an 8 - neighborhood space. Zhang [35] has made an extension to a 16-neighborhood, by defining the M-connectivity and the M-path boundary. In a DPSM, because the belt-shaped regions that correspond to the edges have some actual width, the neighborhood space can be extended to a 64-neighborhood, and the central point P continues to extend lengths in 64 directions, which are calculated as continuing features. This approach is called the 64 peripheral direction contributions (64 PDC), which can be expressed as di = li × si

i = 1,2,...,64

(5)

In (5), li is the unit length of every step that extends in direction i , si represents the steps extending in direction i , and 64 indicates a division of 360° around the point P into 64 equal parts, thus each two neighboring directions differ by 360/64 = 5.625°. Fig. 8 shows the unit length in every direction. For the convenience of calculations and representations, the 32 peripheral bi-direction contribution (32 PBDC) of the central point P is also defined in this paper. If the contributions in opposite directions are added together, then a feature set with half the number of features defined in the 64 PDC may be obtained, which is the 32 PBDC of the central point P, and can be expressed as bdi = li (si+32 + si )

i = 1,2...32

(6)

ZHU et al.: FAST AND ROBUST 2D-SHAPE EXTRACTION

Fig. 8.

4767

The unit length of every step in 64 directions.

Based on 64 PDC and 32 PBDC, the centerline extraction method for belt-shaped regions in a DPSM may be divided into three steps; cross-sectional analysis, centerline tracing and centerline classification. 1) Cross-Sectional Analysis: The cross-sectional features of any discrete point in a DPSM may be defined and calculated by its 64 PDC and 32 PBDC, as shown in Fig. 9 and Table I. a) Region point: P is a region point if all of its 64 PDCs are greater than zero. b) Part region point: P is a part region point if all of its 32 PBDCs are greater than zero. c) Wide belt point: P is a wide belt point if its EL is less than half the sum of its 64 PDCs. d) If the NL of P is less than a normal width threshold Tn , and its EL is longer than an extended length threshold Td, and i) if the SL of P is less than 4r, and its NW is less than 1/4 of its EL, and its FNW and SNW are all greater than zero, with differences of less than 2r, then P is an end point; ii) if both P’s FNW and SNW are greater than 1/4 of its EL or half of a normal width threshold Tn , and there is at least one PDC that is equal to zero in each region’s directions, for the four regions, then P is a cross point; iii) otherwise, P is a connected point. e) P is an unknown point if all of the rules from (a)–(d) are un-satisfied. f) Region points and part region points are feature points of noncurvilinear structures, while end points, cross points and connected points are feature points of curvilinear structures. 2) Centerline Tracing: If the current point belongs to a region point or a part region point, then a region shield is implemented to prohibit all of the points in this region from being calculated or traced, which can greatly decrease the number of discrete points for which types will be calculated. If the present discrete point P belongs to an end point or a connected point, line segment detection will start immediately. Specifically, taking P as the starting point Pst art , the max PDC, dmax , is calculated from among its LD and LD’s neighboring 2 × l directions, which means that there are 2 × l + 1 directions to be calculated for the PDC, and the present

Fig. 9.

A discrete point’s cross-sectional features. TABLE I

D EFINITIONS OF THE C ROSS -S ECTIONAL F EATURES OF A D ISCRETE P OINT

discrete point moves forward one step to a new discrete point P End along the direction that corresponds to dmax . During this process, both the global feature extending the direction and the local feature of making a directional adjustment are taken into account. In the above process, the larger l is, the greater the number of directions in which the centerline can extend, and the greater the computational cost will be. In this paper, l is set to 4, according to our experimental experience, which means that the line segment detection is conducted among the LD’s within ±22.5°. The connected line between P End and P St art is the line segment L. All of the detected line segments and their mother-centerline numbers are recorded on a history map, by drawing gray line segments. The gray value is that line segment’s number, which will be used for relationship calculations. Taking P End as the new start-point, P St art , our centerline tracing method detects line segments continuously until it satisfies the ending conditions, which are illustrated below.

4768

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 22, NO. 12, DECEMBER 2013

Fig. 11. Five kinds of relationships among centerlines. (a) Intersection, (b) joining, (c) encountering, (d) co-linearity, and (e) parallelism.

Fig. 10. Centerline classification results indicated by different colors. (a) Initial gray image, 1,400 × 1,024. (b) Centerline map, RED: circle arc; PINK: partial circle arc; BLUE: closed curve; AZURE: partially closed curve; YELLOW: straight line; PURPLE: two-polygonal lines; GREEN: threepolygonal lines; BLACK: unknown curve. There are 59 centerlines extracted and classified, taking 100.53 ms.

a) The max PDC in 2×l +1 directions, dmax , is equal to zero. b) The last Pend point belongs to a region point or a part region point, which means that the tracing centerline leaves the feature points of the curvilinear structures, and enters into the feature points of the noncurvilinear structures. c) Centerline tracing reaches the image boundary. d) Centerline tracing reaches a line segment that belongs to its own or other curves that are recorded in its history map, while the intersection condition is not satisfied. After tracing along the longer direction of Pst art , centerline tracing is continued along the shorter direction of Pst art . Then, all of the line segments from both directions are reordered, and a whole centerline C is formed. 3) Centerline Classification: Using computational geometry algorithms, all of the detected centerlines can be divided into eight types, including a straight line, a two-polygonal line, a three-polygonal line, a closed curve, a partially closed curve, a circle arc, a partial circle arc and an unknown curve. This paper uses different colors to represent different centerline types, as shown in Fig. 10. C. Centerline Grouping 1) Clustering Grouping: In this paper, clustering grouping means clustering extracted centerlines together according to the Gestalt laws of proximity, good continuation, and closure and symmetry, which can be divided into the two steps of relationship detection and centerline clustering. a) Relationship detection: There are five types of relationships detected in this paper, among centerlines close to each other. These relationships include intersection, joining, encountering, co-linearity and parallel, which are shown in Fig. 11. As mentioned above, all line segments detected are recorded on a history map. Thus if the present line segment, Lq , of a new centerline, Cn , meets with another line segment, L p, in the history map, it also means centerline, Cn, meets with L p ’s mother centerline, Cm . Whether Cn crosses Cm or ends at their cross point, is determined by the sharp angle Ac between L p and Lq . In this paper the threshold is set to 2 × 360/64 = 11.25°, corresponding to 64 PDC. This

means if there are two directions of difference between L p and Lq , or the angle between them is greater than 11.25°, Cn can cross Cm and it is a centerline intersection. If there are less than two directions difference between L p and Lq , or in other words the angle between them is less than 11.25°, Cn is ended at their cross point and it is then a centerline joining. If the first line segment or the last segment of a centerline meets another centerline by extending within a set distance de , it is called a curves’ encountering. If there are two centerlines, whose closer endpoints can be connected as a line segment, and this line segment has an angle of less than 5.625° to the two line segments to which the two closer endpoints belong (or in other words the difference in the directions is no more than one) then these two centerlines are collinear. If the variance  of all shortest distances between each point on one centerline and another centerline is less than a threshold d p , then these two centerlines are parallel. b) Centerline clustering: Using the centerlines’ relationships such as intersection, joining, and encountering, centerlines close to each other can be clustered together to delineate a more complex shape, which can play important roles in extracting complex 2D-shapes combined by lines and planes in complex images. Such results will be compared with EDContours and C. Lemaitre’s CRD in section IV using some images from the W-CS dataset which is available at http://johel.m.free.fr/curvi/dataset_curvi.rar. Using co-linearity, centerlines which are close to each other and collinear can be connected together as a longer centerline. Using parallel, two parallel centerlines which are close to each other can be merged into one parallel structure. Those two relationships are useful when line shapes such as roads and rivers are extracted from aerial images. We will also discuss our results from such an application in section IV. 2) Retrieval Grouping: If the whole or partial prior information of a desired shape is available, centerline retrieval grouping can be utilized. In this paper, this approach is divided into two steps: key centerline locating and centerlines retrieval. Key centerline locating selects one or more extracted centerlines which possibly belong to the desired shape as key centerlines, and locates some possible areas for the desired shape. Centerlines retrieval retrieves centerlines that match the prior information best in the possible areas of its DPSM. To illustrate centerline retrieval grouping method, we will take wheel location in TFDS as an example. Images of wheel parts are the most common images in TFDS, as Fig. 1 shows. Wheel location is essential to finding the potential fault regions in the images of wheel parts. The difficulties come from several aspects. a) 1/4 of the wheel is covered by other components. b) Uncovered parts of the wheel are not a regular circle

ZHU et al.: FAST AND ROBUST 2D-SHAPE EXTRACTION

4769

Fig. 12. Three conditions for the key centerlines of a right wheel (a) Initial image. (b) Condition i. (c) Condition ii. (d) Condition iii. (e) Wheel location result.

arc, but an oblate arc with a large radius above and a small radius below. This is due to the visual angle of the side camera not being direct to the plane of the wheels. c) Different TFDS stations have images of wheel parts with different sizes, visual angle, and shooting times. d) Abundant nonlinear deformations exist as Fig. 2 shows. Our wheel locating strategy first finds centerlines which may belong to a wheel as the key centerlines. A centerline that matches one of the three conditions following may belong to a right wheel. Fig. 12 shows these three conditions. i) Circle arc or partial circle arc with radii between 200–400 pixels. ii) A nonstraight line which has a relationship of an intersection, a joining or an encountering with a straight line (corresponding to the rail) in the right lower side of the image. iii) Centerline of a belt-region in the right lower side of the DPSM which may correspond to the boundary of a right wheel. Centerlines are retrieved with the above three conditions consecutively. Almost all images of right wheel parts will have one or more such centerlines, even with severely nonlinear deformations as shown in Fig. 2. After the key centerlines corresponding to boundaries of a wheel are obtained, all P End and P St art points of the line segments in the key centerlines may form many circles, with radii between 200–400 pixels. Each right lower side arc of every circle is calculated with the number of discrete points that it passes through in the DPSM. The circle whose arc can pass through the greatest number of discrete points is the located result of the wheel. The definition of any centerline passing through a discrete point can be expressed as (7)  1 I ( pk ) = 1 or C8 ( pk ) ≥ 2 (7) S( pk ) = 0 C 8 ( pk ) < 2 where I (Pk ) = 1 indicates that there is a pixel that corresponds to pk in the DPSM, and C8 ( pk ) ≥ 2 indicates that there are two pixels that are located at eight connected directions around the position that corresponds to pk . These two situations can both be considered to have the curve passing through pk in the DPSM. This definition avoids explicit pointto-point correspondences, so it is robust to noise and small geometric transformations. IV. E XPERIMENTS AND R ESULTS In this section, we first compare our DPSMs with Canny edge maps and Otsu segmentation maps et al., to show the

Fig. 13. Results of different intermediate-level representations under uneven illumination. (a) Initial image, 450 × 303. (b) Edge map by OpenCV’s cvCanny, low threshold = 0, high threshold = 50, 1.86 ms. (c) Edge by EDPF, 22 edge segments, 2.85 ms. (d) Otsu segmentation map 0.44 ms. (e) Concavity segmentation map, 4.53 ms. (f) DPSM, 580 discrete points, 8.06 ms.

Fig. 14. Results of different intermediate-level representations with interference from noise and texture. (a) Initial image from the W-CS dataset, 1024 × 685. (b) Edge map by OpenCV’s cvCanny, low threshold = 30, high threshold = 80, 17.42 ms. (c) Edge map by EDPF, 279 edge segments, 41.94 ms. (d) Otsu segmentation map, 1.42 ms. (e) Concavity segmentation map, 25.10 ms. (f) DPSM, 29 228 discrete points, 43.51 ms.

advantages of using a DPSM as an intermediate-level representation. Then we compare our results of clustering grouping with C. Lemaitre’s CRD and EDContours, and describe the results of retrieval grouping vs. edge grouping applied to wheel locations. As a comprehensive application of these two kinds of centerline grouping, some desired shapes extraction results from different part images from a TFDS are given. At the end this section, the role of the absolute value sign in (2) is discussed. Our test environment utilizes an Intel Core i5-2540, 2.60 GHz CPU, 4 GB RAM, and VC++2010. A. Different Intermediate-Level Representations Figs. 13 and 14 compare five intermediate-level representations of images having uneven illumination or interference from noise and texture. The five intermediate-level representations are edge maps by OpenCV’s cvCanny [36], EDPF, Otsu segmentation maps [37], concavity segmentation maps [38], and our DPSMs. It can be seen from Figs. 13(f) and 14(f) that in a DPSM, both plane objects’ boundaries and line objects have been uniformly represented as belt-shaped regions clustered by discrete points. Large scale structures in this image are enhanced as belt-shaped regions, while detailed

4770

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 22, NO. 12, DECEMBER 2013

Fig. 15. Some results of shape extraction from nature scene images. (a) Initial images. (b) CSD. (c) EDContours. (d) Proposed.

noise and textures are inhibited. The rough shape combined of lines and planes may be obtained by extracting and grouping centerlines from these belt-shaped regions. So a DPSM is more similar to the impressions of human brains, and is better for extracting complex shapes combined of lines and planes. At the same time, the Canny operator requires careful adjustment of its high and low thresholds, and while the EDPF does not, its parallel edges need to be handled. Both Canny and EDPF are easily interfered with noise and texture. The difficulty in searching for desired edges from these two maps is much greater than extracting centerlines from DPSMs. Otsu segmentation maps confound objects and backgrounds under such conditions, as Figs. 13(d) and 14(d) shows. Concavity segmentation maps can separate them, but under uneven illumination they cannot extract objects as accurately as a DPSM. With interference from noise and texture, the line parts and plane parts need be processed separately to obtain the final shapes, as Figs. 13(e) and 14(e) show. B. Centerline Clustering Grouping Fig. 15 shows several results of shape extraction from images of nature scenes, using CRD, EDContours, and centerline clustering grouping as proposed in this paper. All these images are 1,024 × 685 color images from the W-CS dataset. CRD works well at extracting line objects [see Fig. 15(b) and (j)], but cannot handle plane objects’ boundaries [see Fig. 15(f)]. The processing time of more than 10s is also too long. EDContours can achieve fine contours of objects in 400 ms [see Fig. 15 (c), (g), and (k)], however in order to obtain shapes which match a human’s impressions more accurately, responses from noise and textures in contour results should be excluded, and parallel contours corresponding to line objects should be merged into one line. Centerline clustering grouping clusters the centerlines above a particular length by their relationships such as intersection, joining, encountering, co-linearity and parallel, and obtains the shapes which most accurately match a human’s impressions in 100 ms. Noise and textures have little effect on the final results of the extracted shapes. Fig. 16 shows road detection results of a single image frame from an aerial video. There is a wider road and three narrower roads. After centerline extraction, 14 centerlines

Fig. 16. Road detection from an aerial video. (a) Initial image 720 × 576. (b) Edge map by OpenCV’s cvCanny, low threshold = 0, high threshold = 50, 9.92 ms. (c) Edge map by EDPF, 326 edge segments, 16.86 ms. (d) DPSM, 26.06 ms. (e) Centerline extraction, 14 centerlines, 36.52 ms. (f) Centerline clustering grouping, 37.78 ms.

Fig. 17. Schematic diagram of wheel locating by edge grouping. (a) Initial image. (b) Arc edges belonging to different quadrants. (c) Recalculating center and radii. (d) Wheel location result.

were extracted. Through centerline clustering grouping, C0, C6 and C9 are combined into one centerline using co-linearity. C1, C2, C3 and C4 are combined into a parallel structure. Fig. 16(b) and (c) also shows edge maps by Canny and EDPF, where it can be seen that the wider road is easily extracted through parallel edge analysis, but the edges of three narrower roads are submerged in the disorderly and unsystematic edge maps. We tested our method on this video, which had a total of 325 frames of 720 × 576 color images, with an average of 35 ms per frame. Prior to centerline grouping, there was an average of 23 centerlines extracted. Unless roads are close to an image boundary, almost all roads can be detected. C. Wheel Location This section will compare wheel location results from centerline retrieval grouping and edge grouping. The algorithm for wheel location by edge grouping is shown in Fig. 17. Edges are extracted first, and arc edges belonging to different quadrants, with the axis as their centrality, are determined. For example, the red, blue and green arcs in Fig. 17(b) belong to the 1st, 3rd, and 4th quadrants, respectively. As the wheel is an oblate arc with the large radii above and the small radii below, the region of the wheel calculated with arcs in one quadrant will be too large, as shown by the blue circle in Fig. 17(c), or too small, as shown by the red circle. In order to obtain a better wheel location result as shown in Fig. 17(d), the center and radii should be recalculated according to the normal intersection points from the middle of the three arcs. Edges are extracted by cvCanny with two threshold values of 50 and 100. In order to obtain desired edges under different

ZHU et al.: FAST AND ROBUST 2D-SHAPE EXTRACTION

4771

Fig. 18. Success and error examples of wheel locating. (a) Success. (b) Error: no wheel axis. (c) Error: too small a radius. (d) Error: too large a radius. TABLE II W HEEL L OCATION R ESULTS W ITH N OISE

Fig. 19. Wheel location results with two types of noise. (a) Random noise, SNR = 16 dB. (b) Result of edge grouping, fail. (c) Centerline extraction. (d) Wheel location result, success. (e) Salt and pepper noise, SNR = 13 dB. (f) Result of edge grouping, fail. (g) Centerline extraction. (h) Wheel location result, success.

When no noise is added into the images, both methods can locate wheels with excellent results, as the first two rows of Table II show. The average time using by the method based on edge grouping is 108 ms, while centerline retrieval grouping uses 170 ms. However, when noises are added into images of wheel parts, the locating ability of centerline retrieval grouping is much stronger than edge grouping, which can be seen in the rest of Table II. It is because that the method based on edge grouping is very sensitive to noise, which cannot be used in images with noises, seeing Fig. 19(b) and (f) for example. Our method can successfully locate the wheel when added random noise reduces the SNR to lower than 16 dB, or added salt and pepper noise reduces the SNR to lower than 13 dB, as shown in Fig. 19. D. Results of Desired Shape Extraction from TFDS Images

illumination and noise conditions with the Canny operator, histogram equalization and Gaussian filtering were carried out on images, prior to edge extraction. The results were checked by humans with the sub-images cut. If the wheel axis was well situated in the middle area of an edited sub-image with a moderate radius, as shown in Fig. 18(a), this location is judged as a success. If there is no wheel axis in the sub-image, or the radius is too large or too small, this location is judged as an error. If no key centerline is found with centerline retrieval grouping, or if no arc edge is found with edge grouping, this location is judged as a fail. Examples of successes and errors are shown in Fig. 18. Comparative experiments between our method and edge grouping were carried out for wheel locations without and with noise. The experimental data came from five different TFDS stations, with 300 separate images each, yielding a total of 1,500 images. When these images were selected, all kinds of geometric transformations and nonlinear deformations were deliberately included.

As explained in the introduction of this paper, there are many processes of complex shape extraction carried out during fault detection from TFDS images, which are first utilized for locating potential fault regions, and second for fault judgments. Freight train inspectors can extract these shapes, locate the potential fault region and make a judgment immediately, which depends significantly on their extensive experience. This paper endeavors to build a fast and robust shape extraction method which can compare favorably with that of a human. Fig. 20(a) shows different images of part with desired shapes. Fig. 20(b) shows their edge map from a Canny operator. As the Canny operator can only generate thousands of individual edge pixels with no real relationship or connectivity, complex edge grouping or matching methods are needed to extract desired shapes from these edge maps, and many times this is impossible. Fig. 20(c) shows edge sequences generated by EDPF, which is capable of producing hundreds of contiguous, one-pixel wide edge segments, greatly decreasing the difficulty of edge grouping. However, EDPF does not take noise interference into account, so it produces too many (725-1,013) edge segments, and the direction of edge continuation may not be in accordance with the desired shape. Fig. 20(d) shows our centerline extraction results after discrete-point sampling, with the numbers of extracted centerlines from 29 to 81, which are significantly less than the number of edge pixels in (b) and edge segments in (c), but retain the basic shapes of the components in the images.

4772

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 22, NO. 12, DECEMBER 2013

a smaller external ring radius R can also be used, which significantly increases the speed of discrete-point sampling. Unfortunately, most images in real applications are not as ideal as shown in Fig. 21(a), (f), and (m). Effects from noise, blurring and uneven illumination require that the images have to be processed with (4), which can depress the discrete-point sampling operator’s sensitivity to noise, reduce the number of discrete points and cause belt-shaped regions corresponding to different plane objects’ boundaries or line objects separate from each other. V. C ONCLUSION

Fig. 20. Results of desired shape extraction from TFDS images. (a) Initial gray image, 1,400 × 1,024. (b) Edge map by OpenCV’s cvCanny, low threshold = 0, high threshold = 50, thousands of edge pixels, average time of 33.48 ms. (c) Edge map by EDPF, 725-1,023 edge segments, average time of 41.18 ms. (d) Centerline extraction, 29-81 centerlines, average time of 135.78 ms. (f) Desired shapes by centerline grouping.

Fig. 21. Some ideal images which are clear without noise should be processed with (2) that have the absolute value sign. (a) Ideal images. (b) DPSM by (4), r = 2, R = 15, ξ = 0.05. (c) Centerline extraction from (b). (d) DPSM by (2), r = 2, R = 4, ξ = 0.05, 10.62 ms. (e) Centerline extraction from (d).

Through integrated utilization clustering grouping and retrieval grouping that we have discussed previously, the desired shapes can be extracted quickly, as Fig. 20(e) shows, which can be very useful for subsequent locating of potential fault regions and making fault judgments. E. Roles of the Absolute Value Sign All of the above applications were carried out with (4) without the absolute value sign. However, for ideal images which are clear, without noise, (4) will ignore very narrow line objects [see Fig. 21(b)], and form two parallel belt-shaped regions, along a wider line object’s boundaries [see Fig. 21(b), (i), and (n)], which causes difficulties in extracting line object shapes. If (2) is applied with an absolute value sign to such images, not only can the form of two parallel belt-shaped regions be avoided as shown in Fig. 21(d), (k), and (p), but

2D-shape extraction is widely used in many applications of computer vision. Traditional research in this field has focused on plane objects’ boundaries and line objects, separately, which cannot be directly used to extract complex 2D-shapes combined of lines and planes. In complex scenes with uneven illumination and interference from noise and textures, traditional intermediate-level representations, such as those from edge maps and threshold segmentation maps, cannot automatically enhance the main information in the images and ignore detailed information, which limits their application for such scenes. Edge grouping or shape retrieval requires determining explicit point-to-point edge correspondence in numerous and complicated edge maps, which greatly increases the complexity of the grouping and retrieving algorithms, and renders them inapplicable under such conditions. This paper first uses a discrete-point sampling operator based on Weber’s Law, which reflects the manner by which human eyes perceive brightness differences, to convert a gray image into a DPSM as an intermediate-level representation. Compared with the typically used threshold segmentation maps and edge maps, DPSMs correspond more closely to the intuitive impressions of an initial image in a human brain, which can tolerate small geometric transformations (translation, rotation, scaling, etc.), and automatically accentuate the main information while rejecting many details from uneven illumination and interference from noise and textures. Traditional plane objects’ boundaries and line objects in a DPSM are not filaments or parallel lines depicted by single connected edge pixels, but are belt-shaped regions depicted as clusters of discrete points. Such belt-shaped regions can represent the main information for shapes combined of lines and planes, while retaining some ability for depicting details, which permits easy extraction of centerlines. Through crosssectional analysis of different belt-shaped regions, this paper has produced a three-level detection system using feature points, line segments and centerlines, which can extract centerlines quickly, smoothly and accurately from these beltshaped regions. Eight basic types of centerlines are defined by computational geometry algorithms. The relationships between extracted centerlines such as intersection, joining, encountering, co-linearity and parallel are then calculated and Gestalt laws are used for clustering grouping. If some prior information of a desired shape is available, inspired retrieving for certain shapes such as lines, circles, right angle polygonal lines and long curves could be carried out by a DPSM. As centerline

ZHU et al.: FAST AND ROBUST 2D-SHAPE EXTRACTION

retrieval in a DPSM is between centerline and belt regions, the complexity of a retrieval algorithm is greatly reduced. The wheel location results of noise test and other shape extraction experiments show that our method has a strong capability to persist with nonlinear deformations. Other results indicate that our method can also be used in road detection from aerial images and some man-made object extractions from natural scenes. The DPSM generated by discrete-point sampling is a rough outline, intermediate-level representation of the initial image. The belt-shaped regions in a DPSM reduce the complexity of the initial image and make structural features with larger scales more stable, which show a strong toleration for nonlinear deformation. The drawbacks of a DPSM are that the tolerated rotation and scaling changes cannot exceed the belt-shaped regions, and cannot represent elaborate images. It is thus better at describing large scale plane objects’ boundaries or line objects in a noisy environment. Under such conditions, subsequent centerline extraction and retrieval in the beltshaped regions can be more efficient and more effective than that of traditional edge grouping. Centerline extraction as proposed in this paper can be seen as a special case of principal curves [39], [40] extraction under the condition that a discrete point has some continuity in 64-directions. Principal Curves are nonlinear generalizations of the first linear principal component which can be thought of as optimal linear 1D summarization of the data. Their emphasis is on finding self-consistent, smooth, one dimensional curve that pass through the middle of a multidimensional data set. The theoretical foundation is to seek lower-dimensional, non-Euclidean manifolds embedded in multidimensional data space. We have begun to test different algorithms of principal curve extraction with our DPSMs, in order to allow our centerline extraction to break through the limits of continuity from discrete points and retain its processing speed, which would offer greater flexibility and adaptability to more complex images. Centerline grouping proposed in this paper takes full advantages of the traditional Gestalt laws and the characteristic that shape matching is effective in a DPSM with belt-shaped regions, and can extract complex shapes combined of line and plane structures in complex scenes. This method is widely applicable. In the future, we will make further use of our method to describe and extract more complex shapes in TFDS images, and also to explore broader applications such as aerial image analysis and stroke or writing trajectory extraction. R EFERENCES [1] R. Jakubowski, “Extraction of shape features for syntactic recognition of mechanical parts,” IEEE Trans. Syst. Man Cybern., vol. 15, no. 5, pp. 642–651, Sep./Oct. 1985. [2] M. Liu, B. C. Vemuri, S.-I. Amari, and F. Nielsen, “Shape retrieval using hierarchical total bregman soft clustering,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 34, no. 12, pp. 2407–2419, Dec. 2012. [3] (2013, Apr. 10). Shape [Online]. Available: http://en. wikipedia.org/wiki/Shape [4] G. Papari and N. Petkov, “Edge and line oriented contour detection: State of the art,” Image Vis. Comput., vol. 29, nos. 2–3, pp. 79–103, Feb. 2011.

4773

[5] A. K. Mishra, P. W. Fieguth, and D. A. Clausi, “Decoupled active contour (DAC) for boundary detection,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 33, no. 2, pp. 310–324, Feb. 2011. [6] L. Liu, D. Zhang, and J. You, “Detecting wide lines using isotropic nonlinear filtering,” IEEE Trans. Image Process., vol. 16, no. 6, pp. 1584–1595, Jun. 2007. [7] Z. W. Kim, “Robust lane detection and tracking in challenging scenarios,” IEEE Trans. Intell. Transp. Syst., vol. 9, no. 1, pp. 16–26, Mar. 2008. [8] N. Aggarwal and W. C. Karl, “Line detection in images through regularized hough transform,” IEEE Trans. Image Process., vol. 15, no. 3, pp. 582–591, Mar. 2006. [9] Z. H. Liu, D. Y. Xiao, and Y. M. Chen, “Displacement fault detection of bearing weight saddle in TFDS based on hough transform and symmetry validation,” in Proc. 9th Int. Conf. FSKD, May 2012, pp. 1404–1408. [10] X. D. Yang, L. J. Ye, and J. B. Yuan, “Research of computer vision fault recognition algorithm of center plate bolts of train,” in Proc. 1st Int. Conf. Instrum., Meas., Comput., Commun. Control, Oct. 2011, pp. 978–981. [11] J. W. Wang, X. Baia, X. Youa, W. Liua, and L. J. Lateckib, “Shape matching and classification using height functions,” Pattern Recognit. Lett., vol. 33, no. 2, pp. 134–143, Jan. 2012. [12] J. Canny, “A computational approach to edge detection,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 8, no. 6, pp. 679–698, Nov. 1986. [13] M. Ursino and G. E. La Cara, “A model of contextual interactions and contour detection in primary visual cortex,” Neural Netw., vol. 17, nos. 5–6, pp. 719–735, Jun./Jul. 2004. [14] P. L. Shui and W. C. Zhang, “Noise-robust edge detector combining isotropic and anisotropic Gaussian kernels,” Pattern Recognit., vol. 45, no. 2, pp. 806–820, Feb. 2012. [15] T. Lindeberg, “Edge detection and ridge detection with automatic scale selection,” Int. J. Comput. Vis., vol. 30, no. 2, pp. 117–156, 1998. [16] S. Yang and C. Xu, “Contour grouping: Focusing on image patches around edges,” in Interactive Technologies and Sociotechnical Systems (Lecture Notes in Computer Science). New York, NY, USA: SpringerVerlag, Oct. 2006, pp. 138–146. [17] T. B. Nguyen and D. Ziou, “Contextual and non-contextual performance evaluation of edge detectors,” Pattern Recognit. Lett., vol. 21, no. 9, pp. 805–816, Aug. 2000. [18] C. Topal, C. Akinlar, and Y. Genç, “Edge drawing: A heuristic approach to robust real-time edge detection,” in Proc. 20th ICPR, Aug. 2010, pp. 2424–2427. [19] C. Akinlar and C. Topal, “EDPF: A real-time parameter-free edge segment detector with a false detection control,” Int. J. Pattern Recognit. Artif. Intell., vol. 26, no. 1, pp. 862–872, Feb. 2012. [20] C. Akinlar and C. Topal, “EDContours: High-speed parameter-free contour detector using EDPF,” in Proc. IEEE ISM, Dec. 2012, pp. 153–156. [21] J. Freixenet, X. Muñoz, D. Raba, J. Martí, and X. Cufí, “Yet another survey on image segmentation: Region and boundary information integration,” in Lecture Notes in Computer Science. New York, NY, USA: Springer-Verlag, May 2002, pp. 21–25. [22] J. S. Stahl and S. Wang, “Edge grouping combining boundary and region information,” IEEE Trans. Image Process., vol. 16, no. 10, pp. 2590–2606, Oct. 2007. [23] Y. Zheng, L. Huiping, and D. Doermann, “A parallel-line detection algorithm based on HMM decoding,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 27, no. 5, pp. 777–792, May 2005. [24] C. Steger, “An unbiased detector of curvilinear structures,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 20, no. 2, pp. 113–125, Feb. 1998. [25] D. Eberly, R. Gardner, B. Morse, S. Pizer, and C. Scharlach, “Ridges for image analysis,” J. Math. Imaging Vis., vol. 4, no. 4, pp. 353–373, Dec. 1994. [26] C. Lemaitre, M. Perdochb, A. Rahmounea, J. Matasb, and J. Miteran, “Detection and matching of curvilinear structures,” Pattern Recognit., vol. 44, no. 7, pp. 1514–1527, Jul. 2011. [27] S. X. Li, H. X. Chang, and C. F. Zhu, “Fast curvilinear structure extraction and delineation using density estimation,” Comput. Vis. Image Understand., vol. 113, no. 6, pp. 763–775, Jun. 2009. [28] G. Papari and N. Petkov, “Adaptive pseudo dilation for gestalt edge grouping and contour detection,” IEEE Trans. Image Process., vol. 17, no. 10, pp. 1950–1962, Oct. 2008. [29] V. Ferrari, L. Fevrier, F. Jurie, and C. Schmid, “Groups of adjacent contour segments for object detection,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 30, no. 1, pp. 36–51, Jan. 2008.

4774

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 22, NO. 12, DECEMBER 2013

[30] Z. X. Zhu, G. Y. Wang, and J. G. Liu, “Object detection based on multiscale discrete points sampling and grouping,” in Proc. 6th Int. Symp. Multispectral Image Process. Pattern Recognit., Oct. 2009. [31] P. Carbonetto, N. de Freitas, and K. Barnard, “A statistical model for general contextual object recognition,” in Proc. Comput. Vis. ECCV, May 2004, pp. 350–362. [32] L. Fei-Fei and P. Perona, “A Bayesian hierarchical model for learning natural scene categories,” in Proc. IEEE Comput. Soc. Conf. CVPR, vol. 2. Jun. 2005, pp. 524–531. [33] (2013, Apr. 5). Just-Noticeable Difference [Online]. Available: http://en.wikipedia.org/wiki/JND [34] J. H. Shen, “On the foundations of vision modeling I. Weber’s law and Weberized TV restoration,” Phys. D, Nonlinear Phenomena, vol. 175, nos. 3–4, pp. 241–251, Feb. 2003. [35] Y. J. Zhang, “Two new concepts for N16 space,” IEEE Signal Process. Lett., vol. 6, no. 9, pp. 221–223, Sep. 1999. [36] OpenCV Dev. Team. (2013, Mar. 1). Feature Detection. OpenCV 2.4.4.0 Documentation [Online]. Available: http://docs.opencv.org/modules/imgproc/doc/feature_detection.html [37] N. Ostu, “A threshold selection method from gray-level histogram,” IEEE Trans. Syst., Man Cybern., vol. 9, no. 1, pp. 62–66, Jan. 1979. [38] A. Rosenfeld and P. D. Torre, “Histogram concavity analysis as an aid in threshold selection,” IEEE Trans. Syst. Man Cybern., vol. 13, no. 3, pp. 231–235, Mar. 1983. [39] E. Ataer-Cansizoglu, E. Bas, J. Kalpathy-Cramer, G. C. Sharp, and D. Erdogmus, “Contour-based shape representation using principal curves,” Pattern Recognit., vol. 46, no. 4, pp. 1140–1150, Apr. 2013. [40] H. N. Wang and T. C. M. Lee, “Extraction of curvilinear features from noisy point patterns using principal curves,” Pattern Recognit. Lett., vol. 29, no. 16, pp. 2078–2084, Dec. 2008.

Zongxiao Zhu (S’08) received the B.S. and M.S. degrees in electrical and electronic engineering from Xi’an Jiaotong University, Xi’an, China, in 2000 and 2003, respectively. He is currently pursuing the Ph.D. degree in control science and engineering with the Institute for Pattern Recognition and Artificial Intelligence, Huazhong University of Science & Technology, Wuhan, China. He was an Engineer with SuZhou Shihlin Electric & Engineering Co., Suzhou, China, from 2003 to 2004, where he was in charge of designing small power converters (on the market in 2004). Since 2004, he has been a Faculty of College of computer Science, South-Central University for Nationality (SCUN), Wuhan. In 2007, he founded the Information Processing Laboratory for Minority Language, SCUN, and began to manage a multidisciplinary research team aiming at using information technology to salvage, protect and broadcast endangered minority cultures. His current research interests include image processing, object detection, and endangered minority culture’s protection with information technology. Mr. Zhu is a member ACM and the Chinese Computer Federation.

Guoyou Wang received the B.S. degree in electronic engineering and the M.S. degree in pattern recognition and intelligent system from the Huazhong University of Science and Technology, Wuhan, China, in 1988 and 1992, respectively. He is currently a Professor with the Institute for Pattern Recognition and Artificial Intelligence, Huazhong University of Science and Technology. His current research interests include image processing, image compression, pattern recognition, artificial intelligence, and machine learning.

Jianguo Liu (M’95) received the B.S. degree in science from the Wuhan University of Technology, Wuhan, China, in 1982, the M.S. degree in engineering from the Huazhong University of Science and Technology, Wuhan, in 1984, and the Ph.D. degree from Hong Kong University, Hong Kong, in 1996. He is currently a Professor with the Institute for Pattern Recognition and Artificial Intelligence, Huazhong University of Science and Technology, Wuhan. His current research interests include pattern recognition, image processing, fast discrete transforms, and neural network.

Zhong Chen received the M.S. degree in physical electronics from the Huazhong University of Science and Technology, Wuhan, China, in 2003, and the Ph.D. degree from the Institute of Remote Sensing Application, Chinese Academy of Sciences, Beijing, China, in 2006. From 2006 to 2008, he was a PostDoctoral Fellow with the Huazhong University of Science and Technology, Wuhan, China. Currently, he is an Associate Professor with the Huazhong University of Science and Technology. His current research interests include image analysis, remote sensing image processing, and target recognition.

Suggest Documents