The signature represents the shapes of boundaries detected in digitized binary images of the parts. The autocorrelation coefficients computed from the signature ...
Journal of Intelligent Manufacturing (1997) 8, 167±175
A new approach for automated parts recognition using time series analysis and neural networks Y O U N G - H A E L E E , 1 * S E U N G K I M O O N 1 and HO NG C HU L L E E 2 1 2
Department of Industrial Engineering, Hanyang University, Seoul 133-791, Korea Department of Industrial Engineering, Korea University, Seoul 136-701, Korea
Received September 1995 and accepted May 1996
This paper presents a new approach for automated parts recognition. It is based on the use of the signature and autocorrelation functions for feature extraction and a neural network for the analysis of recognition. The signature represents the shapes of boundaries detected in digitized binary images of the parts. The autocorrelation coecients computed from the signature are invariant to transformations such as scaling, translation and rotation of the parts. These unique extracted features are fed to the neural network. A multilayer perceptron with two hidden layers, along with a backpropagation learning algorithm, is used as a pattern classi®er. In addition, the position information of the part for a robot with a vision system is described to permit grasping and pick-up. Experimental results indicate that the proposed approach is appropriate for the accurate and fast recognition and inspection of parts in automated manufacturing systems. Keywords: Pattern recognition, neural networks, time series, feature extraction
1. Introduction Computer vision systems are currently being introduced in industrial settings to provide machines with the abilities to see, and understand what they see. Industrial analysts are predicting that the increasing application of this technology to manufacturing will profoundly in¯uence the factory of the future. Computer vision systems will be used to perform industrial inspection and quality control tasks, and to provide vision for industrial robots (Gonzales and Safabaksh, 1982). The application of computer vision systems to quality control processes (Fu, 1982) would enable ¯awed parts to be detected prior to assembly, rather than testing the ®nal product and searching for the cause of a malfunction. Computer vision inspection systems have the potential to be faster and more reliable than human inspectors. Increases in the resolution of computer vision systems will allow them to detect automatically things that humans cannot see. A robot with vision capabilities could adjust to its changing environment, and could be trained to perform a
*Author to whom all correspondence should be addressed.
0956-5515
Ó 1997 Chapman & Hall
variety of tasks. Vision would allow robots to correct their positions after movement, and to avoid obstacles during movement. Robots with vision recognize parts, determine part position and orientation for acquisition, and select, count, sort and reorient parts. After a part has been recognized, bolt-holes and other part features could be detected. In the future it is likely that the assembly of parts into single units will be a common task for robots. Robot vision systems would then be used primarily for recognizing and handling parts on an assembly line, and these systems would require techniques that could recognize complicated objects with a high degree of accuracy (Haralick and Shapiro, 1992). The recognition of objects from digitized images is one of the most important tasks of computer vision systems. Vision systems are increasingly being considered and implemented as an alternative to manual inspection and manual gauging in manufacturing systems. Computer vision systems can eectively inspect industrial parts. These parts must be identi®ed before they are measured and checked against process or speci®cation databases. Suppose a part is translated, rotated, or scaled dierently; then the wrong recognition could be taken place. A common problem that occurs in part identi®cation is that the rec-
168 ognition process is facilitated if the part is in a known position and orientation. However, parts in the factory are not typically positioned and oriented correctly. For this reason, an eective and fast new procedure, useful for pattern and position recognition of parts variant in orientation and location, is needed. Many published papers have dealt with problems related to the utilization of a computer vision system for recognition and inspection tasks. When the part image is obtained, a common approach uses piecewise linear approximation to decompose a boundary into polylines (Ventura and Dung, 1993). Other approaches use compound curves to represent the object boundary (Mokhtarian and Mackworth, 1986). These approximation methods are mainly for the purposes of object recognition or data reduction rather than for the measurement of geometric features. A general methodology for digitized straight edges over all orientations based on minimizing the maximum estimation error has been suggested by Koplowitz and Bruckstein (1989). Chang et al. (1991) utilized least-squares regression models to ®t edge lines for measurement purposes. They also proposed a more precise break-point identi®cation method for automated part measurement. Hsieh and Chang (1994) proposed a system in which parts are identi®ed by the Bayesian classi®er using the normalized Fourier descriptors. Then the matching pointwise distances between the scanned object and its standard data are used to infer the quality of this scanned part. Dubois and Glanz (1986) developed a method to classify objects based on the use of autoregressive model parameters, which represent the shapes of boundaries detected in digitized binary images of the objects. In this paper, a new procedure for automated parts recognition is presented. First, for the accurate and rapid recognition of parts, features that are not dependent on the size and location of the same part image are extracted, based on the signature and the autocorrelation coecients. Then a neural network is applied to the part recognition. The information used in the recognition process can also be used to obtain the position information of the part for robot applications. Section 2 describes the preprocessing and a new approach to the feature extraction from part images. Section 3 provides a brief description of neural networks and the application of neural networks for this study. Section 4 describes the methods to obtain the position information from the image of part. Section 5 shows the experimental results. Finally, conclusions are presented in Section 6.
Lee et al. scaled dierently. Under this situation, the correct recognition of the pattern of a two-dimensional part image is an important task in computer vision. Therefore, to obtain invariant variables, it is necessary to extract pattern information that can uniquely characterize the part. This process is called feature extraction. Figure 1 shows the feature extraction process. The following subsections describe each step in detail. 2.1. Image segmentation Image segmentation is the process that breaks up a sensed scene into its constituent parts or objects. In segmentation, the goal is to extract information from the digitized image. One of the techniques in segmentation is thresholding. Thresholding is a binary conversion technique in which each pixel is converted into a binary value, either black or white, on a grey-scale image. Thresholding distinguishes pixels that have higher grey values from pixels that have lower grey values. Pixels whose grey values are high enough are given the binary value 1, and pixels whose grey values are not high enough are given the binary value 0. Hence, if F
x; y is the grey level at coordinates
x; y in the image, and T is the thresholding value, then the image represented by F
x; y is binarized by applying the rule: if F
x; y > T then B
x; y 1 if F
x; y T then B
x; y 0 where B
x; y denotes the binary version of F
x; y. The value of T is determined by plotting the frequency histogram of the image.
2. Feature extraction In modern manufacturing systems using computer vision, the images of industrial parts delivered by the material handling systems are sometimes translated, rotated or
Fig. 1. Block diagram of the feature extraction process.
A new approach for automated parts recognition
169
Fig. 2. Directions for 4-directional and 8-directional chain code.
Fig. 4. Boundary approximation.
2.2. Chain codes
vectors. The radius vectors are used as input for computation of the autocorrelation coecient.
In an image processing system, the extraction of features from an object for the purpose of recognition is a very important element. Also, the proper description of the object's boundary is necessary to the analysis of pattern. As most representation is based on the approximation of the boundary by linear segments, chain codes are used to represent a boundary by a connected sequence of straightline segments of speci®ed length and direction. Typically, this representation is based on the 4- or 8-connectivity of the segments, where the direction of each segment is coded using a numbering scheme, as shown in Fig. 2. Figure 3 is an illustration of 4-connectivity chain code. The boundary of the object is represented by numerical sequence codes. Because a chain code is very sensitive to noise, a small change of size and rotation may cause many problems. However, once the boundary of the input object has been obtained by a chain code, its properties, such as size and shape, are computed. Using the obtained coordinates of the object boundary, the centroid G
x; y of the object can be computed: nÿ1 ÿ1 1P 1 nP xi ; yi n i0 n i0
2.3. Signature
where n is the total number of points belonging to the region enclosed by the boundary, and xi ; yi are the coordinates of the ith point of the region. The coordinates of the boundary and centroid are used to obtain the radius
A signature is a one-dimensional functional representation of a boundary (Gonzales and Wintz, 1987). One of the simplest representations is to plot the distance from the centroid to the boundary as a function of angle, as illustrated in Figs. 4 and 5. In order to construct a part shape, parameters of the part pattern are estimated from a data set of boundary samples. The boundary is approximated by an ordered sequence of the length of n angularly equi-spaced radius vectors projected between the boundary centroid and the boundary, as shown in Fig. 4. The boundary approximation can be improved by increasing the number of the radius vector projection. The length of the radius vector, r, is a function of angle of projection t
t 0; 1; 2; . . . ; n, and the function r
t forms a one-dimensional boundary approximation. As Fig. 5 shows, the ordered set of numbers, r, can also be expressed as a `time series' r
t, with the parameter t describing the position of a radius vector in equiangular increments from the starting point (Kashyap and Chellappa, 1981). In this study, a new procedure for computing the radius vector is proposed. The radius vectors are the distances from the points of the boundary to the centroid point. The positions of the boundary points are obtained by moving
Fig. 3. Boundary representation by 4-directional chain code.
Fig. 5. Plot of r
t versus t for the shape in Fig. 4.
170
Lee et al. has a unique and identical feature vector regardless of its size, position and orientation. In addition, this compact feature vector is appropriate for the input pattern of neural networks because the number of input nodes (i.e. the length of time lag) is not changed by the patterns of the parts. The reasons why autocorrelation coecient values are invariant in translation, rotation, and scaling are as follows. First, the data of the signature are invariant to translation. Second, for rotation, the radius vectors are periodic time series. Finally, for scaling, because autocorrelation coecients are computed by time lag, the autocorrelation coecient values under uniform scale transformations should be the same. Also, based on the radius vector, one can determine relatively the scale of the chain code according to the scaling of the object.
Fig. 6. The extraction of radius vectors.
the point on the boundary of the part image. By making the length Dt moved along the boundary consistent with the scale of the chain code, the signature can be computed directly from the concave boundary pattern as well as the convex boundary pattern. Figure 6 illustrates the extraction of radius vectors based on the proposed concept. Because the generated signatures are obviously dependent on size and starting point, one more transformation is required. For this reason, the signature information is transformed to an autocorrelation coecient. 2.4. Autocorrelation coecient Autocorrelation coecients (Makridakis and Wheelwright, 1978) provide important information about the pattern in time series data. The formula for the autocorrelation coecient of time lag k is: nP ÿk
qk
t1
rt ÿ r
rtk ÿ r n P t1
rt ÿ r2
1
where qk = autocorrelation coecient, k 1; 2; . . . ; n; k the length of the time lag; n = the number of observations; rt = the value of the variable at time t; and r = the mean of all data. Using Equation 1, the invariant feature vector is obtained from the radius vector. Figure 7 is an illustration of an autocorrelation graph. The characteristic of the autocorrelation coecient as the feature is that the same part
Fig. 7. Autocorrelation coecient value with time lag k.
3. Pattern recognition It is apparent that a neural network derives its computing power through, ®rst, its massively parallel distributed structure and, second, its ability to learn from experience. Furthermore, the generalization capability of neural networks means that the neural networks are able to produce reasonable outputs for inputs not encountered during training (learning) processes. These information-processing capabilities make it possible for neural networks to solve complex (large-scale) problems that are currently intractable (Haykin, 1994). Therefore neural networks can be used as pattern classi®ers. Neural network classi®ers are non-parametric, and make weaker assumptions concerning the shapes of underlying distributions than traditional statistical classi®ers. They may thus prove to be more robust when distributions are generated by non-linear processes. In particular, the patterns of autocorrelation coecients are various, and producing deterministic decision-making rules for pattern recognition may not be impossible. Hence a backpropagation neural network is utilized for non-linear pattern classi®cation in a supervised manner. 3.1. Basic operations of a neuron A neuron is an information-processing unit that is fundamental to the operation of a neural network. Figure 8 shows the model for a neuron. Three basic operations of the neuron as shown in Fig. 8 are as follows: (1) A set of synapses or connecting links, each of which is characterized by a weight or strength of its own: specifically, a signal xj at the input of synapse j connected to neuron k is multiplied by synaptic weight Wkj ; (2) An adder for summing the input signals, weighted by the respective synapses of the neuron; (3) An activation function for limiting the amplitude of the output of a neuron.
A new approach for automated parts recognition
171
Fig. 8. Non-linear operations of a neuron.
Fig. 9. Sigmoid function.
Here, the activation function, denoted by u
, de®nes the output of a neuron in terms of the activity level at its input. There are usually three types of activation function: threshold, piecewise-linear and sigmoid. The sigmoid function shown in Fig. 9 is by far the most common form of activation function used in the construction of arti®cial neural networks.
Fig. 10. Architectural graph of a BP network with two hidden layers.
3.2. Backpropagation algorithm A backpropagation (BP) network is a feedforward network with one or more layers of nodes between the input and output nodes. These additional layers contain hidden units or nodes that are not directly connected to both the input and output nodes. A four-layer BP network with two layers of hidden units is shown in Fig. 10. This type of multilayer perceptron overcomes many of the limitations of singlelayer perception. The back-propagation algorithm uses a gradient search technique to minimize the mean square error between the actual output and the desired output based on the feedforward computation. The network is trained by initially selecting small random weights and internal thresholds and then presenting all training data repeatedly. Weights are adjusted after every trial using side information specifying the correct class until weights converge and the cost function is reduced to an acceptable value. An essential com-
ponent of the algorithm is the iterative method that propagates error terms required to adapt weights back from nodes in the output layer to nodes in lower layers. The algorithm steps to implement the BP network are as follows. Step 1. Initialization Start with a reasonable network con®guration, and set all the synaptic weights
wij and threshold
h levels of the network to small uniformly distributed random numbers. Step 2. Presentation of training examples Present the network with an epoch of training examples. For each example in the set, ordered in some fashion, perform the following sequence of forward and backward computations.
172
Lee et al.
Step 3. Forward computation Let a training example in the epoch be denoted by [xn, dn], with the input vector xn applied to the input layer and the desired response vector dn represented to the output layer of computation nodes. Compute the activation potentials and function signals of the network by proceeding forward through the network, layer by layer. The net internal activity level Sj for neuron j is: X wji xi
2 Sj i
yj
1 1 exp
ÿSj hj
3
where yj is the function signal of neuron j in the previous layer, wji is the synaptic weight of neuron j, and hj is the threshold applied to neuron j. If neuron j is in the output layer, set yj oj . Hence compute the error signal: e j d j ÿ oj
4
where dj is the jth element of the desired response vector dn. Step 4. Backward computation Compute the ds (i.e. the local gradients) of the network by proceeding backward, layer by layer: dj ej oj 1 ÿ oj dj yj 1 ÿ yj
X
for neuron j in output layer dk wkj
5
for neuron j
k
in hidden layer
Eav
7
where g is the learning-rate parameter and a is the momentum constant to smooth out the weight change and to make the network converge faster.
Step 6. Output representation and decision rule In theory, for an m-class classi®cation problem in which the union of the m distinct classes forms the entire input space, a total of m outputs to represent all possible classi®cation decisions is required. Let dj be the desired (target) output pattern for the prototype input vector xj. Then: 8 < 1 when the prototype xj belongs to class Ck dkj 0 when the prototype xj does not belongs to : class Ck Based on this notation, the output is represented by an mdimensional binary vector such as 0; . . . 1; . . . 0. Here, the decision choice Ck has the value 1. 4. Position information extraction The position information is obtained from the origin coordinate and the centroid coordinate of part image, as shown in Fig. 11. The position information is the distance from the origin to the centroid coordinate of the part image. In the ®gure, G(x, y) denotes the centroid coordinate
Iterate the computation by presenting new epochs of training examples to the network until the free parameters of the network stabilize their values, and the average squared error Eav, computed over the entire training set, is at a minimum or acceptably small value. The order of presentation of training examples should be randomized from epoch to epoch. The momentum and the learning-rate parameter are typically adjusted (and usually decreased) as the number of training iterations increases. The instantaneous sum of squared errors of networks at iteration n is written as: 1X 2 e
n 2 j2C j
9
The instantaneous sum of error squares E(n), and therefore the average squared error Eav, is a function of all the free parameters (i.e. synaptic weights and thresholds) of the networks. The objective of the learning process is to adjust the free parameters of the networks so as to minimize Eav.
Step 5. Iteration
E
n
N 1X E
n N n1
6
Hence adjust the synaptic weights of the network in layer l according to the generalized delta rule: wji
t 1 wji
t awji
t ÿ wji
t ÿ 1 gdj yj
where the set C includes all the neurons in the output layer of networks. Let N denote the total number of patterns. The average squared error is obtained by summing E(n) over all n and then normalizing with respect to the set size N, as shown by:
8 Fig. 11. Position information.
A new approach for automated parts recognition
Fig. 12. Information on the rotated angle.
of the part and O(0,0) denotes the origin coordinate. The centroid coordinate is obtained by the chain code. The distance d from the origin coordinate to the centroid coordinate can be computed as: p
10 d x2 y 2 In the case of robot application in material handling, information on the rotated angle of parts should be fed to the robot in order to determine the proper con®guration of the robot gripper. Information on the rotated angle is obtained by using the radius vectors of signature. Figure 12 is illustrated for the method of rotated angle extraction. The rotated angle h can be computed using the following equation: yg ÿ y ÿ1
11 h tan xg ÿ x where P
x; y denotes the smallest radius vector and G
xg ; yg denotes the centroid coordinate.
5. Experimental results The eectiveness of this approach is demonstrated through a vision system consisting of an RS-170 monochrome camera (Cosmicar/Pentax), a frame grabber card and ROBOT VISIONpro 2.2 (Eshed Robotec Ltd) software installed in an IBM/PC computer. The overall conceptual approach for the recognition process is shown in Fig. 13. The image of the part is obtained using the vision system. Next, the image data is preprocessed by smoothing and segmentation processes. Then invariant variables (features) are extracted through signature and autocorrelation coef®cient computation. Finally, this data is provided to the neutral network for pattern recognition. The computer
173
Fig. 13. Block diagram of part pattern recognition.
program was written and compiled in C++. Image sample patterns for the experiment are shown in Fig. 14. The scale of the image is 512 512 and the grey scale of the image is 0±255. The illustrations of the signature using chain code to extract the radio vector, and of the corresponding autocorrelation coecient value, are presented in Figs. 15(a±d) and 16(a±d) respectively. In the ®gures, the scale of chain code used is one pixel, and the time lag for the autocorrelation coecient is 100. The autocorrelation coecient value is not varied by translation, rotation or scaling of a part. These values are fed to the neural network for the training procedure. Once the network was trained under several experimental conditions, the same part translated, rotated and scaled differently was tested. The trained neural network can identify 15 dierent parts having various translation, rotation and scaling with 100% accuracy in real time. In order to determine the optimum number of hidden layers and the number of nodes in the hidden layer, several experiments changing the number of hidden layers and nodes were conducted. In the network construction, the number of input nodes is the same as the time lag (100), and the number of output nodes is the number of parts to be recognized (15). The track of system error expressed in Equation 9 versus the number of iterations is de®ned as the learning curve. The learning curve represents the learning performance of the network. The network convergence is determined by a learning tolerance (total error) of 0.005: that is, if is Eav is less than 0.005, then the network is trained. The learning curve and iterations for several numbers of nodes in the ®rst hidden layer and the second hidden layer are shown in Fig. 17 and Table 1. As can be seen, the learning performance is improved as the number of nodes in the hidden layers increases.
174
Lee et al.
Fig. 14. The binary part images.
6. Conclusions Obtaining the characteristics or invariant information from the image of objects is a very important task in pattern recognition using computer vision systems. In this paper, the signature and autocorrelation functions are used to extract unique features from the image. The signature provides information on the pattern of the boundary, and autocorrelation coecients computed from the signature are used as training data for neural networks and the recognition of parts. The autocorrelation coecients of each part are invariant to transformations such as scaling,
translation and rotation of the part. In addition, as the autocorrelation coecients are used as the input data to the neural network, the number of input nodes (the time lag) is easily determined. Consequently, a much smaller number of input nodes and hidden layers is required. Therefore perfect and real-time recognition of the various parts is possible, and the training time for network convergence is relatively short. This is a very important aspect of neural network application in pattern recognition. The proposed approach is useful for obtaining proper information from the various patterns of parts and part position information in real-time applications. Finally, it is expected
Fig. 15. Illustration of the signature for parts: (a) part 8; (b) part 10; (c) part 11; (d) part 14.
A new approach for automated parts recognition
175 Table 1. The learning iterations for the number of nodes in the hidden layers
Fig. 16. Illustration of the autocorrelation coecient: (a) part 8; (b) part 10; (c) part 11; (d) part 14.
Fig. 17. The learning curves for number of nodes in the ®rst hidden layer.
that the proposed methodology may be applied widely in many areas, such as automatic inspection system, robot positioning control and monitoring systems.
References Chang, C. A., Chen, L. and Ker, J. (1991) Ecient measurement procedures for compound part pro®le by computer vision. Computer and Industrial Engineering, 21, 375±377. Dubois, S. R. and Glanz, F. H. (1986) An autoregressive model approach to two dimensional shape classi®cation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 8, 55±66.
Number of nodes in the ®rst hidden layer
Number of nodes in the second hidden layer
Learning iterations
6
5 6 7 8
3689 4845 2150 3688
7
5 6 7 8
4017 2799 2260 1922
8
5 6 7 8
1923 2300 2733 2073
The numbers of recognized parts are 15, g=0.2, a=0.05 and Eav =0.005.
Fu, K. (1982) Pattern recognition for automatic visual inspection. IEEE Computer, 15, 34±40. Gonzales, R. C. and Safabaksh, R. (1982) Computer vision techniques for industrial applications and robot control. IEEE Computer, 15, 17±32. Gonzales, R. C. and Wintz, P. (1987) Digital Image Processing, Addison-Wesley, Reading, MA. Haralick, R. M. and Shapiro, L. G. (1992) Computer and Robot Vision, Addison-Wesley, Reading, MA. Haykin, S. (1994) Neural Networks: A Comprehensive Foundation, Macmillan, Hampshire, UK. Hsieh, K. and Chang, C. A. (1994) Automated part recognition and pro®le inspection for ¯exible manufacturing systems, in Proceedings of Mid-America Conference on Intelligent Systems, Kansas State University, Overland Park, KS, pp. 73±79. Kashyap, R. L. and Chellappa, R. (1981) Stochastic models for plane closed boundary analysis: representation and reconstruction. IEEE Transactions on Information Theory, 27, 627± 637. Koplowitz, J. and Bruckstein, A. M. (1989) Design of perimeter estimators for digitized planar shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 11, 611±622. Makridakis, S. and Wheelwright, S. C. (1978) Forecasting Methods and Application, John Wiley & Sons, New York. Mokhtarian, F. and Mackworth, A. (1986), Scale-based description and recognition of planar curves and two-dimensional shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 8, 34±44. Ventura, J. A. and Dung, Z. (1993) Algorithms for computerized inspection of rectangular and square shapes. European Journal of Operation Research, 68, 256±277.