A Bayesian approach to the Hough transform for line detection

3 downloads 221 Views 1MB Size Report
The HT requires an accumulator array H(ρ, θ), called the Hough space, to represent the possible values of (ρ, θ), which is generally approximated by a discrete ...
IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 35, NO. 6, NOVEMBER 2005

945

Correspondence A Bayesian Approach to the Hough Transform for Line Detection Andrea Bonci, Tommaso Leo, and Sauro Longhi Abstract—This paper explains how to associate a rigorous probability value to the main straight line features extracted from a digital image. A Bayesian approach to the Hough Transform (HT) is considered. Under general conditions, it is shown that a probability measure is associated to each line extracted from the HT. The proposed method increments the HT accumulator in a probabilistic way: first calculating the uncertainty of each edge point in the image and then using a Bayesian probabilistic scheme for fusing the probability of each edge point and calculating the line feature probability. Index Terms—Bayesian method, Hough Transform (HT), image processing.

I. I NTRODUCTION The evaluation of the uncertainty of straight lines extracted from digital images is a relevant aspect in many recognition problems, such as object recognition, robot manipulation, and environment geometry definition for the navigation of mobile robots (see, e.g., [1]–[3]). In particular, in the field of the mobile robotics, the Hough Transform (HT) seems to be a useful technique for solving problems as selflocalization [4]–[7] or simultaneous localization and map building (SLAM) [8]. In these problems, a probabilistic geometric map of the environment is considered and the probabilistic HT can be used for extracting the line features of the environment and their probability measures [9]. The HT is a well-known method for detecting straight lines on gray level images [10]–[12]. In the HT, the line equation is expressed by ρ = x cos θ + y sin θ

(1)

where ρ is the distance between the line and the origin of the coordinate system in the image plane, θ is the orientation of the line, and x and y are the coordinates of any edge point belonging to the line in the same coordinate system. The HT requires an accumulator array H(ρ, θ), called the Hough space, to represent the possible values of (ρ, θ), which is generally approximated by a discrete array. For each edge point (x, y) in the image, detected by using some (orthogonal) differential operator [13], such as the Sobel operator, the parameters (ρ, θ) are estimated and quantized, and the corresponding cell of the accumulator array is increased of a proper quantity. After the edge points processing, the accumulator array is searched for peaks. The peaks identify the parameters of the highest probability lines. In the standard HT, the accumulator is increased by the same quantity for each edge point supposing that all edge points contribute equally to the line features. This method has some drawbacks because edge points are not equally Manuscript received October 17, 2003; revised July 28, 2004. This work was supported by the Ministero dell’Istruzione, dell’Università e della Ricerca. This paper is recommended by Associate Editor D. Zhang. The authors are with the Dipartimento di Ingegneria Informatica, Gestionale e dell’Automazione, Università, Politecnica delle Marche, Ancona 60131, Italy (e-mail: [email protected]; [email protected]; sauro. [email protected]). Digital Object Identifier 10.1109/TSMCA.2005.853481

reliable; this is due to the uncertainty introduced by the image noise and by the edge orientation estimation method. Many researchers have proposed different approaches for improving the HT that are usually variants of the method proposed by O’Gorman and Clowes [12] where the accumulator is increased by the magnitude of the gradient orientation. This gradient is calculated for each edge point, and it is taken into account as an estimation of the line feature orientation θ. The advantage of this approach is the computational speed. However, this method fails if significant amount of noise is present: The estimation of the edge direction, performed by the edge operator, is highly susceptible to noise. In recent papers, Ji and Haralick [14], [15] have introduced a scheme for updating the accumulator depending on the uncertainty of each edge point. In this scheme, the uncertainties—the variances σρ2 , σθ2 of the estimated line parameters ρ(x, y), and θ(x, y)—are analytically computed for each edge point (x, y) basing on the image noise, edge orientation estimation procedure, and parametric line representation. The uncertainties of the line parameters are used for evaluating the joint density distribution p(Θ(x, y)|Θ) that is the likelihood of the all possible quantized values Θ = [ρ, θ]T given the observed line parameters Θ(x, y) = [ρ(x, y), θ(x, y)]T . Moreover, in the quoted papers, Ji and Haralick have proposed to assume Θ(x, y) as normally distributed as Θ(x, y) ∼ N (Θ, ΣΘ(x,y) ), where ΣΘ(x,y) is the covariance matrix of Θ(x, y), and to increment the accumulator by log( p(Θ(x, y)|Θ)) at each edge point. In this paper, a new method is proposed for increasing the Hough accumulator and for computing the probability of each line feature. The error propagation technique proposed by Ji and Haralick [14] for the uncertainty computation is improved by the characterization of the covariance matrix ΣΘ(x,y) and by the analysis on how to infer the joint probability P (Θ(x, y)|Θ) by the normal joint distribution p(Θ(x, y)|Θ). Furthermore, a Bayesian approach for updating the joint probability in the Hough accumulator is also proposed. This paper is organized as follows. Sections II and III recall a method for evaluating the uncertainties of the line parameters and for the estimation of the line probability. Section IV introduces a Bayesian approach for updating the Hough accumulator. An algorithm for implementing this method and concluding remarks close this paper. II. U NCERTAINTY E STIMATION OF L INE P ARAMETERS The uncertainty of the line parameters is estimated following the error propagation technique proposed by Ji and Haralick [14] for improving the HT. By a gradient-based edge operator, such as the Sobel edge detector [13], [16], all the possible image edge points are selected, and for each edge point (x, y), the intensity gradient orientation is used as line orientation estimator θˆ = θ(x, y) (as suggested by O’Gorman and Clowes [12]). The image gray level values I(x, y), distributed in the neighborhood of an image edge point (x, y), are approximated by a plane with additive random noise I(x, y) = αx + βy + γ + η(x, y)

(2)

where η(x, y) is an additive random noise having zero mean and variance σ 2 . The estimated gradient direction is given by

1083-4427/$20.00 © 2005 IEEE

tan θˆ =

α ˆ βˆ

(3)

946

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 35, NO. 6, NOVEMBER 2005

where the estimation of the plane parameters α ˆ and βˆ is computed by means of a least square fitting as follows [13]. Consider a rectangular region R × C around the edge point (x, y). Let ε2 =



2

ˆ + γˆ − I(r, c) αr ˆ + βc

(4)

r∈R c∈C

be the mean square error between I(x, y) and its neighborhood estiˆ which minimize the mean square error, mation. The values of α ˆ and β, are given by [13]



α ˆ=

r

r

 βˆ =

r

rI(r, c)

c 

c

c

r

c

Given an edge point (x, y), denote by Θ(x, y) = [ρ(x, y), θ(x, y)]T the line parameters associated to the edge point (x, y) that can be estimated by relations (1) and (3) as recalled in the previous section. Let Θ = [ρ, θ]T be the vector of the all possible quantized values of ρ and θ of the accumulator discrete array H(ρ, θ) (the Hough space). By assuming that Θ(x, y) = [ρ(x, y), θ(x, y)]T ∼ N (Θ, ΣΘ(x,y) ), the normal joint distribution of Θ(x, y) can be expressed as p(Θ(x, y)|Θ) =

(5)

r2

.

c2

(6)

2πσρ σθ

ΣΘ(x,y) =

2

r

c

σ2 σβ2 =   r

c

σ2

σαβ =   r

c

r2

(7)

c2

(8)

 r r2

rc

c   r

c

(9)

c2

2 σα

where for rectangular neighborhood = σβ2 and for symmetric neighborhood σαβ = 0. Given an estimation of the image noise variance σ 2 [13], for each 2 2 , σβ2 , and σαβ are edge point (x, y), the plane parameters α, β, σα estimated as shown in [13] and [14]. Denoting by ∆θ = θˆ − θ, ∆α = α ˆ − α, and ∆β = βˆ − β, the first-order Taylor expansion of the left and right sides of (3), respectively around θ and (α, β), produces ∆ tan θ =

∆θ cos2 θ

∆ tan θ =

which implies that



∆θ =

α 1 ∆α − 2 ∆β β β



σθ2 = E(∆θ2 ) =



1 − 2

e

1 2(1−2 )



∆ρ2 2 σρ

2∆ρ∆θ

− σ σ ρ θ

2

+ ∆θ2



σ θ

where ρ is the correlation coefficient for which σρθ = σρ σθ . According to (13) and (14), the correlation coefficient is equal to 1 and the covariance matrix



The variances and the covariance have the forms σ 2 =  σα



1

(15)

cI(r, c)



III. B ELONGING P ROBABILITY TO A L INE OF AN E DGE P OINT

1 α ∆α − 2 ∆β β β

 cos2 θ

2 α2 σβ2 σα 2α 2 + − 3 σαβ 2 β β4 β

(10)

(11)

where E(·) denotes the expected value of the argument. For each edge point (x, y), ρˆ = ρ(x, y) is estimated by substituting θˆ on (1). Denoting by ∆ρ = ρˆ − ρ, by the first-order Taylor expansion of (1) around θ, it follows that ∆ρ = ∆θ(y cos θ − x sin θ) = ∆θk

(12)

σρ2 = E(∆ρ2 ) = ( y cos θ − x sin θ)2 σθ2 = k2 σθ2

(13)

and the covariance of ρˆ and θˆ has the form σρθ = E(∆ρ∆θ) = ( y cos θ − x sin θ)σθ2 = kσθ2

(14)

where k = y cos θ − x sin θ can be geometrically interpreted as the distance from an image edge point (x, y) to the point closest to the origin on the line determined by (ρ, θ) [14].

σρ σθ σθ2





=

k2 σθ2 kσθ2

kσθ2 σθ2



(16)

is singular, and Θ(x, y) has a singular or degenerate normal joint distribution. As recalled in the Appendix, this means that the probability density for Θ(x, y) is always concentrated in a subspace whose dimension is smaller than that of the space generated by Θ(x, y). In this case, θ(x, y) completely determines ρ(x, y) and vice versa, and it is sufficient to specify the density of just one of these random variables [17]. By using θ(x, y) as the selected random variable, ρ(x, y) is not a random variable and, according to (1), for each edge point (x, y), ρ(x, y) and all its parameters are completely defined. By Properties 2 and 3 of the bivariate joint distribution recalled in the Appendix, with X = Θ(x, y) − Θ, x1 = ρ(x, y) − ρ, and x2 = θ(x, y) − θ, the line probability density p(Θ(x, y)|Θ) is not expressed by (15), but it can be expressed by relation (38) in the Appendix, which, in the considered context, assumes that 1



−2 1 e p (Θ(x, y)|Θ) = p (0, θ(x, y)) = √ 2πσθ

[θ(x,y)−θ]2 σ2 θ

. (17)

This means that the bivariate joint distribution (15) of the two correlated random variables θ(x, y) and ρ(x, y) can be computed in a very simple way as the normal distribution (17) of only one of the two considered random variables; in particular, in this paper, the selected variable is θ(x, y). Therefore, the probability that the edge point (x, y) belongs to the line whose vector parameter is Θ given the observation θ(x, y) is simply obtained by integrating (17)

 cos4 θ

σρ2 σρ σθ





θ(x,y)+ 2

p (0|θ(x, y)) dθ

P (Θ(x, y)|Θ) = θ

θ(x,y)− 2





θ(x,y)+ 2

1 =√ 2πσθ

−1 2

e



[θ(x,y)−θ]2 σ2 θ

dθ (18)



θ(x,y)− 2

where θ is the quantization step of θ in the Hough accumulator. This result shows that the probability that the edge point (x, y) belongs to the line whose vector parameter is Θ depends on the line orientation estimation θ(x, y) only. Therefore, for small σθ , the probability that (x, y) belongs to the lines having parameters Θ is high. The computational efforts needed for computing P (Θ(x, y)|Θ) is reduced because P (Θ(x, y)|Θ) is obtained by the simple integration of a function of only one variable.

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 35, NO. 6, NOVEMBER 2005

A recursive computation of (19) is considered here. For a cell Θ in the Hugh accumulator, denote by s(θ, ) the quantity

IV. B AYESIAN A PPROACH IN C OMPUTING THE P ROBABILITY OF A L INE In this section, for a line represented by a couple of parameters Θ = [ρ, θ]T in the Hough accumulator, a Bayesian approach is proposed for deducing the probability P (Θ(x, y)|Θ) measure of the line under the assumption that the probability of each edge point (x, y) belonging to the line is known. The HT is accomplished by first creating the accumulator discrete array H(ρ, θ), also called Hough space, to represent the quantized parameters Θ = [ρ, θ]T . For each edge point (x, y) in the image, detected by using some gradient edge operator [13], the line parameters ρ(x, y) and θ(x, y) are estimated, and the probability of this edge point (x, y) to belong to the line whose parameters are Θ = [ρ θ]T is estimated by (18). The contribution of each edge point to all the possible image lines is obtained by iterating the computation of the belonging line probability (18) of the edge point (x, y) for the parameters ΘT = [ρ, θ]T in the discrete array. Because each edge (x, y) point is affected by some uncertainty, mainly due to the image noise and to the edge orientation estimation technique, all edge points do not contribute equally to the same line features and the measure of the different contribution is given by the probability expressed by relation (18). The line probability value is given by the “union” of the probability contributions of each edge point belonging to the line. A Bayesian estimation is considered for combining the probability contribution of each edge point according to the rules of probability theory [19], [20]. The considered rule is stated making use of the following notation. For each couple (ρ, θ) in the accumulator, Θ denotes the hypothesis that exist a line whose parameters are (ρ, θ). Θ1 = Θ(x1 , y1 ), Θ2 = Θ(x2 , y2 ), . . . , Θn = Θ(xn , yn ) are conditional independent pieces of evidence concerning Θ. Each Θi , i = 1, . . . , n relative to the same Θ is the event “the ith edge point (xi , yi ) belongs to the line with parameters Θ.” Then the conditional probability on the line Θ given the piece of evidence Θi , i = 1, . . . , n concerning Θ is specified as [19], [20]

P (Θ|Θ1 , Θ2 , . . . , Θn ) =

P (Θ) P (¬Θ)

1+

n 

P (Θi |Θ) P (Θi |¬Θ)

i=1 n

P (Θ) P (¬Θ)



i=1

P (Θi |Θ) P (Θi |¬Θ)

.

(19)

In (19), according to the Bayesian theorem, the terms P (Θ) and P (¬Θ) = 1 − P (Θ) are the a priori probability about Θ, and P (Θi |Θ) is the probability that the ith edge point belong to the line Θ and it is computed using relation (18); P (Θi |¬Θ) can be deduced from the Bayes theorem P (Θi |¬Θ) =

P (¬Θ|Θi )P (Θi ) P (¬Θ)

(20)

with P (¬Θ|Θi ) = 1 − P (Θ|Θi ). According to the Bayes theorem, P (Θ|Θi ) is expressed as P (Θ|Θi ) =

P (Θi |Θ)P (Θ) . P (Θi )

(21)

Therefore, by (21) and (20), P (Θi |¬Θ) has the following form: P (Θi |¬Θ) =

P (Θi ) − P (Θi |Θ)P (Θ) . 1 − P (Θ)

947

(22)

Relations (18), (19), and (22) allow to compute the line probability for each Θ = [ρ, θ]T in the Hough accumulator.

s(Θ, ) :=

  P (Θi |Θ) i=1

P (Θi |¬Θ)

(23)

where  is the number of contributions accumulated for the cell Θ. After the edge points processing, the further contribution P (Θj |Θ) to the cell Θ is used for computing P (Θj |¬Θ) and for updating the term s(Θ, j − 1) to the value s(Θ, j) by s(Θ, j) = s(Θ, j − 1)

P (Θj |Θ) P (Θj |¬Θ)

(24)

for j = 1, . . . , n, where n is the number of the edge points belonging to the line with parameters Θ. At the end of this recursive computation, for each cell Θ, the value s(Θ, n) is updated to s(Θ) =

1

P (Θ) s(Θ, n) P (¬Θ) . P (Θ) + P (¬Θ) s(Θ, n)

(25)

Then the array is searched for peaks. The peaks identify the parameters of the most likely lines; the values of these peaks represent the probability of these lines. Remark 1: It is worth noting that for each edge point (x, y), the update of the probability values stored in the cells of the accumulator is not accomplished for all the cells, but only for those having line parameters satisfying relation (1). Algorithm 1: A formal definition of the proposed algorithm for computing the lines probability is stated. Step 1) Acquire image and perform the prefiltering procedures for reducing uncertainty and noise. Step 2) Extract the edge pixels by gradient edge operator (Sobel). Step 3) Select the pixels belonging to the straight lines. Step 4) Consider a pixel (x, y) belonging to a straight line. Step 5) For each quantized value θq of θ in the Hough accumulator, compute the quantized values ρq of ρ satisfying (1). Step 6) Estimate the line parameter θ(x, y) = θˆ by relation (3) and the probability values P (Θ(x, y)|Θ) and P (Θ(x, y)|¬Θ) expressed by (18) and (22) with Θ = [ρq , θq ]T . Step 7) Update each cell (ρq , θq ) making use of relation (24). Step 8) If there is a pixel belonging to a line go to Step 4), otherwise, go to Step 9). Step 9) Update each cell of the Hough accumulator making use of relation (25). V. E XPERIMENTAL T ESTS In this section, the proposed approach for computing line probability is compared with the standard HT on two designed images and on real image. The designed images are shown in Fig. 1: a) the image n.1 is composed by a uniform background with a set of straight lines having different orientation; and b) the image n.2 is composed by a set of diamonds having a gray scale background. The aim of the performed test is to evaluate the performance of the proposed approach compared to those of other HT algorithms. Two different HTs are compared with the new HT here proposed and denoted by HT3. In the first standard HT, which was proposed by O’Gorman and Clowes [12] and here denoted by HT1, the accumulator is increased by the magnitude of the gradient orientation. In the second HT, which was proposed by Ji and Haralick [14] and here denoted by HT2, the accumulator is increased by p(Θ(x, y)|Θ) at

948

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 35, NO. 6, NOVEMBER 2005

Fig. 1. Designed images. (a) Test image n.1. (b) Test image n.2.

Fig. 3.

Fig. 2. Test image n.1 with noise variance σ 2 = 10.

HT1 results on image n.1 with noise variance σ 2 = 10.

each edge point (x, y). In the performed experimental tests, when the images are not corrupted by noise, the performances of the considered HT algorithms are comparable. When the images are corrupted by noise, different performances are verified and the considered methods are evaluated under different amounts of noise. The noisy images are obtained by adding Gaussian noise with zero mean and variance σ 2 to the pixel intensity of the original image. Noise with different values of σ 2 are added to the original image for identifying the amount of noise needed for verifying a substantial difference by the HTs. The criteria used for the performance comparison are based on the evaluation of: 1) the number of peaks in the Hough accumulators and 2) the number of lines extracted from the HTs. In the following, the detected lines are shown as bold line in the same image. The first step is to compare the algorithms on the noisy image with σ 2 = 10 that is shown in Fig. 2. Fig. 3(a) and (b) shows the Hough accumulator and the extracted lines by using HT1; Fig. 4(a) and (b)

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 35, NO. 6, NOVEMBER 2005

949

Fig. 5. HT3 results on image n.1 with noise variance σ 2 = 10.

Fig. 4.

HT2 results on image n.1 with noise variance σ 2 = 10.

shows the results obtained by HT2; and Fig. 5(a) and (b) shows the results obtained by using HT3. From these figures, it is possible to see that the noise level affect slightly the performance of HT1, which is unable to extract the upper left line (a thin original line is only shown in this part of Fig. 3), but it does not affect HT2 and HT3. The second step is to compare the algorithms on the image of Fig. 1(a) affected by the higher noise level with σ 2 = 20 that is shown in Fig. 6. Fig. 7(a) and (b) shows the Hough accumulator and the extracted lines by using HT1; Fig. 8(a) and (b) shows the results obtained by HT2; and Fig. 9(a) and (b) shows the results obtained by using HT3. These figures, compared with the case of a lower noise level, show that as the noise level increases, the performance of HT1 decreases quickly as shown by the large number of peaks extracted and reported in Fig. 7(a). The performances of HT2 and HT3 are still comparable also if HT2 is more affected by the noise than HT3. The third step is to compare the algorithms on the image of Fig. 1(b) affected by the noise level with σ 2 = 15, which is shown in Fig. 10. This figure is more complex than the first one because edge points are obtained as the boundary of two regions with different gray levels. Fig. 11(a) and (b) shows the Hough accumulator and the extracted lines

Fig. 6. Test image n.1 with noise variance σ 2 = 20.

by using HT1; Fig. 12(a) and (b) shows the results obtained by HT2; and Fig. 13(a) and (b) is related to HT3. These figures show that in more complex environment, the HT3 works well also with inaccuracy. The test of the proposed technique on a real image is obtained by framing the floor with a camera mounted on a mobile robot. The results are shown in Fig. 14 for the HT1 algorithm and in Figs. 15 and 16 for HT2 and HT3, respectively.

950

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 35, NO. 6, NOVEMBER 2005

Fig. 7. HT1 results on image n.1 with noise variance σ 2 = 20.

The set of results here discussed is a sample of the performed tests. These tests have shown that HT3 has a good capability to extract line probability in real environments. The last consideration is related to the capability of the HT3 algorithm to extract a real line probability. The HT3 does not perform a sum of discrete values, but it computes the effective line probability starting by the probability of each line edge point. Therefore, it is able also to deduce the probability value of a line. The computed probability value is different for different noise levels, it is smaller for greater noise levels; for example, in Fig. 9, the medium value of the probabilities stored in the peaks having values greater than 0.5 is 0.87, while in Fig. 5(a), the medium value of the probabilities stored in the peaks is 0.98.

Fig. 8.

HT2 results on image n.1 with noise variance σ 2 = 20.

the distribution for deducing the edge points probability; and 3) how to fuse the edge points probability relative to parameters of each line for obtaining the line probability. Therefore, the main contribution is a rigorous scheme for calculating the HT accumulator using the effective probability values of each extracted lines. The scheme is founded on theoretical and statistical derivations. The probability of a feature point is directly related to the input perturbation and to the estimation of the orientation angle of the line. The derived algorithm represents a simple and consistent method for calculating a probability measure of a line feature extracted from a digital image. Significant applications of this algorithm can be developed in many recognition problems, such as objects geometry and environmental geometry definitions for the automatic grasping of robot manipulators [21], [22] and for the navigation of mobile robots [8], [23].

VI. C ONCLUSION In this paper, a Bayesian updating scheme to compute the existence probability of the detected straight line is presented. The proposed scheme suggests: 1) how to simplify the normal joint distribution describing the straight line parameter uncertainty; 2) how to integrate

A PPENDIX N ORMAL J OINT D ISTRIBUTION In this section, some properties of the normal joint distribution are recalled and used for deducing relevant implications on the

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 35, NO. 6, NOVEMBER 2005

Fig. 9.

951

HT3 results on image n.1 with noise variance σ 2 = 20.

Fig. 11. HT1 results on image n.2 with noise variance σ 2 = 15.

and z2 [18]. Using a matrix notation, it results to X = ΣZ where Σ is a 2 × 2 matrix and Z = [z1 z2 ]T . If Σ is a nonsingular matrix, the bivariate normal density function for X is obtained by transforming the bidimensional density pZ (Z) for Z to a bidimensional density pX (X) for X through a nonsingular transformation applied to the joint density of Z. This is calculated as the product of the marginal densities of zi pX (X) = pZ (Z)|Σ−1 | = pZ (Σ−1 X)|Σ−1 | = Fig. 10. Test image n.2 with noise variance σ 2 = 15.

computation of the probability of an edge point (x, y) to belong to a line with parameters Θ, P (Θ(x, y)|Θ). A vector of two random variables X = [x1 x2 ]T with zero mean value has a bivariate normal distribution if each component of X is a linear combination of two independent normal random variables z1

T 2 −1 1 1 |Σ|−1 e− 2 [X (Σ ) X ] 2π

(26)

where Σ2 = ΣΣT is the covariance matrix. By writing Σ2 as

2

Σ =

σ12 σ1 σ2

σ1 σ2 σ22

(27)

where σ12 = var(x1 ), σ22 = var(x2 ), and  = cov(x1 , x2 )/σ1 σ2 ; the two random variables x1 and x2 are uncorrelated and then independent

952

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 35, NO. 6, NOVEMBER 2005

Fig. 13.

HT3 results on image n.2 with noise variance σ 2 = 15.

Fig. 12. HT2 results on image n.2 with noise variance σ 2 = 15.

for  = 0 and correlated for  = 0, and their covariance matrix is singular for  = ±1. Taking into account that |Σ2 | = σ12 σ22 (1 − 2 )

pX (X) =

1

2πσ1 σ2



1 − 2

e

1 2(1−2 )



x2 1 σ2 1

R=

cos γ sin γ

− sin γ cos γ

(31)

(28)

the bivariate normal density for X can be written as −

By considering the following rotational transformation matrix as orthonormal matrix

2x x

x2

2+ 2 − σ 1 σ2 1 σ2 2

.

(29)

the random vectors X = ΣZ and Y = ΣRZ have the same distribution for each rotation angle γ. Property 2: Equation (29), which is the normal joint density of two correlated random variables x1 and x2 , can be transformed in the normal density of two independent random variables y1 and y2 having 1



−2 1 pY (Y ) = e 2πσ1 σ2

y2 1 σ2 1

2

y + 2 2 σ 2

.

Fig. 17 shows the normal joint density function computed for two correlated random variables having variances σ1 = 1.3 and σ2 = 0.7 and correlation  = 0.85. In the following, some useful properties of the normal joint density function are recalled [18]. Property 1: Let X be a vector of two random variables having a normal joint distribution and let Σ be a 2 × 2 matrix such that X is a linear combination of two independent normal random variables z1 and z2 , Z = [z1 z2 ]T , and X = ΣZ. Let B be a 2 × 2 matrix such that B = ΣD, where D is an orthonormal matrix (DDT = DT D = I). Then the random variable vector Y = BZ has the same distribution of X and

it results to σy21 = σ12 and σy22 = σ22 , and the joint density of the random variables y1 and y2 has the form

2 . σY2 = BB T = ΣDDT ΣT = ΣIΣT = σX

pY (Y ) = |J|pX (X)

(30)

(32)

By defining Y = [ y1 y2 ]T = ϕ(X) with X = [x1 x2 ]T and y1 =

σ1

 x1

σ1



−  σx22



1 − 2

y2 = x2

(33) (34)

(35)

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 35, NO. 6, NOVEMBER 2005

Fig. 14. Example of the HT1 algorithm applied to a floor image acquired by a mobile robot during environment exploration.

where |J| is the determinant of the Jacobian transformation X = ϕ−1 (Y ), and it is described by

 ∂x1  ∂y

|J| = 

1 ∂x2 ∂y1

∂x1 ∂y2 ∂x2 ∂y2

    1 − 2 ρ σ1 = σ2   0 1

    = 1 − 2 . 

(36)

By simple algebraic manipulations, relation (35) implies (32), which is the joint density of two independent random variables; it can be also expressed as pY (Y ) = py1 ( y1 )py2 ( y2 ).

(37)

Property 3: Let x1 and x2 be two random variables completely correlated such that: x1 = mx2 , their correlation factor  = 1 and m is a scalar quantity. The normal joint density of x1 and x2 can be computed as the normal density of only one of these two variables; in particular, it can be expressed as 1

−2 1 e pX (X) = p(0, x2 ) = √ 2πσ2



x2 2 σ2 2

.

(38)

According to Property 2, by using the coordinate transformations (33) and (34), it is possible to transform (29) in (32), representing

953

Fig. 15. Example of the HT2 algorithm applied to a floor image acquired by a mobile robot during environment exploration.

the normal density of two independent random variables y1 and y2 . If the random variables x1 and x2 satisfy the relation x1 = mx2 , where m is a scalar quantity, then σ12 = m2 σ22 and for  = 1, (32) becomes (38). In fact, for x1 = mx2 , σ1 = mσ2 , and  = 1 (33) becomes

 y1 = m 

1 − 2

1 + 2

y2 −−−−−→ 0  −→ 1

(39)

therefore, if y1 → 0, (32) becomes (38). Relation (38) is the normal distribution of the random variable x2 . The normal joint density of the considered case is shown in Fig. 18. Remark: According to the properties stated above, it is possible to conclude that the estimation of the normal joint density function of two completely correlated normal variables x1 and x2 can always be computed as the density function of only one of the two random variables. In fact, by a proper rotational transformation matrix, it is possible to rotate the reference frame (x1 , x2 ) of an appropriate angle γ so that the new reference frame (x1 , x2 ) has an axis coincident with the straight line of the degenerated normal joint distribution as shown in Fig. 19. According to Property 1, the normal distribution of (x1 , x2 ) has the same density function of (x1 , x2 ) in the new reference frame, and according to Property 3, it depends on a single random variable.

954

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 35, NO. 6, NOVEMBER 2005

Fig. 18. Bivariate normal density of two completely correlated random variables x1 and x2 for  −→ +1.

Fig. 16. Example of the HT3 algorithm applied to a floor image acquired by a mobile robot during environment exploration.

Fig. 19. Example of a rotational coordinate transformation, which transforms the normal joint distribution of two correlated normal variables in the normal distribution of only one random variable.

R EFERENCES

Fig. 17. Bivariate normal density for two correlated variables x1 and x2 with σ1 = 1.3, σ2 = 0.7, and  = 0.85.

[1] G. N. Desouza and A. C. Kak, “Vision for mobile robot navigation: A survey,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no. 2, pp. 237– 267, Feb. 2002. [2] A. Nakamura, J. Ota, and T. Arai, “Human-supervised multiple mobile robot system,” IEEE Trans. Robot. Autom., vol. 18, no. 5, pp. 728–743, Oct. 2002. [3] R. Joshi and A. C. Sanderson, “Minimal representation multisensor fusion using differential evolution,” IEEE Trans. Syst., Man, Cybern. A, Syst. Humans, vol. 29, no. 1, pp. 63–76, Jan. 1999. [4] L. Iocchi, D. Mastrantuono, and D. Nardi, “A probabilistic approach to Hough localization,” in Proc. IEEE Int. Conf. Robotics and Automation, Seoul, Korea, 2001, pp. 4250–4255. [5] G. Grisetti, L. Iocchi, and D. Nardi, “Global Hough localization for mobile robots in polygonal environments,” in Proc. IEEE Int. Conf. Robotics and Automation, Washington, DC, 2002, pp. 353–358. [6] T. Schmitt et al., “Cooperative probabilistic state estimation for vision based autonomous mobile robots,” IEEE Trans. Robot. Autom., vol. 18, no. 5, pp. 670–684, Oct. 2002.

IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART A: SYSTEMS AND HUMANS, VOL. 35, NO. 6, NOVEMBER 2005

[7] X. J. Li et al., “CAD-vision-range-based self-localization for mobile robot using one landmark,” J. Intell. Robot. Syst., vol. 35, no. 1, pp. 61–82, 2002. [8] J. M. Perez, C. Urdiales, A. Bandera, and F. Sandoval, “A Hough-based solution to the simultaneous localization and map building problem,” in Proc. 1st Eur. Conf. Mobile Robots (ECMR), Radziejowice, Poland, 2003, pp. 53–58. [9] A. Bonci, G. Ippoliti, L. Jetto, T. Leo, and S. Longhi, “Methods and algorithms for sensor data fusion aimed at improving the autonomy of mobile robot,” in Advances in Control of Articulated Mobile Robots, B. Siciliano et al., Eds. Heidelberg, Germany: Springer-Verlag, Springer Tracts in Advanced Robotics (STAR), 2004, pp. 191–222. [10] P. V. C. Hough, “Methods and means for recognising complex patterns,” U.S. Patent 3 069 654, Dec. 18, 1962. [11] R. O. Duda and P. E. Hart, “Use of the Hough Transform to detect lines and curves in pictures,” Commun. ACM, vol. 15, no. 1, pp. 11–15, Jan. 1972. [12] F. O’Gorman and M. B. Clowes, “Finding picture edges through collinearity of feature points,” IEEE Trans. Comput., vol. C-25, no. 4, pp. 449–454, Apr. 1976. [13] R. M. Haralick and L. G. Shapiro, Computer and Robot Vision, vol. 1. Reading, MA: Addison-Wesley, 1992. [14] Q. Ji and R. M. Haralick, “An improved Hough Transform technique based on error propagation,” in IEEE Int. Conf. Systems Man and Cybernetics, San Diego, CA, 1998, vol. 5, pp. 4653–4658. [15] ——, “Error propagation for the Hough Transform,” Pattern Recogn. Lett., vol. 22, no. 6–7, pp. 813–823, 2001. [16] I. E. Sobel, “Camera models and machine perception,” Ph.D. dissertation, Elect. Eng. Dept., Stanford Univ., Stanford, CA, 1970. [17] A. H. Jazwinski, “Mathematics in sciences and engineering,” in Stochastic Processes and Filtering Theory, vol. 64. New York: Academic, 1970. [18] H. J. Larson and B. O. Shubert, “Probabilistic models in engineering sciences,” in Random Variables and Stochastic Processes, vol. 1. New York: Wiley, 1979. [19] E. A. Bender, Mathematical Methods in Artificial Intelligence. Los Alamitos, CA: IEEE Comput. Soc. Press, 1996. [20] R. C. Luo and M. G. Kay, Data Fusion and Sensor Integration, Data Fusion in Robotics and Machine Intelligence, M. A. Abidi and R. C. Gonzales, Eds. Orlando, FL: Academic, 1992, pp. 54–56. [21] A. Bonci, S. Longhi, A. Monteriù, and M. Vaccarini, “Motion control of a smart mobile manipulator,” in Int. Conf. Intelligent Manipulation and Grasping, Genoa, Italy, Jul. 1–2, 2004, pp. 110–116. [22] ——, “Navigation system of a smart wheelchair,” J. Zhejiang Univ. Sci., vol. 6A, no. 2, pp. 110–117, Feb. 2005. [23] A. Bonci, G. Di Francesco, and S. Longhi, “A Bayesian approach to the Hough Transform for video and ultrasonic data fusion in mobile robot navigation,” in Proc. IEEE Int. Conf. Systems Man and Cybernetics, Hammamet, Tunisia, 2002, vol. 3, pp. 354–359.

955

Linear Versus Nonlinear Neural Modeling for 2-D Pattern Recognition Claudio A. Perez, Guillermo D. Gonzalez, Leonel E. Medina, and Francisco J. Galdames

Abstract—This paper compares the classification performance of linearsystem- and neural-network-based models in handwritten-digit classification and face recognition. In inputs to a linear classifier, nonlinear inputs are generated based on linear inputs, using different forms of generating products. Using a genetic algorithm, linear and nonlinear inputs to the linear classifier are selected to improve classification performance. Results show that an appropriate set of linear and nonlinear inputs to the linear classifier were selected, improving significantly its classification performance in both problems. It is also shown that the linear classifier reached a classification performance similar to or better than those obtained by nonlinear neural-network classifiers with linear inputs. Index Terms—Face recognition, genetic selection of inputs, handwrittendigit classification, linear classifier, neural-network classifier, nonlinear inputs.

I. I NTRODUCTION Modeling, based on linear systems and neural networks, has been widely used in a large number of applications [1], [17]. Although the linear-system and the neural-network approaches to modeling have grown independently, they make use of similar techniques. The present study compares linear-system-based models and neural-network-based models in two pattern-recognition problems: handwritten-digit classification and face recognition. Neural networks have been used in modeling pattern-recognition problems in many industrial applications such as in woodboarddefects classification for the forestry industry and in the lithologicalcomposition sensor for the mining industry [2], [6], [12]. Handwrittendigit recognition is an important task in automated document analysis. Applications have been developed to read postal addresses, bank checks, tax forms, and census forms, including reading aids for the visually impaired, among others [3]–[5]. Face recognition has become popular as a possible person-identification procedure based on biometrics. Several papers have used neural networks for face recognition in the past few years [8], [14]. Both problems, handwritten-digit classification and face recognition, are suitable for exploring new approaches in the design of two-dimensional (2-D) pattern-recognition classifiers, because they are inherently complex tasks, but they are restricted to only a few classes, enabling relatively simple implementation [20]. In the problem of handwritten-digit classification, published results on classification performance for handwritten digits fall in a broad range, from 68% [15] to 99% [5], [18], [26]. Results largely depend on the size of the database, type of partition, and rejection ratios employed [5]. In the face-recognition problem, performances have been reported reaching up to 95% on limited-size (less than 50

Manuscript received March 4, 2002; revised November 21, 2002 and August 27, 2003. This research was funded by CONICYT–Chile through project FONDECYT 1000977 and by the Department of Electrical Engineering, Universidad de Chile, Santiago, Chile. This paper was recommended by Associate Editor D. Zhang. The authors are with the Department of Electrical Engineering, Universidad de Chile, Santiago, Chile (e-mail: [email protected]; gugonzal@ cec.uchile.cl; [email protected]; [email protected]). Digital Object Identifier 10.1109/TSMCA.2005.851268 1083-4427/$20.00 © 2005 IEEE

Suggest Documents