Detection of Zebra Crossing and Measurement of ...

1 downloads 0 Views 597KB Size Report
Detection of Zebra Crossing and Measurement of. Crossing Length – a Travel Aid for the Blind. Mohammad Shorif Uddin. 1,2 and Tadayoshi Shioyama. 2.
SICE Annual Conference 2005 in Okayama, August 8-10, 2005 Okayama University, Japan

Detection of Zebra Crossing and Measurement of Crossing Length – a Travel Aid for the Blind Mohammad Shorif Uddin1,2 and Tadayoshi Shioyama2 1 Department of Computer Science & Engineering, Jahangirnagar University, Savar, Dhaka 1342, Bangladesh 2 Department of Mechanical & System Engineering, Kyoto Institute of Technology, Sakyo-ku, Kyoto 606-8585, Japan [email protected] Abstract: A safe road-crossing system is very important to improve the mobility of blind people. In this paper, an image-based real-time novel approach is proposed for detecting a zebra crossing and also measuring its length. Detection is achieved through bipolarity-based segmentation along with projective invariant-based recognition and then measurement of length is performed from the number of stripes in the segmented area. Experimental results demonstrate the effectiveness of the proposed technique. Keywords: Image analysis, zebra crossing, bipolarity feature, projective invariants, travel aid for the blind.

1. Introduction According to the World Health Organization statistics1) , approximately 40 million people are blind all over the world. A safe road-crossing system is an extreme necessity to improve the mobility of these vision-disabled people. This paper intends to develop a safe road crossing system for this noble purpose. Usually, blind people use a white cane as a travel aid. The range of detection of special patterns or obstacles using a cane is very narrow. To improve the versatility of the white cane, various devices have been developed such as the SONICGUIDE5) , the Mowat sensor2) , the Laser cane4) and the Navbelt9) . However, these devices are not able to assist the blind at a pedestrian crossing where information about the location of a crossing, an idea about its length and the state of traffic lights is important. Some traffic lights have beeper, which prompt the blind person to cross the road, when it is safe to do so. However, such equipment is not available at every crossing; perhaps, it would take too long for such equipment to be installed and maintained at every crossing. Blind people obviously can not see, but can hear. The arrival of fast and cheap digital portable laptop computers with multimedia computing to convert audio-video streams in real time opens new avenues to develop an intelligent navigation system for blind people. Hence, a practical system can be easily devised using a small camera fitted with a sunglass, a PDA and a small speaker. In this paper, we concentrate on the detection of the location of a zebra crossing as well as measurement of its length using a single camera. A crossing is characterized by zebra pattern i.e. by constant width periodic white bands on a usual black road surface, which can be treated as a bipolar pattern. In Japan, the width of each white or black band of crossing is 45 cm. The algorithm first tries to detect the existence of crossing in the frontal image. If there exists a crossing, then it goes to determine the crossing length. From the viewpoint of image-based blind navigation system, Stephen Se10) first proposed a pedestrian crossing detection by grouping lines and checking for concurrency using the vanishing point constraint. However, the main shortcoming of this technique is that it is working slow and far from

- 2945 -

real time. Meijer’s7) “vOICe” consisting of a head-mounted camera, stereo headphones and a laptop, is the only commercially available vision-based travel aid that uses 1-to-1 image-to-sound mapping. Though it can recognize the walls, doors etc., but pedestrian crossing detection technique is still absent in it. Recently, we proposed a real-time crossing detection system11) based on Fisher criterion. The method is simple, but as the feature points are extracted on a single vertical line in an image, hence its reliability is not upto the mark. Compared to that technique the present method has multiple checking criteria and hence its reliability is high. Previously, Shioyama et al.12) reported a crossing length measurement technique, which is based on edge detection, Hough transformation and Susan feature detection techniques. However, it is complicated, computationally inefficient and needs many thresholding parameters to adjust. Recently, we presented a crossing length measurement technique6) assuming that the image contains crossing using the information of crossing bands, which is simple, stable and computationally faster than the Shioyama’s method. However, we did not use the segmented area that is being incorporated in the present method. Beside this, we have done the unification of the two steps – detection of existence of crossing and measurement of crossing length in the present paper. In order to evaluate the performance of the proposed method, experiment is performed using real road scenes with and without pedestrian crossing.

2. Principle of the Proposed Technique Some images of a real road scenes containing pedestrian crossings are shown in Fig. 3(a)-(d). The crossing region can be treated as a bipolar pattern. Detection of crossing technique includes three major steps. First, extraction of the crossing region from the image. We segment the image on the basis of the strength of bipolarity and pick the crossing candidates on the basis of area. These candidates are then checked for appropriateness of their positions in the image as well as crossing direction. We have

PR0001/05/0000-2945 ¥400 © 2005 SICE

used bipolarity as the main feature in detecting the crossing candidates, in subsection 2.1, we describe the bipolarity feature. As a blind person is interested in detecting the frontal crossing not the sided one, therefore, estimation of crossing direction is important. We present the crossing direction estimation principle in subsection 2.2. Second, extraction of feature points in the candidate area based on Fisher criterion. In subsection 2.3, we describe feature points extraction principle. Third, final decision of crossing or not crossing based on projective invariants. In subsection 2.4, we describe the details of projective invariants. If a crossing is detected then we go for the measurement of its length. We describle the crossing length estimation model in subsection 2.5.

2.1 Bipolarity We denote the intensity distribution of an image block as p0 (x). If the block contains only black and white pixels, then p0 (x) can be written as p0 (x) = α p1 (x) + (1 − α )p2 (x), where 0 ≤ α ≤ 1, p1 (x) is the intensity distribution of black pixels and p2 (x) is the intensity distribution of white pixels. Let define some variables as:

µi =

Z ∞ −∞

xpi (x)dx, σi2 =

Z ∞ −∞

(x − µi )2 pi (x)dx, (i = 0, 1, 2),

(1) where µi and σi 2 represents mean and variance, respectively. Using the above relations, we can write σ0 2 as

σ02 = ασ12 + (1 − α )σ22 + α (1 − α )(µ1 − µ2 )2 .

(2)

Equation (2) shows that the total variance consists of the weighed sum of variances and the difference of means. If σ02 ≈ α (1 − α )(µ1 − µ2 )2 , then p0 (x) can be said almost bipolar. So, we define the bipolarity γ as ª 1 © γ ≡ 2 α (1 − α )(µ1 − µ2 )2 . σ0

(3)

| µ 1 − µ2 | |µ1 − µ2 | and σ2 ≤ . n n

1 α (1 − α )d 2 o = 1− n . 1 α (1 − α )n2 1 + 2 d n2 + α (1 − α )

u = + x cos θ + y sin θ , v = − x sin θ + y cos θ .

(5)

If there is a pedestrian crossing in the image, then θ will closely correspond to the direction of crossing bands. The differential of i(x, y) along v direction includes alternate peaks and valleys on the edges of crossing bands (i.e. from black to white and vice versa). Therefore, ∂∂ vi has local extremes about direction v on the edges of crossing bands. For simplicity, we used i instead of i(x, y). Since the edges of crossing bands are straight lines, the integration along u direction emphasizes the local extremes. Consequently, one can find crossing bands by analyzing the projection as Z ∞ ∂i −∞

∂v

du,

(6)

which is a one-dimensional function about v. Then, the integral of square of (6) becomes a good measure of closeness to true crossing direction. Accordingly, we use θ that maximizes ¯ Z ∞ ¯¯Z ∞ ∂ i ¯¯2 ¯ du (7) ¯ dv. ¯ −∞ −∞ ∂ v Using Parseval’s formula, one can derive the following equation. ¯ Z Z ∞ ¯¯Z ∞ ∂ i ¯¯2 1 ∞ 2 ¯ du dv = ζ |I (−ζ sin θ , ζ cos θ )|2 d ζ , ¯ ¯ 2π −∞ −∞ −∞ ∂ v (8) where I(ξ , η ) is the Fourier transform of i(x, y) and is given by Z Z ∞



−∞ −∞

i(x, y)e− j(ξ x+η y) dxdy,

(9)

j is a complex operator and ζ = −ξ sin θ + η cos θ . Although the left hand side of (8) includes two integrations depending on θ , the right hand side includes only one integration depending on θ . Therefore, we find θ that maximizes the right hand side of (8).

2.3 Extraction of Feature Points by the Fisher Criterion

Then, (3) implies

γ≥

Let i(x, y) denote the intensity of an image, where x and y are the horizontal and vertical spatial coordinates, respectively. If the image rotates by an angle of θ , then we can write expressions for the rotated coordinates u and v as follows.

I(ξ , η ) =

Equation (3) implies that 0 ≤ γ ≤ 1. If γ = 1, there are α , p1 and p2 such that σ1 = σ2 = 0. This means p1 (x) = δ (x − µ1 ) and p2 (x) = δ (x − µ2 ). So, γ = 1 corresponds to perfect bipolarity and γ = 0 represents the absence of bipolarity. Let, d = |µ1 − µ2 | , σ1 ≤

2.2 Extraction of Crossing Direction

(4)

Therefore, using (4), one can estimate a lower bound of γ taking α as a parameter. For example, γ must be greater than 0.8 when there are p1 and p2 such that α = 0.5 and n ≥ 4.

- 2946 -

Fisher criterion8) extracts an edge point on a white band in a crossing. Consider a vertical window of size (2 × win + 1, 1) pixels whose center coincides with a feature point. In this paper the value of win is taken 6 pixels. Let Class 1 consists of pixel intensities in the upper half of the window and Class 2 consists of intensities in the lower half of the window. The Fisher criterion fc is defined as the ratio of σb (between class variance) to σw (within class variance): fc = σb /σw ,

(10)

where σb = p1 p2 (m1 − m2 )2 , σw = p1 v1 + p2 v2 , mi is the mean intensity, vi is the variance of intensity and pi is the probability of class i, here, i = 1, 2. If we scan the window in the vertical direction in a crossing region, then from the above definition, fc will be local maximum at the edge points of white bands.

end line

2nd line 1st line d y0

2.4 Projective Invariants

yn

According to projective transformation3) , for four collinear points, the cross ratio of their Euclidean distances I≡

l12 l34 , l13 l24

h camera center ground plane

optical axis

(11) Figure 1: Crossing length estimation model.

is invariant, where li j is the Euclidean distance between two points i and j. Earlier we mentioned that a pedestrian crossing is characterized by equal width periodic white and black stripes. Let denote the width of each crossing band is w. Consider feature points, which are edge points of white bands, then using (11) we can write the projective invariant for four consecutive feature points of a pedestrian crossing as I≡

image plane

f

l12 l34 w·w = = 0.25. l13 l24 2w · 2w

3. Experimental Method A flow diagram of the proposed method is presented in Fig. 2. Color to gray scale conversion Segmentation using bipolarity

(12) Extract highly bipolar and wide candidate regions

If there are n feature points, we can find (n − 3) projective invariants using (11). For each projective invariant I(k), k = 1, 2, · · · , n − 3, we check whether I(k) is within 10% of 0.25. That means, ¯ ¯ ¯ I(k) ¯ ¯ ¯ < 0.10. − 1.00 (13) ¯ 0.25 ¯ We have used this 10% tolerance on the basis of experimental data.

Fill small areas inside the candidate and refine

Y

Is the location of candidate appropriate?

N

Y Y Crossing Direction > 15 deg

Y

N

2.5 Crossing Length Estimation Model We use the projection model shown in Fig. 1 in estimating the crossing length. Let yn and y0 are the positions of the first and the end lines of crossing in the image plane, respectively, f is the focal length of the camera, h is the camera position from the road surface, d is the crossing length (i.e. the horizontal distance from the observer to the end line of crossing) and w is the width of a crossing band. If there are n number of bands in the crossing area, then we may easily find the following relations from Fig. 1. y0 =

fh d

and

yn =

fh . d − nw

feature points Y number >= 7

N

Is there any more candidate?

Y

Invariant number >= 1

N N

Y There is a crossing

There is a NO crossing

Determination of length using crossing region

(14)

Figure 2: Flow diagram of the proposed method.

(15)

3.1 Method for Detection of Existence of Crossing

Then, we get yn − y0 = f h ×

nw . d(d − nw)

We can derive the expression for crossing length from (15) as s ( ) 4nw f h 1 2 d= nw + (nw) + . (16) 2 yn − y0 In (16) w, f and h are known.

- 2947 -

At first, the color image is converted to a gray scale image. Then the image is partitioned into equal-size rectangular blocks of (16 × 16) pixels. We use the following two steps in finding the crossing region candidates.

1. This step identifies homogenous bipolar regions. Segmentation is carried out by merging neighboring blocks of similar pattern on the basis of intensity distributions.

invariant condition (13) then decide there is a crossing, otherwise go to examine the next candidate region (if any).

2. Only keep regions that are sufficiently bipolar. Calculate the bipolarity of each segmented region. First, determine the largest region that have bipolarity higher than 0.80. Then extract the candidate regions those have bipolarity greater than 0.80 and areas more than 50% of the largest region’s area. The above values are chosen based on experimental data.

On the basis of above steps 4 and 5, if there is a crossing in the vicinity (too left or too right or too far or too steep), then this information will also be conveyed to the user to find the appropriate location of a crossing.

Then, we follow the following steps in checking each candidate region.

We use the following steps in estimating the crossing length.

3. This section refines the segmentation. First, check the largest area candidate. If there exists a small region of different label within this candidate region, then fill it with same label if its area is less than 5% of the candidate area. Then refine the bottom boundary region of the crossing area. The refinement is done by assigning same label to the pixels on a different label horizontal line if its leftmost and rightmost pixels lie in the candidate region. 4. Make sure the region is centered in the image. Check the location (either it is in appropriate position or too left or too right or too far w. r. t. the observer) of the candidate region. If the total candidate region lies in a position, which is situated within 20% of image width from the left or right boundaries, then the position of the candidate region is treated too left or too right, respectively. If y-coordinate of the bottommost point of the candidate region is situated more than 40% of image height from the bottom boundary, then the position of the candidate region is treated too far from the observer. If the candidate region is not in an appropriate location then decide there is no crossing for this candidate and go to examine the next candidate region (if any). 5. This section corrects for viewing angle. Calculate the crossing direction at the location of the candidate region using (8) with the help of the two-dimensional fast Fourier transform. If the crossing direction is greater than a threshold (15◦ is used here on the basis of experimental data) then decide there is no crossing for this candidate, as the crossing direction is too steep and go to examine the next candidate region (if any). 6. This section extract feature points using Fisher criterion and then take the decision on the basis of projective invariants. Determine the center position of the candidate region. Then extract the feature points on the vertical line passing through the center point of the candidate region. If there are at least 7 feature points, then go for checking the projective invariant criterion. This condition ensures that there are more than 3 white bands exist in the crossing region. Some road marking (not crossing) consists of three painted stripes. An image of this type is shown in Fig. 3(f). Hence, the above condition will safeguard against false positive detection result. If there is at least one invariant satisfying the

- 2948 -

3.2 Method for Estimating Crossing Length

1. Integrate the crossing region of the image along the crossing direction (i.e. u-axis direction) for each v-axis position. Then to find the location of edges of white to black or black to white crossing bands, we use the differentiation along v direction of the integral data. There will be local extremes (maximum or minimum) at the position of edges. 2. Extract important extremes (true band edges) by comparing absolute value of the differentials. First, find a local extreme that has the maximum absolute value of differential. Then search a new local extreme whose value lies in the interval from 0.4 to 6.0 times of the previously selected extreme. Each newly selected extreme becomes reference for the next search of local extreme selection. In this way, the search and selection are continued for all local extremes. Omit the local extremes that are not selected. The parameters used in this step are also selected on the basis of experimental data. 3. Remove the adjacent local extremes of same sign. Starting from the farthest local extreme (on the basis of y position in the image), if there exists two adjacent extremes of same sign, remove the extreme that has lower value. 4. Count the number of local extremes and determine y positions of the first- and the last-local extremes. Then calculate the crossing distance using (16).

4. Results and Discussions To evaluate the performance of the proposed method we used 113 real images with different backgrounds taken by a commercial digital camera. Among them, 79 images include crossing and the rest 34 are of no crossing. Fig. 3 shows some samples of experimental images. The size of the images is (width × height) = (640 × 480) pixels. Fig. 4 presents bipolarity of each segmented region and extracted feature points in the candidate region for the crossing image of Fig. 3(a). The complete result of the detection of the existence of crossing is summarized in Table 1. From this table, we see that the proposed algorithm is quite successful in detecting the existence of crossings from real road images. The method has not made any dangerous (false positive) error such that it decides the existence of a crossing for a scene

(a)

(b)

(c)

(d)

crossing (i.e. false negative). These have happened only in 5 cases among 113 experimental images. Among them, 3 cases contain less than four white bands - an image of these type is shown in Fig. 3(b). A case shown in Fig. 3(c) where the white paintings on the crossing are damaged, gives false negative result. The rest case shown in Fig. 3(d) also gives false negative result. The segmentation step could not extract any highly bipolar wide area in this image, as the white paintings of crossing are done on unusual pattern of road surface.

(a)

(b)

Figure 5: Extracted feature points for Fig. 3(f): (a) by the previous method mentioned in reference 11) - gives false positive result, (b) by the present method - gives correct result. (e)

(f)

Figure 3: Some experimental images: crossing is detected in (a), false negative is obtained in (b)-(d), no crossing is detected in (e) and (f).

(a)

(b)

Figure 4: (a) Bipolarity of each segmented region, (b) extracted feature points for the crossing image of Fig. 3(a). Table 1: Crossing Detection Result Summary. Decision crossing No crossing Total number of images

Image with crossing 74 (Ok) 5 (False negative) 79

Image without crossing 0 (False positive) 34 (Ok) 34

In the earlier method mentioned in reference 11) , the feature points are extracted on the middle of an image without any segmentation and then use the projective invariant criterion to take final decision. For the image of Fig. 3(d), total 9 feature points are extracted and 2 invariants satisfy the projective criterion. Therefore, it gives life threatening false positive result. However, the present method gives the correct decision i.e. there is no crossing, as the extracted feature points are less than 7. The extracted feature points for both methods are shown in Fig. 5. Crossing length measurement is done for 74 images. Fig. 6 shows the extraction process of crossing bands, which are used in estimating crossing length for the image of Fig. 3(a). For this image, the true crossing length is 17.0 m and the estimated length is 17.2 m. The technique correctly estimated the crossing length. The average estimation error is 9.9% and the r.m.s. error is 2.0 m for the total 74 images. The maximum relative error is 30%. Exactly, extremely high accuracy is required for detection of the existence of crossing, as a false positive error is life threatening for a pedestrian. However, an idea of crossing length is sufficient for a pedestrian. In the present method, the error in the estimated length is not very high.

5. Conclusions

of without crossing. For a scene containing crossing where the white paintings on the crossing are damaged or the scene contains too few crossing bands - very short crossing (less than 4 white bands exist) then it decides nonexistence of

- 2949 -

In this paper, a simple and real-time method for the detection of existence of crossing as well as measurement of crossing length has been proposed. Experimental results with real road scenes confirmed its effectiveness. In detection of crossing, the method has not made any dangerous (false positive) error such that it decides the existence of a crossing for a scene of without crossing. For a crossing scene where the white paintings on the crossing are damaged or the scene contains too few crossing bands, then it decides

nonexistence of crossing (i.e. false negative). These have happened only in 5 cases among 113 experimental images. The method is successful in measuring the crossing length with a reasonable accuracy if the image resolution is good and the white paintings on the crossing are fine as well as there is no vehicle obstruction. We hope this technique will help in improving the mobility of blind people. We have not tested our algorithm with visually impaired users yet. In future, we would like to improve our system on the basis of feedbacks from visually impaired users. Beside this, for a complete system we need to include an algorithm for detection of traffic light and also illumination invariant strategy to make it robust to illumination changes.

Research (No. 16500110 and No. 03232).

References [1] B. Thylefors, A. D. Negrel, R. Pararajasegaram, K. Y. Dadzie, “Global data on blindness,” Bulletin of World Health Organization 73, pp.115-121, 1995. [2] D. L. Morrissette et al., “A follow-up study of the Mowat sensor’s applications, frequency of use, and maintenance reliability,” J. Vis. Impairment and Blindness 75, pp. 244-247, (1981). [3] I. Weiss, “Geometric invariants and object recognition,” Int. J. Comp. Vision, 10, no. 3, pp. 207-231, 1993.

300 250 200

[4] J. M. Benjamin, “The new C-5 laser cane for the blind,” in Carnahan Conf. on Electronic Prosthetics, pp. 7782, (1973).

projection imp_local_extremes

150 100 50

[5] L. Kay, “A sensor and to enhance spatial perception of the blind, engineering design and evaluation,” Radio Electron. Eng. 44, pp. 605-629, (1974).

0 -50

[6] M. S. Uddin and T. Shioyama, “Measurement of the length of pedestrian crossings - a navigational aid for blind people,” Proc. IEEE Intl. Conf. on Intelligent Transportation Systems (ITSC2004), Washington, USA, Oct. 2004 pp. 690-695.

-100 0

50

100

150

200 250 300 v-position [pixel]

350

400

450

(a)

[7] P. B. L. Meijer, “Vision technology for the totally blind,” [Online] http://www.seeingwithsound.com/ [8] R. A. Fisher, “The use of multiple measurements in taxonomical problems,” Annals of Eugenics 2, pp. 179188, (1936). [9] S. Shoval, J. Borenstein, Y. Koren “Auditory guidance with the navbelt-a computerized travel aid for the blind,” IEEE Trans. Syst. Man Cybern. C28, pp. 459467, (1998).

(b) Figure 6: Experimental result of the image of Fig. 3(a) for estimation of crossing length: (a) extraction process of band edges using crossing region only (“projection” indicates mean value of integration and “imp local extremes” indicates local maximum or minimum of differentials i.e. positions of band edges), (b) straight line marks are drawn at the positions of extracted band edges. The true crossing length is 17.0 m and the estimated length is 17.2 m.

[10] Stephen Se, “Zebra-crossing detection for the partially sighted,” Proc. IEEE Computer Society Conf. Computer Vision and Pattern Recognition (CVPR), South Carolina, USA, Jun. 2000, vol. 2, pp. 211-217. [11] T. Shioyama and M. S. Uddin, “Detection of pedestrian crossings with projective invariants from image data,” Meas. Sci. Technol. 15, no. 12, pp. 2400-2405, (2004). [12] T. Shioyama, H. Wu, N. Nakamura, S. Kitawaki, “Measurement of the length of pedestrian crossings and detection of traffic lights from image data,” Meas. Sci. Technol. 13, pp. 1450-1457, (2002).

Acknowledgments The authors gratefully acknowledge the suggestions and comments of Professor Yasuo Yoshida and Mr. Tadashi Matsuo of Kyoto Institute of Technology, Japan. They are also grateful for the support of Japan Society for the Promotion of Science under Grants-in-Aid for Scientific

- 2950 -

Suggest Documents