MEASUREMENT OF THE LENGTH OF PEDESTRIAN CROSSINGS THROUGH IMAGE PROCESSING Mohammad Shorif Uddin and Tadayoshi Shioyama Department of Mechanical and System Engineering, Kyoto Institute of Technology Matsugasaki, Sakyo-ku, Kyoto 606-8585, Japan, E-mail:
[email protected]
Abstract
A computer vision based new method for the measurement of the length of pedestrian crossings with a view to develop a travel aid for the blind people is described. In a crossing, the usual black road surface is painted with constant width periodic white bands. In Japan, this width is 45 cm. The crossing region as well as its length is determined using this concept. Experimental results using real road scenes with pedestrian crossing confirm the effectiveness of the proposed method.
Keywords:
Image analysis; computer vision; pedestrian crossing; measurement of length; travel aid for the blind.
1.
INTRODUCTION
This paper discusses an application of computer vision to improve the mobility of millions of blind people all over the world. Usually, the blind uses a white cane as a travel aid. The range of detection of special patterns or obstacles using a cane is very narrow. To improve the usefulness of the white cane, various devices have been developed such as the SONICGUIDE,1 the Mowat sensor,2 the Laser cane3 and the Navbelt4 . However, these devices are not able to assist the blind at a pedestrian crossing where information about the existence of crossing, its length and the state of traffic lights is important. There are traffic lights with special equipment which notifies the blind of a safe direction at a crossing by sounding beeping noise during the green signal. However, such equipment does not inform the blind about the length of crossing and is not available at every crossing; perhaps, it would take too long for such equipment to be put and maintained at every crossing. Blind people obviously can not see, but can hear. Navigation is the number one barrier for them. The arrival of fast and cheap digital portable laptop computers with multimedia computing to convert audio-video streams in real time opens new pathways to the development of an intelligent navigation system for the blind people. 51 K. Wojciechowski et al. (eds.), Computer Vision and Graphics, 51–56. © 2006 Springer. Printed in the Netherlands.
52 In this paper we propose a simple image based method to measure the crossing length assuming that the location of pedestrian crossing is known. Previously, Shioyama et al.5 developed an image analysis method, which is based on edge detection, Hough transformation and Susan feature detection techniques. However, it is complicated, computationally inefficient, and needs many parameters to adjust. In contrast to the previous method, the present method is fast and needs very few parameters to adjust. In order to evaluate the performance of the proposed method, experiment is performed using real road scenes with pedestrian crossing.
2.
METHOD FOR MEASUREMENT OF CROSSING LENGTH
In a crossing, the usual black road surface is painted with constant width periodic white bands. Image processing technique is applied for the extraction of the number of white and black bands in the crossing region. end line
2nd line 1st line d y0 yn
image plane
f h camera center ground plane
optical axis
Figure 1.
2.1
Crossing length estimation model.
Crossing length estimation model
The crossing length estimation model is described in Fig. 1. Let yn and y0 are the positions of the first and the end lines of crossing in the image plane, respectively, f is the focal length of the camera, h is the camera position from the road surface, d is the crossing length (i.e. the horizontal distance from the observer to the end line of crossing) and w is the width of a crossing band. If there are n number of bands in the crossing image, then we may easily find the fh . Then, we get following relations: y0 = fdh and yn = d−nw yn − y0 = f h ×
nw . d(d − nw)
(1)
Measurement of the Length of Pedestrian Crossings through Image Processing
We can derive the expression for crossing length from Eq. (1) as 4nw f h 1 2 . nw + (nw) + d= 2 yn − y0
2.2
53
(2)
Feature extraction principle
Let i(x, y) denote the intensity of an image, where x and y are the horizontal and vertical spatial coordinates, respectively. If the image rotate by an angle of θ and consider (u, v) be the rotated coordinates correspond to (x, y), then u and v a r e calculated a s f o l l o w s : u = + x c o s θ + y s i n θ and v =−x sin θ + y cos θ . If there e xists a pedestrian crossing in the image,then θ will closely correspond to the direction of crossing bands. The differential of i(x, y) along v direction includes alternate peaks and valleys on the edges of crossing bands (i.e. from black to white and vice versa). Therefore, ∂∂vi has local extremes on the edges of crossing bands. Since the edges of crossing bands are straight lines, the integration along u direction emphasizes the local extremes. one can find crossing bands by analyzing the projec ∞ Consequently, ∂i du, which is a one-dimensional function about v. The integral of tion as −∞ ∂v square of this projection becomes a good measure of closeness to true crossing ∞ ∞ ∂ i 2 direction. Accordingly, we use θ that maximizes −∞ −∞ ∂ v du dv. Using Parseval’s formula, one can derive the following equation. ∞ ∞ ∂ i 2 1 ∞ 2 du dv = ζ |I (−ζ sin θ , ζ cos θ )|2 d ζ , (3) 2π −∞ −∞ −∞ ∂ v
∞ ∞ − j(ξ x+η y) dxdy, j is a complex operator, and where I(ξ , η ) = −∞ −∞ i(x, y)e ζ = −ξ sin θ + η cos θ . We find θ that maximizes the right hand side of Eq. (3).
2.3
Crossing direction estimation
The images used in this paper are of size (width × height) = (640 × 480) pixels and the origin of the image coordinates is chosen at the upper left corner of the image. At first, the color image is converted to a gray scale image. We calculate the power spectrum using 2D FFT of the gray scale image. In FFT, to make sample numbers that are integer power of 2, we used the maximum possible region of an image. In taking the maximum region, we choose the lower part. This is due to the fact that the camera is set at the height of an observer’s eye, so the crossing be always in the lower region of the image. For an image of size (640 × 480) pixels, we have taken (512 × 256) pixels from the lower region for the Fourier transformation and these are the maximum possible sample numbers. The maximum value of the power spectrum corresponds to the crossing direction.
54
2.4
Crossing pattern extraction
We make integration of the gray scale image along u-axis direction for each v-axis position. The shape of this projection plot is periodic in nature if there exists crossing patterns in the image. Then the differentiation along v direction of the integral data will give the location of edges of white and black bands. Next, we follow the following steps. 1 Starting from the bottom of the image, integrate all absolute values of differentiation data and then smooth the integration result using a moving averaging window of size 1/20 times of image height. Due to crossing patterns, this smooth integration result’s gradient will be largely changed at the end line of the crossing region. To emphasize this change, use a Laplacian filter of size Nl = 1/10 times of image height. Finally, take the position of the global maximum of this filtered output. This is the end line of crossing. Laplacian filter can be described as Nl /2 a(t + n), where a(t) is the integration data after L(t) = a(t) − N1l ∑ n=−N l /2 n=0
2 3
4
5
6
smoothing. From the bottom position of image to the end line of crossing extract important extremes by comparing absolute value of the differentials. Starting from the farthest local extreme (on the basis of y position in the image), if there exists two adjacent extremes of same sign, remove the extreme that has the lower value. The straight lines along crossing direction at the remaining extremes’ locations indicate the boundaries of the crossing bands. We can estimate the crossing distance using Eq. (2) from the above extracted number of crossing bands. However, this estimation may be erroneous as the neighboring regions of crossing have influence on the extraction of crossing bands. So, for perfect estimation, we use the following steps to extract the crossing region from the whole image. A crossing region is characterized by alternate black and white bands, so if a band is white, then its mean intensity value must be greater than the mean values of immediate previous and later bands. Using the selected bands in Step 3, determine the white bands from the mean image intensity of each band region. Integrate the gray values within each white band along v direction. To find the left and right border points of each white band, use a step filter = 128 pixels. Step filter can be described as of size Ns Ns /2 −1 1 S(t) = Ns ∑n=−Ns /2 b(t + n) − ∑n=1 b(t + n) , where b(t) is the mean integration data along v direction. To determine the left and right border lines of the crossing region from the left and right border candidate points, respectively, use the minimum error strategy with omission.
Measurement of the Length of Pedestrian Crossings through Image Processing
55
7 Starting from the bottom of the image, integrate the image along u direction for each v position within the left and right boundaries. Determine the the position of white and black bands from the differentials of these integration data using the above Steps 1 to 3 and then calculate the crossing length using Eq. (2).
(a)17.00m, 16.26m, 4.3%
(b)18.56m, 13.78m, 25.8%
Figure 2. Two experimental images of pedestrian crossing. The crossing region in each image is marked by border lines.
200000
150000
100000 projection differential int_diff_smooth int_diff_laplace imp_local_extremes
50000
0
-50000 0
50
100
150
200
250
300
350
400
450
v-position [pixel]
(a)
(b)
Figure 3. (a) Integration “Projection”, differentiation “differential”, integration of differential data after smoothing “int diff smooth”, Laplacian filtered output “int diff laplace”, important local extreme positions “imp local extremes” for the crossing image shown in Fig. 2(a) using the crossing region only, (b) straight line marks are drawn at the position of important local extremes for the same crossing image.
3.
EXPERIMENTAL RESULTS
To evaluate the performance of the proposed method for the measurement of crossing length, we used 77 real images of crossing taken by a commercial digital camera under various illumination conditions by observing in various weathers (except rain). Two samples of experimental images are shown in Fig. 2. Under each image we show the true crossing length, the estimated crossing length and the percentage error. The extracted border lines are also shown
56 in these images. Fig. 3(a) presents the results of integration, differentiation, integration of differential data after smoothing, Laplacian filtered output and important local extreme positions for the image of Fig. 2(a) using the crossing region only. In Fig. 3(b), straight line marks are drawn at the position of important local extremes for the same crossing image shown in Fig. 2(a). From the experimental results, we find that our proposed method is successful in determining the length of the pedestrian crossings. The average relative estimation error is 8.5% and the r.m.s. error is 1.87 m. The maximum relative error is 25.8%. The maximum error occurs for the image shown in Fig. 2(c) due to the fact that the white paintings and also the image resolution are not perfect. Though in the present investigation, the relative and the maximum errors are somehow large, we are confident that these errors will definitely be reduced by (i) maintaining clear white paintings on the crossing, (ii) taking image with good resolution and (iii) adding a vehicle detection algorithm as a preprocessing step with this method,which will ensure no vehicle obstruction in the image. The approximate computation time of the proposed algorithm for the measurements of a crossing length is 1.30 s using an Intel Pentium M of 1600 MHz processor.
4.
CONCLUSIONS
In this paper, a simple and fast computer vision based pedestrian crossing length measurement technique has been described. Using 77 real road scenes with pedestrian crossing, the average relative estimation error and the r.m.s. error are found 8.5% and 1.87 m, respectively. The main sources of measurement error are the low image resolution and distorted white paintings. We are confident that the accuracy will be greatly increased by overcoming the above mentioned causes. As the computer hardware cost is decreasing day by day, we hope the system will be affordable by the blind.
ACKNOWLEDGMENTS The authors are grateful for the support of Japan Society for the Promotion of Science under Grants-in-Aid for Scientific Research (No. 16500110 and No. 03232).
REFERENCES 1. L. Kay, Radio Electron. Eng. 44, 605-629, 1974. 2. D. L. Morrissette et al., J. Vis. Impairment and Blindness 75, 244-247, 1981. 3. J. M. Benjamin, in Carnahan Conf. on Electronic Prosthetics, 77-82, 1973. 4. S. Shoval et el., IEEE Trans. Syst. Man Cybern. C28, 459-467, 1998. 5. T. Shioyama et al., Meas. Sci. Technol. 13, 1450-1457, 2002.