Measurement of Pedestrian Crossing Length Using Vector Geometry – an Image Based Technique Mohammad Shorif Uddin and Tadayoshi Shioyama Department of Mechanical and System Engineering Kyoto Institute of Technology Matsugasaki, Sakyo-ku, Kyoto 606-8585, Japan E-mail:
[email protected] Abstract —A computer vision based simple method for the measurement of the length of pedestrian crossings is described. The main objective of this research is to develop a travel aid for the blind people. In a crossing, the usual black road surface is painted with constant width periodic white bands. The crossing length is estimated using vector geometry from the left- and the rightborder lines, the first, the second and the end edge lines of the crossing area. Image processing technique is applied on the crossing image to find these lines. Experiment is performed using 32 real road scenes with pedestrian crossing. The rms error of estimation is found 2.30 m.
I. I NTRODUCTION This paper discusses an application of computer vision to improve the mobility of millions of blind people all over the world. Usually the blind uses a white cane as a travel aid. The range of detection of special patterns or obstacles using a cane is very narrow. To improve the usefulness of the white cane, various devices have been developed such as the SONICGUIDE [1], the Mowat sensor [2], the Laser cane [3] and the Navbelt [4]. However, these devices are not able to assist the blind at a pedestrian crossing where information about the existence of crossing, its length and the state of the traffic lights is important. There are traffic lights with special equipment which notifies the blind of a safe direction at a crossing by sounding beeping noise during the green signal. However, such equipment does not inform the blind about the length of crossing and is not available at every crossing; perhaps, it would take too long for such equipment to be put and maintained at every crossing. In this paper we concentrate on the crossing length measurement assuming that the location of crossing is known. Our aim is to develop a device by which the blind can autonomously detect important information for safely negotiating a crossing. For the purpose of achieving such an objective using image data we propose a method for image analysis of a crossing to measure its length. Previously, Shioyama et al. [5] developed an image analysis method, which is based on edge detection, Hough transformation and Susan feature detection techniques. However, it is complicated, computationally inefficient, and needs many thresholding parameters to adjust. In contrast to the previous method, the present method is fast and needs few parameters to adjust. In a pedestrian crossing, the usual black road surface is painted with constant width periodic white bands. Image processing technique is applied on the crossing image to extract the crossing region with required features. The crossing length is calculated using vector geometry from the edge lines of the first, the second and the end bands and the left and right border lines of the crossing region. In order to evaluate the performance of the proposed method, experiment is performed using
real road scenes with pedestrian crossings. II. M ETHOD FOR M EASUREMENT
OF
C ROSSING L ENGTH
A. Model of a Crossing In a crossing, the usual black road surface is painted with constant width periodic white bands. Fig. 1 shows a crossing image, where the required edge lines are marked. In this figure, L1 is the edge line of the first band, which is the nearest horizontal edge line among observed bands in the image, L2 is the second edge line, Le is the end edge line, which is the farthest horizontal edge line in the crossing, and L3 and L4 are the left and the right border lines of the crossing region, respectively. B. Principle of Crossing Length Measurement Using Vector Geometry Fig. 2 shows a perspective projection on observation with a crossing image. Let (X,Y, Z) be the camera coordinate, n is a surface normal of the road surface and d is the camera height i.e. the distance from the origin of the camera coordinate system to the road surface. Here, the Z-axis coincides with the optical axis of the camera. Then a road surface P¯0 is described as aX + bY + cZ + d = 0, (1) 2 2 n = (a, b, c)T , a + b + c2 = 1, where T denotes the transpose operator. Let le , the end edge line in a crossing in three-dimensional (3D) space whose perspective projection in the image is Le . In a similar way, li , i = 1, 2, 3, 4, the lines in 3D space whose perspective projections are Li , i = 1, 2, 3, 4, as illustrated in Fig. 2. Let P¯ be the plane which passes through the line le and the origin O of the camera coordinate, and m e be the unit vector of the surface normal of ¯ The vector m the plane P. e is obtained from the observed line Le because the plane P¯ passes through the line Le in an image plane which is described as Z = f , where f is a focal length. We denote by m i , i = 1, 2, 3, 4, the unit vectors of the surface normal of planes obtained from the lines Li , i = 1, 2, 3, 4, respectively, in an image in the way similar to m e . If the lines Li , i = 1, 2, 3, 4, e, are represented by Ai x + Biy + Ci z = 0,
(2)
with image coordinate (x, y), where x and y axes are parallel to X and Y axes, respectively, then the vectors m i , i = 1, 2, 3, 4, e, are given by Ci T m i = Ai , Bi , / (A2i + B2i + (Ci / f )2 . (3) f Let u be the unit vector with the 3D direction of l1 , l2 and le , which are parallel to each other, and v be the unit vector with the 3D direction of l3 and l4 . By definition, the vector u is
orthogonal to vectors m 1, m 2, m e and n, and the vector v is or 4 , and n. Since m 1, m 2, m 3, m 4 and thogonal to vectors m 3, m m e are obtained from the observed lines L1 , L2 , L3 , L4 and Le , respectively, the unknown vectors u,v and n are obtained using their relations. We define P¯e as the plane which passes through the origin O and has the surface normal u. Then the nearest point Re from the origin O to the end edge line le is given by the intersection of the three planes P¯0 , P¯ and P¯e , as illustrated in Fig. 3. The horizontal component he of the crossing length ORe from the origin O to the point Re is given by 2 (4) he = ORe − d 2 .
where the symbols × and · indicate vector product and norm, e = (X,Y, Z)T of the respectively. The point of intersection R ¯ P¯e , P¯0 can be obtained by solving the following three planes P, equation. e = a, (6) AR where the matrix A is defined as A ≡ (m e ,u,n)T and the vector T a is defined as a ≡ (0, 0, −d) . C. Image Processing Technique 1) Principle: Let f (x, y) denotes the intensity of an image, where x, and y are the horizontal and vertical spatial coordinates, respectively. If the image rotate by an angle of θ and consider (u, v) are the rotated coordinates corresponding to (x, y), then u and v can be calculated as follows. u = + x cos θ + y sin θ ,
(7)
v = − x sin θ + y cos θ .
If there exists a pedestrian crossing in the image, then θ will closely correspond to the direction of crossing bands. The differential of f (x, y) along v direction includes alternate peaks and valleys on the edges of crossing bands (i.e. from black to white and vice versa). Therefore, ∂∂ vf has local extremes about v direction on the edges of crossing bands. For simplicity, we used f instead of f (x, y). Since the edges of crossing bands are straight lines, the integration along u direction emphasizes the local extremes. Consequently, one can find crossing bands by analyzing the projection as
Le L4
L3 L2
L1
Fig. 1. Important edge lines of a crossing.
Y
∞ ∂f
y
−∞
mi
Z
oi li
Li
u
x X v
me Pe P
n
F(ξ , η ) =
le
d
u Po
v Fig. 3. Crossing length from the intersection of the three planes P¯0 , P¯ and P¯e .
On the basis of above explanation, we can obtain the vectors u, v and n as follows. u =
m 1 × m 2 , m 1 × m 2
v =
m 3 × m 4 , m 3 × m 4
n =
∞ ∞
−∞ −∞
f (x, y)e−i(ξ x+η y) dxdy.
(11)
Hence, we can calculate θ from the right hand side of (10).
Re
he
(8)
Using Parseval’s formula, one can derive the following equation. ∞ ∞ ∂ f 2 1 ∞ 2 du dv = ζ |F (−ζ sin θ , ζ cos θ )|2 d ζ , 2π −∞ −∞ −∞ ∂ v (10) where
Fig. 2. Perspective projection of a crossing.
O
du,
which is a one-dimensional function about v. As mentioned earlier, (8) probably has alternate prominent peaks and valleys when crossing bands exist in the image. Then, the integral of square of (8) becomes a good measure of closeness to true crossing direction. Accordingly, we use the θ that maximizes ∞ ∞ ∂ f 2 du (9) dv. −∞ −∞ ∂ v
f
O
∂v
u ×v , u ×v
(5)
2) Crossing Direction Estimation: The images used in this paper are taken by a commercial camera and the size of the image is (width × height) = (640 × 480) pixels. At first, the color image is converted to a gray scale image. We calculate the power spectrum using 2D FFT of the gray scale image. In FFT, to make sample numbers that are integer power of 2, we used the maximum possible region of the image. We choose the lower region of the image. This is due to the fact that the camera is set at the height of an observer’s eye, so the crossing be always in the lower region of the image. For the image size of (640 × 480) pixels, we have taken (512 × 256) pixels from
the lower region for the Fourier transformation and these are the maximum possible sample numbers. The origin of the image coordinates is chosen at the upper left corner of the image. The maximum value of the power spectrum corresponds to the crossing direction.
(a)
(b)
(c)
(d)
(e)
(f)
Fig. 4. Some experimental images along with extracted region of pedestrian crossing. The crossing region in each image is marked by border lines.
3) Crossing Pattern Extraction: We make integration of the gray scale image along u-axis direction for each v-axis position using (8). The shape of this projection plot is periodic in nature if there exists crossing patterns in the image. Then the differentiation along v direction of the integral data will give the location of edges of white and black bands. Next, we follow the following steps. Step 1: Determine the end line of crossing using density condition of the differentiation data. Starting from the bottom of the image, integrate all absolute values of differentiation data and then smooth the integration result using a moving averaging window of size 1/20 times of image height. Due to crossing patterns, this smooth integration result’s gradient will be largely changed at the end line of the crossing region. To emphasize this change, use a Laplacian filter of size 1/10 times of image height. Finally, take the position of the global maximum of the filtered output. This is the end line of crossing. Laplacian filter can be described as L(t) = a(t) −
1 Nl
Nl /2
∑
a(t + n),
(12)
n=−Nl /2 n=0
where Nl is the size of the Laplacian filter and a(t) are the integration data after smoothing. Step 2: Determine all local extremes from the bottom position of image to the end line of crossing using the differentiation data.
Step 3: Extract important extremes by comparing absolute value of the differentials. First, find the local extreme that has the maximum absolute value of differential in a specific position, which is situated between y = 0.8 of height and y = height. The position of the maximum absolute value of a local extreme is the first selected extreme. Searching from the bottom of the image, select a new local extreme whose value lies in the interval from 0.4 to 6.0 times of the previously selected extreme. Each newly selected extreme becomes reference for the next local extreme selection. In this way, the search and selection are continued for all local extremes. Omit the local extremes that are not selected. Step 4: Starting from the farthest local extreme (on the basis of y position in the image), if there exists two adjacent extremes of same sign, remove the extreme that has the lower value. The straight lines along crossing direction at the remaining extremes’ locations indicate the boundaries of the crossing bands. Step 5: A crossing region is characterized by alternate black and white bands, so if a band is white, then its mean intensity value must be greater than the mean values of immediate previous and later bands. Using the selected bands in Step 4, determine the white bands from the mean image intensity of each band region. Step 6: Integrate the gray values within each white band along v direction. To find the left and right border points of each white band, first, use a step filter of size w = 128 pixels. Some cases, there are the existence of small black blocks in pedestrian crossing’s white band due to various markings. So pick the step points as candidate border points for a white band when the local maximum of the filtered data is greater than half of the global maximum. Connect two white blocks when their intermediate black block’s length is less than 1.2 times the length of the smaller white block. Then, determine the longest white block in each white band. The leftest and the rightest points of each longest white block are the candidate points for left and right border lines of the crossing region, respectively. Step filter can be described as t+Ns /2 t−1 1 S(t) = ∑ b(t + n) − ∑ b(t + n) , (13) Ns /2 n=−N n=t+1 /2 s
where Ns is the size of the step filter and b(t) are the integration data. Step 7: To determine the left border line of the crossing region from the left border candidate points, use the following minimum error strategy. As two points are sufficient to draw a straight line, so if there are n candidate points, then the maximum omittable points are (n − 2). At first, draw a straight line using any two points. Calculate the distances from the rest points to this line. The error is calculated by using square root of the sum of square of distances. Determine the line which gives the minimum error among all possible lines. If the minimum error is less than a threshold (here, 2 pixels is used as threshold), then the line resulting minimum error is the border line. If the error is greater than the threshold, then determine the new minimum error line for omitting 1, 2, 3, · · · , (n − 2) points and check the minimum error with the threshold. Using the same way, determine the right border line from the right border candidate points. Step 8: In Step 4, the crossing bands are estimated using the
intensity of the whole image, this estimation may be erroneous as the neighboring regions of crossing with road markings have influence on the extraction of crossing bands. So it is better to determine the exact crossing bands from only the crossing region. Starting from the bottom of the image, integrate the image along u direction for each v position within the left and right boundaries. Determine the the position of white and black bands from the differentials of these integration data using the above Steps 1 to 4. Calculate the crossing length using (4).
mean_int 250 200 150 100 50
250 200 150 100 50 0
0
5
10
15
20
25
600 500 400 300 200 100
30 0
Fig. 5. Mean integration (perpendicular to crossing) data “mean int” of each band for the crossing image of Fig. 1. Intensity contours are also shown.
TABLE I
C ROSSING L ENGTH M EASUREMENT E RROR OF THE I MAGES S HOWN IN T HIS PAPER .
Fig. No. Fig. 1 Fig. 4(a) Fig. 4(b) Fig. 4(c) Fig. 4(d) Fig. 4(e) Fig. 4(f)
True length [m] 17.00 9.85 21.00 15.80 5.42 16.30 18.15
Estimated length [m] 15.57 9.41 18.83 15.80 8.18 19.07 18.79
Error [%] 8.4 4.4 10.4 0.0 51.0 17.0 3.5
From the experimental results, we find that our proposed method is successful in determining the length of the pedestrian crossings. Left and right border lines play the major role in the accurate estimation of crossing length. Accurate determination of left and right border lines depends on the normal vector of road surface, the white paintings in the crossing and the vehicle obstruction. The average relative estimation error is 16.5% and the r.m.s. error is 2.30 m. The minimum error is 0.0% and the maximum error is 51.0%. The parameters used in this paper are selected on the basis of experiment. The approximate computation time of the proposed algorithm for the measurement of a crossing length is 1.30 s for an image with a nominal size of (640 × 480) pixels using a Toshiba laptop with Intel Pentium M of 1600 MHz processor. IV. C ONCLUSIONS
200000
150000
In this paper, a new, simple and fast computer vision based pedestrian crossing length measurement technique has been described. The method is successful in measuring the length of the pedestrian crossings. The method is almost accurate if we can accurately determine the border lines.
projection differential int_diff_smooth int_diff_laplace imp_local_extremes
100000
ACKNOWLEDGEMENTS
50000
0
-50000 0
50
100
150
200
250
300
350
400
450
y-position [pixel]
Fig. 6. Integration “Projection”, differentiation “differential”, integration of differential data after smoothing “int diff smooth”, Laplacian fi ltered output “int diff laplace”, important local extreme positions “imp local extremes” for the crossing images shown in Fig. 1 using the crossing region only.
III. E XPERIMENTAL R ESULTS To evaluate the performance of the proposed method, we have used 32 real images of crossings taken with a digital camera (Sony DCR-VX1000) under various illumination conditions. Some samples of experimental images are shown in Fig. 4. The extracted border lines are also shown in these images. Fig. 5 describes the mean integration data of each band for detecting its end points. Fig. 6 shows the results of integration, differentiation, integration of differential data after smoothing, Laplacian filtered output and important local extreme positions for the image of Fig. 1 using the crossing region only. Table I presents the measurement error for the images shown in this paper.
The authors gratefully acknowledge the suggestions and comments of Professor Yasuo Yoshida and Mr.Tadashi Matsuo of Kyoto Institute of Technology, Japan. The authors are also grateful for the support of Japan Society for the Promotion of Science under Grants-in-Aid for Scientific Research (No. 16500110 and No. 03232). R EFERENCES [1] L. Kay, “A sensor and to enhance spatial perception of the blind, engineering design and evaluation,” Radio Electron. Eng. 44, pp. 605-629, (1974). [2] D. L. Morrissette et al., “A follow-up study of the Mowat sensor’s applications, frequency of use, and maintenance reliability,” J. Vis. Impairment and Blindness 75, pp. 244-247, (1981). [3] J. M. Benjamin, “The new C-5 laser cane for the blind,” in Carnahan Conf. on Electronic Prosthetics, pp. 77-82, (1973). [4] S. Shoval, J. Borenstein, Y. Koren “Auditory guidance with the navbelta computerized travel aid for the blind,” IEEE Trans. Syst. Man Cybern. C28, pp. 459-467, (1998). [5] T. Shioyama, H. Wu, N. Nakamura, S. Kitawaki, “Measurement of the length of pedestrian crossings and detection of traffi c lights from image data,” Meas. Sci. Technol. 13, pp. 1450-1457, (2002).