EPIPOLAR IMAGE GENERATION AND CORRESPONDING POINT MATCHING FROM COAXIAL VEHICLE-BASED IMAGES Zhizhong Kang Faculty of Aerospace Engineering Delft University of Technology Kluyverweg 1, 2629 HS Delft, The Netherlands
[email protected]
ABSTRACT The matching of corresponding points is the foundation of image-based 3D reconstruction. As coaxial vehicle-based images as concerned, however, since they are taken along the optical axis epipolar lines of coaxial stereo pair are arranged in the direction of radial in stead of nearly horizontal arrangement of stereo pair with horizontal baseline. Therefore, the generation of epipolar images and corresponding point matching on epipolar images are discussed in this paper. Although the straight lines in raw image become arcs in epipolar image, the point features are still apparently clear and possible for 1D corresponding point matching. Moreover, matching in the epipolar images can improve the success rate. Coaxial vehicle-based images are used to test the algorithms presented in this paper.
INTRODUCTION Image sequences taken along the optical axis, presently, are involved indispensably in localization of Mars rovers, mobile mapping, site modeling, surveillance applications and especially moving robot navigation. Therefore, it becomes a hot issue in photogrammetry and S1 X computer vision communities to research how to process this kind of image sequence. There are two ways to form stereo vision along the a1 optical axis in general: one is to take images by a single camera moving BZ S2 along the optical axis (popular in localization of Mars (Ron Li et al., BY 2002) rovers and mobile mapping); the other is to align coaxially two BX omnidirectional (Gluckman et al., 1998 & Lin et al., 2003) (generally in a2 moving robot navigation). The research presented in this paper is carried out on the stereo vision formed in the former way, i.e. a single camera A moving along the optical axis. Figure 1. components of baseline. Generally, as aerial photogrammetry as be concerned, the main component, max(BX, BY, BZ), of horizontal baseline is commonly BX (Wang, 1990). Therefore the geometry of stereo pair with horizontal baseline is naturally evolved under this hypothesis. For the stereo pairs along the optical axis, however, the components of the baseline B should satisfy such condition: BZ>>BX, BZ>>BY in strictly speaking (Fig.1.). In this case, the geometry of stereo pair will be distinctly different compared with that of horizontal baseline. As we know, photogrammetric computer vision is concerned the reconstruction of objects via the stereo pairs structured by multiple view photography. In principle, the corresponding point matching in the image is 2D, however, the 2D matching can be convert to 1D by introducing corresponding epipolar lines (Zhang et al., 1997). As aerial photogrammetry as be concerned, the arrangement of epipolar lines is nearly horizontal because the stereo pair has horizontal baseline. In coaxial stereo pairs, however, epipolar lines in images along the optical axis are arranged in the direction of radial because epipoles lie in the image near the principal point. As a result, we cannot expect to implement 1D correlation in the raw image. Therefore, according to the epipolar geometry of coaxial stereo pair, we will discuss the generation of epipolar image and corresponding point matching on it in this paper.
Z
Y
ASPRS 2008 Annual Conference Portland, Oregon April 28 - May 2, 2008
EPIPOLAR IMAGE GENERATION As aerial photogrammetry as be concerned, the arrangement of epipolar lines is nearly horizontal because the stereo pair has horizontal baseline. In coaxial stereo pairs, however, epipole is the intersection of baseline S1S2 and image. In this case, epipoles lie in the image near the principal point as illustrated in Fig. 2. As we know, epipolar lines are lines passing both epipoles and image points, e.g. I1 a1, I2 a2. The epipolar lines in images along the optical axis are arranged in the direction of radial because epipoles lie in the image near the principal point. Relative orientation is implemented on coaxial stereo pairs and the orientation elements computed are used to determine the corresponding epipolar lines in stereo image pair.
S1 I1 S2 Z
Y
I2 I
X
Plane Z
Estimation of the Epipolar Geometry Figure 2. Epipolar geometry of coaxial stereo pair.
As mentioned in previous section, the geometry of stereo pair is distinctly different with that of horizontal baseline. In this case, relative orientation will be accordingly different with conventional photogrammetry as well. Therefore, the formula of relative orientation of coaxial stereo pairs (Zhang and Kang, 2004) is employed to compute the relative orientation elements. As we know, there are five elements of relative orientation and these elements must be independent to each other. In coaxial stereo pairs, angle κ 1 and κ 2 are highly correlative instead of angle ω 1 and ω 2 as in traditional two-projector method of relative orientation, therefore the orientation elements should be ϕ1, ω1, ϕ 2, ω 2, κ 2 . When relative orientation process is finished, the corresponding epipolar orientation vectors of an arbitrary point in the left image can be computed using the relative spatial orientation of coaxial stereo pair. The equations of corresponding epipolar lines in the stereo pair can be expressed as below. Ax + By − Cf = 0 A' x '+ B ' y '−C ' f = 0
(1)
Where, ( A B C ) , ( A' B ' C ') are orientation vectors of corresponding epipolar lines in the left and right images respectively. (x, y) and (x’, y’) are image coordinates in the left and right images respectively. f is the focus. Epipolar orientation vectors in the left and right images are computed by the formula below.
⎡ x0 ⎤ vT T l = R BR ⎢⎢ y 0 ⎥⎥ ⎢⎣− f ⎥⎦ ⎡ x0 ⎤ vT T l ′ = R' BR ⎢⎢ y 0 ⎥⎥ ⎢⎣− f ⎥⎦
(2)
v v Where, l = ( A B C ) , l ′ = ( A' B' C ') . R and R ' , rotation matrices of left and right image respectively,
ASPRS 2008 Annual Conference Portland, Oregon April 28 - May 2, 2008
⎡ 0 ⎢ computed by relative orientation elements ϕ1, ω1, ϕ2, ω2 and κ2. B= ⎢ B Z ⎢⎣− BY baseline components.
− BZ 0 BX
BY ⎤ − B X ⎥⎥ , BX, BY and BZ are 0 ⎥⎦
Determination of the Area to Generate Epipolar Image In aerial photogrammetry, the area in the raw image to generate epipolar image is close to a rectangle because the stereo pair has nearly horizontal epipolar lines. However, since the epipolar lines in coaxial stereo pair are arranged in the direction of radial, the area to generate epipolar image becomes a sector and the center is the epipole. Obviously, the closer to the epipole, the larger distortion in the epipolar image. As a result, a sector ring is chosen instead of a simple sector. It is determined by the coordinates of a starting point and an ending point selected from the raw image to generate epipolar image. The inner radius of sector ring should be determined to reduce the distortion of epipolar image. Since the images are taken while the vehicle is moving forward, the scene included by latter image is smaller than that included by the former. The area to generate epipolar image, accordingly, is determined from latter image so that the corresponding areas are visible in both of the stereo image pair. The area in former image can be determined using the orientation elements computed.
Generating Epipolar Image According to the area determined, the epipolar lines from the starting point to the ending point in raw image are rearranged in the sequence of bottom-to-up with certain interval (e.g. 10') into epipolar image (Fig.3). In Fig.3, although the straight lines in raw image become arcs in epipolar image, the point features are still apparently clear. Therefore, the key point is that corresponding point matching is possible although the epipolar image differs obviously from the raw one.
Figure 3. Epipolar image.
CORRESPONDING POINT MATCHING The image matching algorithm, which is employed to find corresponding points between adjacent images, is explained bellow. The first step is to build an image pyramid. Image pyramid (e.g. Kraus et al, 1997) is usually used for the representation of a digital image in different resolution levels. Fig. 4 shows an example of an image pyramid. The idea of image pyramids as a stack of digital images actually depicting the same scene with decreasing resolution is closely related to the concept of scale space (Yuille and Poggio, 1986), in which the scale is introduced as an additional (continuous) dimension of a digital image.
ASPRS 2008 Annual Conference Portland, Oregon April 28 - May 2, 2008
Coarse
Fine
Figure 4. Image pyramid.
Image pyramids combine the advantages of both high and low resolutions of digital images. The lower levels of an image pyramid provide detailed information, but a great amount of data, whereas the higher levels contain less information but give an overview and require a smaller amount of data. The strategy of coarse-to-fine matching on image pyramid is used to increase the convergence radius and thus avoid searching for correspondences in the row images. Conversion to an image pyramid proceeds by recursively halving the resolution of images. As shown in Fig.4, at each stage of the recursion, every 2 × 2 pixels is replaced by a single pixel that contains the average value of those four pixels. After image pyramid is created, feature points are extracted. Generally, feature points can be extracted using corner detectors such as Moravec operator (Moravec, 1977) or Harris/Pressley operator (Harris and Stephens, 1988). The following text briefly outlines Moravec operator since it is implemented in our approach. Moravec has defined the concept of ‘points of interest’ as being distinct regions in images and concluded these interest points could be used to find matching regions in consecutive image frames. Moravec has proposed measuring the intensity variation by shifting a small square window (typically, 3 × 3, 5 × 5, or 7 × 7 pixels) by one pixel in each of the eight principle directions (horizontally, vertically, and four diagonals) on an image. The intensity variation for a given shift is calculated by taking the average sum of squares of intensity differences of corresponding pixels in eight principle directions (Moravec, 1977). The intensity difference, used as a threshold, determines feature points, i.e. those which intensity difference is larger than the threshold. The principle of threshold selection is to keep real feature point and exclude non-feature one. Since feature points have small neighborhoods, raw images are divided into grid. The size of grid cell is selected with respect to the resolution and larger than the size of searching window. Intensity difference for a given shift is computed by Eq. (3) (for a window of 3 × 3) i +1
M (i, j ) =
j +1
∑∑
1 ( I (k , l ) − I (i, j )) 2 8 k =i −1 l = j −1
(3)
where I(i,j) denotes the intensity of a given pixel (i,j). All potential feature points are picked out and the point with maximum difference is classified as a real feature point. Thus, only one feature point is extracted in each grid cell to ensure reasonable distribution of feature points. After extracting feature points in left image, image matching process is employed to find corresponding points in the right one. Image correlation is used to find the point with maximum correlation coefficient as corresponding one. Image correlation (e.g. Kraus et al., 1997) is a technique by which the conjugate point of a slave image (right) corresponding to the master image (left) is searched for the maximum correlation coefficient. The size of the window should be selected depending on the image resolution and feature size. 9 × 9 to 21 × 21 would be better used for digitized aerial photographs or close-range imagery. Smooth constraint is employed to detect and remove falsely accepted corresponding points. For every corresponding point pair, the weighted average parallax of neighboring pairs is computed. If the parallax difference between this pair and weighted average one is larger than a certain threshold, this point pair is considered as an
ASPRS 2008 Annual Conference Portland, Oregon April 28 - May 2, 2008
outlier. The weight is determined according to the distance between this pair and the neighboring one. The larger is the distance, the smaller is the weight.
EXPERIMENTS The approach presented in this paper is tested with coaxial vehicle-based images provided by Toyota MapMaster Incorporated. The image size is 1920 × 1080 pixels. As Fig.5, there are much more potential corresponding points on the buildings than those on the road and moreover it makes no sense to match corresponding points in the sky. Therefore, only the areas including buildings instead of the whole raw image are selected to generate epipolar images. The areas I and II to generate epipolar images are illustrated as Fig.5. According to the area determined, the epipolar images were generated as Fig.6. In Fig.6, although the straight lines in raw image become arcs in epipolar image, the point features are still apparently clear. Moreover, the maximum vertical parallax in the epipolar images is only 1.317 pixels.
II
I
II
I
Figure 5. The areas to generate epipolar images.
The matching algorithm presented in section 3 was employed to match corresponding points in both of raw images and epipolar images. The matching results are as Fig.5 and 6 respectively. As Tab.1, in the area I and II, only 10 and 25 corresponding point pairs were matched from raw images respectively, however the numbers of corresponding point pairs is respectively 244 and 365 in epipolar images. Compared to the matching in raw images, the matching in epipolar images not only simplifies the process, but also increases the numbers of corresponding point pairs matched over 10 times.
Figure 6. Matching results.
ASPRS 2008 Annual Conference Portland, Oregon April 28 - May 2, 2008
Table 1. The numbers of corresponding point pairs matched
I II
In raw images 10 25
In epipolar images 244 365
CONCLUSIONS Epipolar image generation and corresponding point matching for coaxial stereo pair was discussed in this paper. As coaxial stereo pair as be concerned, the epipole is the intersection of baseline S1S2 and image. In this case, epipoles lie in the image near the principal point so that the epipolar lines in images along the optical axis are arranged in the direction of radial. As a result, the epipolar image generated by rearranging the epipolar lines is distinctly different from the raw one. Although the straight lines in raw image become arcs in epipolar image, the point features are still apparently clear and possible for 1D corresponding point matching. Moreover, compared to the matching in the raw image, corresponding point matching in the epipolar image not only simplifies the process, but also improve the success rate of matching.
REFERENCES Gluckman, J. M., Thoresz, K., and Nayar, S. K., 1998. Real time panoramic stereo. Image Understanding Workshop. Harris, C. and Stephens, M., 1988. A Combined Corner and Edge Detector, Proc. Alvey Vision Conf., Univ. Manchester, pp. 147-151. Kraus, K, Jansa, J., and Kager, H., 1997. Photogrammetry, Vol.2, Advanced Methods and Applications, Dümmler/Bonn, ISBN 3-427-78684-6. Li, R., F. Ma, F. Xu, L. H. Matthies, C. F. Olson, and R. E. Arvidson, 2002. Localization of Mars rovers using descent and surface-based image data, J. Geophys. Res., 107(E11), 8004, doi:10.1029/2000JE001443. Lin, S., Bajcsy, R., 2003. High Resolution Catadioptric Omni-Directional Stereo Sensor for Robot Vision, IEEE 2003 International Conference on Robotics and Automation.12-17 May 2003 (Taipei, China). Moravec, H.P., 1977. Towards Automatic Visual Obstacle Avoidance, Proc. 5th International Joint Conference on Artificial Intelligence, pp. 584. Wang, Z., 1990. Principles of Photogrammetry, Publishing House of Surveying and Mapping, Beijing, pp. 27-29. Yuille, A. L. and Poggio, T. A., 1986. Scaling theorems for zero crossings, IEEE Transactions on Pattern Analysis and Machine Intelligence, 8(1): 15-25. Zhang Z., Kang Z., 2004. Several Issues on Photogrammetry based on forward moving images along the optical axis, The 4th International Symposium on Mobile Mapping Technology (MMT’2004), Kunming, China, 29-31 March 2004. Zhang, Z, Zhang J. (1997). Digital Photogrammetry, The Press of Wuhan University. pp. 114-115.
ASPRS 2008 Annual Conference Portland, Oregon April 28 - May 2, 2008