Depth extraction from unidirectional integral image using ... - CiteSeerX

2 downloads 0 Views 218KB Size Report
The extracted sub-image pairs are different to those used in stereoscopic calibration since they are generated from a different source. Geometric analysis on the ...
Depth extraction from unidirectional integral image using a modified multi-baseline technique ChunHong Wua, Amar Aggouna, Malcolm McCormicka, S.Y. Kungb a Faculty of Computing Sciences and Engineering, De Montfort University, U.K. b Electrical Engineering, Princeton University, USA

ABSTRACT Integral imaging is a technique capable of displaying images with continuous parallax in full natural colour. This paper presents a modified multi-baseline method for extracting depth information from unidirectional integral images. The method involves first extracting sub-images from the integral image. A sub-image is constructed by extracting one pixel from each micro-lens rather than a macro-block of pixels corresponding to a micro-lens unit. A new mathematical expression giving the relationship between object depth and the corresponding sub-image pair displacement is derived by geometrically analyzing the three-dimensional image recording process. A correlation-based matching technique is used to find the disparity between two sub-images. In order to improve the disparity analysis, a modified multi-baseline technique where the baseline is defined as the distance between two corresponding pixels in different sub-images is adopted. The effectiveness of this modified multi-baseline technique in removing the mismatching caused by similar patterns in object scenes has been proven by analysis and experiment results. The developed depth extraction method is validated and applied to both photographic and computer generated unidirectional integral images. The depth estimation solution gives a precise description of object thickness with an error of less than 1.0% from the photographic image in the example. Keywords: Integral Image, Depth Extraction, Multi-baseline Technique

1. INTRODUCTION The development of three-dimensional (3-D) imaging systems is a constant pursuit of the scientific community and entertainment industries. Many applications exist for fully three dimensional video communication systems. One much discussed application is 3-D television. Integral imaging is a 3-D display technique capable of encoding a true volume spatial optical model of the object scene in the form of a planar intensity distribution by using unique optical capture apparatus. The recorded planar intensity distribution can be stored or transmitted as a conventional two dimensional pixel array. It is akin to holography in that 3-D information is recorded on a two dimensional medium, but does not require coherent light sources. This allows continuous parallax, wide viewing zone and very good live capture and display practicality. All integral imaging can be traced from the work of Gabriel Lippmann, 1908 [1], where a micro-lenses sheet was used to record the optical model of an object scene. A full natural colour scene with continuous parallax can be replayed when another micro-lenses sheet with appropriate parameters is used to overlay the original image, see figure 1. A modification to the system was proposed by Ives [2] where a two-stage photograph is used to overcome the problem imposed by the pseudoscopic (spatially depth invert) nature of the image. A two-tier network as a combination of macro-lens arrays and micro-lens arrays designed by Davies and McCormick [3][4] further overcomes the image degradation caused by the two-stage recording. The two-tier network works as an optical “transmission inversion screen”, which allows direct spatially correct 3-D image capture. With progress in micro-lens manufacturing, integral imaging is becoming a practical and prospective 3-D display technique and hence is attracting much interest.

Figure 1. The principle of integral image

This paper is concerned with the extraction of depth information and the reconstruction of a 3-D scene from the recorded integral image data. One particular usage of depth extraction is to enable content-based interactive manipulation, which allows flexible operations on visual objects to be carried out. Hence, real and computer generated 3-D object space can be combined in a virtual studio. It can also be used for content-base image coding. Recently, a method for extracting depth based the point spread function (PSF) of the optical recording process has been reported [5] and [6]. The object space is conceived as a discrete set of points endowed with intensity. A correspondence matrix is associated to the PSF function, which transforms the object space into the pixel defined integral image. Depth estimation from the 3-D integral image data is formulated as an inverse problem. Although this technique works well on synthetic numerical experimental data, produced using the PSF matrix assumed, further research is needed towards its possible application to the real integral image [5] and [6]. This paper presents an alternative method for extracting depth information from an integral image. The method involves first extracting sub-images from the integral image. A sub-image is generated by taking one pixel out from each micro-lens unit rather than the macro block corresponding to the micro-lens unit. Each sub-image contains pixels at the same position under different micro-lens hence each sub-image record the object scene from and only from one particular direction. A mathematical expression giving the relationship between the depth of the object and the corresponding sub-images displacement is then derived by geometrically analyzing the integral image recording process. Correlation-based disparity analysis methods are used and a multi-baseline technique is adopted in order to remove the mismatch arising from ambiguity in object space. Application and validation of this method is presented for both photographic and computer generated images. Although current work is applied to unidirectional integral images, extension of the technique to omni-direction integral images (parallax in all direction) is straightforward.

2. EXTRACTING SUB-IMAGES FROM UNIDIRECTIONAL INTEGRAL IMAGE The key point of the integral image involves using a micro-lenses sheet in recording. A micro-lenses sheet is made up of many micro-lenses having the same parameters and lying in the same focal plane, each micro-lens works as an individual small low-resolution camera. A recording film is placed behind the micro-lenses sheet and coincident with the focal plane. As all parallel incident rays pass through the same point in the focal plane after refraction in an ideal lens, the parallel incident rays will be recorded at the same position under each micro-lens. The different recording position only depends on which micro-lens surface it reaches, as shown in figure 2. As an example, all rays along direction θ 1 will be recorded on position numbered n1 while all rays along θ 2 will be recorded on position numbered n2 , in other words, all pixels at the same position under different micro-lenses record the object scene from the same direction.

Therefore a new image representing a particular viewpoint can be composed by simply sampling the pixels at the same position under different micro-lenses. This new image (termed here sub-image) records the object scene from and only from one particular view direction. Changing the positions of the pixels selected, other sub-images can be constructed. Figure 3 illustrates how the sub-image is extracted from the unidirectional integral image. For simplicity,

only four pixels are assumed under each micro-lens. Pixels in the same position under different micro-lenses, which record rays from one particular direction and represented by the same color, are employed to form one sub-image.

Figure 2. The direction selectivity of integral recording procedure

Figure 3. Illustration of sub-image extraction Figure 4a is an example of a unidirectional integral image. The original object scene contains a flat background with Chinese characters and a small box attached to the front of it. The pitch size (%), focal length (F) of the micro-lenses and the radius of curvature (r) of the micro-lens surface are 600nm, 1.237mm and 0.88mm, respectively. The optical system used to take the image is explained in Davies and McCormick [4]. The 3-D scene can be replayed by scaling the image back to the original size (10.0cm × 9.0625cm) and overlaying with micro-lens sheet having the same parameters. Figures 4b and 4c are two sub-images extracted from the unidirectional integral image in figure 4a. Each sub-image presents a two-dimensional (2-D) recording of the object space from a particular direction. These sub-images are used in the following section for the disparity analysis.

(a)

(b)

(c)

Figure 4. An example of an unidirectional integral image example and two corresponding sub-images

3. MATHEMATICAL RELATIONSHIP BETWEEN OBJECT DEPTH AND SUB-IMAGES DISPLACEMENT Further sub-images are extracted from the original integral image and a correlation-based matching technique is used to find the disparity between sub-images. The extracted sub-image pairs are different to those used in stereoscopic calibration since they are generated from a different source. Geometric analysis on the optical recording procedure is necessary to find the relationship between the object depth and the disparity information of sub-images. Figure 5 depicts the Cartesian coordinate system used in the analysis. Only one dimensional disparity is considered here since the unidirectional integral image is being discussed. The z-axis denotes the depth and the x-axis represents the lateral position. The z-axis starts from the plane coinciding with the micro-lenses surface, while the x-axis starts from the center of the first micro-lens.

Consider an object point P (x0, D) at distance D from the micro-lens surface plane. Suppose the first sub-image is obtained by choosing pixels at distance ds1 from the micro-lenses center, which records rays from the 1 direction. The ray from P along the 1 direction intersects the plane of the micro-lenses sheet surface (x-axis) at the N1th micro-lens. The intersection is at P1 (x1 ,0). Following lens refraction, the ray is recorded on film at point Q1 (x1’, -t) at distance ds1 from the micro-lens center, see figure 5b. The following geometric relationship can be easily obtained from figure 5:

( N 1 − 0.5) ⋅ψ < x0 +

( D + d r ) ⋅ ds1 < ( N 1 − 0.5) ⋅ψ F

(1)

A similar relationship can be obtained for the second sub-image:

( N 2 − 0.5) ⋅ψ < x 0 +

( D + d r ) ⋅ ds 2 < ( N 2 − 0.5) ⋅ ψ F

(2)

In this paper, the baseline is defined DV û GV1-ds2, which represents the sampling distance between two sub-images. The name ‘baseline’ is inherited from stereoscopic. Since only one pixel is sampled out from each micro-lens in each sub-image, the disparity between two sub-images, d, corresponds to the micro-lenses numbers differences between the position Q1 and Q2 , d= N 1 − N 2 . Substituting d and û and manipulating equation (1), (2) yields the following:

D=

(d ± 1) ⋅ ψ ⋅ F − dr ∆

(3)

Here d ± 1 means the expected value is d but it may vary from the range d-1 to d+1. In most cases dr

Suggest Documents