An Object-Based System for Stereoscopic ... - Semantic Scholar

0 downloads 0 Views 516KB Size Report
This paper describes algorithms that were developed for a stereoscopic ... person sitting in front of a stereoscopic camera system with rather large baseline.
An Object-Based System for Stereoscopic Videoconferencing with Viewpoint Adaptation Jens-Rainer Ohm and Ebroul Izquierdo M. Heinrich-Hertz-Institut für Nachrichtentechnik GmbH Einsteinufer 37, D-10587 Berlin, Germany

ABSTRACT This paper describes algorithms that were developed for a stereoscopic videoconferencing system with viewpoint adaptation. The system identifies foreground and background regions, and applies disparity estimation to the foreground object, namely the person sitting in front of a stereoscopic camera system with rather large baseline. A hierarchical block matching algorithm is employed for this purpose, which takes into account the position of high-variance feature points and the object/background border positions. Using the disparity estimator's output, it is possible to generate arbitrary intermediate views from the left- and right-view images. We have developed an object-based interpolation algorithm, which produces high-quality results. It takes into account the fact that a person's face has a more or less convex surface. Interpolation weights are derived both from the position of the intermediate view, and from the position of a specific point within the face. The algorithms have been designed for a realtime videoconferencing system with telepresence illusion. Therefore, an important aspect during development was the constraint of hardware feasibility, while sufficient quality of the intermediateview images had still to be retained.

1. INTRODUCTION The goal of a telepresence videoconferencing system is to give the user an illusion of true contact with the remote partner, like sitting together in a virtual space. An important factor to improve this illusion and to render a remarkable quality gap over simple PC-based systems is stereoscopic image presentation, especially when autostereoscopic displays are used1. An even more realistic impression is gained when viewpoint adaptation is performed, which means that the view angle on the display must be altered automatically when the viewer moves his head2.

screen baseline

left view

right view

Fig.1. configuration of stereo cameras in videoconferencing application There are some critical limitations in the use of such a system. First, the stereoscopic cameras cannot be positioned in front of the display, which means that the baseline between the cameras is at least 50 cm with a small display, and 80 cm with a larger display (fig.1). This baseline is much too large to render stereoscopic images. The extreme differences between left- and rightview images do not correspond to the small distance of human eyes, which would perturbate the viewer. Hence, it is necessary

Ohm/Izquierdo

1

to interpolate intermediate-view stereo image pairs with smaller baseline. At the same time, a headtracking system can be used to adapt the actual viewpoint and give the impression of natural motion parallax. The interpolation of natural-looking intermediate views requires depth information, which can be obtained by disparity estimation between left- and right-view images. The disparity vectors can immediately be used to project pixels onto the intermediate image plane. However, a critical case is the presence of occlusion areas, where some parts of the scene may only be found in the left- or in the right-view image. In these cases, instead of interpolation, a unidirectional projection has to be performed. Disparity estimation is the most demanding task of the whole system. We have found that disparity ranges of 120 pel are necessary with a 50 cm baseline, and even up to 230 pel with 80 cm baseline, in order to match corresponding points between the left- and right-view images. On the other hand, vertical disparity shifts are much smaller. In the special case that a coplanar camera geometry (parallel, adjusted at same height) is used, the vertical disparity is zero all over the images. During the last years, many different schemes for disparity estimation have been proposed. Though feature-based3 and dynamic-programming4 approaches seem to perform very well, at least in the case of pure horizontal disparities, we found them to be too complex for a hardware system with the requirement of large disparity ranges. Instead, we investigated an area-based hierarchical block matching scheme, which easily copes with arbitrary disparity ranges, and is very robust even in the case of low correspondence between left- and right-view images, e.g. in the case of large occluded areas. The paper is organized as follows. In section 2, the algorithm for disparity estimation algorithm is described, which can be divided into 4 different modules : 1. Preprocessing and segmentation. The goal of this stage is to find points with highest relevance for matching, and to perform a subdivision into foreground and background areas. 2. Block matching with large block size for global bidirectional disparity estimation with consistency check. 3. Block matching with small block size for local bidirectional disparity estimation with consistency check. 4. Interpolation of dense L→R and R→L disparity fields, application of vertical median filters and ordering check. In order to gain good intermediate image quality even on the occasion of wrong disparity estimates, we have developed an object-based interpolation scheme, which makes use of a simple convex-geometry model of the human head, and uses the segmentation information to determine the occlusion areas. This method is described in section 3. Section 4 gives a glimpse over the architecture of hardware realization. In section 5, computer simulations, examples and results are described. Finally, conclusions are drawn.

2. DISPARITY ESTIMATION ALGORITHM A flowchart describing the interrelation of the disparity estimator module blocks is given in fig.2. In the first module, a preprocessing and segmentation is performed on the input signal of both channels, which generates information for the subsequent estimation steps, and for the intermediate image interpolator. Both L→R and R→L sparse disparity fields are estimated in the global and local estimation stages. In order to guarantee temporal continuity of the estimated disparities and to avoid temporally annoying artefacts, the disparities estimated for the previous field pair are fed back to the estimator stages. For this purpose, the dense field generated at the final stage is used. feature point coordinates

global disparities

local disparities

L image field R image field

L image field R image field

Preprocessing and Segmentation

Global disparity estimation

Local disparity estimation

Dense field interpolation, vertical median filter

to interpolator

L disparity field R disparity field segmentation mask

Fig.2. Flowchart of disparity estimator algorithm 2.1. Preprocessing and segmentation The preprocessing-and-segmentation stage uses a simple criterion based on pel differences. To select those image points which cannot be distinguished from their neighbours, we use a simple, difference-based interest operator (Moravec operator). This is

Ohm/Izquierdo

2

applied to both left- and right-view image fields, which are subsampled prior to this operation by a factor of two in the horizontal direction. The directional difference in four directions (horizontal, vertical and the two diagonals) is measured at each pel position over a small square window of size 5x5. In each of the four directions, we have five pels, and four differences between adjacent pel pairs. In a first step, the sums-of-absolute-differences along all directions are calculated. The output of the operator is then defined as the maximum of these four sum values. The goal of this operation is two-fold : − The Moravec operators's output is used to detect the point of highest interest within each matching block for the subsequent global block matching stage. − A threshold analysis performs segmentation of large, uniform background areas, at which valid disparity vectors cannot be estimated by a block matching strategy. At the present time, the segmentation system was only optimized for head-and-shoulder sequences with uniform background. In the case of sequences with structured background, the object-oriented feature of the interpolator will fail, while the overall system still keeps working with fair quality. In addition to the threshold criterion, we use a simple labeling procedure to eliminate small areas with "wrong" classification, and end up with a unique foreground/background mask containing only two segments. 2.2 Global disparity estimation In order to reduce noise sensitivity and simultaneously reach higher efficiency, both the left and right image fields are subsampled by a factor of two in the horizontal direction. Due to the interlaced input format, the image information is already subsampled in the vertical direction. Only those subsampled image fields are used during the global estimation step. Hence, this module requires an additional storage capacity of half the field size for both L and R fields. After sub-sampling, the left field is divided into blocks of size 16x16. The total number of blocks is 22x18=396, if the input is ITU-R-601 video.

point with highest Moravec output Matching window of size 13x9 Block of size 16x16 Fig.3. Relation of block position, point of highest Moravec output and matching window The strategy is : to use the point of highest interest within each block, which has been determined by the preprocessing module, and to match a reference window around this point (fig.3). This means that the sampling position with highest Moravec output inside the block is chosen as representative point for the entire block. Furthermore, matching is performed only for those blocks which are part of the foreground region from the segmentation result, which means that blocks within uniform background areas are not considered at all during the whole matching process. (t )

Let us denote the left and right subsampled field pair at time t as Rl and Rr( t ) . Moreover, let I l( t ) and I r( t ) be the image field intensity maps and z=(x,y) the sampling position of the particular point in the left field, that has been chosen to be matched. z = z + d z in the right field Rr( t ) . Herein, A full-search block matching is performed in order to find the corresponding point ~ d z denotes the disparity vector from left to right. A reference window of size 13x9 is placed around z. This reference window is then compared with all corresponding search windows (of size 13x9 as well) along a given horizontal search interval. The maximum search range is 64 pel for sequences with 50 cm baseline, and 128 pel for sequences with 80 cm baseline. With this interval range, a maximum disparity value of 128 (256) pel can be estimated with respect to the not-subsampled image field.

In order to select the best match from the allowed displacements, a matching criterion based on temporal smoothing and mean absolute difference (MAD) is used. The MAD is given as

Ohm/Izquierdo

3

MAD(d z ) =

1 9 13 ∑ ∑ I ( x + j, y + i) − I r ( x + j + d z , y + i) . 117 i =1 j =1 l

(1)

Using the MAD, the cost function is defined as: F (d z , d z( t −1) ) = MAD(d z ) + α d z − d z( t −1) ,

(2)

with d z as current displacement vector, d z( t −1) the temporal prediction vector and the weight coefficient α, which should be set to a value of approximately 0.2. To realize a simple hardware structure, the quotient 1/117 may be omitted from (1), and α be set to 16 in (2). Since motion compensation cannot be used (out of complexity reasons), the temporal prediction vector d z( t −1) must be taken from the same position z in the previously-estimated disparity field at time t-1. The previous-field dense disparity maps are available from the dense-field inerpolation module; it is necessary to establish an additional memory for internal use in the estimator module. z within the search interval on Rr( t ) , the function value F ( z − ~ For each position ~ z , d z( t −1) ) is calculated. The particular ~ z is the sampling position z , which minimizes the cost function, is the corresponding point of z, and the difference z - ~ z has been estimated, the same procedure is repeated from right to left, using ~ z as reference sampling disparity vector. Once ~ position on the right image, which means that the reference window of size 13x9 is now centered at this position. The search window of the same size is placed on the left image and shifted within a search interval of 64 (128) pel again. Now, the temporal prediction is taken from the R→L dense disparity memory. The correspondence search is then carried out without consideration of the previously-found L→R disparity d z . Let us denote the estimated L→R disparity with z as reference sampling position as d z , and the estimated R→L disparity with reference sampling position ~ z = z + d z on the right image as d ~z . Then, a bidirectional consistency check is performed in order to reject outliers. If the vector difference condition d z − d ~z ≤ 1 pel

(3)

is violated, the two vectors d z and d ~z are eliminated from both disparity fields. This verification enables a reliability of disparity estimation, such that the remaining disparity estimates can be considered as correct disparity values. 2.3 Local disparity estimation Local disparity estimation is again a block matching procedure, but is applied to the full-resolution (not-subsampled) image fields. The block positions are now 4 pel apart in the horizontal and vertical directions. The reference windows have a size of 9x5 pel, but the position z is always at the block center, such that adjacent windows overlap by a regular value. Instead of using full search (as in global estimation), only candidate vectors are tested with an additional search range of ±2 pel horizontally around each candidate. We are using 10 candidates, of which are • 6 from the output of the global estimation, unless they are part of the background segment ; • 3 from the surrounding local estimation at positions already calculated ; • 1 from the temporally-preceding displacement field at the same spatial position. The positions of candidate vectors are shown in fig.4. If, for the sake of hardware feasibility, a systolic-array matching processor is used, the matching operations must be pipelined, such that we only need to test adjacent search positions whenever possible. We found the following solution to be practicable : 1. 2. 3. 4.

Find the maximum (MAX) and minimum (MIN) displacements among the candidate vectors. If MAX - MIN ≤ 16, perform matching within the range MIN -2 ... MIN +17 (20 Positions) Otherwise, perform matching within ranges MIN -2 ... MIN +17 and MAX -17 ... MAX +2 (2x 20 Positions) Within the search range(s), find the optimum position under the additional constraint, that it is one of the candidates ±2 pel horizontally.

Ohm/Izquierdo

4

blocks and known vectors of local estimation

block to be estimated

blocks and vectors of global estimation

fig.4. Positions of 9 spatial candidate vectors (one candidate from temporal preceding displacement field not shown) In addition, those candidates must be regarded as invalid, which point into the background of the segmentation mask generated during preprocessing. It may happen that no candidate exists. This will be the case, whenever all candidates are set to the DEFAULT value, e.g. in uniform background areas. In this case, no matching is performed. The rest of the procedure is very similar to global estimation, with exception of the search range(s) and block sizes. Again, the search criterion is a combination of MAD and temporal smoothness with approximately the same α-parameter (≈0.2) in (2). If we again omit the division in (1), where the quotient would now be 1/45, it is appropriate to set a value of α=8. Local diplacement estimation is also performed bidirectionally, in order to apply a cross-consistency check on the estimation result. As the first step, L→R disparity estimation is performed. This means that the matching window in the L image field is fixed, and the best-matching search window in the R image field has its center position anywhere on the same line. The positions ~ z of the matching blocks in the R image field are not necessarily at equidistant positions, but they can in any case only be found at each fourth row. There may be areas in the R image field which are not part of any matching block. For R →L disparity estimation, the same candidate vectors are used as for L→R estimation. However, in contrast to the global estimation, the temporal predictions for L→R and R→L estimation are taken from the rowwise-dense R→L disparity fields, which are generated during the first step of dense field interpolation, defining the dense field only at each fourth row. The estimation result is again judged as inconsistent, if the results of L→R and R→L estimation differ by more than one pel according to condition (3). Inconsistent positions are marked with a DEFAULT value. Regarding the output of the estimation, the number of disparity values calculated per image field is not fixed. This is caused by the variable size of the background, where no values are estimated, and due to values that did not pass the cross-consistency check. The maximum possible number of estimated disparity values is 1/16 of image field size. 2.4 Generation of dense disparity fields After estimation of disparity values at sparse positions, as it is performed by the local estimation procedure, the rowwise-dense disparity fields Dl( t ) and Dr( t ) must be generated by linear interpolation. As the first step, a preprocessing is applied to the disparity values of one row, which takes into account the foreground position of the segmentation mask. Starting from the position xstart of the first foreground pel within the row (the left edge of the foreground object), the maximum (in the case of L→R) or the minimum (in the case of R→L) among the first four nonDEFAULT disparity values is searched. Its position is stored as xleft. All non-DEFAULT values that are left from the maximum (or minimum for R→L) are now set to this value. The same procedure is applied at the right edge of the foreground object, where the end position of the last foreground pel is xend. Again, the first four non-DEFAULT disparities left from xend are searched for maximum, and all values positioned right from the maximum (minimum) with position xright are set to its value. In the case where the foreground mask covers the whole image (e.g. in the case with non-uniform background), this proccdure is not applied ; then, xstart=0, xend=703, and xleft/xright are simply set as the positions of the first and last non-DEFAULT disparity values within the row.

Ohm/Izquierdo

5

Now, as the second step, the horizontal interpolation is started along the row. The procedure, with the result written again to the same memory, can be described as follows at a particular row y (here for the L→R field) : 1. Set Dl( t ) ( xstart ) = Dl( t ) ( xleft ) , Dl( t ) ( xend ) = Dl( t ) ( xright ) and x1=0. If xstart=0, set Val1= Dl( t ) ( xstart ). Otherwise, set Val1=0. 2. Starting at position x1+1, count the elements up to the next non-DEFAULT value Val2, which will be found at position x2. If the end of the line is reached, set x2=703 and Val2=0 (if xend≠703) or Val2= Dl( t ) ( xend ) otherwise. If no non-DEFAULT value is found, leave the memory for this row as it is. A step size ∆ is calculated, following the formula

Val 2 − Val1 . (5) x 2 − x1 Interpolation of the positions between the non-DEFAULT values is performed recursively, for x1 x − x1

(8)

Now, the newly-found valid x is stored as x2, the appropriate disparity value at this position as Val2, and point 2. from the second interpolation step as described above is executed once to fill the resulting hole. So far, the dense disparity fields are ready calculated at each fourth row of the image. All remaining rows are filled with DEFAULT values (which indicate that no disparity had been calculated at these positions during estimation). Now, the steps 1 and 2 from the procedure described above for rowwise-dense interpolation can be applied in a transposed sense for columnwise interpolation. The median filter and ordering-constraint check are omitted, because the remaining rows automatically fulfill these requirements if they are linear interpolated.

3. OBJECT-BASED INTERPOLATION The dense disparity fields produced by the estimator are used to project the left- and right-view images onto an arbitrary intermediate image plane. Herein, it has to be decided which areas of the intermediate image are to be interpoalted, i.e. taken from both images, and which areas are subject to occlusions and hence must only be projected from the corresponding area of one of the left- and right-view images. For both purposes, enhanced results were obtained if the information from the segmentation mask is used ; however, a simple fallback mode allows interpolation without the segmentation information, which is then no longer an object-based approach. 3.1. Projection of the disparity fields Basically, both disparity fields contain approximately the same information, and are highly redundant due to the application of bidirectional consistency check. Major differences are only present at the areas of occlusions, which are indicated by many L→R (R→L) vectors pointing to only one or a very small number of pels in the right-view (left-view) image ; this is called a

Ohm/Izquierdo

6

left (right) occlusion. Basically, we found that R→L disparities produced by the estimator algorithm are more reliable at left occlusions, while the L→R disparities produce better image quality at the right occlusions. Since we deal with videoconferencing sequences, we can employ a very simple model for head-and-shoulder scenes, which is based on the more or less convex surface of the human head and body5. Then, it is clear that left occlusions can occur only left from the center of the foreground shape, while right occlusions will be present only right from this point. Hence, we divided each row of disparity values into two parts, which are marked by the mid position of the active area under the foreground object (fig. 5). For the left part, the R→L disparity field is used, while for the right part, the L→R field is better suited. The split position should be determined only from one of the masks (L or R), because the mid positions must not necessarily coincide. In addition, it is useful to perform an ordering-constraint check around the split position again, and use an interpolation fill where mixed disparities would violate.

R mask R image R->L disparities

L->R disparities L image

L mask

Fig.5. Usage of L→R and R→L disparities exploiting position of the foreground masks If no segmentation mask is present, the fallback mode is that only one disparity field is used. It is more or less arbitrary which one to choose for this purpose ; however, as the estimation process is started with the L→R field, we recommend to use this one. R image

R mask

mask at intermediate position "I"

L image

"I/II"

"II"

"II/III"

"III"

L mask

Fig.6. Position of mask in the intermediate image and definition of five areas. 3.2. Object-based intermediate position interpolator The goal is to find the areas where the information in the interpolated image should be taken preferably from the left or right image only, and where it should be taken from both images. If the segmentation mask is available, this is a strong indicator that the specific sequence is of "head-and-shoulder" type, which means that the foreground object is convex. In this case, the lefthand side of the person's head can be found with more accuracy in the left-view image, and vice versa, the right-hand side should better be taken mostly from the right-view image. We want to interpolate intermediate-view images at arbitrary points along the baseline between the left- and right-view images. We define the parameter s=0 as the position of the left, and s=1 as the position of the right image. Hence, any 0 0.5  0

(9)

The weight wR is set to 1- wL in order to realize the condition wL+ wR=1.

4. HARDWARE STRUCTURE We have investigated the hardware feasibility of the algorithm, and found that a realization is possible using only custom chips, signal processors (SPs) and field programmable gate arrays (FPGAs). The most demanding task with respect to processing power are the matching stages. It is possible to realize global matching by using one FPGA of 20K gates, which is designed as systolic array processor and performs calculation of the MADs, and one SP of C40 type, which adds the temporal prediction term, searches for minimum and performs bidirectional consistency checks. The local matching stage, which is performed on full-resolution image fields, requires a quadruple configuration of FPGA/SP pairs, which basically perform the same operation as in global matching. All other tasks - preprocessing/segmentation, interpolation of vector fields and interpolation of intermediate images can be done by FPGAs.

Ohm/Izquierdo

8

5. RESULTS OF COMPUTER SIMULATIONS The performance of the methods presented has been tested with a set of natural stereoscopic sequences. These sequences were recorded within the framework of the European projects RACE-DISTIMA and ACTS-PANORAMA and used in all computer simulation experiments. The image resolution is 720x576 pixels. The robustness of the methods presented is confirmed by processing the four stereoscopic sequences ANNE, MAN, CLAUDE and GWEN, which represent typical videoconferencing situations. These sequences have been recorded using stereo cameras with baselines varying between 15 and 80 cm. The distance between the person and the camera varies between 1.2 and 2.5 m. Extremely large occluded image regions are present within the foreground object and the largest disparity vector can reach 230 pixels. Excellent overall image quality has been obtained in these cases, if the intermediate images are synthesized applying the object-based method introduced in section 3. Most of the videoconferencing sequences considered were chosen to fulfill the uniform background assumption, which simplifies the correspondence estimation and fits in with the proposed object-based image synthesis method. For the sequence GWEN the background (textured) has been recorded separately. This information has been taken into account in the disparity estimation as well as in the image synthesis. Although the algorithms have been successfully applied to all stereo sequences mentioned above, only selected results are reported which seem to represent the different possible situations encountered by videoconferencing.

Fig. 8. Foreground object, sequence ANNE.

Fig. 9. Foreground object, sequence CLAUDE.

Fig. 10. Corresponding points after global matching (sequence GWEN). The first group of experiments demonstrates the performance of the preprocessing module for fixed parameter ranges and different scenes. In the procedure for homogeneous region recognition two parameters can be adapted : The size of the window

Ohm/Izquierdo

9

for the variance measure and the variance threshold for deciding if a sampling position is declared a point of interest. Figures 8 and 9 show the extracted homogeneous regions from the left first frame of the sequences ANNE and CLAUDE. To test the performance of the hierarchical block matching procedure, experiments with several measurement window sizes have been carried out. As expected, the larger windows provide more accurate results, although good results are obtained even with windows of moderate size. All experiments reported here have been performed using the parameters given in section 2. The computed correspondences after global step for the tenth frame pair of the sequences GWEN are shown in figure 10. Herein, corresponding points are shown as white or black spots. In figures 11 and 12, images representing the horizontal component of dense disparity field pointing from left to right are displayed. Low gray levels represent large negative horizontal vector components, whereas high gray levels represent large positive horizontal vector components. According to this convention, a vector with horizontal component 0 is represented by the gray value 128. The vertical component of the disparity vectors is neglected in this representation. Figure 11 shows the disparity field estimated for the tenth frame pair of sequence ANNE, and figure 12 displays the result obtained for the same frame in the sequence CLAUDE. In both cases, a constant disparity value has been assigned to the background.

Fig. 11. Disparity field of sequence ANNE

Fig. 12. Disparity field of sequence CLAUDE

Finally, some results illustrating the performance of the object-based image synthesis method are given. Figure 13 shows leftview image, synthesized central viewpoints and right-view image using the tenth frame pair of the sequence MAN, fig.14 those of sequences ANNE, CLAUDE and GWEN. The computed central viewpoint is displayed between the two original stereo images.

Fig. 13. Left-view image, synthesized central viewpoint and right-view image, sequence MAN.

Ohm/Izquierdo

10

Fig. 14. Left-view images, synthesized central viewpoints and right-view images, sequences ANNE, CLAUDE and GWEN.

6. SUMMARY AND CONCLUSIONS A method for disparity estimation and image synthesis applied to 3D-videoconferencing with viewpoint adaptation is introduced. The novelty of the disparity estimator is twofold : On one hand, it has been optimized in order to achieve a very low hardware complexity, and on the other hand, it shows robustness and accuracy with regard to the addressed application. The goal, to estimate reliable displacement maps with extremely low computational costs, is reached by an improved hierarchical Block-Matching method. The idea at the heart of the approach presented is to combine previously estimated vectors to predict and correct each newly-calculated disparity vector, applying a suitable cost function and taking into account the assumptions about the scene. Additionally, an investigation about the hardware complexity of the method was performed. The image synthesis approach assumes a convex object located in the center of the scene. This assumption is fulfilled by typical videoconferencing situations, in which the scene usually consists of a person's head and shoulders in front of a uniform background or a previously-recorded textured background. The system reported in this paper is capable to process videoconferencing sequences with extremely large occluded areas, keeping implementation costs low and supplying intermediate views with very good image quality. The performance of the presented methods was tested by computer experiments using natural stereoscopic sequences representing typical videoconferencing situations. The disparity estimator and

Ohm/Izquierdo

11

image synthesis method introduced in this paper are capable to offer realistic 3D-impression with continuous motion parallax in videoconferencing situations.

7. ACKNOWLEDGEMENTS This work has been supported by the German Federal ministry of education, research, science and technology under grant 01BK 304. We are indebted to Dr. M. Ernst† who initiated the early stages of this research. Some of the sequences presented in this paper were provided by CCETT, France.

8. REFERENCES 1. N. Tetsutani, K. Omura and F. Kishino : "Wide-screen autostereoscopic displays employing head-position tracking," Opt. Eng. , vol. 33, no. 11, pp. 3690-3697, Nov. 1994 2. K. Hopf, D. Runde and M. Böcker : "Advanced videocommunications with stereoscopy and individual perspective," in Towards a Pan-European Telecommunication Service Infrastructure - IS&N '94, Kugler et. al. (eds.), Berlin, Heidelberg, New York : Springer 1994 3. W. Hoff and N. Ahuja : "Surfaces from stereo : Integrating feature matching, disparity estimation and contour detection," IEEE Trans. Patt Anal. Mach. Intell., vol. PAMI-11, no.2, 1989 4. Y. Ohta and T. Kanade : "Stereo by intra- and inter-scanline," IEEE Trans. Patt Anal. Mach. Intell., vol. PAMI-7, no.2, pp. 139-154, Mar.1985 5. E. Izquierdo and M. Ernst : "Motion/disparity analysis for 3D video conference applications," Proc. Int. Workshop on Stereoscopic and Three Dimensional Imaging, pp. 180-186, Santorini, Greece, Sept. 1995

Ohm/Izquierdo

12

Suggest Documents