Repercussion of Geometric and Dynamic Constraints ... - Springer Link

1 downloads 0 Views 2MB Size Report
Abstract * in this paper a simulator of a multi-view shooting system with parallel optical axes and structurally variable configuration is proposed. The considered ...
3D Res. 2, 04(2011)3 10.1007/3DRes.04(2011)3

3DR EXPRESS

w

Repercussion of Geometric and Dynamic Constraints on the 3D Rendering Quality in Structurally Adaptive Multi-View Shooting Systems Mohamed Ali-Bey • Saïd Moughamir • Noureddine Manamanni

Received: 29 June 2011 / Revised: 14 August 2011 / Accepted: 28 August 2011 © 3D Research Center, Kwangwoon University and Springer 2011

Abstract * in this paper a simulator of a multi-view shooting system with parallel optical axes and structurally variable configuration is proposed. The considered system is dedicated to the production of 3D contents for autostereoscopic visualization. The global shooting/viewing geometrical process, which is the kernel of this shooting system, is detailed and the different viewing, transformation and capture parameters are then defined. An appropriate perspective projection model is afterward derived to work out a simulator. At first, this latter is used to validate the global geometrical process in the case of a static configuration. Next, the simulator is used to show the limitations of a static configuration of this shooting system type by considering the case of dynamic scenes and then a dynamic scheme is achieved to allow a correct capture of this kind of scenes. After that, the effect of the different geometrical capture parameters on the 3D rendering quality and the necessity or not of their adaptation is studied. Finally, some dynamic effects and their repercussions on the 3D rendering quality of dynamic scenes are analyzed using error images and some image quantization tools. Simulation and experimental results are presented throughout this paper to illustrate the different studied points. Some conclusions and perspectives end the paper.

Keywords

multi-view camera, auto-stereoscopic visualization, dynamic scenes, 3D rendering quality assessment, visual servoing

Mohamed Ali-Bey1 ( ) • Saïd Moughamir1 ( ) • Noureddine 1 Manamanni ( ) 1 CReSTIC, Université de Reims Champagne-Ardenne, Moulin de la Housse BP. 1039, 51687 Reims, France Email: {mohamed.ali-bey, noureddine.manamanni, said.moughamir}@univ-reims.fr URL: http://crestic.univ-reims.fr/ Tel: +33 (0)3 26 91 86 17 Fax: +33 (0)3 26 91 31 06

1. Introduction Nowadays, three-dimensional television (3DTV) knows a real revolution thanks to the technological progresses in visualization, computer graphics and capture technologies. Depending on the adopted technology, the 3D visualization systems can be either stereoscopic or auto-stereoscopic. In stereoscopy, viewing glasses are required and different technologies are used to separate the left-eye and the righteye views: anaglyph or color multiplexing1,2, occultation and polarization multiplexin3, time sequential presentation using active shuttering glasses4. Auto-stereoscopic displays do not need any special viewing glasses since they are direction-multiplexed devices equipped by parallax barriers or lenticular systems4,5,6. To supply these display devices by 3D contents, the more interesting and used methods are based on the synthesis of multiple viewpoint images from 2D-plus-depth data for stereoscopic display7,8 and auto-stereoscopic display9,10. The transformation between viewing and capturing space with controlling perceived depth in stereoscopic case is described in11. A generalized multi-view transformation model between viewing and capturing space with controlled distortion is proposed in12. A time varying concept of this architecture for dynamic scenes capture is reported in 13a, 13b. Moreover, an assessment of 3DTV technologies is presented in14 and surveys on 3-D time-varying scene capture technologies and scene representation technologies can be found respectively in15. In this work, the authors’ interest is focused on a multiview capture for auto-stereoscopic visualization with the aim of designing a multi-view camera. In fact, this paper is rather a continuation of existing research in the literature dealing with the production of 3D images and videos. Its novelty is to design a new shooting system of real 3D static and dynamic scenes. The used methods are intuitive serving to approach the different problems aroused by this goal. Hence, exploiting the geometrical analysis given in12, an appropriate perspective projection model is derived to work

2

out a simulator of a parallel and decentring multi-view shooting system. After performing some validations of the different parts of the global geometrical model, the simulator is then used to study some dynamic 3D capture problems consisting of the main contribution of the paper: The first approached problem concerns the capture of dynamic scenes. We’ll highlight the need to vary the parameters of the shooting system according to the distance between the scene and the 3D camera. The second one aims at studying the role and the influence of each structural parameter of the shooting system on the 3D rendering, in this way the importance of adaptation of different structural parameters could be evaluated. Finally, the third approached problem deals with the repercussions on the 3D rendering quality due to the actuators dynamics and the mechanical constraints of the camera affecting this adaptation. The goal is to show the necessary dynamical adaptation that should be sufficient to be able to track the dynamic of the captured scene. Several and crucial issues will emerge from this study such that the positioning accuracy of the image sensors to ensure a quality 3D rendering and the control problem of a structurally variable shooting system. To the authors’ knowledge, there is a lack in literature concerning the work on 3D shooting systems of dynamic scenes for auto-stereoscopic display dealing with the problems mentioned above. This paper is organized as follows: in Section 2 some recalls about the viewing/shooting geometrical process for auto-stereoscopic visualization in the case of parallel and decentred shooting configuration are given. An appropriate perspective projection model is then derived and used to work out a simulator. This latter is used to validate the global geometrical system by testing experimentally the produced 3D images and videos (visualization on an autostereoscopic display device). Next, the capture problem of dynamic scenes is considered and the adaptation of the shooting parameters is put in evidence using the proposed simulator. After that, the problem of the impact due to the dynamic effects on the 3D rendering quality is posed. Then, in order to evaluate the impact of the geometrical and the dynamic constraints on the 3D rendering quality, we define in Section 3 the error images and quantization tools so as to compare the impacts of different geometrical and dynamic constraints on the 3D rendering. Section 4 considers the case of ideal dynamics to study the impact of the geometrical constraints on the 3D rendering quality. In fact, the influence of the constancy of the different shooting parameters considered separately or together is studied. The influence of dynamic phenomena of the camera on the 3D rendering quality is studied in Section 5. Indeed, transient states and tracking of dynamic scenes in the case of first and second order dynamic behavior are considered. Finally, some conclusions and future works are given at the end of the paper.

2. Global shooting/viewing geometrical process The basic simulation scheme performed in this work is depicted in Fig.1. It is composed of three main blocks. The first one consists in a simulated 3D scene representing a 3D object, which describes eventually a given trajectory in the

3D Res. 2, 04(2011)3

3D space. In the second block, the object is projected on n viewpoint images using an appropriate perspective projection model. This latter incorporates the structural parameters of the 3D camera represented by the 3D camera projection model block exploiting the shooting/viewing geometrical process reported in12 and recalled succinctly below. Then, the n produced images will be interlaced in the image interlacing block to provide a single 3D image visualized via an auto-stereoscopic display device. X Y

3D scene

Z

D

3D Camera Projection Model

Image 1 Image 2 Image n

Image interlacing

3D screen

Fig.1 Basic simulation scheme

In this section, one is particularly interested in the 3D camera block Fig. 1 that contains the capturing parameters stemming from the capturing/viewing geometrical process. In fact, this geometrical process involves three groups of parameters: the first one is the rendering parameters group imposed by the auto-stereoscopic display geometry, the second one is the shooting parameters group defining the geometric structure of a real or a virtual 3D camera for capturing the scene and the third one is the transformation parameters group controlling the transformation that affect the 3D rendering. Hence, knowing the parameters of these three groups and the relations between them, one can define a capturing configuration satisfying both the parameters imposed by the visualization device and the parameters of the wished distortion effects. Thereafter, one recalls succinctly the different parts of this geometric process and the associated parameters developed in our laboratory12. In the first subsection, a multi-view rendering geometry of auto-stereoscopic display device is described with the viewing parameters definition. Then, the shooting geometry of parallel and decentred configuration is described defining the capture parameters. After that, the relations between the capturing and the viewing parameters are given to define the distortion controlling parameters. An appropriate perspective projection model is finally derived and a simulation scheme is performed to validate the different parts of the global geometrical process.

2.1 Auto-stereoscopic viewing geometry The considered display device is an auto-stereoscopic screen as depicted in Fig. 2, where H and W represent respectively the height and the width of the device. To perceive the 3D rendering, the observers should be at a preferential positions imposed by the screen and determined by a viewing distance d, a lateral distance oi and a vertical distance δo corresponding to a vertical elevation of the observer’s eyes. Let Oi and Ori be respectively, the position of the left and the right eye and b is the human binocular gap. The perceived point noted m results from the viewed points mi and mri respectively, by the left and the right eye. A viewing frame r = (Cr, x, y, z) is associated to the device in its centre Cr for expressing the viewing geometry.

3D Res. 2, 04(2011)3

3 m

W

x

Cr

H d

mi

y

mri

b δ

oi

o

Ori

Oi

Fig. 2 Viewing geometry

2.2 Shooting geometry The geometry of a parallel multi view shooting with decentred image sensors configuration is presented in Fig. 3. Wb CB

2.3 Transformation Parameters

Z, Z0 Y0

D Cp

Hb

X

mi

Ci

pi X0 P

ai Ii

M

fi

e

O Y a- Perspective view

h

w Y0

mi Ii e

Ci

ai Xci

CB

pi P

Yci

X0

Cp

X M b- Front view Ii fi

Y mi Ci

ai

X0

pi

Xci D

Zci X

focal length f. It is characterized by parallel optical axes Zci distanced uniformly by an inter-optical distance B. The position of the corresponding optical centers Ci is defined in the frame R by a lateral position pi, a vertical position -P and a convergence distance -D according to the Z direction. The optical centers are lined up parallel to the image sensors, which are coplanar. Each image sensor represented by its principal point Ii where dimensions w × h are decentred regarding to its corresponding optical center Ci, by a lateral distance ai and a vertical distance e. Note that all the sighting axes (IiCi) converge to a common point Cp situated in the center of the scene plane which is distant from the optical centers’ line by the convergence distance D. For more clarity a practical scheme of a five points of view shooting system is depicted in Fig. 4. It shows clearly the structural parameters of the capture. Note that five point of view is the minimum needed by the existing commercial auto-stereoscopic display devices. Hence, we have chosen to illustrate a five points of view system in order to avoid the scheme to be too cumbersome.

CB

Cp M

c- Top view

Z,Z Fig. 3 Shooting geometry

To help the reader to well analyze this shooting geometry; the perspective, the front and the top views are represented in the following. Note that the explanation given thereafter can be transposed for each of the given view. Let R = (Cp, X, Y, Z) be the frame associated to the scene plane CB whose dimensions are Wb × Hb (Fig. 3-a). The shooting system is composed of n sensor/lens pairs with

The transition from the shooting space to the viewing one is expressed by the transformation between the captured point homogenous coordinates M(X, Y, Z, 1)R and those of the perceived point m(x, y, z, 1) for more details see12:

 µ  x   y  k α = z     1   0 

ρµ 0

γ δ

0  0 1 0  k (ε − 1) ε  d 

X  Y    Z    1

(1)

Where the transformation parameters quantifying independent distortion effects are defined as follows: b Wb d is the global enlargement factor, ε = K= D BW controls the nonlinearity of the depth distortion according to Z b the global reduction rate α = ε + k (ε − 1) , µ= kB d controls the relative enlargement width/depth rate, W H ρ= b controls the relative enlargement Hb W p b − oi B height/width rate, γ = i controls the horizontal dB δ ° B − Pbρ controls the vertical clipping rate and δ = dB clipping rate. The viewing and shooting parameters d, b, oi, δo, W, H, Wb, Hb, D, B, P and pi have already been defined in the subsections 2.1. and 2.2.

2.4 Specification of a Multi-View Shooting Layout

4

3D Res. 2, 04(2011)3

M C

a- Top view p

D p1

p2

I1

a1 m1

C3

C2

C1 I2

a2 m2

m3

f I3

p5

p4

B C4

C5 a5

m4

I4

m5

Ii e mi

I5

M f

Ci P

D Cp

Z b- Lateral view

Y Fig. 4 Five viewpoints shooting system

Knowing the viewing, capturing and transformation parameters presented previously, one can now specify a capturing layout satisfying the transformations and taking into account both the parameters imposed by the display device Fig. 2 and the parameters of the desired distortion effects k, ε, µ, ρ, γ and δ. Then, the geometrical parameters of the specified capture layout are pulled and expressed as follows (Table1): The last relation is pulled from the well-known Descartes 1 1 1 relation: = + , where F is the lens focal. f D F Note that to obtain a perfect 3D rendering without deformation, it is sufficient to choose the distortion parameters as follows: ε = 1, µ = 1, ρ = 1, γ = 0 and δ = 0. Based on this analysis some industrial applications such as 3D-CAM1 and 3D-CAM2 prototypes were developed in collaboration with our industrial partner 3DTV-Solutions Society (Fig. 5-a). These prototypes are able to capture images of eight points of view simultaneously that can be displayed, after interlacing, on an auto-stereoscopic screen

in real-time. However, it is important to note that these prototypes are designed only for static and quasi-static scenes presenting one constant and known convergence distance for each prototype and they are unable to capture dynamic scenes correctly. We have also designed in this partnership an application called photo-rail to allow the capture of different but static scenes (Fig. 5-b). It consists in a single camera sliding on a rail and positioned sequentially at positions calculated according to the convergence distance of a given scene from the camera to capture sequentially the different points of view of the static scene. The captured eight points of view are then interlaced to produce off line the 3D image16. Up to now there is no developed device to capture real dynamic scenes for auto-stereoscopic visualization. The work reported in the present paper aims at contributing to design a structurally adaptive shooting system to capture dynamic scenes for auto-stereoscopic rendering.

3D Res. 2, 04(2011)3

5

Table 1 The shooting parameters

Width and height of the scene plane: Convergence distance: Width and height of the image sensors: Vertical position of the lenses line: Lateral position of the ith optical centre: Lateral decentring of the ith image sensor: Vertical decentring of the image sensors: Focal length of each sensorlens pair:

Wb =

Wε Hε , Hb = kµ kρµ

D=

d k

w=

Wb f Wfε H Hfε = ,= h = b = D µd D µρd

P=

δ ° − δd kρµ

block “3D camera projection model” using the derived perspective projection model incorporating the structural parameters computed in the block “structural parameters calculation” according to the convergence distance D pulled from the block (3D scene). The captured images are interlaced in the block “images interlacing” and the resultant 3D images are visualized on an auto-stereoscopic screen. We have considered the case of eight points of view. The obtained eight images of the different viewpoints and the interlaced image are presented in Fig. 7. Remark that the disparity between the successive images is very small according to the relations given in the equations of Table1.

o + γd Pi = i kµ

3D scene

B

D

ai =

pi f f (oi + γd ) = D µd

e=

Pf f (δ ° − δd ) = D µρd

f =

DF D+F

X Y Z

Structural Parameters Calculation

ai f P e

Image 1

3D Camera Projection Model

Image 2 Image n

Image interlacing

3D screen

Fig. 6 Simulation scheme of the production process

a) 3D-CAM1 and 3D-CAM2

b) Photo-rail Fig. 5 Static prototypes

2.5 The Simulation model and the global geometry validation Now, in order to validate the global geometrical model, a simulator reproducing the global shooting/viewing geometrical process developed under Matlab/Simulink environment is worked out. Its simulation scheme is illustrated in Fig. 6. It exploits the derived perspective projection model (Appendix) by assuming the convergence distance of the camera to be equal to the real distance of the scene. In fact, the block “3D scene” of Fig.6 consists in generating a virtual 3D object. Its points’ coordinates are projected on the different viewpoints of the camera in the

Fig. 7 The eight points of view images and the produced 3D image

The 3D images and 3D videos delivered by this simulator are visualized on an auto-stereoscopic screen using XnView freeware for images and 3D-Play developed by 3DTVSolutions for videos, showing an optimal 3D rendering as presented in Fig. 8, the left picture for images and the right picture for videos. These results validate experimentally the developed simulator and the global geometrical process: viewing, perspective projection and shooting geometries.

6

3D Res. 2, 04(2011)3

Fig. 8 Generated 3D image and 3D videos visualized on an autostereoscopic screen

Now, since the geometric model is validated, some problems must be addressed. The first one is how about the capture problem of dynamic scenes? The second one deals with the necessity or not of the structural adaptation of the shooting system to different distances of the scene and the third one consists in the study of the repercussions on the 3D rendering quality due to the actuators dynamics and the mechanical constraints of the camera.

2.6 Capture problem of dynamic scenes and adaptation of the structural parameters The simulation scheme depicted in Fig. 6 reproduces the stationary perspective projection model of the multi-view camera. The structural parameters of this model are calculated according to the convergence distance D using the relations of Table 1. In the case of static scenes, this distance remains constant; therefore, it is not necessary to modify the shooting parameters to obtain a correct 3D rendering. Whereas, when the scene presents a distance discreetly or continuously variable in time regarding to the camera, the pre-configured cameras – like those presented in Fig. 5-a – and the stationary model, where no shooting parameter is updated, provide an incorrect 3D rendering except for the scene positions for which they are configured. For any other positions the 3D images are incorrect leading to a scattered, aggressive and imperceptible 3D rendering Fig. 9. In fact, the stationary model and the pre-configured cameras developed for static scenes assume that the convergence distance D and the structural parameters of capture remain constant. Therefore, for the dynamic scenes, the captured images are confused because of the erroneous shooting parameters regarding to the variable convergence distance of the scene. Then, the disparity between the images becomes very important leading to scattered 3D images (Fig. 9) while it should be very small between two successive points of view as previously shown in Fig. 7. To overcome the dynamic scene capture problem, an adaptation of the structural geometrical parameters of the camera becomes necessary to ensure a correct 3D rendering for any position of the scene at any time. These parameters should be variable according to the variation of the convergence distance D(t). Hence, the projection model of the camera becomes non-stationary. The simulation scheme is then modified as depicted in Fig. 10 where the variable distance of the scene regarding to the camera is now measured continuously and considered as the convergence distance of the shooting system.

Fig. 9 The eight points of view images and the produced 3D image

This measured distance is then used to update the structural shooting parameters of the 3D camera B, f, pi, ai, e and P following the procedure given bellow: 1- Measure D(t) d 2- Compute K = D (t ) b b D(t ) 3- Compute B(t ) = Pi +1 − Pi = = kµ µ d 4- Compute Pi (t ) = i ∗ B (t ) n 1  n ( − 2 , K ,0, K , 2 ) m 2 if n is even where, i =   − ( n − 1) , K 0, K , n − 1 if n is odd 2 2 

5-Compute P=

f =

D(t ) F D (t ) + F

, ai =

pi F D(t )

Pf δD(t ) and e = dρµ D(t )

3D scene

D

successively

X Y Z

Structural Parameters Adaptation

Image 1

B ai f P e

3D Camera with Adaptive structural parameters B, ai, f, P , e

Image 2 Image n

Image interlacing 3D image

3D screen

Fig. 10 Simulation scheme of structural parameters adaptation

,

3D Res. 2, 04(2011)3

7

To illustrate the necessity of the structural parameters adaptation, an example is considered consisting in a 3D generated object describing an elliptical trajectory in the horizontal plane and so presenting a sinusoidal convergence distance while making an own rotation. The structural capturing parameters B, ai, f, e and P are calculated using the relations expressed in Table 1 according to the convergence distance D(t) measured continuously. The evolution of the convergence distance and the structural parameters is presented in Fig. 11-a. The derived multi-view perspective projection model incorporating the updated structural shooting parameters is used to project the 3D points of the 3D object on the image planes. Several simulations have been performed. We have selected one of the most significant obtained results which are given in Fig. 11-b showing a perfect and an optimal 3D rendering for any time and at any position of the object in the scene. Convergence Distance 6500 6000

(mm)

5500 5000 4500 4000 3500 0

5

10

15

t (s)

20

25

30

35

25

30

35

25

30

35

25

30

35

Lateral Decentering -1.6025 -1.603 -1.6035

(mm)

-1.604 -1.6045 -1.605 -1.6055 -1.606 -1.6065 -1.607 0

5

10

15

t (s)

20

Inter-optical Distance 160 150 140

(mm)

130 120 110 100 90 80 0

5

10

15

t (s)

20

Focal Length 17.31 17.305 17.3 17.295 (mm)

17.29 17.285 17.28 17.275 17.27 17.265 0

5

10

15

t (s)

20

Fig. 11-a Structural parameters evolution

These results show the necessity of the adaptation of the structural parameters of the multi-view shooting system according to the convergence distance at any time to ensure

a correct 3D rendering for any position at any time.

Fig. 11-b Produced 3D image; Object in the closest position D(t) = 3.5 m, Object in the furthest position D(t) = 6.5 m

2.7 The Dynamic Effects in Structurally Variable Multi-View Shooting System In virtual reality, an ideal simulation scheme such that presented in Fig. 10 is satisfactory to produce a correct 3D rendering of dynamic scenes. So the measurement problems, the mechanical constraints and the dynamic effects are neglected without risk of harming the production process and the quality of the rendered images and videos. However, in the case of real and dynamic scenes as considered in this work, the measurement problems and the dynamic effects are settling crucially, then an ideal simulation scheme would be inadequate. This work is done with view of designing a multi-view capturing system with adaptive structural parameters to be able to track the dynamics of the scene in order to produce correct images and a quality 3D rendering. For that, it is important to take into account and to study the dynamic effects of the structural parameters on the 3D rendering quality. Then, we propose to add a block of structural parameters’ dynamics in the adaptation scheme of the structural parameters (Fig. 12). This block groups the mechanical constraints of the camera, its actuators’ dynamics as well as their control sub-systems behavior. It has as inputs the reference structural parameters updated according to the measured convergence distance D(t) and as outputs the structural parameters undergoing the dynamic constraints of the camera which are used in the 3D camera model. Fig. 13 presents the implementation of the structural adaptation of the multi-view shooting system to capture dynamic scenes. It involves the measurement of the convergence distance D(t) and the different structural parameters. The measures are used to compute de desired structural parameters in order to control their positions allowing the tracking of the scene dynamics and used also to obtain a correct 3D rendering. In the next sections, this simulation scheme will help us to show the degradation of the quality due to the constancy of the different structural parameters and the dynamic constraints for a 1st and a 2nd order behavior. The impact during the transient states is showed in static scenes while the impact of the tracking delay is observed in dynamic scenes. To quantify these impacts, an intuitive quality assessment method is considered. It consists in a comparison between a reference image and a current image by a subtraction of the number of their colored pixels. Hence, the subtraction result defines an error quantifying the degradation of the 3D rendering quality.

8

3D Res. 2, 04(2011)3 X

Convergence Distance D

Image 1

Y Z

3D scene

Structural Parameters Adaptation

Br air fr Pr er

Structural Parameters Dynamics

B ai f P e

3D Camera with Adaptive structural parameters B, ai, f, P , e

Image 2 Image n

Image interlacing 3D image

3D screen

Fig. 12 Simulation scheme of structural adaptation including dynamic constraints Shosen viewing and transformation parameters

3D environment

Cam1 Parameters computation

Controler

Cam2

D(t)

3D Scene

Image interlacing

Camn Structural parameters measurement

Auto-stereoscopic screen

D(t) measurement

Fig. 13 Implementation scheme of structural parameters adaptation

3. Error images and quantization tools To evaluate the impact of the constancy of the structural parameters and their dynamic effects on the rendering quality, the resulting 3D images are compared to a reference 3D image obtained under ideal conditions where the parameters of shooting are calculated theoretically. From this comparison an error image is obtained and then quantified by counting the number of its colored pixels to define an absolute error. This image error allows, as well, a visual appreciation of impact of these geometrical and dynamic constraints. We also define a relative error by dividing the absolute error by the total number of the colored pixels of the reference image. An index of quality is then defined to express directly the rendering quality.

variable from one observer to another. The quantization of these error images will serve mainly to compare the different effects in terms of numerical quantities what constitutes a valuable tool in our study.

(a)

(b)

3.1 Error images An error ima imerr ge is an image produced by the subtraction of two images: imerr = imref − im where imerr is the error image, imref is the reference image and im is the current image. In our case the error image allows a comparison of an image undergoing the geometrical and dynamic constraints reverberated on the positioning of the image sensors, to a reference image obtained by a perfect positioning of them (Fig. 14). This image type allows a qualitative assessment of the impact that has a positioning error of a given parameter on the 3D rendering quality. However, the mere observation of these error images cannot provide a quantitative evaluation of the impact of the geometrical and dynamic constraints affecting the structural parameters; visual sensitivity being

(c) Fig. 14 a-reference, b-current and c-error image

3.2 Error images quantization To get a quantitative evaluation of the different effects, two types of error are adopted. They are based on counting the number of colored pixels in the reference and error images. To avoid redundant counting of pixels, RGB images are converted to grayscale images giving one matrix for each image. For clarity, one considers error images with white background obtained by inverting the grayscale error images (Fig. 15). The absolute error Nabs is obtained by counting the coloured pixels in the error image compared to an image obtained with not erroneous parameters.

3D Res. 2, 04(2011)3

9

the images relative to the required value. The results obtained for a convergence distance D = 3.5 m for example, are presented in Fig. 16 and Table 2.

Fig. 15 Error image a-in grayscale and b-in inverted grayscale

The relative error Nrelat is the ratio between the number of coloured pixels in the current image error Nabs and the number of coloured pixels in the reference image Nref at the same instant. It is defined as follows: N N relat = abs (2) N ref This error can be expressed also in percentage Nrelat% = Nrelat × 100. Note that the complement to 1 of the relative error representing the image quality is also adopted: N (3) Q = 1 − abs N ref Thus the error is smaller when Q is closer to 1. Now let us discuss the importance of the adaptation of each capture parameter separately. Then we will study the influence of the transient state dynamics on the rendering quality.

4. Evaluation of the influence of shooting parameters constancy on the rendering quality In this section we will study separately the influence of the capturing parameters constancy on the 3D rendering quality. We shall be interested particularly in the focal length f and the lateral decentring ai. As for the vertical decentring e and the capture elevation P, they play a role analogous respectively to the lateral decentring ai and the inter-optical distance B. In this study e and P are considered null by assuming that the observation elevation δo is zero. The inter-optical distance B is contrariwise updated. Four cases will be considered: ai=0 in order to show that the only adaptation of the inter-optical distance B is insufficient and that the lateral decentring is very important. The case where both f and ai are not updated (constant) but not zero to show the influence of their constancy on the 3D rendering quality. This case will be compared to two other cases where either f or ai is not updated. Naturally, each of these cases is compared to the ideal one where all the parameters are updated together.

4.1 Case where ai is zero In this case all parameters are updated except the laterals decentring ai which are assumed to be constant and zero. This implies that the sighting axes are parallel and the distance of the scene from the camera is infinite, whatever the variations of inter-optical distance and focal length: p (t ) ai = i ∗ f (4) D(t ) However, the distances to the point of convergence in practice are rather finite which implies a greater disparity of

Fig. 16 a-reference, b-current and c-error images in inverted grayscale Table 2 Images quantification for ai = 0

The number of pixels of the reference image: Nref The number of pixels of the error image: Nabs The relative error: Nrelat The relative error in percentage: Nrelat% Quality index: Q

2291779 p 380393 p 0.1660 16.5982 % 0.8340

The obtained results show a significant disparity of the different viewpoint images which involves a significant error resulting a scattered and aggressive 3D rendering.

4.2 Case where f and ai are constant These parameters (f and ai) are fixed according to a constant value of the convergence distance. Consider for example D0 the convergence distance corresponding to the centre of the elliptic trajectory. The corresponding focal length is computed as follows: 1 (5) f = 1 1 − F Do f being constant, the lateral decentring ai(t) remains also constant since the ratio between the inter-optical distance B(t) and the convergence distance D(t) or B(t)/D(t) remains unchanged implying that: p (t ) (6) ai = i ∗ f D(t ) Knowing that pi (t ) = i ∗ B (t ) it yields: B(t ) ai = i ∗ ∗f D(t ) Where: n 1  n (− 2 , K ,0, K , 2 ) m 2 if n is even i=  − (n − 1) , K ,0, K , n − 1 if n is odd  2 2 And: B (t ) b = = cste D(t ) dµ

(7)

(8)

(9)

10

3D Res. 2, 04(2011)3

It yields: b ∗f (10) ai = i ∗ dµ This means that the lateral decentring varies only according to the variation of f. The update of lateral decentring (and vertical decentring when P ≠ 0) accompanies the update of the focal length (autofocus). Since f is constant, the lateral decentring ai is constant but nonzero. The results obtained by simulating this case are depicted in Fig. 17.

And Fig. 19 represents the amount of error during the entire movement of the cube along its elliptical orbit. Quantification of these images gives the following values: Nref = 2291779 p, Nabs = 2944 p, Nrelat = 0.0013, Nrelat% = 0.13 %, Q = 0.9987. Absolute error in pixels 7000

6000

5000

Constant and updated focal length f updat f const

17.3 (mm)

(pixels)

17.32

4000

3000

2000

17.28

1000

0

5

10

15

20

25

30

35 0

Discard between constant and updated focal length 0.04

(mm)

5

10

15

20

25

30

35

25

30

35

t (s)

Relative Error

0.02

0.35

0

-0.02

0

0.3

0

5

10

15

20

25

30

0.25

35

t(s) 0.2 (%)

Constant and updated lateral decentring -1.602

0.15

(mm)

-1.603 -1.604

0.1

-1.605 a1 updat a1 const

-1.606 -1.607

0

5 -3

2

x 10

10

15

20

25

30

0.05

35

Discard between constant and updated lateral decentring

(mm)

0

5

10

15

20 t (s)

Fig. 19 Absolute and relative error evolution

1 0 -1 -2 -3

0

0

5

10

15

20

25

30

35

t(s)

Fig. 17 Evolution of focal length f and lateral decentring ai

The focal length should have a range of variation of 0.039 mm for a variation range of 3000 mm of the convergence distance D(t). As for the maximum discard it is 0.0256 mm = 25.6 µm. This constancy of the focal length implies the constancy of the laterals decentring that should have a range of variation of 0.0036 mm and a maximum discard of 0.0024 mm = 2.4 µm for a1 for example.

(a) (b) Fig. 18 a-Obtained 3D Image and b-Error image in inverted grayscale when the discards are maxima

In Fig. 18 the case of constant focal length and laterals decentring at the time of maximum discard is illustrated.

As expected, the error image is denser when the discard on the parameters f and ai is more important. However, only a visualization of 3D images and videos can allows us an assessment of their quality. Indeed, visualization of the 3D sequence on an auto-stereoscopic screen shows a slight quality degradation when the discard is maximum corresponding to the moment when the 3D object moves closer to the camera.

4.3 Case where only the focal length f is constant In this case, the lateral decentring of each sensor is updated depending on the value of the focal length updated theoretically while it remains constant really. It is fixed by considering a constant value of the convergence distance D0 corresponding, for example, to the center of the elliptical orbit. 1 (11) f = 1 1 − F Do The value of the focal length in our example is equal to f = 17.27 mm. All the other parameters are updated according to the geometric relations established in relations of Table1. The obtained results for these parameters, at the moment of

3D Res. 2, 04(2011)3

11

a maximum error on f, are presented in Fig. 20. The quantification of the 3D images and error images provides the results presented in Fig. 21.

lenses, it is often necessary to perform an auto-focus regarding to the distance of the scene from the camera to obtain sharp images on the different points of view.

4.4 Case where only ai is constant

(a) (b) Fig. 20 a-Obtained and b-error image in inverted grayscale

Quantification of these images gives the following values: Nref = 2291779 p, Nabs = 4945 p, Nrelat = 0.0022, Nrelat% = 0.22 %, Q = 0,9978.

We examine in this subsection the effect of constancy of the lateral decentring when autofocus takes place. The focal length is then updated when the lateral decentring is kept constant despite the relation between these two parameters: b ∗f (12) ai = i ∗ dµ The error due to this constancy of lateral decentring is presented in Fig. 22.

Absolute Error in pixels 9000 8000 7000

(pixels)

6000 5000

(a) (b) Fig. 22 Current 3D image and error image in inversed grayscale

4000 3000 2000 1000 0

0

5

10

15

20

25

30

35

t (s) Relative Error 0.4

Quantification of these images gives the following values: Nref = 2291779 p, Nabs = 4260 p, Nrelat= 0.0019, Nrelat% = 0.19 %, Q = 0.9981. For a sinusoidal variation of the convergence distance we obtain the following results Fig. 23.

0.35 Absolute error in pixels 8000

0.3

7000

0.25 (%)

6000

0.2 5000 (pixels)

0.15 0.1

4000 3000

0.05 2000

0

0

5

10

15

20

25

30

35

1000

t (s)

Fig. 21 Absolute and relative error evolution

0

0

5

10

15

20

25

30

35

25

30

35

t (s)

Relative Error 0.35

0.3

0.25

0.2 (%)

As in the previous case, the error image is denser when the difference in f is greater. We also observe that the error is larger compared to the previous case. The visualization of the resultant images on auto-stereoscopic display shows a slight degradation of the rendering quality when the object moves away from the camera. This degradation becomes more visible when the object approaches the camera whereas the discard on the focal length is less important. This difference with the previous case is due to the unnecessary update of the lateral decentring that normally depends and must accompany the update of the focal length f (the autofocus). Furthermore, the projection model adopted assumes that the optical group is ideal and has no distortion, so the multiview projection provides a single image point for a given point M of the captured scene, giving consistently sharp images on the different points of view. In the case of real

0.15

0.1

0.05

0

0

5

10

15

20 t (s)

Fig. 23 Absolute and relative error evolution

These results show that the effect of the lateral decentring constancy is less than the effect of the focal length

12

the quantities associated to characteristic time instants of a first order dynamic system (t = τ, t = tr= 3τ) and the visual assessment of the 3D rendering quality displayed via an auto-stereoscopic screen is noted. To have a more global view and to compare the influence of different time constants during the transient state, we present in the obtained results through abacuses of the structural parameters evolution and the evolution of the quantization of the error images presented in Fig. 24 and Fig. 25 for τ =0.1, 0.3, 0.5, 0.7. The transient state is longer for high values of τ implying a 3D rendering which is formed very slowly. It should be noted that the time response of the mechanical system to 95% of reference values of the various structural parameters does not correspond to the regime ensuring a satisfactory 3D rendering. Inter-optical distance 100 (mm)

constancy when considered separately, but the effect is even less significant when they are considered together updated or constant. That said, it is often necessary to perform an autofocus to get sharp images on the different points of view. Adaptation of lateral decentring is then a backing for the autofocus to ensure precise disparity and matching of different viewpoints in order to obtain a perfect 3D rendering. Remark: if an autofocus is not necessary, it is always useful to motorize the lateral decentring because of their dependence on viewing distance of the display device and the relative enlargement factor (width/depth) µ under the b relation a i = i ∗ ∗f . dµ The necessity of the adaptation of the structural parameters has now been shown. Hence, let us focus on the inherent problems to this adaptation. Indeed, the parameters will be updated by motorizing the different components of the camera (lenses and sensors). The movements are affected by dynamic phenomena whose effects on the rendering quality will be studied in the following section.

3D Res. 2, 04(2011)3

50

0

0

1

2

3

4

5

6

7

τ= τ= τ= τ=

0.1 0.3 0.5 0.7

τ= τ= τ= τ=

0.1 0.3 0.5 0.7

τ= τ= τ= τ=

0.1 0.3 0.5 0.7

τ= τ= τ= τ=

0.1 0.3 0.5 0.7

τ= τ= τ= τ=

0.1 0.3 0.5 0.7

τ= τ= τ= τ=

0.1 0.3 0.5 0.7

8

Lateral decentering

5. Evaluation of the impact of the dynamic constraints on the rendering quality

(mm)

0

-1

-2

0

1

2

3

4

5

6

7

8

Focal length

5.1 Transient states study

(mm)

20

10

0

5.1.1 First order dynamic behavior Let us consider the case where the shooting system involves a first order dynamic behavior as described by the following differential equation: dy (t ) τ + y (t ) = Kx(t ) (13) dt Where τ is the time constant of the system and K is the static gain of the system. For a static scene situated at a constant distance, we have calculated the error image and

1

2

3

4 t(s)

5

6

7

8

Error on inter-optical distance

(mm)

50

0

0

1

2

3 4 5 Error on lateral decentering

6

7

8

1 0 -1 -2

0

1

2

3 4 5 Error on focal length

6

7

8

20

(mm)

During the transient states the structural parameters of the camera are in transition between the initial position and the reference one, and so a transient error appears throughout this period involving degradation of the 3D rendering. We'll study in this subsection the transient state of the first and second order behavior and their impact on the quality of the 3D rendering.

0

100

(mm)

In this section, the influence of the dynamic phenomena of the camera, in both the transient and the tracking states, on the 3D rendering quality is studied. In the simulation scheme presented in Fig. 13 a block grouping the mechanical constraints of the camera, the dynamic constraints of its actuators and the behavior of its control sub-systems is inserted. Two types of behavior will be considered in this study: first and second order dynamic behaviors since they include the different phenomena that may occur during the transient or the steady states of the movements.

10

0

0

1

2

3

4 t (s)

5

6

7

8

Fig. 24 Evolution of the camera’s structural parameters for different τ

This latter occurs only when the positioning errors become very small corresponding at a time over seven (07) times larger than the time constant τ. To illustrate the effects of the transient state on the 3D rendering quality, poses are token at characteristic time instants of a 1st order transient state (t = τ, t = 3τ) for τ = 0.2 s (Fig. 26 and Table 3).

3D Res. 2, 04(2011)3

13

4

10

Absolute error

x 10

τ= τ= τ= τ=

9 8

0.1 0.3 0.5 0.7

7

(pixels)

6 5 4 3 2 1 0

0

1

2

3

4 t (s)

5

6

7

8

Relative error 4.5

τ= τ= τ= τ=

4 3.5

0.1 0.3 0.5 0.7

reference values there is a significant error on the interlaced image and the 3D rendering is still in the transient phase. At time t = 3τ corresponding to the time response tr where the system is supposed to reach the steady state corresponding to 95% of the parameters reference values, the error on the 3D image is important and 3D rendering is of poor quality and very badly perceived on an auto-stereoscopic screen. This perception becomes satisfactory only when the positioning error reaches very low values in the steady state. The most annoying effect is that in every change of scene plane a transient phase is engaged leading to an important error on the images penalizing significantly the 3D rendering quality in the middle of the sequence and not only at the beginning. However, after every transient state the 3D rendering forms and improves gradually as the structural parameters approach their reference values and becomes optimal as soon as the steady-state is reached and the quantified error becomes sufficiently small.

5.1.2 Second order dynamic behavior

3

A second order behavior is described by the following differential equation: 1 d 2 y (t ) 2ζ dy (t ) + + y (t ) = Kx(t ) (14) ωn dt ωn2 dt 2

(%)

2.5 2 1.5 1 0.5 0

0

1

2

3

4 t (s)

5

6

7

8

Fig. 25 Evolution of the absolute and relative error for different values of τ

t=τ

t = 3τ Fig. 26 Current and error image in inverted grayscale during the transient state

Quantifying the error image gives the following values: Table 3 Images quantification for t = τ and t = 3τ

t = τ t = 3τ

Nref

Nabs

Nrelat

Nrelat%

Q

2288529 p 2288529 p

60943 p 58157 p

0.0266

2.663 %

0.9734

0.0254

2.5412 %

0.9746

At time t = τ corresponding to 50% of the parameters

Where ξ is the relative damping factor, ωn is the undamped resonance frequency and K is the static gain. The system response may be aperiodic or pseudo-periodic according to the damping factor ξ. Fig. 27 represents the step response and the associated errors for different values of ξ. In Fig. 28, one represents the quantities of error images namely absolute and relative errors. For small values of ξ the pseudo periodic movement presents oscillations around the desired values of the structural parameters leading to an important error on the 3D images Fig. 28 translating a scattered 3D rendering when the parameters go away from their reference values, and a correct 3D rendering in the neighborhood of the desired values Fig.28. This oscillatory effect is reduced with greater values of ξ Fig. 29. Thus, the perception of the 3D rendering during the pseudo periodic state is more unpleasant than in the case of an aperiodic transient state especially when ξ = 1 for a smaller time response. As soon as the steady-state is reached, the 3D rendering becomes gradually correct and satisfactory. We observe that the rendering quality is less degraded when the damping factor is greater. The oscillations are smaller and less frequent than the case when the damping is small. Note that the transient state intervenes every time a significant and fast variation of the convergence distance D(t) occurs whether at the beginning or during the functioning of the camera, and that for all the motorization types. The interest is then to reduce as much as possible the duration of the transient state by reducing the time constant. Recall that reaching 95% of the reference values of the camera's structural parameters is not sufficient to provide an acceptable 3D rendering, higher accuracy is required, which delays the forming of a quality 3D rendering.

14

3D Res. 2, 04(2011)3 5

Inter-optical distance

(mm)

200

Β

ξ=1

ξ=0.7

9

ξ=0.4

ξ=0.2

Absolute error

x 10

ξ=1 ξ=0.7 ξ=0.4 ξ=0.2

8

100

7

0

0

1

2

3 4 5 Lateral decentering

6

7

8

6 (pixels)

(mm)

0 -2 -4

0

1

2

(mm)

40

ξ=1

ξ=0.7

3

4 Focal length

ξ=1

ξ=0.7

ξ=0.4 5

6

7

ξ=0.4

3

8

2

ξ=0.2

1

0

1

2

3

4 t(s)

5

6

7

0

8

(mm)

ξ=1

50

ξ=0.7

1

2

3

4 t (s)

5

6

7

ξ=0.4

ξ=1 ξ=0.7 ξ=0.4 ξ=0.2

ξ=0.2 35 30

0

1

2

3 4 5 Error on Lateral decentering

6

7

8 25

1 (%)

0 -1 -2

ξ=1 0

1

2

ξ=0.7

ξ=0.4

3 4 5 Error on Focal length

6

ξ=0.2 7

ξ=1

ξ=0.7

20 15

8 10

20 10

8

40

0 -50

(mm)

0

Relative error

Error on Inter-optical distance 100

(mm)

4

ξ=0.2

20 0

5

ξ=0.4

ξ=0.2 5

0 -10

0

1

2

3

4 t(s)

5

6

7

8

Fig. 27 Evolution of the camera’s structural parameters for different values of ξ

1st overflow

0

0

1

2

3

4 t (s)

5

6

7

8

Fig. 28 Absolute and relative error in error images for different values of ξ

2nd overflow

ξ = 0.4

ξ = 0.7

Fig. 29 Obtained 3D images for during the transient state

5.2 Tracking delay study Let us now consider the problem of dynamic scenes tracking. One will focus on the responsiveness of the shooting system to the dynamics of the scene moving. So,

we resume the example of the 3D object performing an elliptical trajectory and presenting a sinusoidal convergence distance D(t).

3D Res. 2, 04(2011)3

15

5.2.1 First order dynamic behavior

Relative error 4.5 4 3.5 3 2.5 (%)

In the case of first order dynamic behavior with a time constant τ = 0.7 s, one notices an important tracking error between the current value of the structural parameters of the camera and their reference value. This translates the slowness of the system and its low responsiveness Fig.30-a. The tracking error of the structural parameters implies an error in the 3D images (Fig.31-a). The 3D rendering is very badly perceived on an auto-stereoscopic screen except in some punctual positions. Fig.32-a represents the 3D rendering quality in a position where the error is very important.

2 1.5 1 0.5 0

0

5

10

15

20

25

30

35

t (s)

Inter-optical distance

(a)

200 (mm)

Relative error

100 0

4

B ref B 0

5

10

15

20

25

30

3.5

35

Lateral decentering

3 2.5

-1.604

a1 ref a1

-1.606 -1.608 0

5

10

15

20

25

30

(%)

(mm)

-1.602

2

35 1.5

Focal length

(mm)

17.3

1

17.25 17.2

f ref f 0

5

10

15

20

25

30

0.5

35

0

t(s)

0

5

10

(mm)

20

25

30

35

(b) Fig. 31 Absolute and relative error evolution for time constant (a) 0.7 s and (b) 0.1 s

Inter-optical distance 200 100 0

15 t (s)

(a)

B ref B 0

5

10

15 20 Lateral decentering

25

30

35

(mm)

-1.6 -1.605

(mm)

-1.61

a1 ref a1 0

5

10

15 20 Focal length

25

30

35

f ref f

17.3

(a) (b) Fig. 32 Obtained and error image for time constant (a) 0.7 s and (b) 0.1 s at t = 10 s Table 4 Images quantification for τ = 0.1 s and τ = 0.7 s

17.25 0

5

10

15

20

25

30

35

t(s)

(b) Fig. 30 Evolution of the camera’s structural parameters for time constant (a) 0.7 s and (b) 0.1 s

The tracking error decreases when the time constant is smaller (e.g. τ = 0.1s) (Fig.30-b). The evolution of the error implied on the 3D image is presented in Fig.31-b. The quality of the 3D rendering displayed on an autostereoscopic screen shows a slight alteration when the 3D object goes through the lateral poles of the elliptical trajectory and when the variation of the sinusoidal convergence distance D(t) is fast (Fig.32-b). The quality is satisfactory when object goes through the poles of the Z axis: when the 3D object is in the closest or furthest position from the camera and when the variation of the convergence distance is slow.

Τ = 0.1 s Τ = 0.7 s

Nref

Nabs

Nrelat

Nrelat%

Q

2295092 p 2295092 p

11471 p 48561 p

0.005

0.4998 %

0.995

0.0212

2.1159 %

0.9788

5.2.2 Second order dynamic behavior In the case of second order behavior with a damping factor ξ = 0.2, the tracking of a dynamic scene presents a very oscillatory transient state (Fig.33-a) as in the case of static scenes, producing a very aggressive 3D rendering. During the steady state a residual error leads to a slight degradation of the 3D rendering in the places where the movement of the object in the scene is faster (Fig.34-a and Fig.35-a). This reflects the lack of reactivity of the tracking system.

16

3D Res. 2, 04(2011)3 Inter-optical distance

(mm)

200 100 0

B ref B 0

5

10

15

20

25

30

35

Lateral decentering -1.59

(a)

(mm)

a1 ref a1

(b)

Fig. 35 Obtained and error image for (a) ξ=0.2 and (b) ξ=1 at t = 16 s

-1.595 0

5

10

15 20 Focal length

25

30

35

(mm)

17.2 17.15 17.1

f ref f 0

5

10

15

20

25

30

35

t(s)

(a)

As we increase the value of the damping factor ξ (e.g. ξ = 1), the transient state is softened as shown in the subsection 5.1.2, but the responsiveness of the system is reduced and the tracking error becomes important (Fig.33-b) implying a significant error in the 3D images (Fig.34-b) leading to a very degraded 3D rendering quality especially in places with fast moving on the axis Z (Fig.35-b).

Inter-optical distance

(mm)

200

6. Conclusion

100 0

B ref B 0

5

10

15

20

25

30

35

Lateral decentering -1.59 (mm)

a1 ref a1

-1.595 0

5

10

15 20 Focal length

25

30

35

(mm)

17.2 17.15 17.1

f ref f 0

5

10

15

20

25

30

35

t(s)

(b) Fig. 33 Evolution of the camera’s structural parameters for (a) ξ=0.2 and

(b) ξ=1

Relative error 30

25

(%)

20

15

10

5

0

0

5

10

15

20

25

30

35

25

30

35

t (s)

(a) Relative error 3.5

3

2.5

(%)

2

1.5

1

0.5

0 0

5

10

15

20 t (s)

(b) Fig. 34 Absolute and relative error evolution for (a) ξ=0.2 and (b) ξ=1

In this paper, a global shooting/viewing geometrical process for auto-stereoscopic visualization and multi-view shooting system with parallel and decentring configuration is presented as well as some industrial applications based-on. Then an appropriate multi-view perspective projection model is derived and a simulator is worked out under Matlab/Simulink environment. After validating experimentally the global shooting/viewing geometrical model by visualizing the resultant 3D images and videos on an auto-stereoscopic screen, the capture problem of dynamic scenes is approached and the necessity of adaptation of the structural shooting parameters is put in evidence. The effect of the geometrical and dynamic constraints of the shooting system on the 3D rendering quality is then discussed. At first, the impact of the constancy of the different structural parameters is studied, so, to ensure a correct and satisfactory 3D rendering, all the structural parameters should be motorized for their adaptation. However, if the autofocus is not needed, the focal length can be kept constant and not motorized. In this case, the lateral decentring are kept constant also, but they should be motorized to be adapted to other different viewing distances. Then, the effect of the system dynamics is approached. So, in the case of a first-order behavior, the transient and the tracking states depend on the time constant value. If this latter is important, the time response is high and the tracking error also, inducing a major error on the 3D rendering. If the time constant is small, the transient state is short and the tracking error is small, slightly degrading the 3D rendering quality. Contrary, in the case of a second order behavior, an aperiodic transient state is obtained for high values of the damping factor. These high values of the damping factor, however, induce a significant error in tracking state involving a significant degradation of the 3D rendering. Values of the damping factor lower than the unity provoke a pseudo-periodic transient state translating a very aggressive 3D rendering. Then, the tracking state, meanwhile, presents an error degrading slightly the 3D rendering quality. Note also that any undesirable dynamic effects (vibrations, dead zone,…) which would provoke a

3D Res. 2, 04(2011)3

significant error on the structural parameters of the shooting system, involve the degradation of the 3D rendering quality. For this reason, the various aspects of the conception must be considered with big attention, ranging from the mechanical design to the control strategy elaboration passing through a meticulous choice of the actuators. This study suggests the development of sophisticated control strategies to ensure a high responsiveness of the tracking system and high accuracy with all the difficulties, which it arouses. Indeed, we can explore in particular the visual servoing techniques for positioning of the multi-view shooting system elements and the tracking of dynamic scenes.

Acknowledgment The authors would like to thank the Region Champagne Ardenne and the Agence Nationale de la Recherche for their support within the projects CPER CREATIS and ANR CAM RELIEF.

References 1.

W. Sanders, D. F. McAllister (2003) Producing anaglyphs from synthetic images, In Stereoscopic Displays and Virtual Reality Systems X. 5006 of Proceedings of SPIE:348–358 2. E. Dubois (2001) A projection method to generate anaglyph stereo images, In Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing (ICASSP '01), 3:1661–1664 IEEE Computer Society Press. 3. R. Blach, M. Bues, J. Hochstrate, J. Springer, B. Fröhlich (2005) Experiences with multi-viewer stereo displays based on lc-shutters and polarization, In Proceedings of IEEE VR Workshop: Emerging Display Technologies. 4. L. M. J. Meesters, A. IJsselsteijn Wijnand, J. H. Seuntiëns Pieter (2004) A Survey of Perceptual Evaluations and Requirements of Three-Dimensional TV, IEEE Trans. on Circuits and Systems for Video Technology, 14(3):381–391 5. K. Perlin, S. Paxia, J. S. Kollin (2000) An autostereoscopic display, In Proceedings of the 27th ACM Annual Conference on Computer Graphics (SIGGRAPH '00), 33:319–326 6. N. A. Dodgson (2002) Analysis of the viewing zone of multiview autostereoscopic displays, In Stereoscopic Displays and Virtual Reality Systems IX, 4660 of Proceedings of SPIE:254–265 7. S. Fu, H. Bao, Q. Peng (1996) An accelerated rendering algorithm for stereoscopic display, Computers & Graphics: Techniques for Virtual Environments, 20(2):223-229 8. U. Güdükbay, T. Yilmaz (2002) Stereoscopic view-dependent visualization of terrain height fields, IEEE Trans. on Visualization and Computer Graphics, 8(4):330–345 9. P. Kauff, N. Atzpadin, C. Fehn, M. Müller, O. Schreer, A. Smolic, et al (2007) Depth map creation and image-based rendering for advanced 3DTV services providing interoperability and scalability, Int. Journal on Signal Processing: Image Communication, February, 22(2):217234 10. K. Müller, A. Smolic, K. Dix, P. Merkle, P. Kauff, T. Wiegand (2008) View synthesis for advanced 3D video systems, Hindawi Publishing Corporation, EURASIP Journal on Image and Video Processing. 11. J. Graham, L. Delman, H. Nicolas, E. David (2001) Controlling perceived depth in stereoscopic images, In Proc. SPIE Stereoscopic Displays and Virtual Reality Systems VIII, 4297:42-53, June.

17 12. J. Prévoteau, S. Chalençon-Piotin, D. Debons, L. Lucas, Y. Remion (2010) Multi-view shooting geometry for multiscopic rendering with controlled distortion, Hindawi, Int. Journal of Digital Multimedia Broadcasting (IJDMB), Advances in 3DTV: Theory and Practice. 13. M. Ali-Bey, S. Moughamir, N. Manamanni (2010) Towards Structurally Adaptive Multi-View Shooting System, 18th Mediterranean Conference on Control and Automation (MED), Marrakesh, Morocco. 14. L. Onural, T. Sikora, J. Ostermann, A. Smolic, M. R. Civanlar, J. Watson (2006) An Assessment of 3DTV Technologies, Conf. Proc. NAB Broadcast Engineering, April 26, 456-467 15. E. Stoykova, A. A. Alatan, P. Benzie, N. Grammalidis, S. Malassiotis, J. Ostermann, et al (2007) 3-D Time-Varying Scene Capture Technologies—A Survey, IEEE Trans. on Circuit. Sys. Video Tech. Nov., 17(11):1568–1586 16. Y. Rémion, L. Lucas, D. Debons, S. Aït Ouazzou, L. E. Afilal, S. Moughamir, N. Manamanni (2009) Dispositif de captation séquentielle d'une pluralité de prises de vues pour une restitution en relief à déformation contrôlée sur un dispositif multiscopique, Déposé le 9 mars 2009 (n° 09/1077), 91, Déposant : 3DTV Solutions & URCA.

Appendix The simulator used in this paper to perform the different tests is based on an appropriate perspective projection model presented here, and integrating the capturing parameters defined in Table 1. This is done by considering that the focal lengths f, the vertical decentring e and the vertical elevation P are common for all the viewpoints. For that, three essential frames are defined (Fig.2): a camera fixed frame Fo = (O, X o , Yo , Z o ) , n frames associated to the optical centers Fci = (Ci , X ci , Yci , Z ci ) whose origins are Ci = [ pi , P,0]o , and n frames linked to the image sensors FIm i = ( I i , X Im i , YIm i , Z Im i )

whose

origins

are

I i = [ai ,−e,−f ]Ci . The homogenous coordinates of a visible point M in 3D capturing space are defined by o PM = (o xM ,o YM ,o Z M ,1) in the fixed frame Fo . The image point m is represented by the vector of homogenous coordinates Im i PM = Im i xm ,Im i ym ,Im i zm ,1 described in the image frame Imi. The transformations used in this projection model are obtained thanks to the following steps:

(

)

First step: express the coordinates of point M in the frame linked to the ith point of view (ith optical center) using the extrinsic parameters of the camera. The homogeneous transformation allowing the passage from the fixed frame Fo to the frame Fci is defined by: − o x M − p i    − o y M + Pi  ci ci o PM = To PM =   o ZM    1    With:

(15)

18

3D Res. 2, 04(2011)3

− 1 0   − −1 Po =  0 0 1   0 0

 ci R ci To =  o  0

Where

ci

ci

0 − pi   0 P  1 0   0 1 

(16)

Ro expresses the rotation matrix of the frame Fo

regarding the frame Fci and ci Po is the position vector of the origin O of the frame Fo regarding to the frame Fci . Second step: express the coordinates of the image point in the coordinate Ci using a perspective projection. In the absence of distortions, the point M is projected onto the image plane at a point m whose homogeneous coordinates vector is ci PM = ci xm ,ci ym ,ci z m ,1 . Using Thales' theorem, one can write from Fig. 3 and Fig. 4 that: ci ci ci zm x y = ci m = ci m (17) ci zM xM yM

(

)

ci

Knowing that z m = − f and using (17) it yields:  ci  xm = − f    ci  ym = − f   ci  zm = − f 

ci ci ci ci

xm zm yM zM

(18)

Which can be written in its matrix form as: 1 0 0 0  0 1 0 0   −f  ci (19) Pm = ci 0 0 1 0  ci PM  zM  ci zM   0 0 0 − f    This transformation defines the perspective projection of the point M in its image point m. Let us note this transformation as: 1 0 0 0   0 1 0 0   −f  0 0 1 0  (20) T perspective = ci  zM  ci zM   0 0 0 − f    Third step: Passage from the frame Fci to the frame FIm i : expressing the coordinates of the image point m in the

image frame FIm i using the following homogeneous transformation: − 1 0 0 ai    Im i Im i   0 −1 0 − e  Rci Pci Im i  Tci =  = (21)  f 0 1 1  0  0   0 0 1 0 Using (21), the image point vector is obtained as follows: Im i Pm = Im i Tci ci Pm (22) Giving the coordinates of the image point in the image frame:

[

T

]

ci  ci x  y xm , ym , zm =  f ci M + ai , f ci M − e,0 (23) zM zM   From (15), (19) and (22), the overall projection model is given as follows: Im i (24) Pm = Im i TCi T perspective CiTo oPM im

im

im

T

Which can be written in its matrix form as: o  zM   − 1 0 0 − pi + ai f   o  zM  o f  0 −1 0 P − e Im i f Pm = o   PM zM   o zM   0 0 −1   o zM  0 0 0 f  

(25)

And the coordinates of the image point are expressed as: f  Im i o x M + pi + ai  xm = − o zM  (26)   Im i y = f − o y + p − e M m o  zM  Then, the image point coordinates in pixel can finally be given as: Im i  xm u = u o + lu  (27)  Im i ym  v = v o − l v 

(

(

)

)

Where (u o , vo ) are the coordinates of the principal point Ii and lu , lv denote the pixel dimensions according to the directions u and v respectively.