computer generated content for 3d tv displays - CiteSeerX

3 downloads 0 Views 541KB Size Report
seamless autostereoscopic viewing. There are several candidate 3D technologies, for example stereoscopic multiview, holographic and integral imaging that ...
COMPUTER GENERATED CONTENT FOR 3D TV DISPLAYS G. Milnthorpe, M. McCormick, A. Aggoun, N. Davies, M. Forman De Montfort University, UK

ABSTRACT The development of 3D TV systems and displays for public use require that several important criteria be satisfied. The criteria include the perceived resolution, full colour presentation, compatibility with current 2D systems in terms of frame rate and transmission data, human-factors acceptability and seamless autostereoscopic viewing. There are several candidate 3D technologies, for example stereoscopic multiview, holographic and integral imaging that endeavour to satisfy the technological conditions. Although cost is an important parameter it is difficult at the early stages of the development cycle to make more than general judgements on the advantages of a particular arrangement. The perceived advantages of integral imaging are that only a single aperture camera is needed to capture the 3D data. In addition the display is a full 3D optical model, the model is correctly scaled throughout the image space, and in viewing accommodation and convergence occur naturally thereby preventing possible eyestrain. Consequently it appears to be ideal for prolonged human use. INTRODUCTION It is generally accepted that three-dimensional image display systems offer a sense of improved realism and naturalness over conventional two-dimensional systems, Motoki et al (1), by stimulating both psychological and physiological depth cues in the human visual system. Many display systems have been proposed, most of which use stereoscopic techniques to stimulate the binocular convergence response. For most comfortable viewing, however, additional cues such as motion parallax due to minute eye movements should be exploited. Integral imaging (II) is a technique, first proposed by Lippmann (2) as integral photography in 1908 and subsequently developed by Ives (3) and others, Burckhardt and Doherty (4), Demontebello (5), Yu et al (6), which overcomes the limitations of purely stereoscopic systems by including these additional cues. An integral image is recorded by a planar detector surface as a two dimensional distribution of intensities, termed a lenslet-encoded spatial distribution (LeSD) by an encoding lenticular or microlens array. Replay is via a similar decoding array and when viewed the image exhibits full natural colour and continuous parallax within a centred viewing zone whose angular extent depends on the physical parameters of the encoding and decoding arrays used (typically around 30 degrees). Unlike stereoscopic display systems where the scale of an image distorts with changes in viewing distance, the replayed image retains correct scale independent of viewing distance. Holographic images are perhaps the most complete depictions of object space, unfortunately, although a substantial amount of work has been carried out on holographic systems the realization of practical real-time displays is still some time away. Consequently arrangements of realizable ‘holographic-like’ systems such as II is opportune. Both holographic and II systems reconstruct true real 3D

optical models of the object space, the former via a data set that consists of amplitude and phase data, the latter by recording amplitude and directional information (angle). At De Montfort University the Imaging Technologies Group have developed an II arrangement capable of real-time capture and replay, Davies et al (7), Stevens et al (8). Microlens or lenticular sheet may be used as the encoding and decoding element in the system. However, in operation it is different from that of `multiview' stereoscopic systems where the lenticular sheet functions as a spatial demultiplexer to select appropriate discrete left-eye and right-eye planar views of a scene dependent on viewer position. Such stereoscopic systems require multiple cameras for real-time capture in order to produce a defined number of binocular views at the display, Valyus (9), Okoshi (10), Moor et al (11). In contrast, the II system uses a single `camera' unit to capture a full volumetric optical model (VOM) of the object scene in a single time frame. The lens sheet simply samples the incoming information when encoding and replays the information to the correct location in space to reconstruct an optical model of the original scene when decoding. This paper reports on the development of II, the computer generation of integral images, the conversion of scene data to integral form and its use in the PROMETHEUS project for editing and display of images. Animated 3D display sequences will be viewable on the BBC stand. INTEGRAL IMAGING To produce images that replay orthoscopically using Lippmann’s approach it is necessary to use a 2-stage process. The II camera, shown in Figure 1, is an innovative optical approach that not only overcomes the problem of the 2-stage process but also allows the scene to be replayed both in front of and behind the decoding array (close imaging). The optical transmission inversion screen inverts the axial spatial sense of an object, projecting a pseudoscopic image for subsequent encoding. The double integral screen acts as a direction-selective field lens transmitting the rays at equal and opposite angles. The pitch of the microlenses governs the lateral resolution and the large apertures of the macrolens arrays retain the depth resolution. Aberrations induced by the input macrolens array are, to a large extent, cancelled by the output macrolens array. A lens array placed within the reformed pseudoscopic scene samples the image space and the intensity distribution can be captured electronically or on film.

Figure 1 – II camera

This process has been incorporated into a software model capable of generating rendered orthoscopic integral images. Fortunately, in computer graphics, it is not necessary to model the macrolens arrays or the double integral screen as these merely invert the object space. This model, however, attempts to produce images in the same way as the integral camera whereby each lenslet of the lens array is considered as fully apertured and therefore includes spherical aberration and field curvature. A sequence of images can be produced and MPEG-2 encoded to produce off-line animations that may be viewed on an LCD fitted with an appropriate decoding lens array. Sokolov (12) showed that integral images could be produced using a pinhole plate as a recording device and Igarashi (13) used this model for generating integral images by computer before projecting them from a CRT onto the back surface of a microlens screen. Within the Prometheus project, Price and Thomas (14), a pinhole approach that will generate images in real-time is presently being developed and is the second software model developed by this research. The reduced content integral images produced, using a limited number of projection points, are sufficient for TV purposes. COMPUTER GENERATION OF INTEGRAL IMAGES Fully Apertured Microlens Model Within the model the output of the second macrolens array of the integral camera (Figure 1) is considered to be an aperture upon which exists projection points (or viewpoints) which each projects the entire scene with respect to their locations. This is a true representation of the action of the integral camera. Hence, the scene is volumetrically reproduced by a number of discrete projections initiated from different viewpoints and each projection point location is equivalent to the centre of each macrolens location. In this sense the macrolenses of the second macrolens array are the starting point of the model and the lenses are being treated as pinholes producing a perfect pseudoscopic copy of the original scene (Figure 2).

Figure 2 - Projection points on aperture

Integral images are by their very nature continuous images as each part of the pseudoscopic scene is imaged in adjacent lenslets. The depth of the scene that can be replayed without losing its integrity is mainly dependent upon the display resolution in terms of pixel size and the characteristics of the decoding screen. Two types of culling can be performed on the scene database. Object faces not seen from any of the projection locations can be culled but whilst one location may ‘see’ a particular

face another might not. In these circumstances the relevant faces relating to that position are ignored but not discarded. The extrapolated skew rays, from projection points through the polygon vertices, intersect the curved lenslet surfaces of the encoding lens array. Here they are refracted to their positions at the capture plane. This transfer and refractive process uses the standard properties of direction cosines, the refractive index and surface of curvature of the lenslets, Born and Wolf (15). The sampling effect of the encoding lens array, from each projection point, produces vertical stripes at the image plane when using semi-cylindrical lens arrays (Figure 3) and discontinuous vertical stripes when using spherical lens arrays. To be able to use interpolative shading techniques for this fully apertured lens model, the location of the end pixels of the stripes is required to enable interpolative filling. This can be achieved by generating the polygon perimeters in 3D-object space using a 3D line-drawing algorithm, and transmitting them through the array to the image plane. The processing is repeated for each projection point location and for each polygon in turn, whereby the LeSD is built up in the virtual recording medium.

Figure 3 - Integral polygon filling Pinhole Lens Model To derive an II software model capable of real-time processing it is necessary to model the lenslets, within the encoding array, as pinholes. As mentioned previously, the scene produced at the image plane, from each projection point, is displayed as stripes. The pinhole model does not suffer from spherical aberrations and as such the positions of the stripes can be ascertained beforehand. This allows the stripes to be extracted from 2D images produced by the projection points and interlaced to form a composite LeSD. The advantages of this technique are that the perimeters of the polygons are not required in 3D-object space, the refractive process and lens model is not required, and full use can be made of commercial graphics cards. To guarantee that continuous viewing is achieved a minimum number of projection points are needed and at present day processing speeds a PC cluster is required. The outputs from the cluster must be fed into a hardware compositor that sends the composite LeSD to an LCD fitted with a decoding microlens array. Editing and Display of Integral Images Editing of integral images is more complex than for 2D images as correct spatial location is necessary to allow rendering to be carried out taking account of changing obscuration within the viewing zone. Similarly due to the need for seamless integration of computer generated content with real content, appropriate navigation software is necessary that accurately inserts the objects at a definite location, in correct scale. Current display technology is approaching that required for photorealistic images to be presented (IBM T221 LCD). This is particularly important for 3D images due to the increased amount of information to be exhibited. Reconstruction of integral data to produce a true deep spatial optical model, with sufficient resolution to be convincing, relies

both upon the display screen resolution and the ability of the decoding screen to control spherical aberrations. To date screen technology integrating a decoding capability has not been fully addressed. NHK produced an integral display where the decoding lenses were hand placed and affixed to a standard LCD panel, Okana et al (16). The team at De Montfort University have relied upon acquiring decoding elements designed for other purposes or on those produced by experimental technologies. This affects both the quality of the viewed image, the depth of scene displayed in clear focus and due to mismatch between the pixel pattern and decoding element moiré patterns are produced. Better screen structural integrity and optical performance, are now seriously hampering the ability to fully evaluate the latent capabilities of II. The II system potentially offers media production teams new options in presenting materials, for example the spatial positioning of the information presented relative to the observer. Figure 4 diagrammatically illustrates this capability and the demonstration on the BBC Research Village stand provides a simple static example of the depth of field and the capability of simple optics to hold, with clarity, an ‘in front of screen’ image.

Figure 4 - Spatial location of sculpture and view plane The Prometheus project seeks to demonstrate the viability of an ‘end to end’ media content capability and the particular role for De Montfort is to provide integral video from MPEG-4 data streams. The strategy for this is the previously mentioned compositor pinhole technique that has first been demonstrated in off-line software. A VRML2 loader was written for the II software and Surrey University provided the data files of avatars and texture maps. The required optical and mode parameters (the lens pitch, refractive index,

focal length, number of projection points etc) were then read in from a separate file. Figure 5 shows a LeSD of the image ‘Ballerina’ produced in this way.

Figure 5 - LeSD generated from VRML2 format CONCLUSIONS II is able to provide data that reconstructs a 3D image exhibiting continuous parallax within the viewing zone. The advantages of II are the single aperture camera, full colour and true scale image. Potential disadvantages are the limited resolution capability of displays and the quality of decoding optics. Software has been developed to enable integral images to be produced both from existing content and as a creative exercise. Computer generation of reduced content integral images has been shown to create acceptable quality images on high-resolution displays. Mixed content is a primary feature of current program making and consequently real-time computer generated video in autostereoscopic format is important. To date little work has been done on the computer generation of integral images and the rendering of the various formats of image data to integral images. The use of II introduces new artistic potential in program production that requires to be researched both from the artistic and human factors viewpoint. Currently real-time generation has not been achieved, however this is a hardware solution that is entirely possible using current technology.

References 1. T. Motoki, H. Isono, I. Yuyama, `Present Status of Three-Dimensional Television Research', Proceedings of the IEEE 83, 1009-1021, 1995. 2. G. Lippmann, `Epreuves reversibles', Comptes rendus hebdomadaires des Seances de l'Academie des Sciences 146, 446-451, March 1908. 3. H. E. Ives, `Optical Properties of a Lippmann Lenticulated Sheet', J. Opt. Soc. Amer. A 21, 171-176, 1931. 4. C. B. Burkhardt and E. T. Doherty, `Beaded Plate Recording of Integral Photographs', Appl. Opt. 8 (11), 2329-2331, 1969. 5. R. L. Demontebello, `Wide Angle Integral Photography - The Integram Technique', Proc. SPIE 120, 73-91, 1970. 6. Yu. A. Dudnikov, B. K. Rozhkov and E. N. Antipova, `Obtaining a Portrait of a Person by the Integral Photography Method', Sov. J. Opt. Tech. 47 (9), 562-563, 1980. 7. N. Davies and M. McCormick, M. Brewin, `Design and Analysis of an Image Transfer System using Microlens Arrays', Opt. Eng. 33 (11), 3624-3633, November 1994. 8. R.F.Stevens, N.Davies, G.Milnthorpe “Lens Arrays and Optical System for Orthoscopic Three-Dimensional Imaging” The Imaging Science Journal Vol 49 pp 151-164 (2001) 9. N. A. Valyus, `Stereoscopy', Focal Press, London, UK 1966. 10. T. Okoshi, `Three-Dimensional Imaging Techniques', Academic Press, London, UK 1976. 11. J. R. Moor, A. R. L. Travis, S. R. Lang and O. M. Castle, `The Implementation of a Multi-View Autostereoscopic Display', IEE Colloq. `Stereoscopic Television' 1992/173, 4/54/16, 1992. 12. A.Sokolov "Autostereoscopy and Integral Photography by Professor Lippmann's Method" MGU, Moscow State Univ. Press (1911) 13. Y. Igarashi, H. Murata, M. Ueda "3D Display System Using a Computer Generated Integral Photograph" Japan J.Appl.Phys. Vol 17, No.9 (1978) 14. M. Price, G.A. Thomas, “3D virtual production and delivery using MPEG-4”, Proc. of the Int. Broadcasting Convention (IBC 2000), Amsterdam, 8-12 September 2000. 15. Max Born and Emil Wolf "Principles of Optics: Electromagnetic Theory of Propagation Interference and Diffraction of Light" 4th ed.Pergamon Press, Bath (1970) 16. F. Okano, H. Hoshino, J. Arai, I. Yuyama “Real time pickup method for a threedimensional image based on integral photography” Applied Optics, 36, No.7, pp. 1598-1603, 1997 Acknowledgements The work has been made possible by a research grant awarded by the Engineering and Physical Research Council and the British Broadcasting Corporation (BBC). The authors wish to express their gratitude and thanks for the support given throughout the project by the Prometheus partners, BT Exact, Snell and Wilcox, AvatarMe, Surrey University, University College London, and Queen Mary University London.

Suggest Documents