High speed image space parallel processing for computer-generated ...

9 downloads 15711 Views 11MB Size Report
Abstract: In an integral imaging display, the computer-generated integral .... to the vector from each display pixel to the center of the corresponding elemental ... In this paper, we propose a new method we call image space parallel processing.
High speed image space parallel processing for computer-generated integral imaging system Ki-Chul Kwon,1 Chan Park,2 Munkh-Uchral Erdenebat,1 Ji-Seong Jeong,2 Jeong-Hun Choi,1 Nam Kim,1 Jae-Hyeung Park,1 Young-Tae Lim,1 and Kwan-Hee Yoo2,* 1

2

College of Electrical and Computer Engineering, Chungbuk National University, Gaesin-dong, Heungduk-gu, Cheongju, Chungbuk 361-763, South Korea Department of Information and Industrial Engineering, Chungbuk National University, Gaesin-dong, Heungdek-gu, Cheongju, Chungbuk 361-763, South Korea * [email protected]

Abstract: In an integral imaging display, the computer-generated integral imaging method has been widely used to create the elemental images from a given three-dimensional object data. Long processing time, however, has been problematic especially when the three-dimensional object data set or the number of the elemental lenses are large. In this paper, we propose an image space parallel processing method, which is implemented by using Open Computer Language (OpenCL) for rapid generation of the elemental images sets from large three-dimensional volume data. Using the proposed technique, it is possible to realize a real-time interactive integral imaging display system for 3D volume data constructed from computational tomography (CT) or magnetic resonance imaging (MRI) data. ©2012 Optical Society of America OCIS codes: (100.0100) Image processing; (100.6890) Three-dimensional image processing.

References and links 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14.

G. Lippmann, “La photographie integrale,” C. R. Acad. Sci. 146, 446–451 (1908). J.-H. Park, G. Baasantseren, N. Kim, G. Park, J. M. Kang, and B. Lee, “View image generation in perspective and orthographic projection geometry based on integral imaging,” Opt. Express 16(12), 8800–8813 (2008). M. Levoy and P. Hanrahan, “Light field rendering,” SIGGRAPH '96, Proceedings of the 23rd annual conference on Computer graphics and interactive techniques 31–36 (1996). Y. Igarashi, H. Murata, and M. Ueda, “3D display system using a computer generated integral photography,” Jpn. J. Appl. Phys. 17(9), 1683–1684 (1978). M. Halle, “Multiple viewpoint rendering,” SIGGRAPH '98, Proceedings of the 25th annual conference on Computer graphics and interactive techniques 243–254 (1998). S.-W. Min, J. Kim, and B. Lee, “New characteristic equation of three-dimensional integral imaging system and its applications,” Jpn. J. Appl. Phys. 44(2), L71–L74 (2005). S.-W. Min, K. S. Park, B. Lee, Y. Cho, and M. Hahn, “Enhanced image mapping algorithm for computergenerated integral imaging system,” Jpn. J. Appl. Phys. 45(28), L744–L747 (2006). B.-N.-R. Lee, Y. Cho, K. S. Park, S.-W. Min, J.-S. Lim, M. C. Whang, and K. R. Park, “Design and implementation of a fast integral image rendering method,” International Conference on Electronic Commerce 2006, 135–140 (2006). K. S. Park, S.-W. Min, and Y. Cho, “Viewpoint vector rendering for efficient elemental image generation,” IEICE – Transactions on Information and Systems, E 90-D, 233–241 (2007). R. Fernando, GPU Gems: Programming Techniques, Tips and Tricks for Real-Time Graphics (Addison-Wesley, 2004). Y.-H. Jang, C. Park, J.-S. Jung, J.-H. Park, N. Kim, J.-S. Ha, and K.-H. Yoo, “Integral imaging pickup method of bio-medical data using GPU and Octree,” J. Korea Contents Assoc. 10(6), 1–9 (2010). NVIDIA, “OpenCL programming guide for the CUDA architecture,” Ver. 2.3 (2009). NVIDIA, “CUDA C programming guide,” Ver. 3.1.1 (2010). E. Angel, Interactive Computer Graphics: A Top-Down Approach with OpenGL, 2nd ed. (Addison-Wesley, 2000).

1. Introduction Integral imaging technology is distinguished from other three-dimensional (3D) display methods in the points that it can display full-parallax, full-color, auto-stereoscopic 3D images and it can be implemented into existing two-dimensional (2D) monitor devices. The integral

#157720 - $15.00 USD

(C) 2012 OSA

Received 7 Nov 2011; revised 4 Dec 2011; accepted 5 Dec 2011; published 3 Jan 2012

16 January 2012 / Vol. 20, No. 2 / OPTICS EXPRESS 732

imaging system can be divided into a pickup part, which captures the elemental images of 3D objects, and a display part that integrates the elemental images into 3D images using a lens array. An elemental image is a 2D perspective of the object focused by an elemental lens in the lens array. Each elemental image is different from each other because the position of each elemental lens is different relative to the 3D objects. The 3D object information stored as the set of elemental images is optically reconstructed as 3D images in the display part [1-2]. computer-generated integral imaging (CGII) technique is a computational substitute of the optical pickup part. CGII creates the set of elemental images by using computer graphic techniques with the parameters of the virtual lens array without a real optical system. Several methods have been proposed for CGII [3–9]. Such methods include point retracing rendering (PRR) [4], multiple viewpoint rendering (MVR) [5], parallel group rendering (PGR) [6], viewpoint vector rendering (VVR) [7–9]. PRR is a simple method in that the set of elemental images is drawn point by point to retrace the display image. This method can be easily implemented, but it is unsuitable for real-time processing due to the heavy computation requirement. MVR treats the process of rendering a set of perspective images as a unit, and generates each elemental image by computer graphics such as OpenGL graphics library [10]. However, it is influenced by the number of elemental lenses and the data size of 3D object. Although a method using Octree for structuring 3D object data before applying MVR in an effort to enhance the processing efficiency has been proposed, the realtime rendering for a large 3D volume data has not been demonstrated [11]. PGR uses the viewing characteristics of focused mode where each elemental lens appears as a pixel to the observer. In PGR a set of elemental images is obtained from the imaginary scenes observed in a certain direction which are renamed directional scenes. The number of directional scenes is the same as that of the display pixels in the elemental lens area, and the directions correspond to the vector from each display pixel to the center of the corresponding elemental lens. Therefore, the elemental image generation is faster and less affected by the numbers of elemental lenses and 3D object polygons. However, PGR can only be used in the focused mode, not in other display modes. VVR is similar to PGR with regard to the elemental image generation using directional scenes. The larger the size of elemental images is the more directional scenes have to be generated. Hence the computation speed of the elemental images becomes slow. This method can also suffer from distortion of the displayed 3D images. In this paper, we propose a new method we call image space parallel processing technique. The proposed method enhances the speed by using graphic processing unit (GPU) based parallel processing scheme which is implemented in OpenCL [12-13]. The existing method [11] uses GPU processing to improve the rendering speed as well. However, the use of the OpenCL and the thread assignment for each elemental image pixel in the proposed method reduce the rendering time much less than the existing method. Using the proposed method, we achieved 24.39 frames per second (fps) in creating the elemental images for 512 × 512 × 512 large volume data size when the lens array consists of 200 × 200 lenses and each elemental image has 3 × 3 pixels. 2. Image space parallel computing of CGII 2.1 Principle of conventional CGII computation CGII generates the elemental images for a given 3D volume data and virtual lens array parameters. Figure 1 shows concept of elemental image generation. For each elemental lens in the array, the corresponding perspective image of the 3D object is calculated. In calculation, blurring by the out-of-focus and diffraction by the finite aperture size is intentionally ignored as they degrade the quality of the optically reconstructed 3D images in later display stage. Hence the perspective image calculated in CGII is a simple perspective projection of the 3D object on the pickup plane through a pinhole located at the elemental lens position. The calculated perspective image is cropped to the elemental image area of the corresponding elemental lens in order to avoid overlapping of the images between neighboring elemental lenses. In usual configuration, the elemental image area has the same lateral location and size

#157720 - $15.00 USD

(C) 2012 OSA

Received 7 Nov 2011; revised 4 Dec 2011; accepted 5 Dec 2011; published 3 Jan 2012

16 January 2012 / Vol. 20, No. 2 / OPTICS EXPRESS 733

as the corresponding elemental lens. This process is repeated for all elemental lenses, and the set of the elemental images can be obtained. Figure 1(b) shows an example of the elemental images generated by CGII.

Fig. 1. (a) Geometry of elemental image generation and (b) an example of elemental images.

2.2 Proposed image space parallel processing technique The proposed method in this paper is distinguished from previous techniques such as MVR, VVR, and PGR in the point that the proposed one uses parallel processing architecture. While the previous methods calculate the pixel values in the elemental images set sequentially, the proposed method computes all the pixel values simultaneously, reducing the processing time. Image space algorithm incorporated in the proposed method further accelerates the processing. The architecture of the proposed method is shown in Fig. 2. The proposed method is composed of three parts; i.e. input for virtual lens array parameters and 3D volume data, calculation of view matrix and transformation matrix for the virtual lens array and the 3D volume data, and elemental image generation through rendering process.

Fig. 2. Architecture of the proposed image space parallel processing method.

In the input stage, pixel information of display panel, and parameters of the virtual lens array including focal length of an elemental lens, central depth plane and the number of the elemental lenses are inputted. The 3D volume object data is also loaded along with translation, rotation, and scale parameters via inputted user mouse and keyboard. These 3D

#157720 - $15.00 USD

(C) 2012 OSA

Received 7 Nov 2011; revised 4 Dec 2011; accepted 5 Dec 2011; published 3 Jan 2012

16 January 2012 / Vol. 20, No. 2 / OPTICS EXPRESS 734

volume object parameters can be controlled interactively so that the corresponding elemental images are generated in real-time. Parameters for the system configuration such as gap between the lens array and the pickup plane are also input in this stage. In the calculation stage, virtual lens array properties and direction of view point are computed based on inputted parameters. OpenGL graphics library functions render objects from the camera in 3D virtual space, where objects and camera can be translated, rotated and scaled, using 4 × 4 homogeneous matrix [14]. Also in this stage, view matrix and transformation matrix are computed respectively for information of virtual lens array properties and the information of view point direction. The view matrix is a 4 × 4 matrix containing information on orientation and the location of the elemental lens. The transformation matrix is also a 4 × 4 matrix containing information on translation, rotation, and scale of the 3D volume data. The elements of the view and transformation matrices and their physical meaning are illustrated in Fig. 3. Note that in general case where the elemental lenses are aligned in a plane the view matrices of elemental lenses are different only by their fourth column which indicates lateral position of the elemental lenses. Therefore instead of creating view matrix for every elemental lens separately, the proposed method prepares only one view matrix for an elemental lens with a separate lateral positional data of all elemental lenses. This reduces the amount of data transferred to the OpenCL kernel in later rendering process. Note that the transformation matrix for the 3D volume data is common for all elemental lenses.

Fig. 3. View matrix of an elemental lens and transformation matrix of 3D volume data when CTn is the information of the (i, j)-th elemental lens center and nCD is the direction vectors of lens array.

In the rendering stage, the view matrix, the transformation matrix, and lateral positional data of the elemental lenses are given to the OpenCL kernel. With those parameters, the proposed method in the rendering stage calculates the pixel values of the elemental images in a parallel manner to minimize the processing time. Figure 4 depicts the parallel processing scheme used in the proposed method. In the OpenCL programming model, the kernel has a grid made of a number of blocks. Each block again has a number of threads which can run simultaneously. Multiple blocks can also run simultaneously if the GPU processing capacity allows. Therefore in the best case where the GPU processing power is sufficient, all the threads across multiple blocks can run in parallel. In the proposed method, the calculation of each pixel value in the elemental images is assigned to different thread. Thus with N × N

#157720 - $15.00 USD

(C) 2012 OSA

Received 7 Nov 2011; revised 4 Dec 2011; accepted 5 Dec 2011; published 3 Jan 2012

16 January 2012 / Vol. 20, No. 2 / OPTICS EXPRESS 735

elemental lenses and M × M pixels per each elemental image, NM × NM threads are generated and run simultaneously. This is the most distinguishing feature of the proposed method compared with the previous ones where the calculation is performed sequentially for each elemental image pixel.

Fig. 4. OpenCL parallel processing structure map for (i, j)-th elemental image corresponding to the (i, j)-th elemental lens.

The pixel value calculation in each thread is performed using image based algorithm in the proposed method. Figure 5 shows the concept of the image based algorithm. For a given pixel location, the corresponding ray is first calculated. The intersection points between this ray and the 3D volume data are then computed using the view and transformation matrices along with the positional data of the corresponding elemental lens. The maximum or average value of the colors and the intensities of the intersection points in the 3D volume data are finally assigned to the elemental image pixel in the thread. Hence in the proposed image based algorithm, the mapping is performed from the elemental image pixel to the 3D volume object. In the object based algorithm where the mapping is performed from each 3D object point to the elemental image pixel, mapping procedure should be performed in multiple times for a single elemental image pixel as rays from multiple object points can fall onto the same elemental image pixel area. The image based algorithm in the proposed method eliminates such possibility and enables to assign the proper pixel value in a single mapping process, reducing the processing time. Also note that the speed of the proposed image based rendering method is not affected by the relative position of the 3D volume data to the lens array. Hence the elemental images for the volume object in the real field and the virtual field are rendered with the same speed.

#157720 - $15.00 USD

(C) 2012 OSA

Received 7 Nov 2011; revised 4 Dec 2011; accepted 5 Dec 2011; published 3 Jan 2012

16 January 2012 / Vol. 20, No. 2 / OPTICS EXPRESS 736

Fig. 5. Concept of image based rendering for generating an elemental image.

The final elemental images set is sent to the frame buffer, which can be visualized on display panel such as liquid crystal display (LCD). Since the proposed method explained above can generate the elemental images set in a high speed, the 3D image can be displayed in real-time with the user interaction. Figure 6 shows an example of the 3D volume data, a set of the elemental images generated by the proposed method, and the reconstructed 3D image.

Fig. 6. Example of (a) 3D volume Data, (b) elemental images set generated by the proposed method and (c) 3D image optically reconstructed using integral imaging display system.

3. Experimental results The proposed image space parallel computing for CGII has been implemented by MS Visual Studio 2008, OpenGL lib., and OpenCL. The PC hardware was composed of Intel® Core2 Quad (2.66 GHz) CPU with 4Gb RAM and NVIDIA Quadro 4000 (GPU core: 256) graphic card. The performance of the proposed method was evaluated by comparing the rate of generation of elemental images sets. Five kinds of 3D volume data including Bucky, Mummy,

#157720 - $15.00 USD

(C) 2012 OSA

Received 7 Nov 2011; revised 4 Dec 2011; accepted 5 Dec 2011; published 3 Jan 2012

16 January 2012 / Vol. 20, No. 2 / OPTICS EXPRESS 737

Male, CTA, and Mouse are used in the evaluation. Figure 7 shows 3D volume data, generated elemental images, and displayed 3D images.

Fig. 7. 3D objects and displayed 3D images in experiments; (a) Bucky: 32 × 32 × 32, (b) Mummy: 256 × 128 × 128, (c) Male: 128 × 256 × 256, (d) CTA: 512 × 512 × 79 and (e) Mouse: 512 × 512 × 512.

Table 1 shows the processing speed comparison between the method proposed by Jang, et.al [11] using GPU & Octree, and our proposed method when the lens array consists of 20 × 20 lenses and each elemental image has 25 × 25 pixels. It can be observed that the proposed method is much faster than GPU & Octree method in all cases considered. Table 1. The measurement result of generation for CGII

Method Object (Data size)

GPU & Octree [sec / fps]

Proposed [sec / fps]

Bucky (32 × 32 × 32 pixels)

27.033 / 0.038

0.011 / 90.91

Mummy (256 × 128 × 128 pixels)

16.053 / 0.062

0.028 / 35.71

Male (128 × 256 × 256 pixels)

13.53 / 0.074

0.031 / 32.26

CTA (512 × 512 × 79 pixels)

14.87 / 0.067

0.034 / 29.41

Mouse (512 × 512 × 512 pixels)

Out of memory

0.036 / 27.78

Except the GPU & Octree method [11], the other CGII pickup methods, such as VVR, are the CPU based rendering methods, while proposed method is based on GPU. Due to architectural difference between GPU and CPU, the CPU based method fundamentally spends much longer time than the proposed method. It was reported that it takes 0.125 sec in generating 21 × 21 elemental images from a mesh model, which included about 15000

#157720 - $15.00 USD

(C) 2012 OSA

Received 7 Nov 2011; revised 4 Dec 2011; accepted 5 Dec 2011; published 3 Jan 2012

16 January 2012 / Vol. 20, No. 2 / OPTICS EXPRESS 738

vertices in the experiment of VVR [6]. However, VVR will take much longer time if 512 × 512 × 512 size 3D volume data is used instead of the mesh model, because 512 × 512 × 512 3D volume data includes much larger amount of data than the mesh model. Therefore, we compared the performance of the proposed method only with previous GPU & Octree method in the table. Figure 8 shows the processing time in various conditions. As shown in Fig. 8(a), the processing time for CTA (512 × 512 × 79) data in the previous GPU & Octree method, increases fast as the number of elemental lenses increase, while it remains at a much smaller value in the proposed method. Figure 8(b) shows the processing time of the proposed method for larger (512 × 512 × 512) 3D volume data size. It can be seen that even in the worst case where the number of elemental lenses is 100 × 100; the processing time does not exceed 0.8 sec. Figure 8(c) shows the processing time at different number of pixels for each elemental image. Again, the worst case processing time where 50 × 50 pixels are included in each elemental image and the number of the elemental lenses is 50 × 50 is smaller than 0.8 sec. It is possible for us to generate interactively elemental images in real-time by using our proposed method.

Fig. 8. Processing time of elemental images set generation for (a) various numbers of elemental lenses and (b) input data size, and (c) various number of pixel size of an elemental image.

Figure 9 shows a screen shot of the elemental images in experiment. During the experiment, the worst case processing time was 24.39 fps (0.041 sec) for 512 × 512 × 512 large input volume data where the number of elemental lenses are 200 × 200, and a set of elemental images consists of 600 × 600 pixels as shown in Fig. 9(a). In Fig. 9(b), processing time was similar as Fig. 9(a) when the number of elemental lenses are 30 × 30 and each elemental lens generates 20 × 20 pixels.

Fig. 9. The generating time of an elemental image for 512 × 512 × 512 input 3D volume data from (a) 200 × 200 number of lens array when each lens has 3 × 3 pixels and (b) 30 × 30 number of lens array, when each lens has 20 × 20 pixels (Media 1).

#157720 - $15.00 USD

(C) 2012 OSA

Received 7 Nov 2011; revised 4 Dec 2011; accepted 5 Dec 2011; published 3 Jan 2012

16 January 2012 / Vol. 20, No. 2 / OPTICS EXPRESS 739

In this paper, the processing time is proportional to total resolution of the set of elemental images. Due to different resolutions of elemental images, the Fig. 8 and Fig. 9 can be seen as confused to each other. But in Fig. 8(c) and Fig. 9(b), total resolution of a set of elemental images is same to Fig. 9(a), that 600 × 600 pixels and we can obtain the same result to Fig. 9(a). 4. Conclusion An OpenCL based GPU parallel processing method was proposed. While the CPU based previous methods require long processing time which increases linearly with the number of the elemental images, the parallel processing scheme of the proposed GPU based method reduces the processing time significantly. By using the proposed method, 24.39 fps (0.041 sec) was achieved in generating elemental images for 512 × 512 × 512 size 3D volume data when the number of the elemental lenses is 200 × 200 and each elemental image has 3 × 3 pixels. With the proposed method, it is possible to implement real-time user-interactively integral imaging 3D display system. Acknowledgments This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (Grants 2011-0025849) and by the grant of the Korean Ministry of Education, Science and Technology (Regional Core Research Program / Chungbuk BIT ResearchOriented University Consortium).

#157720 - $15.00 USD

(C) 2012 OSA

Received 7 Nov 2011; revised 4 Dec 2011; accepted 5 Dec 2011; published 3 Jan 2012

16 January 2012 / Vol. 20, No. 2 / OPTICS EXPRESS 740

Suggest Documents