Scalable hologram video coding for adaptive transmitting service Young-Ho Seo,1 Yoon-Hyuk Lee,2 Ji-Sang Yoo,3 and Dong-Wook Kim2,* 1
College of Liberal Arts, Kwangwoon University, 447-1, Wolgye-1Dong, Nowon-Gu, Seoul 139-701, South Korea 2
Department of Electronic Materials Engineering, Kwangwoon University, 447-1, Wolgye-1Dong, Nowon-Gu, Seoul 139-701, South Korea
3
Department of Electronic Engineering, Kwangwoon University, 447-1, Wolgye-1Dong, Nowon-Gu, Seoul 139-701, South Korea *Corresponding author:
[email protected] Received 17 July 2012; revised 16 October 2012; accepted 17 October 2012; posted 18 October 2012 (Doc. ID 172779); published 27 November 2012
This paper discusses processing techniques for an adaptive digital holographic video service in various reconstruction environments, and proposes two new scalable coding schemes. The proposed schemes are constructed according to the hologram generation or acquisition schemes: hologram-based resolutionscalable coding (HRS) and light source-based signal-to-noise ratio scalable coding (LSS). HRS is applied for holograms that are already acquired or generated, while LSS is applied to the light sources before generating digital holograms. In the LSS scheme, the light source information is lossless coded because it is too important to lose, while the HRS scheme adopts a lossy coding method. In an experiment, we provide eight stages of an HRS scheme whose data compression ratios range from 1∶1 to 100∶1 for each layered data. For LSS, four layers and 16 layers of scalable coding schemes are provided. We experimentally show that the proposed techniques make it possible to service a digital hologram video adaptively to the various displays with different resolutions, computation capabilities of the receiver side, or bandwidths of the network. © 2012 Optical Society of America OCIS codes: 090.1760, 090.1995.
1. Introduction
Holography is an ideal visualization system that can exactly reconstruct an original three-dimensional (3D) object in a space, and many people regard hologram service as the final goal for 3D video processing technologies. The original image can be displayed naturally without restrictions on the viewing position within a predefined viewing range. As 3D technology becomes common to the public, people request more and more realistic images, and it cannot be denied that the solution should involve holography [1,2]. There can be various types of hologram services and various application areas for holographic technologies, such as advertisement, video conference, 1559-128X/13/01A254-15$15.00/0 © 2013 Optical Society of America A254
APPLIED OPTICS / Vol. 52, No. 1 / 1 January 2013
broadcasting service, education, military simulation, and training. These applications have different network environments, display resolutions, computation capabilities of receivers, etc. Therefore, holographic services should be able to fit to address this variation with a scalable holographic video coding (SHVC) technique. The scalable video coding (SVC) technique for 2D video has been standardized within MPEG-4. MPEG in International Organization for Standardization/ International Electrotechnical Commission and video coding experts groups in the International Telecommunication Union Telecommunication Standardization Sector have been leading the standardization process for 2D scalable coding with the name of MPEG-4 SVC or H.264 Scalable Extension. MPEG-4 SVC, a new scalable coding technique, began as MPEG-21 Part 13. In 2003, it was changed
to MPEG-4 SVC, or an extension type of the H.264 standard, and finally named as MPEG-4 Part 10 Amd. 1 [3,4]. Because a hologram is completely different from a 2D image or video, it is not appropriate to apply the scalable coding technique for 2D video. Thus, it is necessary to develop a scalable coding technique as well as an effective data compression technique for holograms by considering their characteristics [5]. Since the beginning of the 1990s, various coding techniques for digital holograms have been researched. Yoshikawa considered that the resolution of the reconstructed image from a hologram is too high for the human visual system and proposed a method to restrict the resolution of a hologram to reduce the amount of information by using the interpolation method [6,7]. Also, he segmented a hologram onedimensionally into several subholograms, performed discrete cosine transforms (DCT) on them, and applied video compression standards such as MPEG-1 or MPEG-2 to compress the DCTed subholograms [8,9]. Javidi proposed a lossless compression method [10] and combined it with lossy coding techniques [11]. Liebling proposed a Fresnelet-based transformation method to resolve digital holograms and attempted to compress them [12]. A compression scheme to quantize a complex bitstream by a bit packing operation for a real-time networking has also been proposed [13]. Recent research has proposed various schemes to use digital signal processing techniques in hologram data compression. In [14,15], a digital hologram is segmented into several subholograms and transformed to increase the correlation among pixels. Then a 2D compression technique such as H.264 is applied to the transformed subhologram sequence. [16] attempted to form an integrated hologram from a set of subholograms to compress the difference information between the subholograms and the corresponding parts of the integrated image. A method using motioncompensated temporal filtering (MCTF) to increase the compression ratio has been proposed [17]. Also, [18] used a 2D Mallat-tree wavelet transform to compress the resulting subbands. This paper proposes a SVC method for digital hologram video (SHVC) which can reconstruct images in various environments at the receiver. It contains two schemes. The first is for a digital hologram acquired by digital equipment or generated by a computer-generated hologram (CGH) technique. It uses the fact that each pixel of a hologram retains information for all the light sources of an object to form a hologram-based resolution-scalable coding (HRS) scheme. The other scheme is for when the hologram is generated at the receiver by a CGH technique. In this case, only the information for the light source of an object to generate digital holograms is sent adaptively to the receiver. A light source-based signalto-noise ratio (SNR) scalable coding (LSS) scheme is proposed for this case, which uses the fact that a part of the light sources can generate a digital hologram and reconstruct the object.
This paper is composed as follows. After discussing the system structure for holographic video service in Section 2, our SHVC schemes are proposed in Section 3. The proposed schemes are examined through experimentation, and the results are shown in Section 4. Finally, this paper is concluded in Section 5 based on the experimental results. 2. Holographic Video Service A. Holographic Service System
Recently, as research on digital holographic technologies becomes more active, CGH-based techniques gather more and more attention for use in hologram services for various applications. Four types can be considered, as in Fig. 1. Figure 1(a) shows the classical digital holography technique. It captures a digital hologram with digital equipment such as a charge-coupled device camera, the result from which is transmitted or stored after signal processing or compression. The original object is reconstructed with an optical system including a spatial light modulator (SLM). Figure 1(b) uses a data acquisition system consisting of depth cameras and an RGB cameras. This system acquires depth information by the depth camera and the chrominance and/or luminance information by the RGB camera. The information acquired for the light sources are usually preprocessed to increase the quality, and the results are used to generate digital holograms with a CGH technique. Figure 1(c) is similar to Fig. 1(b) in that it uses the depth information, but it differs in how it acquires depth information; that is, it performs a stereo matching process to obtain disparities, which are converted to depths. Figure 1(d) shows the use of 3D graphic information to obtain digital holograms. The system structures in Figs. 1(b), 1(c), or 1(d) can be changed depending on which side the CGH process is performed (transmitter side or receiver side). In general, the receiver has lower computation capability than the transmitter. In this case, digital holograms are generated and compressed at the transmitter. However, when the receiver has enough computational power, CGH processes can be performed at the receiver. In this case, only the data to be used in CGH generation, such as the light source information, needs to be compressed and sent. Compressing a digital hologram is much more complicated than compressing the light source data. In the past, we had used the system in Fig. 1(c) to implement a holographic video service system for real objects. Since it was too hard to obtain not only good calibration and rectification results but also good depth information from a stereo matching process, we have recently changed to the system in Fig. 1(b) to implement the real-time holographic video service system [19]. Commercial depth cameras can acquire high-quality depth information in some restricted environments, but still have problems in regarding distance, interference, and scattering of 1 January 2013 / Vol. 52, No. 1 / APPLIED OPTICS
A255
Fig. 1. Structures of a digital holographic video service system using: (a) an optical system, (b) a Depth+RGB camera system, (c) an RGB camera system, and (d) 3D graphic modeling.
incident waves. However, those problems may be solved shortly if considering that various depth cameras are currently being competitively developed and commercialized.
to adjust detailed bit rate versus image quality by adjusting the distortion in each bit plane of the quantized DCT coefficients by a quantization parameter [3,4].
B.
C.
Scalable Video Coding
An SVC is adaptable to various forms of communication and provides an adaptive coding capability that maximizes coding efficiency with respect to space, time, and image quality. SVC decoders can selectively decode a part of a received bitstream according to its computational capability or receive bitstreams selectively according to the network conditions. A bitstream can provide various spatial, temporal, and image resolutions [3]. MPEG-4 SVC can provide spatial, temporal, and fine granularity scalability (FGS). For a spatial SVC, the transmitter encodes and transmits frames in various resolutions from the base layer to the highest layer through subsequent higher enhancement layers. Then the receiver can selectively decode a part of the bitstream. Temporal SVC is performed in the units of groups of pictures (GOP) by deleting the B pictures (hierarchical B picture scheme) or by using MCTF. It provides various frame rates by assigning a different temporal layer to each frame. FGS is an advanced SNR SVC. It is used A256
APPLIED OPTICS / Vol. 52, No. 1 / 1 January 2013
Some Characteristics of a Digital Hologram
A digital hologram looks much different from an ordinary 2D image at a glance. Therefore, it is necessary to understand the characteristics of a digital hologram for signal processing. Scalable coding schemes need to use the characteristics of pixels, localized regions, and frequency-domain data of a digital hologram. These have been analyzed already in previous research [14,15] and are summarized here. In scalable coding for 2D images, images are usually subsampled by the units of sample or pixel, and the result is encoded. When the subsampled and encoded result is reconstructed with the same resolution as the original image, it is usually somewhat blurred with some high-frequency components removed or attenuated. However, for a digital hologram, this kind of subsampling may lose much of the original image or the original object. We can verify that adjacent pixels in a digital hologram are so closely related in reconstruction that they help each other to form the right wavefront of an area or the whole object image. This
assures that the scheme for a 2D video is not proper for a digital hologram videos. If a co-centered part is cropped out from a digital hologram and reconstructed, we can obtain the same original object, though the resolution is decreased. A part of a hologram retains all the holographic information at that view-point such that reconstructing it results in an image at that view-point. This property of a hologram is totally different from that of a 2D image, and it enables the possibility to form a spatially scalable coding scheme for the different view-points. That is, while a 2D image is subsampled as scaling down for spatial scalability, a hologram can be regionally segmented or divided for spatial scalability. A hologram looks as noise in the aspect of a 2D image, and its frequency characteristics are very different from that of a 2D image. Coefficients in the frequency domain of a digital hologram are generated by DCT or discrete wavelet transform (DWT), the result form that shows much different energy distribution from a 2D image. The lowest frequency subband or DC coefficient has the highest energy as a 2D image, but the tendency for energy to increase in the higher-frequency subband is very different. This means that it is not appropriate to use a transform tool for a 2D image as an analysis tool for transform to frequency domain for a digital hologram, and we need some additional processing steps. D.
CGH
The intensity I from interfering two light waves, the object wave O and the reference wave R, is defined by Eq. (1):
(we assumed that the same light is used for object wave and the reference wave), and N is the number of object pixels. If the N light source is separated into two parts (1 ∼ M and M ∼ N), CGH equation is represented to Eq. (3). In Eq. (3) RCGH is defined as Eq. (4): Iα
N X j M X j1
Aj cosRCGH Aj cosRCGH
N X
Aj cosRCGH ;
(3)
jM1
q RCGH k pα xα − pj xj 2 pα yα − pj yj 2 z2j : (4)
3. Scalable Coding Schemes for Digital Hologram Video
In this section, we propose two types of SVC schemes for digital hologram video service. The first is for the case when a digital hologram video is already prepared. In this case, the hologram-based resolution or spatial SVC (HRS) can be applied. The second is for the case when a digital hologram video is generated at the receiver. In this case, scalability is applied to the light source information, which is the light source-based SNR SVC (LSS). The main criterion to classify the videos is where digital hologram video is generated. A. Hologram-Based Resolution SVC
I jO Rj2 jOj2 jRj2 2jOjjRj cosφo − φr : (1) In Eq. (1), φo and φr are the phases of the object wave and the reference wave respectively. The term jOj2 is intensity of object wave, jRj2 is intensity of reference wave. The term 2jOjjRj cosφo − φr is the interference result. Because the intensity of a hologram pixel is the sum of the effects from all the light sources, that is, all the object pixels, the actual value of a hologram pixel I α would be Eq. (2). CGH is also defined as the Eq. (2): Iα
N X j
q Aj cos k pα xα − pj xj 2 pα yα − pj yj 2 z2j ; (2)
where xα ; yα and xj ; yj ; zj are the position of the hologram pixel and the light source, respectively, pα and pj are the pixel pitches of the hologram plane and the object plane, respectively, Aj corresponds to 2jOjjRj in the third term on the right side of the above equation and is dependent of the intensity of the light from the object pixel (the intensity of the reference light is assumed to be constant), k is the wave number
First, we propose an HRS scheme that uses the localized characteristics of a digital hologram and is applied when a digital hologram video is generated or captured at the transmitter. It is depicted in Fig. 2. 1. Encoding As in any SVC, SHVC also depends mainly on the encoding process. The first step in encoding HRS is dividing each digital hologram frame for SVC. For this step, we crop out a co-centered part of a digital hologram, even though it shows only one cropping process. The cropped hologram is regarded as the base layer, and the rest would be the enhancement layer. Then, an object with lower resolution can be reconstructed if only the base layer is included, and higher resolution component of the original object can be obtained if the enhancement layer in addition to the base layer is reconstructed. The reason why the cropping should be co-centered is that the reconstructed image from the cropped hologram should have the same viewpoint as the original. Note that if the center positions are not the same, the viewpoints would be different. For actual SVC, several co-centered cropping processes are performed. An example is shown in Fig. 3, which is the case of eight layers including the base 1 January 2013 / Vol. 52, No. 1 / APPLIED OPTICS
A257
Fig. 2. HRS scheme: (a) encoding and (b) decoding.
layer. The resolution of the original digital hologram is 1024 × 1024 [pixel2 ], which is divided into 16 × 16 blocks, each of which has the resolution of 64 × 64 [pixel2 ]. The base layer (Level 0) takes the four (2 × 2) blocks at the center. The higher level takes one more block both horizontally and vertically, which results in 4 × 4 blocks in Level 1, 6 × 6 blocks in Level 2, and so on. For SVC, a layer consists of the blocks satisfying Eq. (5): Lk1 Blevel k1 − Blevel k ;
(5)
where Lk and Blevel k are the layer and the blocks included in Level k, respectively. The number of pixels in a block affects the efficiency of the data compression process, which will be explained later. The block size of 64 × 64 [pixel2 ] was empirically chosen. The next step of the SVC encoding is segmentation and transformation. Because each layer has already been segmented into blocks, we use each block as a segment. Thus, each segment or block is transformed with 2D DCT or 2D DWT to convert it into corresponding frequency domain data. Then, the resulting A258
APPLIED OPTICS / Vol. 52, No. 1 / 1 January 2013
blocks are grouped according to the structure of the SVC layers such that the blocks in each layer are grouped into one GOP. The blocks in each GOP are rearranged in a spiral scanning manner to form a sequence by regarding each block as a 2D frame. The grouping and sequencing scheme are shown in Fig. 4. The final step in the HRS encoding is to apply each GOP from sequencing to a 2D video compression tool such as H.264. The data compression itself is beyond the scope of this paper. Refer to [14–17] for the omitted details. 2. Decoding The decoding process of HRS can be performed in exactly the reverse order of the encoding, as shown in Fig. 2(b). For each layer, the corresponding bitstream is taken and decompressed, which results in the corresponding sequence of the blocks. These blocks are inverse-transformed to reconvert them to hologram data, and the results correspond to the hologram blocks from Eq. (1).
the digital holograms are generated, at the transmitter or receiver. For each case, the scheme should be changed appropriately. However, we only deal with the case when CGH is generated at the receiver. Therefore, actual data to be sent to the receiver is the light source information necessary for CGH generation, assuming that the receiver has enough computational power to do so [20].
Fig. 3. Cropping method for the various layers.
To form an enhanced digital hologram k, the hologram blocks for layer k are added to all the hologram blocks for layers lower than k by spiral extension, the same manner as in the encoding process. B.
Light Source-Based SNR SVC
The other scheme is the LSS to provide an adaptive service with scalability on the amount of light sources for CGH generation at the receiver, which is described in Fig. 5. We assumed for this scheme that a digital hologram is generated with a CGH method as in Fig. 1(b) or 1(c), and not directly acquired by capturing an interference pattern. Of course, two cases are possible according to where
1. LSS Encoding As shown in Fig. 5(a), LSS encoding is relatively simple compared to HRS. First, it simply divides the light source information evenly, the depth information and the RGB information. An example for the case of four layers (n 4) is shown in Fig. 6. In this example, only one of the depth and RGB information is shown, but the other is processed in the same way. In both Figs. 6(a) and 6(b), the encoding process evenly divides the original information into four pieces of subinformation by subsampling one out of two in both the horizontal direction and vertical direction. The resulting amount of information in each of them is 1∕4 (1∕n, in general) of the original. Each subinformation acts as a layer and is compressively encoded to be transmitted to the receiver side. Note that each piece of light source information is very important in CGH generation, so they are encoded with a lossless compression tool. CGH needs two 2D images: One is color or luminance image and the other is the corresponding depth image. That means the color or luminance image provides 2D information as well as the color or luminance information, and the depth image provides the remaining coordinate that is the distance of the corresponding pixel from the observing point. CGH process requires a depth image and a RGB or luminance images, which provide 3D positional information as well as the color or luminance information. Therefore, it is possible to divide the light source information into as many parts as we want. That is, if separating the light source, one can take the pixel
Fig. 4. Pre-process for compression. 1 January 2013 / Vol. 52, No. 1 / APPLIED OPTICS
A259
Fig. 5. LSS coding scheme: (a) encoding, (b) decoding 1, and (c) decoding 2.
from RGB or luminance image and the corresponding depth pixel from the depth image. If one take a pixel from every two pixels horizontally and vertically (subsampling by 2), the resulting image size is going to be 1∕4 of the original. Of course subsampling should be done for both images. 2. LSS Decoding In contrast to HRS, LSS decoding can have two options, as shown in Figs. 5(b) and 5(c). The first option (decoding 1) in Fig. 5(b) can be regarded as an ordinary SVC decoding such that to generate an enhanced hologram level k, all the layers of the light source information corresponding to layer k and lower layers are decoded and placed in corresponding positions. A260
APPLIED OPTICS / Vol. 52, No. 1 / 1 January 2013
“Zero Up-Sampling” means that the pixel values in each layer are placed in exactly the same positions as the ones before subsampling in the encoding process, and the empty ones are filled with 0. This is presented in more detail in the example of Fig. 6(a). The second option is to fill up the empty light source pixels by linear, bilinear, bilateral, or bicubic interpolation as in Fig. 5(c). The interpolation is performed after adding all the lower layer information. An example is depicted in Fig. 6(b). For example, to form the data for the enhanced hologram 1, layer 1 data are added to the base layer data (positioned to the very right of the base layer pixels), which results in all the data in the first row and the third row being the received data exactly. Then the second row
Table 1.
Two Parameter Sets for the Experiments
Specification Parameter Distance Hologram size Pixel pitch (p) Wavelength (λ)
S∕W Reconstruction
Optical Reconstruction
100 cm 1024 × 1024 10.4 μm 633 nm (red laser)
100 cm 1280 × 1024 13.6 μm 532 nm (green laser)
software reconstruction as well. Thus, two sets of parameters were used, one for software reconstruction and the other for optical reconstruction. The two parameter sets are shown in Table 1. Figure 7 shows the four test data (Rabbit, Hyunjin, Ballet, and Brahms) and the resulting objects from S∕W reconstruction. Rabbit and Brahms are still images, not videos, but they are included for more objective data. Also, because there is no test data for digital holographic video that is commonly used, we chose one, Ballet, from MPEG for 2D multiview video processes, which retain 10 viewpoints with RGB and depth information for each viewpoint. The Hyunjin data is a self-made video that includes both
Fig. 6. Example of LSS codec scheme: (a) decoding 1 and (b) decoding 2.
and the fourth row are interpolated to form the final data. The second option of decoding (decoding 2) may show better image quality than the first option (decoding 1 scheme) because it results in retaining more light sources even though they are not exactly the same ones as the original. However, this option needs additional cost for calculating the interpolation and CGHs. Thus the holographic image is affected by the interpolated light sources and the efficiency of the scheme highly depends on the exactness of interpolation. 4. Experimental Results
This section is devoted to implementing the proposed SHVC schemes, and experimenting by applying several digital test holograms or videos to show the performance of the proposed scheme. A.
Experimental Environments and Implementation
The proposed SHVC schemes were implemented in C++. CGH generation, encoding, decoding, and reconstruction by Fresnel diffraction were implemented additionally with a general purpose graphic processing unit (GPGPU) based on CUDA [21]. This was done because a CPU takes too much processing time for CGH generation and encoding, while the GPGPU can perform them in almost real time. Because our optical equipment is not good enough to show the properties of the proposed schemes, we prepared
Fig. 7. Images/videos used in the experiments; Rabbit’s (a) depth and (b) reconstructed image, Hyunjin’s (c) depth and (d) reconstructed image, Ballet’s (e) depth and (f) reconstructed image, Brahms’s (g) digital hologram and (h) reconstructed image. 1 January 2013 / Vol. 52, No. 1 / APPLIED OPTICS
A261
Fig. 8. (Color online) PSNR values of reconstructed images after HRS scheme: (a) Rabbit, (b) Hyunjin, (c) Ballet, and (d) Brahms.
depth and luminance information. Note that only the digital hologram data are provided for Brahms, and we had no other choice except including only the digital hologram in Fig. 7(g). Figure 7(e) is the result from stereo matching, but we captured Fig. 7(c) with a depth camera. B.
As in Section 3, layer 0 indicates the base layer and layer 7 corresponds to the highest layer that includes the same amount of information as the original. Note that we have imposed eight layers for the HRS scheme. Thus, each layer corresponds to the local region as shown in Fig. 3.
Results for HRS Scheme
Figure 8 shows the numerical results, whose horizontal axis is the compression ratio and the vertical one is the peak signal-to noise ratio (PSNR). Although it is not clear that the PSNR value can be the absolute estimation for the quality of the reconstructed object image as is in a 2D video, it can be a good criterion to observe the tendencies. PSNR is defined as Eq. (6): PSNR 10 • log10
! 2552 Pm−1 Pn−1 : 1 0 2 i0 j0 Ii; j − I i; j mn (6)
In Eq. (6), m and n are the width of horizontal and vertical direction respectively, and Ii; j and I 0 i; j present the pixel values of the original and the modified image respectively. PSNR cannot be the absolute estimation for holographic image. Difference in PSNR value is not exactly reflected on the image quality. However it is useful to observe the tendency for the quality of the reconstructed object, such that getting higher PSNR value means the quality is getting better. A262
APPLIED OPTICS / Vol. 52, No. 1 / 1 January 2013
Fig. 9. Results from HRS scheme for Hyunjin.
Fig. 10. Results from reconstructing and increasing the resolution to the highest layer for Rabbit and 10∶1 compression ratio: (a) Layer 0, (b) Layer 1, (c) Layer 2, (d) Layer 3, (e) Layer 4, (f) Layer 5, (g) Layer 6, and (h) Layer 7 (highest resolution).
The depth information of Rabbit in Fig. 8(a) and Hyunjin in Fig. 8(b) are relatively exact. Thus, they show a similar tendency of change in PSNR value to the change in compression ratio and to the change of layers; they smoothly decrease in PSNR with increasing compression ratio and decreasing
enhancement level, as expected. However because the depth information of Ballet in Fig. 8(c) was extracted by stereo matching, it could not express the exact 3D shaping of the object. This inherent erroneous information is reflected into the HRS scheme such that layer 4 showed somewhat different
Fig. 11. Examples of optically acquired hologram Brahms for layer 7 with the compression ratio of (a) 1∶1, (b) 20∶1, (c) 40∶1, (d) 60∶1, (e) 80∶1, and (f) 100∶1. 1 January 2013 / Vol. 52, No. 1 / APPLIED OPTICS
A263
Fig. 12. LSS coding process and the results for Rabbit.
characteristics from the others. Also, it more or less breaks the tendency of decreasing PSNR in response to increased compression ratio. Brahms retains much noise as shown in Figs. 8(g) and 8(h), which resulted in a different tendency in PSNR change and its amount of compression ratio change. Thus, it seems necessary for the optically obtained one to undergo a preprocessing technique to remove the noise before signal processing. The values in Fig. 8 are the average values for all frames in the sequence. Example images resulting from reconstruction after scalable encoding and decoding by the HRS
scheme are shown in Fig. 9, which are for both the compression ratio and the scalability layer. Of course, the results for a specific layer include all the lower layer information. As the reconstructed images in this figure were calculated using Fresnel transform, the lower layer images became smaller. Here, we show the results only for the compression ratios from 1∶1 to 100∶1 with their step of 20 for all the layers (0 ∼ 7 layer). As one can see from the figures, the image resolution becomes higher as the layer increases. Also, the image quality becomes worse as the compression ratio increases. The PSNR
Fig. 13. Results from combining CGHs to form an enhanced hologram for Ballet. A264
APPLIED OPTICS / Vol. 52, No. 1 / 1 January 2013
Fig. 14. Results from more detailed experiments for the relationship between the amount of information and the image quality with Hyunjin.
value of layer 7 at compression ratio 100∶1 is about 17.1 dB, while the one of the same layer but at the compression ratio 1∶1 is about 37.8 dB. Thus, one can see that the visual image degradation is not as much as the PSNR value difference, which seems to be characteristic of the compression, not the HRS. Figure 10 shows the Rabbit example images resulting from reconstructing and increasing the resolutions to that of the highest-level, the same resolution as the original one. In the case that the receiver does not have much computational power but has good display resolution, it can receive lower-layered data and increase the resolution to fit its the display. As one can see from the figures, the image quality is improving as the layer level increases. From Figs. 9 and 10, we can say that our HRS scheme can provide 90 resolution-scalable digital holographic video services according to the network capacity, computational capability and resolution of the display of the receiver. Figure 11 shows reconstructed example images resulting from various compression ratios with the same layer information (layer 7). Because of contamination by the noise from the acquisition process, the one without compression has much degradation in itself. Thus, it is not easy to find the image quality
degradation as the compression ratio increases, even if some is recognizable. C.
Results for LSS Scheme
The results from experiments for the LSS scheme are described in Fig. 12. In encoding, the original information of the light sources is divided into four pieces of subinformation, each of which is encoded separately. In CGH generation at the receiver, subinformation corresponding to a layer and all the ones for the lower layers are added by placing them at the corresponding positions, the results of which are used to generate a CGH. Finally, the CGH is reconstructed. The display resolutions are assumed to be the same, because in contrast to HRS, the LSS scheme adjusts the amount of light source information, not the resolution. Lower information does not always result in image degradation as much as information insufficiency. However, from the figures, we can recognize that the higher the amount of information, the better the image quality is. The example results in Fig. 13 were for the same encoding process as that of Fig. 12 but different in CGH generation and in forming the higher layer. For Fig. 13, after decoding each layer of light source subinformation, CGH corresponding to each layer is generated separately. Then to form an enhanced 1 January 2013 / Vol. 52, No. 1 / APPLIED OPTICS
A265
Fig. 15. Reconstructed results from with and without interpolation: (a) original; after 1∕4 down-sampling, (b) without interpolation “zero up-sampling,” (c) duplicating the left pixel, and (d) bi-cubic interpolation.
hologram corresponding to a layer, the CGHs corresponding to the layer and all the lower layers are added. Finally, the resulting hologram is reconstructed. It is possible because the CGH generation process is linear. Thus, the same result
Fig. 16. (Color online) Results from optical reconstruction for HRS scheme (Rabbit). A266
APPLIED OPTICS / Vol. 52, No. 1 / 1 January 2013
is obtained even if the CGH generation process is performed before the information combining process. To show how the image quality changes as the amount of light source information increases in detail, we have divided the information into 16 pieces of subinformation and compared the layer images, which are shown in Fig. 14. The information increases linearly from the leftmost top to the rightmost bottom, but the image quality does not seem to follow this linearity exactly. Figure 15 shows some example results from including the interpolation process for each piece of layered information after decoding for three test images, Rabbit, Hyunjin, and Ballet. Figure 15(a) shows the reconstructed images for the originals, and Figs. 15(b)–15(d) are the ones for the 1∕4 down-sampled information. Figure 15(b) shows the results without interpolation (zero up-sampling), Fig. 15(c) shows the ones from interpolation by duplicating the left pixel values to the empty ones on the right, and Fig. 15(d) shows the results from bicubic interpolation. From Figs. 15(c) and 15(d) we can see that the images look more filledup, but the quality is not clearly improved. This means that the interpolation methods for 2D image do not always improve the image quality in digital hologram reconstruction, which needs more research.
Fig. 17. (Color online) Results from optical reconstruction for LSS scheme (Hyunjin).
5. Conclusion
Fig. 18. (Color online) Optical equipment.
D.
Results from Optical Reconstruction
Figures 16 and 17 show the captured images from optical reconstruction. Figure 16 shows the HRS results for the Rabbit object, and all layers are used for 1∶1, 20∶1, and 100∶1 compression. Displaying the HRS results requires various SLMs with different resolutions. Since we have a SLM as shown in Table 1, we could not experiment the various optical reconstruction for all layers. Figure 17 shows the LSS results for Hyunjin. The light source for Hyunjin is separated into 16 layers, and each layer is combined for the higher resolution. Figure 17 is the same process as Fig. 14. Since our optical equipment is not as good as that shown in Table 1, the optical results are somewhat blurred, but the optical results may be enhanced using equipment with high performance. The implementation result was verified in the prototype optical system, which is shown in Fig. 18.
In this paper, we have proposed two SVC schemes for digital hologram service. The first is HRS, which is applied to a digital hologram video that has been acquired or generated before. The other is the LSS scheme, which is used to generate digital holograms at the receiver. The HRS forms an SHVC layer as the rest subhologram resulting from cropping out a cocentered part of the original hologram for the next lower layer. A layer of the LSS consists of one of the P × Q (in this case the total number of layers is P × Q) subsampled information of the light sources. Especially for the LSS scheme, we proposed an additional decoding scheme that fills-up the absent source information with an interpolation method. With the experiments, we have shown eightlayered HRS schemes with compression ratios ranging from 1∶1 to 100∶1 increasing by 10 for each layer, which could provide 90 scalable holographic video services. The experimental results showed that the resolution becomes higher as the layer increases, and the object quality decreases as the compression ratio increases, as expected. Also, we showed that the object quality increases as the number of layers increases if the resolutions are the same as that of the highest layer. For LSS, we experimented with the four layer case and the 16 layer case. The results showed that the object quality increases as the number of layers increases, as expected. However, for the ones including some interpolation techniques, the 1 January 2013 / Vol. 52, No. 1 / APPLIED OPTICS
A267
resulting object quality did not seem to always be better than the one without interpolation. Consequently, we could be sure that the proposed SHVC schemes can provide various scalable digital holographic videos for various environments. Thus, we are expecting that the proposed schemes can be used as a good basis for further research on SHVC, as well as for real applications such as a holographic TV. As for the interpolation method, it will be one of our future research topics to find an optimal method to increase the object quality. Also, a more efficient compressive coding scheme for digital holograms and video is another topic for future work. This work was supported by the IT R&D program of KEIT (KI002058, Signal Processing Elements and their SoC Developments to Realize the Integrated Service System for Interactive Digital Holograms).
9. 10. 11.
12. 13.
14. 15.
References 1. B. Javidi and F. Okano eds., Three Dimensional Television, Video, and Display Technologies (Springer, 2002). 2. P. Hariharan, Basics of Holography (Cambridge University, 2002). 3. F. Wu, S. Li, and Y.-Q. Zhang, “A framework for efficient progressive fine granularity scalable video coding,” IEEE Trans. Circuits Syst. Video Technol. 11, 332–344 (2001). 4. J. Reichel, H. Schwarz, and M. Wien, “Scalable Video Coding —Working Draft 1,” Doc. JVT-N020 (2005). 5. Y.-H. Seo, H.-J. Choi, and D.-W. Kim, “3D scanning-based compression technique for digital hologram video,” Signal Process. Image Commun. 22, 144–156 (2007). 6. H. Yoshikawa and K. Sasaki, “Information reduction by limited resolution for electro-holographic display,” Proc. SPIE 1914, 206–211 (1993). 7. H. Yoshikawa and K. Sasaki, “Image scaling for electroholographic display,” Proc. SPIE 2176, 12–22 (1994). 8. H. Yoshikawa, “Digital holographic signal processing,” in Proceedings of TAO First International Symposium on Three
A268
APPLIED OPTICS / Vol. 52, No. 1 / 1 January 2013
16.
17.
18.
19. 20.
21.
Dimensional Image Communication Technologies (TAO, 1993), paper S-4-2. H. Yoshikawa and J. Tamai, “Holographic image compression by motion picture coding,” in Proc. SPIE 2652, 2–9 (1996). T. J. Naughton and B. Javidi, “Compression of encrypted three-dimensional objects using digital holography,” Opt. Eng. 43, 2233–2238 (2004). T. J. Naughton, Y. Frauel, E. Tajahuerce, and B. Javidi, “Compression of digital holograms for three-dimensional object reconstruction and recognition,” Appl. Opt. 41, 4124–4132 (2002). M. Liebling, T. Blu, and M. Unser, “Fresnelets: new multiresolution wavelet bases for digital holography,” IEEE Trans. Image Process. 12, 29–43 (2003). O. Matoba, T. J. Naughton, Y. Frauel, N. Bertaux, and B. Javidi, “Real-time three-dimensional object reconstruction by use of a phase-encoded digital hologram,” Appl. Opt. 41, 6187–6192 (2002). Y.-H. Seo, H.-J. Choi, and D.-W. Kim, “Lossy coding technique for digital holographic signal,” Opt. Eng. 45, 065802 (2006). Y.-H. Seo, H.-J. Choi, and D.-W. Kim, “3D scanning-based compression technique for digital hologram video,” Signal Process. Image Commun. 22, 144–156 (2007). Y.-H. Seo, H.-J. Choi, J.-W. Bae, H.-C. Kang, S.-H. Lee, J.-S. Yoo, and D.-W. Kim, “A new coding technique for digital holographic video using multi-view prediction,” IEICE Trans. Inf. Syst. E90-D, 118–125 (2007). Y.-H. Seo, H.-J. Choi, J.-S. Yoo, and D.-W. Kim, “Digital hologram compression technique by eliminating spatial correlations based on MCTF,” Opt. Commun. 283, 4261–4270 (2010). L. T. Bang, Z. Ali, P. D. Quang, J.-H. Park, and N. Kim, “Compression of digital hologram for three-dimensional object using Wavelet–Bandelets transform,” Opt. Express 19, 8019–8031 (2011). http://www.youtube.com/watch?v=WpI0PWALdLE&feature= plcp. Y.-H. Seo, H.-J. Choi, J.-S. Yoo, and D.-W. Kim, “Cellbased hardware architecture for full-parallel generation algorithm of digital holograms,” Opt. Express 19, 8750–8761 (2011). http://developer.nvidia.com/category/zone/cuda‑zone.