The Adaptive Distributed Source Coding of Multi ... - Semantic Scholar

3 downloads 0 Views 1MB Size Report
SUMMARY. We show that distributed source coding of multi-view im- ages in camera sensor networks (CSNs) using adaptive modules can come close to the ...
IEICE TRANS. FUNDAMENTALS, VOL.E88–A, NO.10 OCTOBER 2005

2835

PAPER

Special Section on Information Theory and Its Applications

The Adaptive Distributed Source Coding of Multi-View Images in Camera Sensor Networks Mehrdad PANAHPOUR TEHRANI†a) , Toshiaki FUJII†† , Members, and Masayuki TANIMOTO†† , Fellow

SUMMARY We show that distributed source coding of multi-view images in camera sensor networks (CSNs) using adaptive modules can come close to the Slepian-Wolf bound. In a systematic scenario with limited node abilities, work by Slepian and Wolf suggest that it is possible to encode statistically dependent signals in a distributed manner to the same rate as with a system where the signals are jointly encoded. We considered three nodes (PN, CN and CNs), which are statistically depended. Different distributed architecture solutions are proposed based on a parent node and child node framework. A PN sends the whole image whereas a CNs/CN only partially, using an adaptive coding based on adaptive module-operation at a rate close to theoretical bound - H(CNs|PN)/H(CN|PN,CNs). CNs sends sub-sampled image and encodes the rest of image, however CN encodes all image. In other words, the proposed scheme allows independent encoding and jointly decoding of views. Experimental results show performance close to the information-theoretic limit. Furthermore, good performance of the proposed architecture with adaptive scheme shows significant improvement over previous work. key words: distributed source coding, camera sensor networks, multi-view images, Slepian-Wolf theory, parent node, child node

1.

Introduction

Multi-view images of a scene can be used for several applications ranging from free viewpoint television (FTV) [1] to surveillance. In FTV, the user can freely control the viewpoint position of any dynamic real-world scene. This system can cover a limited space. To expand the coverage to a wider area, a distributed sensor network can be used. In a paradigm of Smart Dust, hundreds or thousands of sensor nodes of cubic millimeter dimension can be scattered about any desired environment. It is used for such tasks as surveillance, widespread environmental sampling, security, and health monitoring and it is known to make human life safer and easier [2], [3]. CSN (Camera Sensor Network) [4] was introduced as an extension to the FTV system, and which has hundreds of cameras, distributed throughout the environment. It performs different tasks, such as data fusion for further processing, and arbitrary viewpoint generation. In this research, we are interested in data fusion application of CSNs. CSN presents a significant trade-off between powers consumed by processing versus communication. ComManuscript received January 24, 2005. Manuscript revised April 22, 2005. Final manuscript received June 15, 2005. † The author is with Information Technology Center, Nagoya University, Nagoya-shi, 464-8601 Japan. †† The authors are with the Department of Electrical Engineering and Computer Science, Graduate School of Engineering, Nagoya University, Nagoya-shi, 464-8603 Japan. a) E-mail: [email protected] DOI: 10.1093/ietfec/e88–a.10.2835

munication power costs can vastly exceed today’s powerefficient processor demands. As a result, in general, developers then strive to process information locally to reduce the data transmitted. The distributed architecture was designed to help the sensor network capitalize on the collective behavior of these complex systems by increasing communication load only when doing so is optimal. Due to the enormous size of multi-view images in CSNs, coding is one of the challenges to build such applications. Multi-view images are usually highly correlated in spatial domain and therefore spatial redundancy can be removed by encoding the information “differentially” with respect to an appropriate “reference” [5]. The disadvantage of this method is an extra overhead in communication between those nodes. In a scenario with limited processing and communication abilities, this method is not preferable. Work by Slepian and Wolf [6] shows that even if the sources are encoded independently, they can be fully reconstructed under certain conditions. In other words, the Slepian-Wolf theorem suggests that it is possible to encode statistically dependent signals in a distributed manner to the same rate as with a system where the signals are jointly encoded. Therefore, distributed source coding is preferable if there is a major constraint on individual camera node performance (i.e., energy, which is consumed by sensing and communication operations). However, approaching the Slepian-Wolf bound is still a challenging issue. Some works have been carried out [7]–[12] in designing a distributed source coding but the performance is not close to information theoretic bound. Aaron et al [13] proposed compression with side information using turbo codes. This method approaches the theoretic bound proposed by Slepian-Wolf for video sequences; however it resembles our method in a different way. In the following of the work by Aaron, Zhu et al. [12] proposed a distributed compression method for a large number of cameras using WynerZiv [14], [15] coding, which is a lossy scheme and cannot be compare with the proposed method in this paper. This paper introduces a lossless adaptive distributed source coding method [16], [17] without inter-node communication for multi-view images in CSNs; similar to the distributed source coding method proposed in [18]–[20] based on module-operation and a parent node (PN) and child node (CN) framework. A parent node encodes the whole image whereas a child node only partially. The coding scheme allows independent encoding and jointly decoding of each view. To perform the decoding task, disparity esti-

c 2005 The Institute of Electronics, Information and Communication Engineers Copyright 

IEICE TRANS. FUNDAMENTALS, VOL.E88–A, NO.10 OCTOBER 2005

2836

mation is employed to compensate the scene geometry [21], [22] to provide the side information. Experimental results show performance close to the limit of information theory. Furthermore, the distributed source coding with adaptive scheme shows significant improvement over previous work [18], [19]. The remainder of the paper is organized as follows: Sect. 2 defines CSNs and its configuration. Section 3 describes the coding architectures. Coding method is explained in Sect. 4. Section 5 shows the experimental results. Finally, Sect. 6 and Sect. 7 are discussion and conclusions of this research. 2.

(b)

Camera Sensor Networks

In this research, each camera is connected to an individual processor, and each node is able to communicate with central node. Each camera with processing and communication devices is called a camera sensor node. For each cluster of sensor nodes, central node is assigned for user interface and network management. The central nodes receive the encoded data from sensor nodes and jointly decode them and deliver the generated view to the user or further processing. The camera configuration affects the total processing task. Total processing task hardly depends on geometry compensation at joint decoder side. The coding process is valid for all cases (dense and coarse CSNs). Dense CSNs are suitable for real-time operation. Coarse CSNs cannot perform in real-time, because the geometry compensation needs lots of computational time to perform satisfactorily. Obviously, if the cameras are set too far, the coding tasks cannot be performed at all, because there will not be any correlation or overlapped part among camera nodes, to provide the side information at decoder side. 3.

(a)

(c) Fig. 1 Distributed source coding architectures: (a) PPP, (b) PCP, (c) CsPCs.

Fig. 2

Distributed source coding architectures, CsCsCs.

(a)

Coding Architectures

The cameras in the CSNs system are grouped into correlated clusters as it mentioned in Sect. 2. The sensor nodes in the CSN are set on a line with equivalent interval. Each cluster of nodes is coded independently. The cluster size corresponds to the maximum allowable disparity, respectively the maximum distance of a camera pair. In Fig. 1 three coding architectures without inter-node communication can be seen. Based on the coding architecture, the CN may have sub-sampled image (i.e., syndrome image). “P” represents PN. If encoded CN has syndrome image is shown by “CNs” or “Cs” and if it does not include the syndrome image is shown by “CN” or “C.” Figure 1(a) shows the individual coding of multi-view images Therefore, all nodes are PN and the coding architecture is named PPP. Figure 1(b) demonstrates coding with two PN and is called PCP. Figure 1(c) illustrates a distributed source coding including just one PN, which is located in the middle of the cluster. This architecture is called CsPCs. The arrows in Fig. 1 show the node to be used to decode a CN at the joint decoder. In CsPCs, the PN in the middle provides side information to

(b) Fig. 3

Distributed source coding architectures (a) PCCs (b) CsCCs.

decode each CNs in the cluster. All CN in PCP are decoded using virtual PN (vPN), which are generated by the two PN at the outer border of a cluster. In CsPCs architecture, P can be replaced by Cs, which makes the decoding of CNs independent from PN as shown in Fig. 2. This coding architecture is called CsCsCs. In CsCsCs, each CNs is decoded by using its neighbor. Furthermore, in PCP architecture, P can be replaced by Cs, which makes two other architectures (i.e., PCCs and CsCCs) as shown in Fig. 3. In the PCCs (Fig. 3(a)) the Cs node is decoded using the PN, as it is done in CsPCs architecture, and after that the CN nodes in the middle are decoded as it is done in PCP architecture. In the CsCCs (Fig. 3(b)) the Cs nodes are decoded using the other Cs

PANAHPOUR TEHRANI et al.: THE ADAPTIVE DISTRIBUTED SOURCE CODING OF MULTI-VIEW IMAGES

2837

node, as it is done in CsCsCs architecture, and after that the C nodes in the middle are decoded as it is done in PCP architecture. 4.

Coding Method

Before describing the encoding/decoding algorithms it is essential to note the variables used throughout the following sections. “n × m” describes a block to be encoded. “D” stands for the maximum gray level bound that is imposed on image coding at each block. “Maximum disparity” stands for the number of pixels required to find all correspondences in a stereo setup. It also defines the size of a cluster. Figure 4 shows a block diagram of the proposed distributed source coder. Although CN/CNs and PN are set apart, in practice they are arranged one after another as shown in Fig. 1, Fig. 2, and Fig. 3. Due to no inter-node communication amongst cameras, the CN/CNs views are encoded independently at each node. The encoded data is transmitted to the joint decoder. At the joint decoder the side information from PN is provided by the scene geometry, which is obtained by an area-based matching method of [21], [22]. 4.1 Encoding In the encoding algorithm, the PN images are transmitted without any coding. As it has been shown in Fig. 5, in CsPCs, CsCsCs, PCCs, and CsCCs architectures the CNs image is sub-sampled according to block size and forms the so called “syndrome images” (i.e., not to be done in CN). The remaining pixels (i.e., all pixels in a block in CN) build an image, called “coset image.” An example of coding is also shown in the figure. The encoding of a coset image

work as follows: Coset pixels of each block are encoded with a “D” value. The adaptive value of the “D” is decided by using the average absolute gradients of a block in vertical and horizontal directions. It corresponds to the spatial frequency of the scene. In the adaptive coding scheme, the higher spatial frequency, the higher “D” value is used. Based on the range, where the measured average gradient is, the adaptive “D” to encode a block is obtained. Table 1 shows a look up table to decide the adaptive “D” at each range. After choosing the adaptive “D” value, the pixels in a block are further encoded by applying a module-operation based on “D” to the pixel values. The same algorithm is applied to other blocks in the image. The encoded image using the adaptive scheme is called “coset image.” The quality of the encoded image can be control by changing the average “D” value used for an image at encoder side. Multiplying a linear weighting factor (i.e. ≥0) to the measured average gradient does the controlling procedure. Figure 6 shows the Cube&Doll (Fig. 6(a)-original), encoded result of a fixed encoder for D = 16 (Fig. 6(b)-scaled by 8), and adaptive scheme with 8 × 8 block size and Dave = 23 (Fig. 6(c)-scaled by 4). Comparison of Fig. 6(b) and Fig. 6(c) shows a significant reduction of information. 4.2 Decoding After receiving the encoded data of CN/CNs and full information of PN at the joint decoder, the coset images of CN/CNs should be reconstructed. Due to the moduleoperation applied to the coset image, decoding is necessary. This is done by using side information from the estimated scene geometry (i.e., not to be done for all blocks in CNs). This scene geometry is obtained by using a block-based matching method [21], [22]. Geometry compensation for decoding CN and CNs is as follow: • CN: A vPN image is generated in the location of each CN by using two PN in PCP, a PN and decoded CNs in PCCs, and two decoded CNs in CsCCs. The CN image pixels are decoded by using the vPN’s pixels at the same location. Table 1

Fig. 4

Fig. 5

Distributed source coder in general.

Encoding method of CN/CNs with example.

Look up table for adaptive distributed source coding.

Gradient Range 0 1 2–3 4–7 8–15 16–31 32–63 64–127 ≥128 D 1 2 4 8 16 32 64 128 256

(a) (b) (c) Fig. 6 (a) Original image of Cube&Doll (b) encoded view of a fixed encoder for D = 16 (scaled by 8) and (c) adaptive coded view with 8×8 block size and Dave = 23 (scaled by 4).

IEICE TRANS. FUNDAMENTALS, VOL.E88–A, NO.10 OCTOBER 2005

2838

pixel c . Then the closet value in to the side information “s” is chosen as the decoded value of children node coset pixel “c.”

• CNs: To obtain the side information to decode a block, there are three ways as follow: – (a) For each block the side information is provided by linear interpolation of the up-sampled syndrome image, which is called U SI. – (b) For each block in a CNs, the corresponding pixels at the PN/CNs are found in CsPCs/CsCsCs by geometry compensation, using syndrome image. – (c) Using the combination of methods (a) and (b), which can outperform due to the experimental results in comparison with method (a) and method (b). In this method, the decision for choosing (a) or (b) is done by measuring the gradient range of a 2 × 2 block. It includes the syndrome pixel of the block to be decoded, and the three other syndrome pixels in the neighborhood. If the gradient range is larger than 125 (in gray scale) method (a) is used, and for the gradient ranges less than 125, method (b) can perform better. This threshold value is set to 125 as an optimal value. It is obtained through experimental results on two data sets (i.e., Cube&Doll and Tsukuba). Note the geometry compensation method affects the threshold value.

c = minC(c ) − s

Example of decoding of c = 3 for D = 16 with s = 150: c = 3 −→ C(c ) = {3, 3 + 16, 3 + 2 × 16, +3 × 16, ...} C(c ) = {3, 19, 35, 51, 67, 83, 99, 115, 131, 147, 163, 178, 194, 210, 226, 242} c = minC(c ) − s → c = 147 However, decoding of the coset image is not possible, if the “D” value for each block is not known. To solve this problem, there are three ways as follow: 1. Sending “D” value for each block from encoder to decoder. 2. Estimating “D” value by using vPN/U SI image (i.e., side information). Table 1 is used for vPN/U SI image at decoder to estimate the “D.” 3. Estimating “D” value by using CN/CNs image (i.e., coset image). The maximum coset value of each block refers to the range and then the “D” is decided using Table 1.

After having the side information, the values of the coset image are decoded by applying an “inverse” moduleoperation, which is not unique. Therefore, that solution is chosen which minimizes the distance to the corresponding pixel of the other image (i.e. syndrome pixels of CNs, PN, vPN or U SI). Equation (1) and Eq. (2) show the decoding procedure for a given “D” value. C(c ) = {kD + c | k ∈ [0, 1, 2, ...], kD + c < 256}

The first method due to overhead on the transmission rate is not preferable. Therefore, we would like to estimate the “D” value of each block to decode the coset image. Experimental results on different block sizes and image scenes show that the third method performance is nearly the same as the first method. Hence, we proposed to use the third method for decoding. The coding flowchart is shown in Fig. 7. It summarizes the coding algorithms mentioned in Sect. 4.1 and Sect. 4.2.

(1)



where “C(c )” is a set of all candidate values for the coset

(a)

(2)

(b)

(c)

Fig. 7 Flowchart of (a) Encoder and (b) Decoder (i.e., (c) Joint Decoding) of adaptive distributed source coding.

PANAHPOUR TEHRANI et al.: THE ADAPTIVE DISTRIBUTED SOURCE CODING OF MULTI-VIEW IMAGES

2839

Note that the proposed coding scheme is lossless, however error is accursed. The “D” value represents the maximum gray level difference between two corresponded locations in two images. Hence, if we can have the correct value of “D” at encoder for each pixel, and if we know a correct location of corresponding point of the encoded pixel at decoder, we can easily decode the encoded image without any loess and error. Therefore the proposed adaptive scheme, which is developed, based on the aforementioned assumption is lossless. However, error accurse in the coding procedure, because the correct prediction of the “D” value for each block is not possible, since we do not have the corresponding point value at encoder (i.e. no communication between among node in distributed source coding scheme), and the correct location of corresponding point should be found at decoder in distributed scheme (i.e. joint decoding). In addition, Image based method is used at geometry compensation block of decoder to find the corresponding value, so that we can not perfectly find the correct matched block to the encoded block. 5.

(a)

(b) Fig. 8 CNs PSNR vs. “D” for adaptive and fixed coders in comparison with U SI in CsPCs and PCCs (a) Cube&Doll, 45.7 dB for average D = 15 (b) Tsukuba, 43.2 dB for average D = 35.

Experiment

In this experiment, the data set consists of 3 views with 320 × 240 pixels per view of Cube&Doll, and 3 views with 384 × 288 per view of Tsukuba. The maximum disparity between the two furthest views is 30 pixels. All experiments are carried out only on luminance component. The experiment is divided into two parts. The first part shows the performance of the adaptive source coding in comparison conventional coder. The second part shows the experiment on the adaptive source coding performance considering the Slepian-Wolf bound.

(a)

(b)

5.1 Comparison of Fixed and Adaptive Coders In this section the CN and CNs decoding performances for adaptive coder are compared with fixed coder. The reconstruction performance of the adaptive coder for both test data sets is compared with a coder, which is using the same “D” value for all pixels in CN image (i.e., fixed coder [18]– [20]). Reconstruction quality of a CN/CNs is measured in term of peak-signal-to-noise-ratio (PSNR). 5.1.1 CNs This part of experiment is considered the performance of a CNs in CsPCs, CsCsCs, PCCs and CsCCs architectures. The block sizes in adaptive coders are 2 × 2. Due to extra information of coset image in CNs, the adaptive distributed source coder should perform better than up-sampled and linearly interpolated view of syndrome image (U SI). Therefore, in Fig. 8 the decoded CNs quality has compared with U SI quality when it is decoded by PN (in CsPCs, and PCCs) whereas in Fig. 9 by CNs (CsCsCs and CsCCs architectures).

Fig. 9 CNs PSNR vs. “D” for adaptive and fixed coders in comparison with U SI in CsCsCs and CsCCs (a) Cube&Doll, 45.6 dB for average D = 15 (b) Tsukuba, 43.05 dB for average D = 48.

5.1.2 CN This part of experiment is considered the performance of a CN in PCP, PCCs and CsCCs architectures. The block sizes in adaptive coders are 4 × 4. Due to geometry compensation at the joint decoder (i.e., vPN image - side information), the adaptive distributed source coder should perform better than vPN. Therefore, in Fig. 10 the decoded CN quality has compared with vPN quality when it is decoded by using two PN (in PCP). These figures show PSNR curves vs. “D” for adaptive and fixed coders in comparison with vPN, and examples of reconstructed image of CN after decoding. In adaptive scheme, the “D” value shows the average “D” value. The adaptive coder has gain over vPN/U SI quality with lower value of average “D” in comparison with fixed coder. Furthermore, the experimental result for Tsukuba shows a sig-

IEICE TRANS. FUNDAMENTALS, VOL.E88–A, NO.10 OCTOBER 2005

2840

Fig. 11

Statistical dependence of PN, and CNs with entropy in CsPC.

(a)

(b) Fig. 10 CN PSNR vs. “D” for adaptive and fixed coders in comparison with vPN in PCP (a) Cube&Doll, 36.37 dB for average D = 16 (b) Tsukuba, 27.51 dB for average D = 48.

nificant improvement due to the complexity of the scene. Note that the performance of CN in other architectures (PCCs and CsCCs) are not evaluated, according to its similarity to the shown performance.

Fig. 12 Inefficiency of total rate achieved by the adaptive scheme as compared to Slepian-Wolf bound in CsPCs.

5.2 Slepian-Wolf Bound

5.2.1 CNs

According to the aforementioned importance of adaptive distributed source coding of multi-view images in CSNs, the information-theoretic bound proposed by Slepain-Wolf is appealing. In this part of experiment, the CN and CNs rates are compared with the rate proposed by Slepain and Wolf. The performance of the coding in all architectures is evaluated with the term of “total entropy inefficiency.” It is the absolute value of one minus the ratio of the achieved rate through out the experiment and the ideal rate proposed by Slepian-Wolf theory as shown in Eq. (3). The zero value of “TotalEntr.Ineff.” means that the encoder perfectly approach the theoretical bound proposed by Slepian-Wolf.    AchievedRate  TotalEntr.Ineff. = 1 − (3)  × 100  SlpeianWolfRate 

The H(PN) and H(CNs) are to the entropies of the PN and CNs images of a cluster in CsPCs architecture. Due to the coding architecture, H(PN,CNs) - joint entropy of PN and CNs should be measured. After measuring these values, H(CNs|PN) = H(PN,CNs) - H(PN) can be calculated as shown in Fig. 11. Figure 12 shows the total entropy inefficiency for Cube&Doll and Tsukuba in CsPCs architecture, as shown in Eq. (4).   Rx + H(PN)    × 100 TotalEntr.Ineff. = 1 − (4) H(CNs,PN) 

The entropies of the encoded images are measured individually. Note that in the experiment, we have shown the “TotalEntr.Ineff.,” so that we just need to measure the joint entropy, and each image or encoded image entropies individually. However, the entropy of one image, and conditional entropies of two images can be measured as following with having any stochastic model: 1. For each grey level (bin) in the histogram, compute the frequency f (i) = number of pixels in bin(i)/total pixels in image. 2. The entropy H(N) of image N is the sum over all bins of [− f (i) × log( f (i))] 3. The joint entropy of two images = H(N1,N2) is the sum over all bins of [− f (i, j) × log( f (i, j)] 4. The conditional entropy of N1 is: H(N1|SN2) = H(N1,N2) - H(N2)

Another architecture is CsCsCs. The H(CNs1) and H(CNs2) are the entropies of two children nodes, as shown in Fig. 13. Due to the coding architecture, H(CNs1,CNs2) is joint entropy of CNs1 and CNs2. The required rate to approach the Slepian-Wolf bound is to satisfy the Eq. (5), practically as shown in Fig. 13. H(CNs1,CNs2) = Rx1 + Rx2

(5)

where the Rx1 and Rx2 are the transmission rates of CNs1 and CNs2, respectively. Figure 14 shows the total entropy inefficiency for Cube&Doll and Tsukuba for CsCsCs architecture. The total entropy inefficiency can be calculated as Eq. (6).   Rx1+Rx2  × 100 (6) TotalEntr.Ineff.=1 − H(CNs1,CNs2)  The results mentioned in the figures show that by using the adaptive coding scheme the total rate can come close to Slepain-Wolf bound. In Cube&Doll image when the average “D = 32” the Rx is completely satisfied the theoretical rate, with 22.72 dB gain over the U SI. Obviously, the Tsukuba

PANAHPOUR TEHRANI et al.: THE ADAPTIVE DISTRIBUTED SOURCE CODING OF MULTI-VIEW IMAGES

2841

Fig. 13

Statistical dependence CNs with entropy in CsCsCs. Fig. 15

Statistical dependence of PN, CNs and CN with entropy.

Fig. 14 Inefficiency of total rate achieved by the adaptive scheme as compared to Slepian-Wolf bound in CsCsCs.

has more complicated scene than Cube&Doll, however the rate inefficiency is 2% at “D = 48.” According to Fig. 9 in Tsukuba the minimum required average “D” value for efficient decoding with gain over U SI has decreased much more than that of Cube&Doll. 5.2.2 CN The H(PN), H(CNs) and H(CN) are the entropies of the images in a cluster of PCCs architecture. Due to the coding architecture, H(PN,CNs,CN) - joint entropy of PN, CNs, and CN - and H(PN, CNs) - joint entropy of PN and CNs - should be measured. After measuring these values, H(CN|CNs,PN) = H(PN,CNs,CN) - H(PN,CNs) can be calculated as shown in Fig. 15. The H(CN|CNs,PN) value is the point at which the rate of CN(Rx) should be close after it is encoded (i.e., coset image). The rates obtained for CNs will be rates obtained in Sect. 5.2.1. Figure 16 shows the total entropy inefficiency for Cube&Doll and Tsukuba. The total entropy inefficiency for PCCs can be calculated as Eq. (7).   +Ry+H(PN)  Rx   × 100 (7) TotalEntr.Ineff.=1 − H(PN,CNs,CN)  As shown in Fig. 16, in Cube&Doll image when the average “D = 22,” the “TotalEntr.Ineff.” is 1%, with 4.5 dB gain over the vPN. The Tsukuba has more complicated scene than Cube&Doll, and the rate inefficiency is approached to zero for “D = 32.” As Fig. 11 shows, in Tsukuba the minimum required average “D” value for efficient decoding with gain over vPN has decreased significantly in comparison with that of Cube&Doll. Considering the performance of decoded CN/CNs and the achieved rates in graphs show that by applying the adaptive distributed source coding, the quality of the decoded re-

Fig. 16 Inefficiency of total rate achieved by the adaptive scheme as compared to Slepian-Wolf bound in PCCs.

sult can be improved as well as satisfying the Slepain-Wolf bound. As it has been shown throughout the experimental part, the performance of the coding depends on the value of “D,” which is decided based on the algorithm explained in Sect. 4.1 using the gradient range measurement of a block. Note that the obtained “D” value in average is always the value where the lowest Total Entropy Inefficiency is obtained in Fig. 12, Fig. 14, and Fig. 16. Therefore, the proposed algorithm predicts the best “D” value for a given block to be coded close to Slepian-Wolf bound. According to the aforementioned explanation on the optimal value of “D,” the adaptive coder should have a significant improvement over the vPN, the U SI, and the fixed coder for the predicted value of “D,” as it is shown in Fig. 8, Fig. 9, and Fig. 10. We would like to emphasize that this value of “D” is the optimal value for an optimized coding result. Note that the coding performance for larger value of “D” is better than the optimal value, however the deviation form Slepian-Wolf bound is increased (i.e., more information is transmitted), which is not preferable. The adaptive distributed coding scheme can also perform better than the vPN, the U SI, and the fixed coder for lower value of optimal value of “D” to some ranges, but the deviation from Slepian-Wolf bound is increased (i.e. less information is transmitted), which is not also preferable. 6.

Discussion

The adaptive distributed source coding in CSNs can generally be evaluated in terms of quality and Slepian-Wolf bound performances. In this section, in addition to aforementioned terms, the proposed method is discussed accord-

IEICE TRANS. FUNDAMENTALS, VOL.E88–A, NO.10 OCTOBER 2005

2842

ing to its compatibility for a generalized coding architecture more than three nodes, random arrangement of cameras in CSNs, and further compression of coset images in the following. • Generalization on more than three nodes: The coding scheme can be applied to more than three nodes architecture, by carefully deciding the block sizes and the “D” values at encoder. The Larger block size and the smaller “D” values should be used if the number of nodes in cluster is increased, or the camera interval decreased. • Random arrangement of cameras: The adaptive distributed source coding can be applied to the random arrangement of cameras. Not that, in such applications, cameras viewing range overlapped parts in a cluster should be found by using the camera parameters [18]. • Compression of coset images: In addition, further compression scheme is needed for coset images, which is considered as future research issues. Normal JPEG compression of coset images is useful as it shown in [19], [20], however, we believe a better compression of coset images can be developed based on the coset image statistics. 7.

Conclusion

This paper introduced an encoder/joint decoder scheme based on adaptive module-operation for asymmetric distributed source coding of multi-view images in camera sensor networks considering the spatial frequency of the area where the encoding algorithm is applied. The CN/CNs is encoded without knowledge of other nodes. At the encoder the desired quality can be controlled by the “D” value. The output is the module image (i.e., coset image). The decoder performs an inverse module-operation by estimating the “D” value using the received encoded CN/CNs and the side information provided by geometry compensation. The adaptive distributed source coding can approach to the SlepianWolf bound by controlling the quality of encoding. Furthermore, its performance has a significant improvement in comparison with conventional coding scheme, with smaller average “D” value. The coding scheme can be applied to random arrangement and more than three nodes architecture, by having cameras parameters, and carefully deciding the block sizes and the “D” values at encoder. In addition, a suitable compression scheme is needed for coset images, which is considered as future research issues. Acknowledgement This work has been supported by the SCOPE Fund project, Ministry of Internal Affairs and Communication, Japan (ref. No.: 041306003).

References [1] P. Na Bangchang, T. Fujii, and M. Tanimoto, “Experimental system of free viewpoint television,” Proc. IST/ SPIE Symposium on Electronic Imaging, vol.5006-66, pp.554–563, Santa Clara, CA, USA, Jan. 2003. [2] J.M. Kahn, R.H. Katz, and K.S.J. Pister, “Mobile networking for smart dust,” ACM/IEEE International Conference on Mobile Computing and Networking, vol.2, no.3, pp.188–196, Seattle, WA, Aug. 1999. [3] C.C. Shen, C. Srisathapornphat, and C. Jaikaeo, “Sensor information networking architecture and applications,” IEEE Pers. Commun., vol.8, no.4, pp.52–59, Aug. 2001. [4] M.P. Tehrani, P. Na Bangchang, T. Fujii, and M. Tanimoto, “The optimization of distributed processing for arbitrary view generation in camera sensor networks,” IEICE Trans. Fundamentals, vol.E87A, no.8, pp.1863–1870, Aug. 2004. [5] P. Na. Bangchang, T. Fujii, and M. Tanimoto, “Ray-space data compression using spatial and temporal disparity compensation,” IWAIT 2004, pp.171–175, Jan. 2004. [6] D. Slepian and J.K. Wolf, “Noiseless coding of correlated information sources,” IEEE Trans. Inf. Theory, vol.IT-19, no.4, pp.471–480, July 1973. [7] S.S. Pradhan and K. Ramchandran, “Distributed source coding using syndromes (DISCUS): Design and construction,” Proc. IEEE Data Compression Conference, pp.157–167, Snowbird, UT, March 1999. [8] S.S. Pradhan and K. Ramchandran, “Distributed source coding: Symmetric rates and application to sensor networks,” Proc. IEEE Data Compression Conference, pp.363–372, Snowbird, UT, March 2000. [9] X. Wang and M. Orchard, “Design of trellis codes for source coding with side information at decoder,” Proc. IEEE Data Compression Conference, pp.361–370, Snowbird, UT, March 2001. [10] J. Kusuma, L. Doherty, and K. Ramchandran, “Distributed compression for sensor networks,” IEEE Signal Processing Society Conference, vol.1, pp.82–85, ICIP 2001. [11] X. Zhu, A. Aaron, and B. Girod, “Distributed compression for large camera arrays,” Proc. IEEE Workshop on Statistical Signal Processing, SSP-2003, St Louis, Missouri, USA, Sept. 2003. [12] A. Sehgal, A. Jagmohan, and N. Ahuja, “Wyner-Ziv coding of video: An error-resilient compression framework,” IEEE Trans. Multimedia, vol.6, no.2, pp.249–258, April 2004. [13] A. Aaron and B. Girod, “Compression with side information using turbo codes,” Proc. IEEE Data Compression Conference, DCC2002, Snowbird, UT, April 2002. [14] A. Wyner and J. Ziv, “The rate-distortion function for source coding with side information at the decoder,” IEEE Tans. Inf. Theory, vol.IT-22, no.1, pp.1–10, Jan. 1976. [15] A. Aaron, E. Setton, and B. Girod, “Towards practical Wyner-Ziv coding of video,” Proc. IEEE International Conference on Image Processing, ICIP-2003, Barcelona, Spain, Sept. 2003. [16] M.P. Tehrani, M. Dorose, T. Fujii, and M. Tanimoto, “Adaptive distributed source coding for multi-view images,” Proc. PCM 2004, LNCS 3333, pp.249–256, Tokyo, Japan, Dec. 2004. [17] M.P. Tehrani, M. Dorose, T. Fujii, and M. Tanimoto, “Effect of quality control on adaptive distributed source coding for multi-view images,” Proc. Forum on Information Technology, FIT 2004, M-015, pp.121–122, Kyoto, Japan, Sept. 2004. [18] M.P. Tehrani, T. Fujii, and M. Tanimoto, “A distributed source coding for ITS,” Forum on Information Technology, FIT 2002, pp.389– 390, Sept. 2002. [19] M.P. Tehrani, T. Fujii, and M. Tanimoto, “Distributed source coding of multiview images,” Proc. IS&T/ SPIE Symposium on Electronic Imaging, VCIP 2004, vol.5308, no.31, pp.300–309, San Jose, CA, USA, Jan. 2004. [20] M.P. Tehrani, M. Dorose, T. Fujii, and M. Tanimoto, “Distributed

PANAHPOUR TEHRANI et al.: THE ADAPTIVE DISTRIBUTED SOURCE CODING OF MULTI-VIEW IMAGES

2843

source coding architectures for multi-view images,” J. Institute of Image Information and Television Engineers (ITE), vol.58, no.10, pp.107–110, Oct. 2004. [21] A. Nakanishi, T. Fujii, T. Kimoto, and M. Tanimoto, “Ray-space data interpolation by adaptive filtering using locus of corresponding points on epipolar plane image,” J. Institute of Image Information and Television Engineers (ITE), vol.56, no.8, pp.1321–1327, Aug. 2002. [22] M. Droese, T. Fujii, and M. Tanimoto, “Ray-space interpolation based on filtering in disparity domain,” Proc. 3D Image Conference 2004, pp.29–30, Tokyo, Japan, June 2004.

Mehrdad Panahpour Tehrani received the B.E., M.E., and Dr.E. degrees in electrical engineering from Tehran Polytechnics University, Iran in 1997, Tarbiat Modarres University, Iran in 2000, and Nagoya University, Japan in 2004, respectively. Currently, he is post-doctoral fellow at Information Technology Center, Nagoya University, Japan. His research interests are 3D Audio/Video processing and communication in camera sensor network.

Toshiaki Fujii received the B.E., M.E., and Dr.E. degrees in electrical engineering from the University of Tokyo, Japan, in 1990, 1992, and 1995, respectively. He is currently an Associate Professor in the Graduate School of Engineering, Nagoya University, Japan. His research interests include 3D image processing and 3D visual communications.

Masayuki Tanimoto received the B.E., M.E. and Dr.E. degrees in electronic engineering from the University of Tokyo, Tokyo, Japan, in 1970, 1972 and 1976, respectively. He joined Nagoya University in 1976 and involved in research on visual communication and communication systems. Since 1991, he has been a Professor of the Department of Information Electronics, Nagoya University. His current research interests include image coding, multidimensional signal processing, video processing, 3D and HD images, and multimedia systems.