Distributed source coding theorem based region of interest image compression method G. Ding, F. Yang, Q. Dai and W. Xu A new and flexible region of interest (ROI) image compression method is proposed according to the distributed source coding theorem with side information. The reconstructed low quality ROI is regarded as a noisy version of the original ROI in the proposed method. Therefore, along with the parity bits from the original ROI, the reconstructed low quality ROI as side information can be utilised by a turbo decoder to decode the high quality ROI. Experimental results show that the proposed method applying the Wyner-Ziv coding to ROI compression can greatly improve the compression efficiency as well as efficiently protect the ROI against bit errors.
Introduction: Region-of-interest (ROI) image compression is a new feature in JPEG2000, which allows the ROI to be encoded with better quality than the rest of an image, i.e. background (BG). Two kinds of ROI coding methods are included in the standard: the scaling based method and the maximum shift (maxshift) method. In the scaling based method, the wavelet transform is applied to the image at the encoder and resulting coefficients not associated with the ROI are scaled down (shifted down) so that the ROI-associated bits are placed in higher bit-planes. During the embedded bit-plane coding process, the bits in the higher bit-planes are placed before those in the lower bit-planes. The scaling value and the shape information of the ROIs are also added into the encoded bit-stream. At the decoder, the bitplanes are reconstructed and the non-ROI coefficients are scaled up to their original bit-planes before the inverse wavelet transform is applied. If the encoded bit-stream is truncated or the encoding=decoding process is terminated before the image is fully encoded=decoded, the ROIs will have a higher quality than the BG. The relative importance of the ROIs and the BG is determined by the scaling value, which defines the number of bit-planes to be shifted. The maxshift method may be considered as a particular case of the general scaling based method when the scaling value is so large that there is no overlapping between BG and ROI bit-planes, which does not require any shape coding or any shape information to be explicitly transmitted to the decoder. At the decoder, the nonzero ROI and BG coefficients can be simply identified by looking at the coefficients’ magnitudes. All coefficients that are found to be lower than the scaling value are known to belong to the BG. There is no need to tell the decoder explicitly about the shape information of the ROIs. There are two major drawbacks to these methods. First, they would significantly reduce the compression efficiency by increasing the dynamic range (or number of bit-planes) of wavelet coefficients. Secondly, they do not have the especial protection for the ROI against the bit errors in image communication applications, i.e. remote medical treatment and satellite communication. When the bit errors occurred, the reconstruction quality of ROI and BG would reduce at the same time. In this Letter, a new and flexible ROI image compression method is proposed, which applies Wyner-Ziv coding to ROI compression and can greatly improve the compression efficiency as well as efficiently protect the ROI against the bit errors.
the reconstructed low quality ROI is regarded as a noisy version of the original ROI. Therefore, along with the parity bits from the original ROI, the reconstructed low quality ROI as side information can be utilised by the channel decoder to decode the high quality ROI. Fig. 1 depicts the proposed ROI compression method. At the encoder, the original ROI is first transformed by DWT. The coefficients of DWT are then encoded at two channels. At one channel, the coefficients of the full image are quantised and encoded by the JPEG2000 standard encoder to generate the low bit-rate bit-stream without ROI. At another channel, the coefficients of the ROI are quantised by a nested scalar quantiser and encoded by a rate compatible punctured turbo (RCPT) encoder to generate parity bit-stream of the ROI. Finally, the two bitstreams are fed into the bit-rate allocator to allocate the bit rate. At the decoder, the JPEG2000 decoder first decodes the reconstructed low quality image. The parity bit-stream and the low quality ROI are then sent to the Wyner-Ziv decoder to decode a higher quality ROI. Finally, the low quality BG and the high quality ROI are combined to the reconstructed image with high quality ROI. JPEG2000 encoder
quantiser
DWT input image
bit-stream of JPEG2000
original ROI
bit-rate allocator
X(q) RCPT encoder
bit-plane
NSQ
a Rg
parity bit-stream of ROI
Wyner-Ziv encoder
bit-plane
reconstructed image with high quality ROI
IDWT
estimation
TCPT SI decoder
low quality ROI
JPEG2000 decoder
dequantiser
X(aRg) Wyner-Ziv decoder reconstructed image with low quality ROI IDWT
Fig. 1 Block diagram of proposed ROI compression method
To obtain the highest quality reconstructed image at some bit rate, the processing of controlling bit rates is described as follows. Let Rg be the goal bit rate (bits per pixel). Let So and Si be, respectively, the size of the original image and the ROI. Let C be the ratio of the image size to the ROI size. Let X(q) and X(aRg) be the original value of the ROI and the reconstructed value of the ROI at the bit rate aRg, where a is the ratio factor, as shown in Fig. 1. According to the above definitions, the number of bits for ROI compression Bit ROI should be: Bit ROI ¼ ðRg aRg Þ So
ð1Þ
To have the highest quality reconstructed image at the bit rate Rg, a should satisfy: ðRg aRg Þ So !HðX ðqÞjX ðaRg ÞÞ Si
ð2Þ
where H(X(q)j X (aRg)) is the conditional entropy. The formula (2) can be rewritten as: Proposed ROI image compression method: In the 1970s, Slepian and Wolf established the information-theoretic bounds for distributed lossless coding [1]. Wyner and Ziv soon thereafter extended the Slepian-Wolf theorem to the case of lossy compression [2]. These coding theories give us the surprising insight that efficient data compression can also be achieved by exploiting source statistics partially or wholly at the decoder only. Coding algorithms that build upon these theorems are generally referred to as distributed source coding (DSC) algorithms [3]. In the literature [4], it has also been proved that a source can be successively refinable in the WynerZiv setting as long as the difference between the source and the side information is Gaussian and independent of the side information. According to these founded DSC theorems on source coding with side information, a new and flexible ROI image compression method is proposed, which applies Wyner-Ziv coding to ROI compression and can greatly improve the compression efficiency as well as efficiently protect the ROI against bit errors. In the proposed ROI compression method,
minkHðX ðqÞjX ðaRg ÞÞ C ðRg aRg Þk
ð3Þ
To easily compute the ratio factor a, it is set to the discrete values from 0.0 to 1.0 with a step size of 0.1, i.e. the selected a is the value satisfying formula (3). So the bit rate of the JPEG2000 coder will be aRg and the number of the output bits of the Wyner-Ziv encoder will be (Rg aRg) So. Moreover, the proposed method can flexibly control the relative importance between the ROI and BG by adjusting the ratio factor a. Experimental results: The performance of the proposed method has been evaluated with two 256 256 grey-level images: ‘Girl’ and ‘Lena’. The 9=7 filters were adopted for the test images with three levels of decomposition. The nesting ratio is 8. We compared the coding efficiency of the proposed method to the maxshift method, and compression with no ROI. The ROI comprises the face area and is about
ELECTRONICS LETTERS 27th October 2005 Vol. 41 No. 22
one-sixth of the image size. Fig. 2 shows some experimental results for the two test images. Similar to the general scaling based method and the maxshift method, the coding efficiency of the proposed method decreases in comparison with compression without any ROI coding. This is mainly because the parity bits of the ROI need be stored and transmitted. However, compared with the maxshift method, the proposed method increases approximately 1–2 dB. The reason is that bit-plane shifting increases the dynamic range of the wavelet coefficients being encoded in the maxshift method. In the proposed method, the bit-plane shifting operation has not been performed, and only the parity bits of the ROI are stored and transmitted.
higher compared with the maxshift method, the PSNR of the ROI is lower. This is the main drawback of the proposed method.
a
40 compression without ROI proposed method maxshift method
Fig. 3 Simulation results of ‘Lena’ reconstructed by two methods at 0.05 bpp
PSNR, dB
35
a Proposed method Using proposed method, PSNR of whole image and ROI are 17.16 and 33.58, respectively b Maxshift method Using maxshift method, PSNR of whole image and ROI are 15.26 and 34.25, respectively
30
25
Conclusion: We propose a technique to encode the region of interest for image compression, which is based on distributed source coding theorems. Experimental results and comparison with the current maxshift method demonstrate the advantage of using Wyner-Ziv coding. We expect this idea to be valuable for future research in ROI image coding and its applications. However, there are still some questions to be considered, which will require further work; e.g. how to improve the flexibility of the proposed method is an open question.
20
15
a 45 compression without ROI proposed method maxshift method
40
35
PSNR, dB
b
Acknowledgment: The authors acknowledge the support received from the Important National Natural Science Foundation of China (no. 60432030).
30
25
# IEE 2005 Electronics Letters online no: 20052601 doi: 10.1049/el:20052601
20
15 0
0.2
0.4
0.6
0.8 1.0 1.2 bits per pixel b
1.4
1.6
1.8
2.0
Fig. 2 PSNR against bit rate (bpp) comparison between compression without ROI, proposed method, and maxshift method, for ‘Girl’ image and ‘Lena’ image a Girl b Lena
G. Ding, F. Yang, Q. Dai and W. Xu (Broadband Networks & Digital Media Laboratory of Automation Department, Tsinghua University, Beijing 100084, People’s Republic of China) E-mail:
[email protected] References 1 2
Because Wyner-Ziv decoding needs to utilise a low quality reconstructed image as ‘side information’, the proposed method has to transmit a low quality image before parity bits of the ROI. This should result in a lower quality reconstructed ROI at the low bit rate, compared with the maxshift method. Fig. 3 shows the results for the Lena image at 0.05 bpp. Though the PSNR of the whole image is
18 July 2005
3 4
Slepian, D., and Wolf, J.: ‘Noiseless coding of correlated information sources’, IEEE Trans. Inf. Theory, 1973, 19, pp. 471–480 Wyner, A., and Ziv, J.: ‘The rate-distortion function for source coding with side information at the decoder’, IEEE Trans. Inf. Theory, 1976, 22, pp. 1–10 Girod, A.A.B., Rane, S., and Rebollo-Monedero, D.: ‘Distributed video coding’, Proc. IEEE, January 2005, pp. 71–83 Steinbergand, Y., and Merhav, N.: ‘On successive refinement for the Wyner-Ziv problem’, IEEE Trans. Inf. Theory, 2004, 50, (8), pp. 1636–1654
ELECTRONICS LETTERS 27th October 2005 Vol. 41 No. 22