Apart from using a high transmission rate, a remote visualization procedure can be ... visualization of a full size image even when only a small part of the full ...
Progressive-Fidelity Image Transmission for Telebrowsing: An Efficient Implementation M.F. L´opez, V.G. Ruiz, J.J. Fern´andez, I. Garc´ıa∗ Computer Architecture & Electronics Dpt. University of Almer´ıa, 04120 Almer´ıa. Spain.
Abstract In this work a progressive lossless image codec named WLPC (Wavelet based Lossless Progressive Codec) is described and compared to the lossy image compression standard P-JPEG (Progressive JPEG). An efficient multithreaded software architecture for WLPC is proposed. Objective and subjective evaluations of P-JPEG and WLPC are carried out using a set of standard images. Numerical and visual results show that during the whole progressive transmission process WLPC obtains better quality images than P-JPEG. This fact along with the capability of providing lossless reconstructions makes WLPC more appropriate for environments such as teleastronomy, telemicroscopy and telemedicine.
Keywords: SPIHT, image compression, image transmission, progressive image displaying, S+P wavelet transform.
1
Introduction
Researchers from scientific fields such as medicine, microscopy or astronomy among others, are getting more and more interested in the use of the Internet to acquire images from large equipments which are only available in a few research centers. Currently, telemedicine [2, 7], telemicroscopy [8, 14, 4, 5] and teleastronomy [9, 6] are feasible tools for the scientific community, allowing one or several remote users to control devices in real time and/or to share the images obtained from the current experiment or to consult database of images obtained from previous experiments. Independently of the research field and the instrument to be remotely controlled, one of the problem to be solved consists of finding ∗
This work was supported by the Ministry of Education of Spain (CICYT TIC99-0361)
1
tI tc Compression
tB Non-Progressive Transmission
Transmission Decompression
td
Visualization Compression
Progressive Transmission
Transmission Decompression Visualization Time
Figure 1: Timing of non-progressive and progressive transmission of images. a mechanism for transmitting large amount of information (usually large images) as fast as possible. Apart from using a high transmission rate, a remote visualization procedure can be accelerated by using an appropriate compressor. Most of the telemetry and remote control systems in telemedicine, teleastronomy or telemicroscopy use the standard JPEG compressor/decompressor. An image transmission is called non-progressive when every element of the transmitted data reaching the receiver only contains information about a small piece of the image. The image at the receiver is usually visualized by rows or by columns, so in the middle of the transmission only half of the image can be visualized at the receiver. On the contrary, a progressive image transmission allows visualization of a full size image even when only a small part of the full information has reached the receiver; this full size image is an approximate version of the original image. The greater the amount of data received the more similar the decompressed image is to the original one. In a progressive image transmission, every element of the transmitted data contains information for refining the image globally, instead of locally as a non-progressive transmitter does. Figure 1 shows timings for a progressive and a non-progressive image transmission. In this model it was supposed that: (a) compression and decompression procedures can be overlapped with the image transmission procedure; (b) the time spent to compress N bits is shorter than the time spent to transmit N bits; (c) times and rates of compression for both the progressive and the non-progressive compressors are the same. Note that the compression/decompression procedures are performed in several stages along the transmission to minimize the total transmission time and the size of memory buffers. It can be seen that using a non-progressive compressor, a full size image can only be visualized after tnp = tc + tI + td seconds, where tc is the compressor latency; i.e. the time the compressor spent 2
to fill the sending buffer, tI the time for the full compressed image to reach the receiver and td is the decompressor latency. However, when a progressive compressor is used, a preliminary version of the full size image can be visualized at the receiver after tp = tc + tB + td seconds, where tB is the time spends to transmit the sending buffer. It should be noticed that after tnp seconds both systems (non-progressive and progressive) are able to visualize the original image at the receiver. However, during this time the progressive model shows a sequence of
I B
(where I is the size of the compressed image and B the size
of the buffer) full size images whose similarity to the original image increases along the sequence. Progressive image transmission is a desirable feature because it provides the capability of interrupting a transmission when the quality of an image has reached an acceptable level or when the user decides that the received image is not interesting. Similarly, the receiver can make a decision based on a rough reproduction of the image and to interact with the remote device for obtaining a new image, or for recovering a higher fidelity (or even exact) replica of only a part of the image. Nowadays, in the world of image transmission, JPEG is the most popular coding scheme for continuous-tone still images. In such environments, a progressive operating mode of JPEG (PJPEG) [13] is more suitable when a preliminary version of the transmitted image must arrive at the receiver as soon as possible. However, P-JPEG is a lossy compressor; so when the user needs to recover the exact original image, it must be transmitted again using a lossless operating mode of the compressor (JPEG or whatever). In this paper P-JPEG as a progressive compressor is compared to a Wavelet based Lossless Progressive Codec (WLPC). In Section 2 a brief description of the WLPC is presented. Section 3 is devoted to specify the software architecture for WLPC and Section 4 and 5 present results and conclusions.
2
Multiresolution Progressive Image Transmission
Transformation is a key stage in a wide spectrum of image compression techniques. Image transforming provides a spectral representation of the information in the image so that, in general, most of the information is contained in relatively few coefficients. Wavelet transforms have recently arisen as a powerful mathematical tool in many image processing applications, and specifically in image compression. One of the main distinctive features of the wavelet transform is its ability to provide a multiresolution spectral decomposition of the image in terms of a certain kernel function. This means that a wavelet decomposition allows us to build
3
variable resolution reconstructions where the most important objects of the image can be represented with higher resolution. On the other hand, the kernel function (in contrast to the waves in the Fourier or cosine transform) may be defined to be more suitable for representing visual information. Transform-based progressive image transmission systems are structured into two main blocks. First, the transformation stage, which plays an important role in decorrelating and compacting the image data by using the spectral decomposition. Second, a progressive-fidelity encoding stage, which is applied to the transform coefficients to create an efficient code-stream in such a way that the image quality is gradually improved until a perfect reconstruction is obtained.
2.1
Discrete Wavelet Transforms. The S+P Transform
In this work, the discrete wavelet transform known as the S+P transform (Sequential transform + Prediction) [3, 10, 12] has been used. This is an integer multiresolution transform which is based on the S transform and a subsequent prediction stage. The S+P transform proves to be very suitable for progressive transmission because the relevant information can be described with a few transform coefficients [12]. Moreover, the transformation can be efficiently performed due to its linear computational complexity. The S transform (stands for Sequential transform) is an integer discrete wavelet transform which is similar to the Haar multiresolution image representation [1]. The S transform of a discrete signal c[n] = 0, . . . , N − 1, with an even number of samples N is defined as the pair of sequences [12]: l[n] =
b c[2n]+c[2n+1] c, 2
n = 0, . . . , N/2 − 1
(1)
h[n] = c[2n] − c[2n + 1], n = 0, . . . , N/2 − 1 where b.c represents downward truncation. Similarly, the inverse S transform is: c[2n]
= l[n] + b h[n]+1 c, n = 0, . . . , N/2 − 1 2
c[2n + 1] =
c[2n] − h[n],
n = 0, . . . , N/2 − 1
In the framework of the filter bank theory, the S transformation corresponds to a subband decomposition [15], where l[n] and h[n] are the outputs of a low pass filter and a high pass filter, respectively, applied to the original sequence c[n]. The two dimensional (2D) transform is computed by applying the transformation (1) sequentially to the rows and columns of the image, as shown in Fig 2. As a consequence, the image is decomposed into quadrants, corresponding to four subsequent subbands (low-low or ll, low-high or lh, high-low or hl, and high-high or hh). The ll subband typically presents the most energy. As energy compaction is central to compression processing, the subband decomposition mechanism is applied repeatedly to 4
...
original image
l
ll
hl
h
lh
hh
columns transformed
hl
lh
rows transformed
hh
pyramid structure
Figure 2: Construction of an image multiresolution pyramid. the ll subband for a number of levels, as depicted in Figure 2, resulting in a hierarchical pyramid structure [12]. The S transform is simple and can be very efficiently computed. Nevertheless, it still leaves a residual correlation among the high pass coefficients. Predictive coding then arises as a very suitable method to further decorrelate the high pass subbands obtained by the S-transform [12]. As a consequence, the S+P transform is obtained as the combination of the S-transform and a subsequent prediction stage as described in [12].
2.2
Coding and Transmission of the Coefficients. SPIHT
The underlying idea for progressive image transmission is to transmit the most important information first. The importance of a piece of information is usually evaluated in terms of a distortion measure of the reconstructed image. In wavelet-based progressive image transmission, the information to be transmitted is the set of spectral coefficients provided by the wavelet transform. The mean-squared error (MSE) is typically used as the distortion measure. Transmitting the wavelet coefficients according to a decreasing order of magnitude yields the minimum MSE for the reconstructed image [12]. Nevertheless, the use of a bit-plane ordering transmission strategy has a similar behaviour —in terms of reconstruction distortion— and, furthermore, presents two additional interesting properties: 1. There is no need to order the coefficients according to their magnitude before transmitting, 5
and thus the latency to start the transmission is reduced. 2. The reconstruction distortion is further minimized since the most significant bits are transmitted first. On the other hand, the wavelet multiresolution representation has the property that there exists a spatial self-similarity relationship among the coefficients at different levels and subbands in the hierarchical pyramid structure of the wavelet decomposition (see Figure 2). SPIHT (Set Partitioning In Hierarchical Trees) is an efficient compression algorithm that takes advantage of this spatial similarity to efficiently compress the coefficients in a bit-plane ordering [11].
3
Software architecture for progressive transmission
The software architecture implementing WLPC is oriented to the transmission of raw images obtained with an instrument which is remotely controlled. Our implementation of the WLPC is based on the well-known client-server software architecture. There is a server process running the WPLC coder at the host acquiring the images and, at the remote host, a client process running the WLPC decoder is executed by the user. The server process accepts requests from the remote users and sends the progressively compressed images. The WLPC coder executes two sequential steps: (1) the S+P transform of the whole image and (2) the SPIHT coding and bit-stream transmission. Most of the latency of the WLPC coder is contributed by the first step. In the second step, the SPIHT coding makes an intensive use of CPU whereas the bit-stream transmission is an I/O process and, both of them can run concurrently; so they may be implemented by means of two threads for an optimal execution. These threads communicate with each other through a memory buffer. The SPIHT thread writes data in the buffer and the transmitting thread reads from the buffer and writes on the client socket. The client process executes three concurrent tasks: (1) receiving the compressed bit-stream (a I/O task), (2) SPIHT decoding (a CPU task) and (3) S+P inverse transform and displaying. All of them are implemented with threads which communicate one another through two memory buffers. The third task is executed in a iterative way because it is desirable to make as many image reconstructions as possible while data are being received and decoded. Note that most of the latency of the WLPC decoder is contributed by one iteration of the third task. As this architecture is based on client-server paradigm, one parameter to be studied is the size of the data packet. In general, big sized packets minimize the total transmission time, although increase the latency of initial lossy progressive visualization. Small sized packets minimize improve this initial
6
visualization but increase the total transmission time for lossless requests. A variable packet size tuned for the particular application and the characteristics of the communication channel would be the most suitable one.
4
Results
This section is intended to compare P-JPEG and WLPC as efficient tools for transmitting images. Comparisons were made using a set of standard images1 in the compression field (all the images had 512 × 512 pixels with 8 bpp). In Figure 3 PSNR values have been depicted, for both progressive compressors, as a function of the time elapsed since the client performs a request. Evaluations of these progressive transmission methods were the result of simulating an image transmission through a network with a constant bit-rate of 10 Kbits/s and a channel latency of 0.5 s. The latency of the WLPC compressor/decompressor was 0.7 s. For P-JPEG, the Linux’s cjpeg/djpeg implementation was used, and the compression and decompression times were 0.5 s and 0.2 s respectively. The P-JPEG quality was set to 36 for minimizing the initial visualization time. Graphs in Figure 3 refer to four different standard images (lena, barb, boats and goldhill). It can be seen that the delay for receiving the first group of bits is longer for WLPC than for P-JPEG. However since the very beginning of the data reception, values of PSNR from WLPC are greater than those from P-JPEG and this fact remains along the progressive transmission. Also notice that the PSNR values of WLPC always evolve in an upward tend while those obtained from P-JPEG not always increase but maintain the same value for some time or even decrease. This behaviour is possibly due to the row by row (multi-scan) way of progressive transmission which used by P-JPEG. Although the objective comparison between WLPC and P-JPEG using the PSNR values shows that WLPC slightly outperforms P-JPEG in all the cases, a subjective comparison provides a better overview of the advantages of a WLPC transmission. In Figure 4, the lena image is shown at three different stages of the progressive transmission using P-JPEG and WLPC. Images from top to bottom are those obtained after 2.2, 2.8 and 5.13 s, respectively. To compare the quality of the image, a zoom at the right side eye of lena has been drawn. Notice that using P-JPEG to visualize the full size image takes some time; so at the beginning of the transmission no information about the bottom part of the image is obtained. Contrarily, WLPC is able to reconstruct a variable resolution image when only the first few bits are received. 1
http://www.ace.ual.es/˜vruiz/test images.
7
lena
barb
40
32 30 28
30
PSNR (dB)
PSNR (dB)
35
25 20
26 24 22 20 18
15
WLPC P-JPEG
10 0
2
4
6
8
10
12
WLPC P-JPEG
16 14 14
16
0
2
4
6
time (s)
boats
10
12
14
16
14
16
goldhill
34 32 30 28 26 24 22 20 18 16 14
32 30 28 PSNR (dB)
PSNR (dB)
8 time (s)
26 24 22 20 18
WLPC P-JPEG 0
2
4
6
8
10
12
WLPC P-JPEG
16 14 14
16
0
time (s)
2
4
6
8
10
12
time (s)
Figure 3: Objective comparison between P-JPEG and WLPC: PSNR (dB) vs transmission time (in seconds) for four different images.
5
Conclusions and Future Work
Results have shown that WLPC has three main advantages compared to P-JPEG: (1) from the very beginning of the transmission, a full size image can be displayed; (2) better PSNR values are obtained; (3) at the end of the transmission the exact image is recovered. However, from our point of view, an improved version of WLPC can be addressed to reduce the latency of the codec by overlapping the computation of the wavelet transform and the encoding/decoding procedure. Taking in account that the bottleneck is the channel, better compression ratios and so faster transmission can be obtained when more powerful progressive encoders are used.
References [1] E.H. Adelson, E. Simoncelli, and R. Hingorani. Orthogonal Pyramid Transforms for Image Coding. Proc. SPIE, 845:50–58, 1987.
8
Figure 4: Subjective comparison between P-JPEG (left) and WLPC (right) at different transmission instants (2.2, 2.8 and 5.13 s) for the lena image.
9
[2] R. Bashshur. On the definition and evaluation of telemedicine. Telemedicine, 1:19–30, 1995. [3] H. Blume and A. Fand. Medical Imaging III: Image Capture and Display. Proc. SPIE of Medical Imaging, 1091:2–2, 1989. [4] K. Burton and D.L. Farkas. Net Progress. Nature, 391:540–541, 1998. [5] M. Hadida-Hassan, S.J. Young, S.T. Peltier, M. Wong, S. Lamont, and M.H. Ellisman. Webbased Telemicroscopy. J. Struc. Biol., 125:235–245, 1998. [6] R. Kibrick, A. Conrad, and A. Perala. Through the far Looking Glass: Colaborative Remote Observing with the W.M. Keck Observatory. ACM Interactions Magazine, 3:32–39, 1998. [7] F.K. Mathiesen. Web Technology - The Future of Teleradiology? Comp. Meth. Prog. Biomed., 66(1):87–90, 2001. [8] R. Maturo, G. Kath, R. Zeigler, and P. Meechan. Control of a Remote Microscope Over the Internet. Biocomputing, 22(6), 1997. [9] G.M. Olson, D.E. Atkins, R. Clauer, T.A. Finholt, F. Jahanian, T.L. Killen, A. Prakash, and T. Weymouth. The Upper Atmospheric Research Collaboratory. ACM Interactions Magazine, 3:48–55, 1998. [10] A. Said and W.A. Pearlman. Reversible Image Compression via Multiresolution Representation and Predictive Coding. In Proc. SPIE Conf. Visual Comm. and Image Proc., volume 2094, pages 664–674, 1993. [11] A. Said and W.A. Pearlman. A New Fast and Efficient Image Codec Based on Set Partitioning in Hierarchical Trees. IEEE Trans. Circuits and Syst. for Video Technol., 6:243–250, 1996. [12] A. Said and W.A. Pearlman. An Image Multiresolution Representation for Lossless and Lossy Compression. IEEE Tran. on Image Proc., 5(9):1303–1310, 1996. [13] G.K. Wallace. The JPEG Still Picture Compression Standard. Communications of the ACM, 34(4):30–40, 1991. [14] G. Wolf, D. Petersen, M. Dietel, and I. Petersen. Telemicroscopy via the Internet. Nature, 391:613–614, 1998. [15] J.W. Woods and S.D. O’Neil. Subband Coding of Images. IEEE Trans. Acoustics, Speech and Signal Proc. (ASSP), 34(5):1278–1288, 1991. 10