A Modified-Set Partitioning in Hierarchical Trees Algorithm for Real ...

12 downloads 0 Views 296KB Size Report
In addition, the MPEG-4 standard uses it in the visual ... set partitioning in hierarchical trees (SPIHT) algorithm is well known for its simplicity and efficiency.
ISSN 1064-2269, Journal of Communications Technology and Electronics, 2008, Vol. 53, No. 6, pp. 642–650. © Pleiades Publishing, Inc., 2008. Published in Russian in Radiotekhnika i Elektronika, 2008, Vol. 53, No. 6 pp. 676–685.

THEORY AND METHODS OF SIGNAL PROCESSING

A Modified-Set Partitioning in Hierarchical Trees Algorithm for Real-Time Image Compression1 M. Aktera, M. B. I. Reazb, F. Mohd-Yasina, and F. Choonga a

b

Faculty of Engineering, Multimedia University, 63100 Cyberjaya, Selangor, Malaysia Department of Electrical and Computer Engineering, International Islamic University Malaysia, 53100, Gombak, Kuala Lumpur, Malaysia e-mail: [email protected] Received January 31, 2007

Abstract—Among all algorithms based on wavelet transform and zerotree quantization, Said and Pearlman’s set partitioning in hierarchical trees (SPIHT) algorithm is well known for its simplicity and efficiency. SPIHT’s high memory requirement is a major drawback to hardware implementation. In this study, we present a modification of SPIHT named modified SPIHT (MSPIHT), which requires less execution time at a low bit rate and less working memory than SPIHT. The MSPIHT coding algorithm is modified with the use of one list to store the coordinates of wavelet coefficients instead of three lists of SPIHT; defines two terms, number of error bits and absolute zerotree; and merges the sorting pass and the refinement pass together as one scan pass. Comparison of MSPIHT with SPIHT on different test image shows that MSPIHT reduces execution time at most 7 times for coding a 512 × 512 grayscale image; reduces execution time at most 11 times at a low bit rate; saves at least 0.5625 MB of memory; and reduces minor peak signal-to noise ratio (PSNR) values, thereby making it highly promising for real-time and memory limited mobile communications. PACS numbers: 07.05.Pj, 42.30.-d DOI: 10.1134/S1064226908060065 1

INTRODUCTION

To save transmission time or storage space of an image, nowadays, many people widely use an image compression technique to transmit or to store an image. Among various compression techniques, “transform coding” is a favorite technique. In the past decade, the discrete cosine transform (DCT) [1] has been the most popular transform because it provides optimal performance and can be implemented at a reasonable cost. However, a discrete wavelet transform (DWT) [2] has been widely used recently because of its ability to solve the blocking effect introduced by DCT and its suitability in multiresolution analysis. By taking advantage of DWT, the “zerotree” coding technique has shown that it is not only computationally simple but also very effective in compression. In addition, its embedded coding property is beneficial to progressive transmission. Xiong et al. [3] developed a very efficient spacefrequency quantization scheme that uses a rate distortion criterion to jointly optimize zerotree quantization and scalar frequency quantization. In a related work, the original “zerotree” algorithm is called an embedded zerotree wavelet (EZW), introduced by Shapiro [4]. Said and Pearlman [5] further enhanced the performance of EZW by presenting a more efficient and faster implementation called set par1 The

text was submitted by the authors in English.

titioning in hierarchical trees (SPIHT). SPIHT is one of the best algorithms in terms of the peak signal-to-noise ratio (PSNR) and execution time. Knipe et al. [6] proposed a vector quantization based SPIHT named VSPIHT. Later, Mukherjee and Mitra [7] broadly studied VSPIHT by using successive refinement Voronoi lattice vector quantization, thus yielding a good quality of the reconstructed image in terms of the peak signalto-noise ratio (PNSR). However, a higher complexity of the vector quantizer (VQ) design is needed and more memory space in the system is required to store the codebook for the cost of the quality of the output image. In addition, the MPEG-4 standard uses it in the visual texture mode [8] because an EZW based coder can provide efficient coding for still images and visual textures. In addition, EZW based coder can provide spatial and quality scalabilities, which are the desired functionalities of the MPEG-4 standard [9] and JPEG2000 standard [10–12]. With the understanding of the usefulness of this coder, the classified zerotree wavelet (CZW) [13], the listless zerotree coder (LZC) [14], and the low-memory zerotree coder (LMZC) [15], coding algorithms have been developed on the basis of the zerotree theory [16]. To reduce the memory requirement, one method is to use other coding algorithms, for examples, embedded block coding with optimized truncation of the embedded bit streams (EBCOT) [17] and space-fre-

642

A MODIFIED-SET PARTITIONING IN HIERARCHICAL TREES ALGORITHM

quency decomposition (SFD) [18]. In EBCOT, each subband is partitioned into relatively small blocks of samples, which are called code blocks. EBCOT generates a separate highly scalable bit stream for each code block. Since it is possible to independently compress relatively small code blocks, EBCOT needs only a small amount of buffer for rate–distortion optimization. However, a high amount of buffer is required for a global optimization because a rate control scheme has to decide which coding passes of a code block should be included into the output bit stream [11]. If not, the rate–distortion information for each truncation point of coding passes for each code block should at least be saved. In SFD, the wavelet tree is trimmed so that the working memory can be reduced. However, these two algorithms generally are based on a rate–distortion sense; therefore, modest computational complexity is required to find the optimal operating point on the rate–distortion curve. However, the previous zerotree coders generally require some lists to store the states of coefficients and the coordinates of partitioning sets during coding, thus leading to high memory requirement and high cost in terms of hardware realization. For example, SPIHT requires three lists: the list of significant pixels (LSP), list of insignificant pixels (LIP), and list of significant sets (LIS). For a 512 × 512 grayscale image, each entry of the lists requires at least 18 bits to store the coordinates, where 9 bits are required to represent the column value in the range 0 to 512 and 9 bits are needed for 512 rows. Since the total number of list entries is approximately twice the total number of coefficients, the total memory required is 1.125 MB. If integer variables with 4 bytes are used to store the coordinates, about 2 MB will be required. In addition to this high memory requirement, another drawback of SPIHT is that the number of entries increases as the coded bit rate increases. Therefore, enough memory should be prepared for the application of coding at various bit rates. SPIHT uses three temporary lists (LIP, LSP, and LIS) and represents a powerful way to improve the codec’s efficiency. The major drawback of SPHIT is its high memory consumption. In addition, the elements in the lists are often inserted and deleted during coding process. These frequent operations greatly increase the coding time with the expansion of the lists. In order to implementation the SPIHT algorithm, a successful lowmemory solution must be provided. Hence, a modifiedSPIHT (MSPIHT) algorithm has been proposed for this research work to solve the drawbacks of the original SPIHT algorithm. In the proposed MSPIHT algorithm, two concepts are used in this study to modify the SPIHT algorithm: the number of error bits and absolute zerotree. The number of error bits defined before encoding indicates the number of bits that are omitted finally. During implementation, when a wavelet coefficient is found to be significant or insignificant, its last error bits are omitted, while the rest of the bits are outputted. In addition, the coordinates of the coefficient

643

are not stored in the LSP and LIP for further processing. Therefore, MSPIHT becomes the low-memory solution to the SPIHT algorithm by eliminating the temporary lists LSP and LIP. Moreover, in SPIHT coding, the coordinates of the zerotree roots are stored in the LIS and not removed, thereby resulting in rapid expansion of the LIS. The introduction of the absolute zerotree is a good solution to the rapid extension of LIS. For a zerotree, if the magnitudes of all its descendants are lower than a fixed value, it becomes an absolute zerotree and can never be significant in the last scan passes. Their coordinates are not stored in the LIS. Obviously, the length of the LIS is shortened. As a result, the processing time decreases and meets the real-time requirements. The same execution path is followed by the decoder, and the inverse DWT reconstructs the original image from decoded data. The organization of this study is as follows. The following section briefly describes the algorithm of SPIHT for completeness with wavelet transform and bit-plane coding. Section 3 addresses the coding algorithm of MSPIHT. In Section 4, implementation and experimental results are shown. Finally, we make concluding remarks in Section 5. CODING METHODOLOGY The SPIHT algorithm is very efficient in transmission of ordering information and essentially involves a scalar quantization operation. The essence of the set portioning is to first classify the elemental coding units on the basis of their magnitude and then to quantize them in a successive refinement framework. The elemental coding unites are scalar wavelet coefficients. Wavelet Transformed Images While embedded zerotree like algorithms are applied, a wavelet transform is performed on the image. The result is a multiscale representation. The transform reduces the correlation between neighboring pixels. The energy of the original image is concentrated in the lowest frequency band of the transformed image. Additionally, self-similarities between different scales, which result from the recursive application of the wavelet transform step to the low frequency band, can be observed. Consequently, on the basis of these facts, good compression performance can be achieved if the first-transmitted coefficients represent most of the image energy. The 2D fast wavelet transform of a discretely sampled image is computed via the same scheme as used in the 1D case. First, each row of the image array undergoes decomposition, resulting in an image whose horizontal resolution is reduced by a factor of 2 and whose scale is doubled. The high-pass (wavelet filter) component of the decomposition characterizes the high-frequency information with horizontal orientation. Next, the high-pass and low-pass subimages obtained by the row decomposition are each separately filtered colum-

JOURNAL OF COMMUNICATIONS TECHNOLOGY AND ELECTRONICS

Vol. 53

No. 6

2008

644

AKTER et al. Bit Coefficient

Magnitude

Bit planes Fig. 1. Bit-plane coding.

nwise to obtain four subimages corresponding to lowlow-pass, low-high-pass, high-low-pass, and highhigh-pass row–column filtering. A two-band perfect reconstruction filter bank should satisfy the equations n g [ n ] = ( – 1 ) h˜ [ 1 – n ],

and

(1)

n g˜ [ n ] = ( – 1 ) h [ 1 – n ],

where g[n], h[n], g˜ [n], and h˜ [n] represent the analysis high-pass filter, the analysis low-pass filter, the synthesis high-pass filter, and the synthesis low-pass filter, respectively.

Layer4

Layer3 LL3

HL3

* Layer3

LH3

Layer2 HL2

HL1

HH3

Layer2 LH2

HH2

Layer1

LH1

HH1

Fig. 2. Examples of the SPIHT tree structure in a typical three-scale pyramidal decomposition of an image. The arrows are oriented from the parent node to its offspring.

Bit-Plane Coding The overall encoding procedure is basically a kind of bit-plane coding. Thereby, the kth bits of the coefficients constitute a bit-plane. In general, a bit-plane encoder starts coding with the most significant bit of each coefficient. When all those bits are coded, the next bit-plane is considered until the least significant bits are reached. Within a bit-plane, the bits of the coefficients with the largest magnitude come first. This ordering is illustrated in Fig. 1 as a bar chart [19]. The coefficients are shown in decreasing order from left to right. Each coefficient is represented as eight bits, where the least significant bit is in front. As a consequence, such a bitplane coding scheme encodes the important information in terms of compression efficiency first. On the other hand, the ordering information, which can scatter the compression effect, has to be stored in the system memory or transmitted via a channel. SPIHT Coder For completeness, we briefly introduce the SPIHT coding algorithm in this section. More details of SPIHT can be found in [5]. Figure 2 illustrates the wavelet tree structure of a typical three-scale pyramidal decomposition of an image. The image is generated by three stages of a 2D DWT. The notations LLi, HLi, LHi and HHi denote the output channels from the ith stage. The parent–offspring dependency for tree structures is demonstrated also. Each node has either four offspring or no offspring. The nodes without any offspring are located on layer1 (i.e., the bands HL1, LH1, and HH1), and some of them are located on the highest layer. (One of them is indicated with the * in Fig. 2.) The main idea is based on partitioning of sets, which consists of coefficients or representatives of entire subtrees. The coefficients of a wavelet-transformed image are classified in three sets: (i) the LIP of insignificant pixels which contains the coordinates of the coefficients that are insignificant with respect to current threshold th,

JOURNAL OF COMMUNICATIONS TECHNOLOGY AND ELECTRONICS

Vol. 53

No. 6

2008

A MODIFIED-SET PARTITIONING IN HIERARCHICAL TREES ALGORITHM

(ii) the LSP of significant pixels that contains the coordinates of the coefficients that are significant with respect to the threshold, and (iii) the LIS of insignificant sets that contains the coordinates of the roots of insignificant subtrees. During the compression procedure, the sets of coefficients in the LIS are refined and, if coefficients become significant, they are moved from the LIP to the LSP. We call node (i.e., transformed coefficient) C(i, j) on a coarse scale a parent. All nodes on the next finer scale with the same spatial location and with similar orientations are called children, this set being denoted as O(i, j). More precisely O(i, j) = {C(2i, 2j), C(2i, 2j + 1), C(2i + 1, 2j), C(2i + 1, 2j + 1)} except for the nodes at the highest layer (Layer4 in Fig. 2) and the lowest layer (Layer1 in Fig. 2). All nodes at all finer scales with the same spatial location and with similar orientations are called descendants, denoted D(i, j). Set L(i, j) is defined as L(i, j) = D(i, j) – O(i, j), and set H is the group of coordinates of all the tree roots (nodes in the highest layer). In addition, we refer to a node or a set as significant if the result of the significance test satisfies the equation S n( X(i, j)) ⎧ 1, = ⎨ ⎩ 0,

if

max

{ C(k, l) } ≥ 2

C ( k, l ) ∈ X ( i, j )

n

,

(2)

645

or L(i, j) in the LIS, extract significant nodes, and put them into the end of the LSP. In the refinement pass, however, another bit of precision is added to the magnitudes of codes in the LSP. We decrease the threshold by one, i.e., cut the threshold in half, and use these two passes for each half in the order of the sorting pass until the bit budget is exhausted. The algorithm addressed above does not consider the statistical dependence between adjacent nodes and between adjacent sets. To increase the coding efficiency, the significance values of 2 × 2 adjacent nodes (the nodes with the same parent) were grouped and coded as a single symbol by the arithmetic-coding algorithm. In general, the decoder duplicates the execution path of the encoder, as is the case in Shapiro’s algorithm. To ensure this behavior, the coder sends the result of a binary decision to the decoder before a branch is taken in the algorithm. Thus, all decisions of the decoder are based on the received bits. The name of the algorithm is composed of the words set and partitioning. The sets O(i, j), D(i, j), and L(i, j) have been mentioned already. Another advantage of SPIHT in comparison to the EZW algorithm is that the complete outcome of the encoder is binary. Therefore, the compression rate can be achieved exactly and arithmetic encoders for binary alphabets can be applied to further increase the compression efficiency.

otherwise

where X(i, j) represents C(i, j), D(i, j), or L(i, j). This equation indicates that, if the coefficient with maximum magnitude in a set is significant, the significance test result is 1. If D(i, j) is significant, it is partitioned into O(i, j) and L(i, j) if L(i, j) exists. If not, it is a zerotree of type A. If L(i, j) is significant, it is partitioned into {D(2i, 2j), D(2i, 2j + 1), D(2i + 1, j), and D(2i + 1, 2j + 1)} except for the coordinates in the highest layer. Otherwise, it is a zerotree of type B. If we encounter a zerotree, we code such a tree as a zerotree symbol and avoid coding all its nodes. The nodes are scanned by the order of importance, so that that no child is scanned before its parent. Therefore, scanning begins with the nodes C(i, j) for (i, j) ∈ H and the sets D(i, j) for (i, j) ∈ H. The result of the significance test for a node or for a set is coded. In addition, for each node C(i, j), if it is significant, its sign bit is coded also. The process begins with setting the LSP as an empty list, adding the coordinates (i, j) ∈ H to the LIP, adding those with descendants to the LIS as type A entries, and outputting the maximum value of n. The value of n can be obtained from n = [ log 2max { C ( i, j ) } ]. ( i, j )

(3)

Then, the following two passes, the sorting pass and refinement pass, are used for every n value. In the sorting pass, we scan each C(i, j) in the LIP and each D(i, j)

MODIFIED-SPIHT CODER In SPIHT, the use of three temporary lists is a powerful way to improve the codec’s efficiency. However, they are quite memory consuming, a circumstance that is a major drawback for the SPIHT algorithm. In addition, during coding, we often insert or delete the elements in the lists. These frequent operations will greatly increase the coding time with the expansion of the lists. In order to realize the implementation of the SPIHT algorithm in real-time for mobile communications, a successful fast and low-memory solution must be provided. In this algorithm, the sorting and refinement phase are combined as one scan pass. It is shown that coordinates of wavelet coefficients are never stored in the LSP and LIP. There are no such lists in this algorithm. In the MSPIHT algorithm, the sorting pass and the refinement pass are combined as one scan pass. Below we present two concepts, called absolute zerotree and number of error bits, to modify the original SPIHT algorithm. After wavelet decomposition, most of the significant coefficients are concentrated in low-pass subbands. Moreover, the magnitudes of transform coefficients decrease rapidly with the decline of the pyramid level. Through extensive experiments, we found that the coefficients in many sets are so small that these trees will always be zerotrees before the expected compression ratio is reached. In SPIHT coding, the coordinates of

JOURNAL OF COMMUNICATIONS TECHNOLOGY AND ELECTRONICS

Vol. 53

No. 6

2008

646

AKTER et al.

these zerotree roots are stored in the LIS and will never be removed. As a result, the LIS is expanded rapidly. The introduction of an absolute zerotree is a simple solution to this problem. We have defined it to indicate the number of truncating error bits. For a zerotree, if the µ

magnitudes of all its descendants are lower than 2 e , it becomes an absolute zerotree and will never be significant in the last scan passes. Their coordinates need not be stored in the LIS. Obviously, the length of the LIS is shortened. Because we do not scan among absolute zerotrees any longer, the coding time is likewise greatly reduced. The notation 0 indicates whether a set is an absolute zerotree. As shown, the value 0 means that the set is an absolute zerotree and vice versa: Zero( X(i, j)) ⎧ 0, = ⎨ ⎩ 1,

if

max

{ C(k, l) } ≥ 2

C ( k, l ) ∈ X ( i, j )

µe

,

(4)

otherwise

where X(i, j) represents C(i, j), D(i, j), or L(i, j). The number of error bits defined before encoding will indicate the number of bits that will be omitted finally. During implementation, when a wavelet coefficient will be found as significant or insignificant, its last error bits will be omitted and the rest of the bits will be outputted. In addition, the coordinates of the coefficient will not be stored in the LSP and LIP for further processing. Therefore, MSPIHT becomes the low-memory solution of the SPIHT algorithm by eliminating the temporary list LSP and LIP. The entries inside the LIS are split up into two parts. One contains pixel coordinates (i, j) for which all the elements of set D(i, j) are insignificant, while the other contains pixel coordinates (i, j) for which all the elements of the set L(i, j) are insignificant. These are referred as type A and type B. In MSPIHT, the sorting and refinement pass are combined to reduce the execution time. Before coding, MSPIHT initializes the number of error bits. The initial value of n is set to be the index of the MSB of the largest magnitude of wavelet coefficients. The coding process starts with adding those with descendants to the LIS as type A; for each (i, j) ∈ H, the last error bits of C(i, j) will be omitted and rest of the bits of that pixel and its sign bit will be outputted. In the first coding pass, the magnitude and sign information of coefficients are coded. For simplicity, we assume that each coefficient has four offspring. The adjacent coefficients with the same parent are grouped and coded together to remove redundancy. The same scheme is used to code the significance of adjacent sets. MSPIHT begins coding the coefficients in the highest layer (i.e., LL3 in Fig. 2). For each coordinate (i, j), it codes individual coefficient C(i, j) and descendant set D(i, j). Whenever coding C(i, j), it

checks whether C(i, j) has been significant from the previous coding pass. After coding C(i, j), MSPIHT checks the significance of D(i, j) if D(i, j) is insignificant in the previous coding pass, it checks the results of Sn(D(i, j)). If Sn(D(i, j)) = 1, the set D(i, j) will be partitioned into O(i, j) and L(i, j) if L(i, j) exits; otherwise, the entry (i, j) will be removed from the LIS. For each entry of O(i, j), it will take the significance test with respect to the threshold value of each coding pass. If zero(k, 1) = 1 according to Eq. (4), it will output a 1 bit and its sign bit. If not, it will consider the entire branch as an absolute zerotree and will finish the entire branch. If L(i, j) exits and is significant, (i, j) will be moved to the end of the LIS as an entry of type B; otherwise, the entry (i, j) will be removed from the LIS. If L(i, j) is insignificant, the entry is of type B. Hereafter, it will check whether Sn(L(i, j)) is significant: If yes, then partition into set D(k, 1). For each entry (k, 1), it will output Zero(D(k, 1)). If Zero(D(k, 1)) = 1, the entry (k, 1) will be added as an entry of type A. Otherwise, entry (i, j) will be removed from the LIS. Afterward, for the next pass, the value of n is decreased by 1. The process continues until the bit budget is exhausted. IMPLEMENTATION AND EXPERIMENTAL RESULTS This section presents an implementation of the MSPIHT coder and its performance in terms of compression and visual quality. In this set of experiments, SPIHT is chosen as the zerotree based image coder and MSPIHT developed here is then applied without arithmetic coding. This algorithm shows excellent performance without vulnerable arithmetic coding among the class of zerotree-based encoders. A 26-bit header is saved to each coded file for recording the image width (9 bits), the image height (9 bits), the number of decomposition scale (4 bits), and the initial threshold value (4 bits). The pyramids constructed with a 9/7-filter bank and a 512 × 512 grayscale image with 8 bpp are used for the experiment. The six level decompositions are constructed by a symmetric extension at the image edge. The 9/7 filter shown in Table 1 is used in this work. We compare our algorithm with the original SPIHT algorithm in four criteria: PSNR values of the reconstructed image, compression ratio (CR), memory used to store the lists, and CPU time for both coding and decoding process (not including the time of wavelet decomposition). We will analyze this algorithm at a low bit rate because, if we go for a higher bit rate, the file size and processing time both will be high, a result that does not meet the real-time condition for image transmission. With the proposed algorithm, the SPIHT algorithm has been taken to compare the reconstructed image quality in terms of the PSNR. The descriptions of visual

JOURNAL OF COMMUNICATIONS TECHNOLOGY AND ELECTRONICS

Vol. 53

No. 6

2008

A MODIFIED-SET PARTITIONING IN HIERARCHICAL TREES ALGORITHM

647

Table 1. Filter Coefficients of a 9/7 Filter: h[n] = h[–n] and g[n] = g[–n]

h[n] g[n]

n=0

n=1

n=2

n=3

n=4

0.8526987 –0.7884849

0.3774027 0.4180924

–0.1106240 0.0406898

–0.0238493 –0.0645391

0.0378288 –

quality in the compressed image are evaluated on the basis of the different bit rate obtained MSPIHT algorithm. It is important to observe that the bit rates are not entropy estimated calculated from the actual size of the compressed file. With the use of progressive transmission ability, the sets of distortions are obtained from the same file. The decoder reads the same file bytes of the file calculated for the inverse subband transform and then compares the recovered image with the original. The distortion is measured from the PSNR as 2

255 PSNR = 10 log ⎛ ------------⎞ ( dB ), ⎝ MSE⎠ PSNR, dB 38

(5)

32

In this study, four very popular test images have been used to measure the rate-distortion performance of reconstructed image: the Lena image, the Goldhill image, the Boat image, and the Mandrill image. Figure 3 compares the rate-distortion performance of the proposed MSPIHT algorithm versus that of the original SPIHT. For the Mandrill and Boat images, MSPIHT’s PSNR values are less than SPIHT’s PSNR values on average by 0.27 dB and 0.57 dB, respectively. This amount of error is acceptable for real-time applications PSNR, dB 34

(a)

36 34

where MSE denotes the mean-square error between the original and reconstructed images.

(b)

32 1 2

30

1

30

2

28

28

MSPIHT SPIHT

26

26

MSPIHT SPIHT

24

24 22

22 20

0 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50

34

20

0 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50

26

(c)

(d)

25

32

24

30 1 28

23

2

22

26 24

MSPIHT SPIHT

22 20

1 2 MSPIHT SPIHT

21 20

0 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 BPP

19

0 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 BPP

Fig. 3. Comparison of different coding algorithms in terms of PSNR values for the (a) Lena, (b) Goldhill, (c) Boat, and (d) Mandrill images. JOURNAL OF COMMUNICATIONS TECHNOLOGY AND ELECTRONICS

Vol. 53

No. 6

2008

648

AKTER et al.

Table 2. Compression of file size and CR at different bit rates for the Lena image (512 × 512 8-bit grayscale image) File size in bytes

Compression ratio (CR)

Bit rate MSPIHT

SPIHT

MSPIHT

SPIHT

0.0075

246

246

1066

1066

0.0154

504

504

519

519

0.0380

1245

1245

210

210

0.0625

2048

2048

128

128

0.0879

2880

2880

91

91

in which processing time and output file size are more important, because the visual quality does not show much difference on the reconstructed image. This algorithm shows excellent performance without vulnerable arithmetic coding among the class of zerotree coders. In fact, arithmetic coding, in general, can improve the rate-distortion value by 1.0 dB, but only by about 0.4– 0.6 dB in SPIHT. The desirable advantage of the MSPIHT algorithm to SPIHT is the possibility to avoid the arithmetic coding by means of tolerating an additional 0.57 dB degradation in PSNR. In a noisy chan-

nel, this entropy coding could cause difficulties in decoding. MSPIHT uses the same strategy as SPIHT to code the states of adjacent pixels and adjacent sets together, so that the computational complexity of MSPIHT and SPIHT is the same in this point of view. In addition, the CR and coded file size of the two algorithms are equal because the number of maximum bits of the coded image is the same as that used to control the coding process. Table 2 shows the comparison of coded file size in bytes and CR for the Lena image. The file size of the coded image is in the byte range, thus allowing transmission of the entire image in only a few packets. The proposed MSPIHT algorithm has saved 0.5625 MB working memory with respect to original SPIHT, thus leading to low-cost hardware implementation. The detailed calculation is given below: T le = 2T bits ,

(6)

T bits = ( log 2 N + log 2M )NMP bytes ,

where Tle is the total number of list entries, Tbits is the total number of bits for all coefficients, N is the number of image pixels in a row, M is the number of image pix-

(‡)

(b)

(c)

(d)

Fig. 4. Images obtained via modified SPIHT without arithmetic coding: (a) rate = 0.0075 bpp, CR = 1066, PSNR = 21.56 dB; (b) rate = 0.045 bpp, CR = 178, PSNR = 26.33 dB; (c) rate = 0.0.055 bpp, CR = 145, PSNR = 27.21dB; (d) original image. JOURNAL OF COMMUNICATIONS TECHNOLOGY AND ELECTRONICS

Vol. 53

No. 6

2008

A MODIFIED-SET PARTITIONING IN HIERARCHICAL TREES ALGORITHM

els in a column, and Pbytes is the number of bytes in each pixel. Thus, the working-memory requirement to store the lists of SPIHT for a 512 × 512 8-bit grayscale image is 1.125 MB. In the case of MSPIHT, the working-memory requirement to store the only list (LIS) is 0.5625 MB. In addition to having the advantages of less working memory, the MSPIHT algorithm is more suitable than SPIHT for incorporation into a plug-in program for an Internet browser owing to faster coding and decoding. Since MSPIHT requires less memory during coding and decoding and has a comparably lower complexity of the source code, the program footprint is relatively smaller than that of SPIHT. From the user’s point of view, this feature is clearly beneficial. The main improvement of MSPIHT in comparison to SPIHT is the CPU time in seconds. Table 3 has shown the coding and decoding time for the two algorithms on the Mandrill image, Lena image, and Goldhill image. As shown in Table 3, for a 512 × 512 grayscale image of the Mandrill image, the encoder is 7 times faster, while for the Lena image, the encoder is 5 times faster than SPIHT at a bit rate of 0.0075 bpp. On the other hand, the decoder is 11 times faster for the Mandrill image and 9 times faster for the Lena image at the same bit rate. Taking advantage of the absolute zerotree structure, the MSPIHT algorithm reduces the number of entries in the LIS, thus improving the execution time for coding and decoding. It satisfies the common requirements of real-time imaging for wireless channel. MSPIHT is faster than SPIHT in CPU time; therefore, MSPIHT has less computational complexity and depends on the number of mathematical operations in the entire algorithm, including addition, subtraction, division, multiplication, and shift operations. Usually, more complex compression algorithm takes a longer time to execute. Because MSPIHT takes less time to code and decode than arithmetic-coded SPIHT, it is clear that this algorithm is less complex and perfect for real-time mobile communications. Note that the experimental image is 512 × 512, which is quite large among all usual image formats for mobile communications. Figure 4 shows the MSPIHT reconstructed image of Lena images at bit rates of 0.0075 bpp, 0.045 bpp, 0.055 bpp and the original image. As shown in Fig. 4, these images show good visual quality at these low bit rates and approximately no subject difference from the original image. We cannot see much difference in the visual quality between the original image and the reconstructed image. CONCLUSIONS In this paper, a fast, efficient, low-memory real-time image-compression algorithm has been proposed. For a 512 × 512 grayscale image, this algorithm has saved 0.5625 MB of working memory. Our experimental results for the Mandrill, Lena, and Goldhill images

649

Table 3. Comparison of CPU (Pentium IV, 1.7 GHz, 256 RAM) time in seconds at different bit rates for the following images (512 × 512 8-bit grayscale images); (a) Mandrill, (b) Lena, (c) Goldhill (a) Mandrill CPU time in seconds Bit rate

0.0075 0.0154 0.0380 0.045 0.055

MSPIHT

SPIHT

code

decode

code

decode

0.06 0.20 0.21 0.23 0.31

0.02 0.05 0.06 0.06 0.08

0.44 0.50 0.54 0.55 0.56

0.22 0.27 0.32 0.33 0.40

(b) Lena CPU time in seconds Bit rate

0.0075 0.0154 0.0380 0.045 0.055

MSPIHT

SPIHT

code

decode

code

decode

0.1094 0.2813 0.2969 0.5060 0.5581

0.0313 0.0781 0.0938 0.1509 0.2500

0.50 0.52 0.54 0.55 0.55

0.28 0.27 0.28 0.32 0.32

(c) Goldhill CPU time in seconds Bit rate

0.0075 0.0154 0.0380 0.045 0.055

MSPIHT

SPIHT

code

decode

code

decode

0.09 0.20 0.23 0.52 0.53

0.02 0.06 0.08 0.23 0.23

0.38 0.40 0.44 0.55 0.60

0.28 0.27 0.33 0.32 0.39

show that the algorithm is fast in terms of CPU time at a low bit rate, a result that satisfies the requirement for real-time mobile communications. The encoder is at most 7 times faster than that of SPIHT, and the decoder is at most 11 times faster than that of SPIHT at 0.0075 bpp. Moreover, the proposed coder preserves most of the merits of SPIHT (such as simple computation, effective compression, and embedded coding), thereby showing great promise for real-time applications. The solution in this paper not only is suited for the implementation of the SPIHT algorithm but also can be used in the majority of applications where the acquisition, compression, and storage of images are needed.

JOURNAL OF COMMUNICATIONS TECHNOLOGY AND ELECTRONICS

Vol. 53

No. 6

2008

650

AKTER et al.

REFERENCES 1. C. N. Taylor and S. Dey, in Proc. IEEE Int. Conf. on Communications, Helsinki, 2001 (IEEE, New York, 2001), Vol. 6, p. 1925. 2. J. Ritter and P. Molitor, in Proc. 2001 ACM/SIGDA 9-Th Int. Symp. Field Programmable Gate Arrays, Monterey, 2001 (ACM Press, New York, 2001), p. 201. 3. Z. Xiong, C. Herley, K. Ramchandran, and M. T. Orchard, in Proc. IEEE Int. Conf. Image Processing, Washington, DC, 1995 (IEEE, New York, 1995), p. 614. 4. J. M. Shapiro, IEEE Trans. Signal Process. 41, 3445 (1993). 5. A. Said and W. A. Pearlman, IEEE Trans. Circuits Syst. Video Technol. 6, 243 (1996). 6. J. Knipe, X. Li, and B. Han, IEEE Trans. Signal Process. 46, 239 (1998). 7. D. Mukherjee and S. K. Mitra, in Proc. IEEE Int. Conf. Image Processing, Chicago, 1998 (IEEE, New York, 1998), Vol. 1, p. 107. 8. Overview of the MPEG-4 Version 1 Standard. /ISO/IEC JTC1/SC29/WG11 N1909 (MPEG97), Ed. by R. Koenen (Int. Org. for Standartizat. and Int. ElectroTechnic. Commiss., Geneva, 1997). 9. T. Sikora, IEEE Trans. Circuits Syst. Video Technol. 7, 19 (1997).

10. ISO/IEC, ISO/IEC 15 444-1, Information Technology JPEG2000 Image Coding System (Int. Org. for Standardization and Int. Electrotechn. Comm., Geneva, 2000). 11. M. D. Adams, The JPEG-2000 Still Image Compression Standard. /ISO/IEC JTC1/SC29/WG1 N2412 (Int. Org. for Standardization & Int. Electrotechn. Comm., Geneva, 2001). 12. C. Christopoulos, A. Skodras, and T. Ebrahimi, IEEE Trans. Commun. Electron. 46, 1103 (2000). 13. T. Kim, S. Choi, R. E. Van Dyck, and N. K. Bose, IEEE Trans. Circuits Syst. Video Technol. 11, 1022 (2001). 14. W. K. Lin and N. Burgess, in Proc. Information, Decision, and Control Conf. (ICD-99), Adelaide, 8–10 Feb. 1999 (IEEE, New York, 1999), p. 91. 15. C.-Y. Su and B.-F. Wu, IEEE Trans. Image Process. 12, 271 (2003). 16. J. Li and J. S. Jin, Electron. Lett. 33, 1305 (1997). 17. D. Taubman, IEEE Trans. Image Process. 9, 1158 (2000). 18. Z. Xiong, K. Ramchandran, and M. T. Orchard, IEEE Trans. Image Process. 6, 677 (1997). 19. J. Ritter, PhD Thesis (Martin Luther University, HalleWittenberg, 2002); http://deposit.ddb.de/cgi-bin/dokserv?idn=967407710

JOURNAL OF COMMUNICATIONS TECHNOLOGY AND ELECTRONICS

Vol. 53

No. 6

2008

Suggest Documents