A Strategic Decomposition for Adaptive Image ... - Semantic Scholar

JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 24, 691-707 (2008)

A Strategic Decomposition for Adaptive Image Transmission RONG-CHI CHANG, TIMOTHY K. SHIH* AND HUI-HUANG HSU* Department of Digital Media Design Asia University Taichung, 413 Taiwan E-mail: [email protected] * Department of Computer Science and Information Engineering Tamkang University Tamsui, 151 Taiwan E-mail: {tshih; hhsu}@cs.tku.edu.tw Progressive image transmission (PIT) transmits the most significant portion of a picture, followed by its less important portions. The mechanism can be used in Webbased applications while users are browsing images. However, most PIT methods use the same pixel interpolation scheme for the entire picture, without considering the differences among image blocks. This paper analyzes the efficiency of pixel interpolation schemes and test several decomposition mechanisms. The contribution results in an adaptive image transmission scheme, which takes the differences of picture portions into consideration. Moreover, this study tested 200 pictures in different categories and parameters. In consequence, the overall bit rates can be reduced significantly with good PSNR values and user satisfaction. The visual result is superior to progressive JPEG on both objective (quantitative) and subjective (human) measures. An error recovery procedure is also implemented in case that the transmitted pictures need to be fully recovered. Keywords: progressive image transmission, bit plane method, pixel interpolation, image coding, network applications

1. INTRODUCTION Browsing documents and pictures on the Internet has become a common activity. However, sometimes, users are frustrated when downloading high quality pictures due to a lack of a high speed connection. Fast downloading of high quality images is of increasing importance in some business applications, including photo agencies, geographical information systems, medical databases, distance learning, and even real estate. Often, the image presented is not specifically the image the user wanted. It is necessary to allow the users to have a glance at pictures. If the image is not important, the user can save time by aborting the transmission and jumping to the next item. Otherwise, it is time consuming to wait for pictures. Fortunately, this problem can be alleviated by Progressive Image Transmission (PIT) techniques. In general, a PIT scheme divides the original image into several parts. The sender transmits the image to the receiver via different stages; and the receiver has to combine the data from all stages to recover the image from initially blurred to progressively clear. If the reconstructed image is good enough for a user to decide to abort, the receiver can interrupt the transmission. The primary advanReceived March 13, 2006; revised July 11 & September 5, 2006; accepted October 26, 2006. Communicated by Shih-Fu Chang.

691

692

RONG-CHI CHANG, TIMOTHY K. SHIH AND HUI-HUANG HSU

tage of PIT is that the gross structural information of the image appears immediately at the beginning of transmission so that it is possible for the user to make a decision on whether further transmission is necessary. Available PIT mechanisms and systems can be categorized into transform domain [1], spatial domain [2], and pyramid-structured progressive transmission [8]. In the transform domain, an image undergoes block compression and the transformed coefficients are transmitted progressively in a relative importance order (e.g., Progressive JPEG). Alternately, a germinal and instinctive method for progressive image transmission in the spatial domain is the Bit Plane Method (BPM) [4, 10]. In such scheme, the final transmitted image is the same as the original. However, its high transmission bit rate is a major disadvantage of BPM. Due to the drawback of BPM, lossy PIT techniques have received more attention. In our earlier approach [3], we improved the BPM method by color guessing, to provide a fast PIT scheme. The Guessing by Neighbors (GBN) method [3] uses interleaved pixels for transmission. Fifty percent of the pixels are transmitted while the other fifty percent were “guessed”. The results show that, the average transmission bit rate is lower than BPM, with one hundred percent of accuracy after an error recovery procedure. In this paper, a new method called Adaptive Image Transmission (AIT) is proposed based on the strategic decomposition of an image. This new method, Adaptive Image Transmission (AIT) scheme, includes adjustable mechanism based on the size of pixel blocks. The experiments were conducted with a complete test on different types of images (i.e., cartoon drawings, paintings, and photos), on multiple sizes of pixel blocks, and on combination of transmission orders. The objective (quantitative) [6] and subjective (qualitative) measurements are used to evaluate and compare our results with BPM, progressive JPEG, and our earlier method (i.e., GBN) [3]. The final results show that, with different strategies, the new method can be used as an adaptive mechanism, which has a very low bit rate and good PSNR values. The experiment result of the AIT method is better than BPM, progressive JPEG (PJPEG), and GBN. An AIT tool is employed for subjective user testing. Practical experience from users shows that the proposed method is very effective and can be used in Web browsers. The rest of this paper is organized as follows: Before describing the proposed method, we will briefly review some frequently discussed methods in our experiments. Then, we discuss our proposed method, AIT, with different pixel interpolation and decomposition mechanisms. The performance optimization of our PIT method will be given and discussed in section 4. Detailed results and analysis are given in section 5 before our conclusions.

2. RELATED WORK This section reviews three schemes which are used in the experiments. First, the BPM scheme is introduced to explore the simple but essential concept of progressive image transmission algorithm. Secondly, the Progressive JPEG scheme is provided to explicate the important ideas of compression algorithms. Finally, the pixel interpolation scheme based on the GBN scheme is offered to elaborate on an effective PIT algorithm that can be applied dynamically to different portions of a picture.

A STRATEGIC DECOMPOSITION FOR ADAPTIVE IMAGE TRANSMISSION

693

2.1 The BPM Scheme Bit Plane Method (BPM) is a simple method, which does not rely on any encoding processes or complex transmission algorithms. The transmitter transmits one bit for each pixel in each stage and the transmitted bits are arranged from the most significant bit (MSB) to the least significant bit (LSB). The receiver receives either a ‘0’ or a ‘1’ for each pixel at each run and reconstructs each pixel using the median as the predicted value [3]. The main idea of BPM is to transmit each bit-plane to the receiver progressively. In these bit-planes, Plane 7 consists of all the MSBs that give the image its main visual sense, whereas Plane 0 consists of all the LSBs of the image. Hence Plane 7 should be the first bit-plane to be transmitted to the receiver. If a clearer image is required, the successive bit-planes (i.e., Plane 6, Plane 5, etc.) can be transmitted to the receiver in later stages. After eight stages, the whole image will be transmitted to the receiver completely. 2.2 The Progressive JPEG Scheme A traditional JPEG file is stored as one top-to-bottom scan line of the image. However, the progressive JPEG scheme divides the image into multiple stages of scans. Subsequent scans can gradually improve the visual quality of a picture. Each scan adds to the data already transmitted such that the total storage requirement is roughly the same as a traditional JPEG image. In general, in the progressive JPEG scheme, an image is divided into non-overlapping 8 × 8 square blocks and each block is transformed with the Discrete Cosine Transform (DCT). The quantization for each of the 64 DCT coefficients is specified in a quantization table (also called the codebook), which is used by all blocks. All quantized DCT coefficients of a block are divided into 10 transmission stages according to their importance orders [5]. The sender transmits these DCT coefficients of each stage for each block individually. Then, the receiver reconstructs the image based on the received coefficients and the quantization table. The order of the transmitted DCT coefficient is based on the Zig-Zag scan order. The advantage of Progressive JPEG is that it allows viewers to see a rough idea of what the actual image looks like and gradually improves the quality. The disadvantage is that each scan takes about the same amount of computation to display as a whole traditional JPEG file would. According to that, progressive JPEG only works well if one has a decoder that’s fast compared to the communication link. 2.3 The GBN Method In our previous studies [3], we proposed the Guessing by Neighbors (GBN) method which is based on the interleaving strategy, with an extension to a two-dimensional stream topology. This GBN method divides an image into two streams, the even stream and the odd stream. According to our definition, an even stream contains a sequence of non-adjacent pixels where the sum of the two indices in each pixel is an even number. On the other hand, an odd stream has odd index sums. The New GBN method is proposed to provide a multiple resolution for each pixel box. This new method is structured based on the GBN scheme. A similar error recovery procedure is also employed in the newly proposed method. Table 1 illustrates a pixel box

694


Table 1. A pixel box of size (n, m). P1,1 P2,1 … Pi,1 … Pn,1

P1,2 … … … … Pn,2

… … … … … …

P1,j … … Pi,j … Pn,j

… … … … … …

P1,m P2,m … Pi,m … Pn,m

of size n by m. In a normal situation, n is equal to m. The pixels transmitted are P1,1, P1,m, Pn,1, and Pn,m (i.e., the pixels on the four corners). There are two steps to compute a guess function. In addition to the 4 corner pixels, a pixel box has surrounding pixels and interior pixels. A surrounding pixel is located on the four boundaries of a pixel box. Other pixels are all interior pixels. Below is the formula provided to compute the guess values of surrounding pixels:

∀1 < i < n : Pi ,1 = a1 × P1,1 + b1 × Pn,1 , ∀1 < i < n : Pi , m = a1 × P1,m + b1 × Pn, m , ∀1 < j < m : P1, j = a2 × P1,1 + b2 × P1, m , ∀1 < j < m : Pn, j = a2 × Pn,1 + b2 × Pn, m ,

⎧a1 = ( D − i + 1)/D ⎪a = ( D − j + 1)/D ⎪ , and D is the distance index. where ⎨ 2 ⎪b1 = (i − 1)/D ⎪⎩b2 = ( j − 1)/D

(1)

To compute the guess value of interior pixels: Pi,j = ((a1 × P1,j + b1 × Pn,j) + (a2 × Pi,1 + b2 × Pi,m))/2 , ∀1 < i < n, ∀1 < j < m.

(2)

Thus, the 4 corner pixels are used as the base of the guessing function. When these pixels are transmitted, in each step, only a few bits in each pixel are transmitted. The number of bits depends on the format of the picture (e.g., 8-bit gray level image, or 24-bit true color image). One important concept is, on the receiver side, after a transmission step, the transmitted bits are re-calculated, with a correction procedure. Thus, previous bits of a pixel transmitted will not be changed in the current step.

3. THE PROPOSED METHOD The proposed method, Adaptive Image Transmission (AIT), consists of the edge crispening scheme and region number reduction function. For instance, the picture is initially decomposed into regions and then a complete quad-tree is used to store the decomposition information. Each node, except the leaves, in a complete quad-tree has 4 children nodes. Based on distribution of crispening edges and a number reduction function of dominant color regions, a block is evenly subdivide into 4 sub-blocks. The strategic


695

Fig. 1. Flowchart of the proposed AIT scheme.

image decomposition and reconstructing processes of AIT are discussed in the following subsections. A discussion of the proposed image transmission method is shown in Fig. 1. Basically, after color transformation to gray levels, the Sobel operator is applied and a binary image (threshold is set to 128) with edges is calculated. The distribution of edges is used in decomposition, followed by a region number reduction scheme. After transmission, images can be fully reconstructed. Detailed algorithms and examples are given in this section. 3.1 Strategic Image Decomposition 3.1.1 Edge crispening

The Sobel edge crispening operator performs a 2-D spatial gradient measurement on an image. The Sobel operator is used to find the approximate absolute gradient magnitude at each point in an input grey scale image. The purpose of the binary image is to decompose the color picture according to the complexity of edge distribution in the binary image. If the percentage of white dots in a region on the binary image is higher than a threshold, α1, the region should be further decomposed. A complete quad-tree is used to store the subdivision structure. An example in Fig. 2 (a) shows the decomposed color picture, based on edge distribution. As one can see in the picture, decomposition does not consider color and texture continuation among regions. Meanwhile, the decomposed regions are too small in the lower portion of the picture. To fix the problems, the further discussions and solutions are provided in the following section. 3.1.2 Dominant color set and mean color

Regions of the same or similar color should be combined. However, it is difficult to decide if two regions contain the same object unless color or textural information is

696


(b) Decomposed color image based on edge distribution and dominant color with region number reduction. Fig. 2. An example of decomposed color image based on different strategies.

(a) Decomposed color image based on edge distribution.

considered. A dominant color set of a region is defined by using two thresholds. Let α2 be a threshold of pixel number percentage. Let β1 be a threshold of color distance. A naive threshold of color distance can be the difference of hues between two pixels. Another threshold of color distance can be the combination of hue, saturation, and intensity (or the combination of red, green, and blue values if the RGB color space is used). Thus, the mean color for a dominant color set can be computed for each region, by taking the average of all colors in the set. However, it is possible that a region has no dominant color set, if the region fails to satisfy α2. Regions holding the same object have the same dominant color set in most cases. The continuation of regions can be defined as if two or more regions have their mean color of dominant color sets differ by at most β2, where β2 is also a threshold of color distance. Region continuation is the main concept of combining small regions to a big region, which could represent an object or part of an object. This process is region number reduction. Thus, the number of regions in a picture can be reduced. 3.1.3 Recursive image decomposition and region number reduction

The recursive decomposition algorithm relies on the distribution of edges and region continuation. Fig. 3 summarizes the recursive image decomposition algorithm. The recursion stops if the size of block is too small. Because of this condition, a minimum block size is defined as a threshold. If the region is larger than this threshold, the region is divided evenly into four quadrants, due to the use of a complete quad-tree. Then, we check if there exists a dominant color set with its mean color for each quadrant, as we have discussed in previous sections. If any of these 4 quadrants have no dominant color set, it is possible that the variation in that quadrant is large (too much detail). Thus, the decomposition algorithm is called on recursively. If all of the 4 quadrants have no dominant color sets, the algorithm terminates at the current recursion level (but the decomposition is performed at the next level already). Alternatively, if the mean color of any two consecutive quadrants differs by less than β2, there exists a region continuation. No decomposition is required at the current level. However, if there is no region continuation, we check the percentage of edge distribution in the whole region R. If necessary, the whole region is decomposed.


697

Algorithm RegionDecomposition(R: Image Region) { If size of region R is too small Then Return Subdivide R into 4 regions, R1, R2, R3, and R4 equally Compute the dominant color sets and mean colors of the 4 regions, using β1 and α2 Forall r in {R1, R2, R3, R4} If r has no dominant color set Then Call RegionDecomposition(r) If all 4 regions has no dominant color Then Return /* decomposition complete */ Else If the mean color distances of any two consecutive regions < β2 Then Return /* region number reduction by color */ Else If percentage of edge distribution of R > α1 Then /* decomposition */ Call RegionDecomposition(R1), RegionDecomposition(R2), RegionDecomposition(R3), RegionDecomposition(R4) Else Return /* region number reduction by edges */ } Fig. 3. A region decomposition algorithm.

3.2 Adaptive Image Transmission

Pixel interpolation relies on an important property of still images – the continuation of color distribution in a small area. Most pictures contain regions of the same (or similar) color, which are larger than a few pixels. This property allows us to use interpolation or extrapolation mechanisms to guess or restore missing pixels. In this study, a new pixel interpolation scheme is presented to adapt different characteristics of pictures, and allows an adaptive color interpolation. This pixel interpolation mechanism was constructed from the New GBN method by using edge distribution and dominant color with region number reduction. The decomposed picture is stored in a complete quad-tree, which can be represented using an array (for sake of simplicity). Since larger blocks have less detail according to this decomposition strategy, the transmission results in a very low bit rate. Smaller blocks with details can use more bits for transmission, which preserve details better. Overheads exist in the transmission, both in CPU time and memory space. The decomposition structure must be stored and transmitted first, followed by the corner pixels and additional error recovery data. On the receiver side, the structure is built before the surrounding and interior pixels are interpolated according to the 4 corner pixels. In the case of a wrong guess, error recovery data is used. Additionally, extra parameters are needed to achieve a best performance of low bit rate. The error recovery procedure and performance optimization procedure are further discussed in the next subsections. 3.3 Error Recovery

Error Correction Procedure is the very important component of AIT algorithm. Because of that, an error correction stream is defined in this study to represent miss-guessed

698


Table 2. An example of error syndrome. 0 0 1 1 1 0

1 ¯ ¯ ¯

0

1

1

¯ ¯

Fig. 4. An example of error recovery process (the first step).

situations during the Error Correction Procedure. This error correction consists of two parts: the error syndrome and the recovery data stream. As shown in Table 2, a missguessed pixel is marked with an “¯”. If one or more miss-guessed pixels are in a column, the corresponding bit is marked as a “1” in the column syndrome. Otherwise, the bit is marked as a “0”. The same concept is applied to rows. The positions of miss-guessed pixels can be stored in a position matrix, with its row number and column number equal to the number of “1”s in the row syndrome and the column syndrome, respectively. For instance, the positions are represented by a 4 by 3 matrix, as (1101, 0010, 1011), for the example in Table 6. Note that, a “1” represents a miss-guessed pixel. Conclusively, an error syndrome is composed from a column syndrome, a row syndrome, and a position matrix. The corresponding recovery data stream will contain actual pixel values. The number of values is equal to the size of the position matrix. On the receiver side, the correction process takes the error syndrome and fixes the miss-guessed pixels with values in the data recovery stream. Fig. 4 illustrates an example of error recovery process. The Transmission Process: Step T-1: Assume that there is an error-correcting bit-plane (shown in Table 3). The correcting stream is 10101 00010 10011 00000 00000 (row-major distribution). For this example, the number of 0’s is more than the number of 1’s. And, there are two rows containing all zeros (i.e., zero-rows). Step T-2: A row and a column are added as bit marks. These bits are called marked bits. If all bits in a row/column are 0, the marked bit is 0. Otherwise, 1 is used. Table 4 shows an example following the case in Table 3.


Table 3. An example for error-correcting bit-plane. 1 0 1 0 0

0 0 0 0 0

1 0 0 0 0

0 1 1 0 0

1 0 1 0 0

Table 5. The simplified error-correcting bitplane after removing zero-rows and zero-columns with row/column id. Column id Row id

0 1

0

2

3

4

1 0

1 0

0 1

1 0

699

Table 4. An error-correcting bit-plane with marked bits Column bits Row bits

1

0

1

1

1

1 1 1 0 0

1 0 1 0 0

0 0 0 0 0

1 0 0 0 0

0 1 1 0 0

1 0 1 0 0

Table 6. The simplified error-correcting bitplane after removing zero-rows and zero-columns without row/column id. 1 0 1

1 0 0

0 1 1

1 0 1

Table 7. The comparison of correcting-bits. Error-correcting bit-plane for previous method 1 0 1 0 0

0 1 0 0 0 1 0 0 1 0 0 0 0 0 0 Bit transmitted: 25 bits

1 0 1 0 0

Error-correcting bit-plane for the new method Simplified EC Marked bits bit-plane (Fixed Length) (Variable Length) 10111 11100

1101 0010 1011

Bit transmitted: 10 + 12 = 22 bits

Step T-3: The marked bits are recorded and, at the same time, the zero-rows and zerocolumns are removed. The final error-correct bit-plane is called the simplified EC (error-correcting) bit-plane. Examples are showed in Tables 5 and 6. Step T-4: After combining the marked bits and EC bit-plane, we can fully obtain the recovered bits. Table 7 shows the comparisons of the correcting-bits using our previous method and our new method. Then, we transmit the simplified EC bit-plane to the receiver side. The Receiving Process: Step R-1: After receiving the plane, the algorithm reads the fixed length of marked bits (i.e., 10111 11100) and builds a temporary error-correcting bit-plane in Table 8 (with some unknown values, ‘?’) according to the marked bits. If a marked bit is zero, we can fill the corresponding row/column with all zeros.

700


Table 8. The first step of error-correcting bit-plane. Column bit Row bit

1

0

1

1

1

1 1 1 0 0

? ? ? 0 0

0 0 0 0 0

? ? ? 0 0

? ? ? 0 0

? ? ? 0 0

Step R-2: If both the row and column bits of the marked bit are 1, we use the simplified error-correcting bit-plane in Table 6 to replace the “?” values. The original error-correcting bit-plane is reconstructed. Thus, the resulting Table 8 (with Table 6) will be the same as Table 3.

4. PERFORMANCE OPTIMIZATION Human visual acuity and improvement of picture quality are two essential considerations for the proposed methods. In some special cases, the PSNR value of a picture is high for transmitted images, but the picture is not clear based on human vision. Because of that, the optimization schemes are employed to solve this problem. As an example of the AIT method, Fig. 5 has an image decomposed by edge distribution and dominant color with region number reduction. There are several large blocks in this example. In order to improve the performance of the transmission, the Adaptive Image Transmission (AIT) algorithm needs to be adjusted. First, the user (or a system designer who use our scheme) needs to adjust a threshold of block size according to experiments in the testing. This threshold is decided based on the subdivision results. For instance, in Fig. 6 (c), the block size threshold is set to 6.25% of the image size. If a block is larger than 6.25% of the image, we use the full-byte transmission scheme. The 4 blocks on the top of Fig. 5 (b) is handled with the full-byte transmission scheme. Conclusively, if a block is larger than this threshold, the block is called a Large Block. When a Large Block is processed, only the 4 corner pixels of the block are transmit, including all bits of each corner pixel. The transmission order of these bits is from the MSB to the LSB. After the transmission, the interpolation scheme is used to predict the non-transmitted bits. Fig. 6 presented a comparison of transmitted results by the BPM method, the AIT method and the optimized AIT method. These results were transmitted from the first to the fourth stage. The PSNR values and bit rates of the three tested methods are given in Table 9. Obviously, the optimized AIT method has a better PSNR value compared to the BPM and the AIT method. The experiments show that the optimization scheme yields an improved result. Table 9 shows an interesting result between AIT and optimized AIT methods. That is, optimized AIT has a lower transmission rate but higher PSNR value compared to AIT. In general, if a picture has many large blocks, and pixels in each block has a near color, the optimization scheme works very well. However, the optimization is not able to use as a default scheme since the natural of color distribution is different from picture to picture.


701

(a) Original picture. (b) By decomposed picture. Fig. 5. An example of image decomposed by the AIT method.

(a) BPM method

(b) AIT method

(c) Optimized AIT method Stage

1

2

3

4

Fig. 6. The reconstructed test images form the first to the fourth stage (evaluation is shown in Table 9).

Table 9. Comparison of large block transmissions using different schemes. BPM Stage 1 2 3 4

PSNR 17.56 22.24 28.52 34.49

AIT Rate 12.5% 12.5% 12.5% 12.5%

PSNR 16.72 20.94 26.73 32.16

Rate 3.61% 5.06% 6.11% 7.01%

Optimized AIT PSNR Rate 17.76 3.64% 22.77 5.01% 28.28 6.07% 33.58 6.97%

5. EXPERIMENTAL RESULTS AND ANALYSIS Three portions of the experiments are presented in this section. In the first part, the comparisons with the proposed method and other previous methods, including the BitPlane Method (BPM) [2] and the progressive JPEG (PJPEG) [5], for 24-bits color images. The second part attempts to evaluate the picture quality of reconstructed image based on both objective (quantitative) and subjective (human) measures [7]. All of these experiments used the same image database which included 200 pictures with sizes approximately equal to 480 × 360 pixels and 24-bit BMP format.


702

5.1 Simulation Results of Different PIT Schemes

Bit Plane Method (BPM), Progressive JPEG (PJPEG), and Adaptive Image Transmission (AIT) methods were all applied in progressive image transmission in the experiments. The quality of reconstructed image was compared with the transmitted rate based on the testing images. The transmission bit rate (R) is define to estimate the accumulated bits transmitted (shown in the columns under ‘R’) at each stage (S). The transmission rate is defined as: R=

B × 100% P

(3)

where P is the total number of pixels for each stage in an image, and B is the total number of transmitted/accumulated bits for each stage in the image. Table 10. The PSNR value of each stage of stage images (480 × 360 color image size).

Stage 1 2 3 4 5 6

BPM dB 17.25 22.94 28.46 33.63 39.72 46.33

PSNR PJPEG dB 21.77 23.90 25.15 28.36 32.72 36.80

AIT dB 17.31 22.89 26.63 33.59 40.01 46.37

BPM R(s) 12.5 12.5 12.5 12.5 12.5 12.5

Transmission bit rates (%) PJPEG AIT T(s) R(s) T(s) R(s) T(s) 12.5 2.33 2.33 3.83 3.83 25.0 3.39 5.72 5.72 9.55 37.5 4.91 10.63 6.55 16.1 50.0 5.79 16.42 5.02 21.12 62.5 5.95 22.37 6.42 27.54 75.0 6.72 29.09 6.18 33.72

To compare the performance of PIT methods, three factors were considered, including the quality of the reconstructed image, the transmission bit rate of each stage, and the accumulated bit rate after each stage. Table 10 illustrates the PSNR values and transmission bit rate of the test images, based on BPM, PJPEG and AIT methods. Comparing the results of the PSNR values of the BPM and the AIT method, the PSNR values of AIT are greater than or equal to the PSNR values of BPM in the each stage. Meanwhile, comparing the results of the PSNR values of the PJPEG and the AIT method, the PSNR values of PJPEG were greater than or equal to the PSNR values of AIT in the first two stages. However, the results of the PSNR values of the BPM and the AIT method were greater than 30 dB in the fourth stage which is better than the PJPEG method. This result satisfied the PIT requirement that the reconstructed image should be clear enough for the receiver to decide whether to see the entire picture with a better quality. Furthermore, the transmission rates of the test images for the BPM, JPEG and AIT methods are shown in Table 10, where R(s) stands for the individual transmission rate in each stage and T(s) stands for the accumulative transmission rate in each stage. The transmission rate of our method (i.e., AIT) is lower than the BPM method. And, in some cases, AIT is better than PJPEG. Even AIT does not result in a lower rate as compared to PJPEG in general; however, PJPEG cannot achieve 100% recovery. AIT is able to fully recover the image (i.e., 100%).


703

5.2 The Objective Evaluation Method

Bit Plane Method (BPM), the progressive JPEG, and the proposed AIT method, were employed to transmit and reconstruct 200 images in the experiment. The quality of a reconstructed image was evaluated by the PSNR. Since it was difficult to manually adjust the PSNR values for the progressive JPEG scheme, the image transmitted results of the BPM, the progressive JPEG, and the proposed AIT methods on an average PSNR value near 27 dB were chosen to obtain a fair experiment model for transmission bit rates. Table 11 shows the average PSNR values and transmitted rates. The results of the BPM and the AIT methods are obtained at the third stage of image transmission. The objective (quantitative) evaluation showed that the proposed AIT scheme was superior to both the BPM and the progressive JPEG, with about the same PSNR values and a lower average transmission bit rates in general. Table 11. The average PSNR values and bit rates of test images (200 pictures). PSNR (dB) Rate

BPM 28.46 0.375

PJPEG 26.76 0.222

AIT 26.63 0.161

5.3 The Subjective Evaluation Method

Picture quality of the reconstructed image was evaluated by the Human Visual System (HVS) as compared with the BPM, the progressive JPEG, and the proposed AIT algorithm. With sizes approximately equal to 480 × 360 pixels, 24-bit BMP images were used in the experiments. The observers included three types of students who are in the College of Business, the College of Sciences, and the College of Engineering of our university. Information for participants is showed in Table 12. Table 12. Information about the participants. Department College of Business College of Sciences College of Engineering Summary

Number of students 64 60 64 188

Sex Male 31 43 40 114

Female 33 17 24 74

19 11 8 9 28

Ages 20 21 40 8 41 7 42 6 123 21

22 5 4 7 16

These students from different colleges were chosen on a volunteer basis, instead of the differences of their majors. The observer was asked to vote on the transmitted images, keeping in mind the original picture. Then, the subjective evaluation was conducted by Mean Opinion Score (MOS) [9], which used a five-grade impairment scale with proper description for each grade in Table 13. A voting system on the Internet was developed to evaluate the subjective assessment of picture quality. Through this voting system, the score of each image transmission method was evaluated based on a polling policy. A voting system on the Internet was

704


Table 13. Subjective scales for image quality. Scales 5 4 3 2 1

Goodness Excellent Good Fair Poor Bad

Impairment Imperceptible Perceptible, but not annoying Slightly annoying Annoying Very annoying

Table 14. Assessment results of MOS. Method MOS value

BPM 3.616

PJPEG 2.518

AIT 3.259

developed to evaluate the subjective assessment of picture quality. Through this voting system, three image transmission methods were scored based on a polling policy. This polling policy is shown as follows: 1. Every participant is asked to evaluate 20 test screens. Each test screen contains three pictures with the same images. These three pictures are conducted by three different image transmission methods (including BPM, PJPEG, and AIT). 2. During a test session, a series of images occurs in a random order and the participant has no clue as to which one of image transmission method is used for each test image. In addition, the same test sequence is restricted in two successive presentations. Every participant is given the same kind of equipment (such as light, computer screen, and the internet) for each test. 3. No time limit is involved during a test session. All of the participants are required to vote for the best image quality based on their observations. A subjective picture quality measure was calculated for all images. The results for MOS are presented in Table 14 for each test image and each image transmission method. Subjective (human) evaluation showed that our AIT method was superior to the progressive JPEG and was close to the BPM. However, the transmission rate of the proposed method was lower than BPM. 5.4 Analysis

Table 11 illustrates the average PSNR values and the total transmitted rate of the 200 test images, based on the BPM, the progressive JPEG, and our AIT method. The average PSNR value of the BPM (28.46 dB) is greater than the average PSNR value of the progressive JPEG (26.76 dB) and the AIT method (26.63 dB) in the 3rd transmission stage. The average PSNR value of the progressive JPEG is close to that of the AIT method. However, these PSNR values do not result in much visual difference. The most important issue is, the transmission rate of our AIT method (16.1%) is lower than the others (37.5% and 22.2%). Table 14 shows the subjective quality results of the mean opinion score (MOS) from the BPM, the progressive JPEG and the AIT methods. The MOS values were obtained from an experiment involving 188 observers. The testing methodology was the double-


705

stimulus impairment scale method with five-grade impairment scale listed in Table 13 [7]. The MOS value of the BPM (3.616) is better than the MOS value of the progressive JPEG (2.518) and the AIT method (3.259). According to the results presented in Tables 11 and 14, the progressive JPEG method seems to provide the same picture quality as the AIT method as only considering PSNR. However, if the visual picture quality quantified by MOS is included, the portion of the results becomes different. The MOS values of AIT method provide a better visual picture quality then the progressive JPEG method. PSNR value is a commonly used in the evaluation of picture quality; but the local feature and human perception of the constructed image does not satisfy as perceptually meaningful measures. It is a clear example that PNSR can not be used as a definitive quality measurement of pictures. On the other hand, the transmission rate of the BPM method (37.5 %) is higher than both the progressive JPEG (22.2 %) and the AIT method (16.1%). Obviously, the transmission rate of the AIT method is smaller than the BPM and the progressive JPEG method. In general, the high PSNR value means the reconstructed image’s picture quality is closer to its original image. The transmitted results of the BPM method have a good overall picture quality on the 3rd transmission stage in both the PSNR and the MOS measures. However, the average transmission rate of BPM is much higher. Furthermore, the AIT method has a lower transmitted rate and a higher MOS value than the progressive JPEG method. Especially, with the MOS value (i.e., 3.259) of AIT method higher than 3.0, according to the literature [6], user satisfaction is guaranteed. In a lossy compression system, the progressive image transmission algorithm should accomplish a trade off between transmitted rate and picture quality, as well as higher compression rates will produce lower picture quality in general. However, the newly proposed AIT method seems to achieve a high transmission performance with a reasonable human satisfaction factor.

6. CONCLUSIONS A good progressive image transmission method should have a low transmission rate with a high picture quality. Without doubt, the proposed AIT scheme fits both criteria. Considering the subjective measures of picture quality on several PIT techniques, PSNR value, as a traditionally used objective measurement of picture quality, is not adequate as a perceptually meaningful measure in some tested PIT systems. The contribution of this study shows that block decomposition considering different situations in different pictures (and picture regions) can improve the overall performance of progressive image transmission. Moreover, the testing methods on different categories of pictures, such as photos, paintings, and cartoon drawings, were also applied. The transmission rate is very low, especially in cartoons. With the performance optimization mechanism, the transmitted rate is even lower. Another advantage of our mechanism is that it is possible to allow users to choose a portion of the transmitted image as a focus area (i.e., ROI). The area can be transmitted with a higher bit rate. The decomposition strategies can also be used in several other applications, such as still image compression and automatic picture inpainting. Combined with improved motion estimation techniques, the proposed method can also be used in video compression.

706


In conclusion, the advantages of the above correlate to a great extent with the previous section. With our future work, how to apply the techniques on PDAs and mobile devices will be investigated. Although bandwidth of Internet and local area networks have been improved, it is still hard to satisfy the needs of progressive transmission mechanism. Similarly in wireless communication, even with the 3G technology, it is also difficult for users to browses pictures smoothly on PDAs and mobile devices with a wireless communication facility. Therefore, it is very important for future studies in the field of multimedia communication to explore more and better supplements to the techniques on PDAs and mobile devices

ACKNOWLEDGEMENTS We would like to express special thanks to Mr. Louis Lin who provided effective technical support to assist our experiments. We would also like to thank the reviewers for their valuable comments and advices which help us improve the quality of this paper.

REFERENCES 1. M. Accame and F. Granelli, “Hierarchical progressive image coding controlled by a region based approach,” IEEE Transactions on Consumer Electronics, Vol. 45, 1999, pp. 13-20. 2. C. C. Chang, F. C Shine, and T. S. Chen, “A new scheme of progressive image transmission based on bit-plane method,” in Proceedings of Asia-Pacific Conference on Communications and Fourth Optoelectronics and Communications Conference, Vol. 2, 1999, pp. 892-895. 3. C. C. Chang, T. K. Shih, and I. C. Lin, “An efficient progressive image transmission method based on guessing by neighbors,” The Visual Computer − International Journal of Computer Graphics, Vol. 18, 2002, pp. 341-353. 4. C. C. Chang and M. N. Wu, “A color image progressive transmission method by common bit map block truncation coding approach,” in Proceedings of International Conference on Communication Technology, Vol. 2, 2003, pp. 1774-1778. 5. T. S. Chen and C. Y. Lin, “A new improvement of JPEG progressive image transmission using weight table of quantized DCT coefficient bits,” in Proceedings of IEEE Pacific Rim Conference on Multimedia, 2002, pp. 720-728. 6. S. Grgić, M. Grgić, and M. Mrak, “Reliability of objective picture quality measures,” Journal of Electrical Engineering, Vol. 55, 2004, pp. 3-10. 7. ITU, “Methodology for the subjective assessment of the quality of television pictures,” ITU-R Recommendation BT, 500-9, 1998. 8. J. H. Kim and W. J. Song, “Pyramid-structured progressive image transmission using quantization error delivery in transform domains,” IEE Proceedings on Vision, Image and Signal Processing, Vol. 143, 1996, pp. 132-136. 9. D. Schilling and P. C. Cosman, “Image quality evaluation based on recognition times for fast image browsing applications,” IEEE Transactions on Multimedia, Vol. 4, 2002, pp. 320-331.


707

10. K. H. Tzou, “Progressive image transmission: a review and comparison of techniques,” Optical Engineering, Vol. 26, 1987, pp. 581-589. Rong-Chi Chang (張榮吉) is an Assistant Professor of Digital Media Design at Asia University, Taiwan. He specializes in image restoration and also works as the program coordinator of applied computer science in the Multimedia Computing Laboratory. He earned both M.S. and Ph.D. degrees from Tamkang University in Computer Science and Engineering. His research focuses on multimedia computing, computer network and image processing.

Timothy K. Shih (施國琛) is a Professor of the Department of Computer Science and Information Engineering at Tamkang University, Taiwan. As a senior member of IEEE, Dr. Shih joins the Educational Activities Board of the Computer Society. His current research interests include multimedia computing and distance learning. He is the founder and co-editor-in-chief of the International Journal of Distance Education Technologies. He is also the associate editor of IEEE Transactions on Multimedia and ACM Transactions on Internet Technology.

Hui-Huang Hsu (許輝煌) received the B.E. degree from Tamkang University, Taipei, Taiwan in 1987, and the M.S. and Ph.D. degrees from The University of Florida, Florida, USA in 1991 and 1994, respectively. All the degrees are in electrical engineering. He joined the Department of Computer Science and Information Engineering, Tamkang University in 2003. His current research interests are in the areas of multimedia processing, bioinformatics, machine learning, data mining, and e-learning.