International Journal of Innovative Computing, Information and Control Volume 2, Number 2, April 2006
c ICIC International °2006 ISSN 1349-4198 pp. 387—398
IMPROVING THE PERFORMANCE OF FRACTAL IMAGE CODING Zhiyuan Zhang and Yao Zhao Institute of Information Science Beijing Jiaotong University Beijing 100044, China
[email protected];
[email protected]
Received December 2004; revised March 2005 Abstract. This paper presents a new fractal image coding (FIC) scheme to exploit the self-similarly at the same resolution scale in natural images. The new scheme can assure the convergence of FIC transforms without some limiting conditions like Zhao’s, and we also give the convergence proof of our new scheme in this paper. Our scheme also uses a recursive scheme feeding the coding results back to update domain pools during the coding process to improve the decoded image quality. At the end of the coding process, the “climbing mountain” method is used to adjust the parameters to further improve the decoded image quality. Experimental results show our scheme can achieve a better ratedistortion curve than conventional FIC scheme. Keywords: Fractal image coding, Compound convergence
1. Introduction. Image compression is the process of encoding data so that it takes less storage space or less transmission time than it would if it were not compressed. This is possible because most real-world imaging is very redundant or not most concisely represented in its human-interpretable form. There are many image compression schemes today, such as Vector Quantization, DCT, DWT, FIC, etc. In this paper, our scheme is a type of FIC. FIC was firstly introduced by Barnsley and Sloan [1]. Afterwards, Jacquin [2] devised the first practical fractal coder with block-based transformations. So far many fractal coders have been devised, among which Jacquin’s method [2] and Fisher’s conventional quadtree method [3] are well-known and successful approaches. In Jacquin’s scheme, the original image is partitioned into non-overlapped blocks socalled range blocks and overlapped blocks so-called domain blocks. The range blocks tile the whole image, and the arbitrarily located domain blocks are twice as large as range blocks. Each range is coded by a reference to a suitable domain block and by some transformation parameters. These parameters describe how the referenced image part has to be adjusted with respect to contrast and brightness in order to give a good approximation to the range blocks to be encoded. The decoding procedure is comparatively simple. All the affine transforms are decoded and are iteratively applied on an arbitrary initial image. The mathematical principles of FIC are two mathematics theorems: the Fixed Point Theorem and the Collage Theorem. Theorem 1.1. (Fixed Point Theorem) Let (X, d) denote a metric space, where d is a given distortion metric. If a transformation T satisfies Equation (1), we call T contractive 387
388
Z. Y. ZHANG AND Y. ZHAO
transformation in (X, d) . For any two points μ, ν ∈ X , d(T (μ), T (ν)) ≤ s · d(μ, ν)
(1)
where 0 ≤ s < 1, s is called contraction factor. For any contractive transformations, there must be a fixed point x∗ , i.e. T (x∗ ) = x∗
(2)
and the fixed point can be obtained from any point x. lim T n (x) = x∗
n→∞
(3)
Theorem 1.2. (Collage Theorem) Let (X, d) be a metric space with a contraction operator T : X → X with contraction factor s and fixed point x∗ . Then for any x ∈ X d(x, x∗ ) ≤
1 · d(x, T (x)) 1−s
(4)
Let (X, d) denote a complete metric space. The elements of the space are digital images. d is a given metric. The original image xorig is one element of the space. The fractal coding procedure of xorig is to construct a transformation T : X → X, which satisfies the following conditions: (i) T is a contractive transform; (ii) T (xorig ) ≈ xorig From condition (i), we know that T is a contractive transformation. Fixed Point Theorem ensures that T has a unique fixed point and the fixed point can be found by iteration of T . Condition (ii) tells us that xorig is an approximate fixed point of T . According to Collage Theorem, xorig can be approximately reconstructed by applying T on any initial image x iteratively. If T can be stored compactly, then it is called the compressed data of xorig . Therefore,xorig is compressed. Although fractal image coding (FIC) has been investigated since 1980s, there are still some problems. 1.1. Problem 1-the self-similarity at the same resolution in natural images. In conventional FIC, the domain block is larger than the range block for convergence purposes. In fact, it exploits the self-similarity at the different resolution scale in natural images. But the self-similarity at the same resolution scale most commonly exists in natural images, and the conventional FIC doesn’t use this type of self-similarity. Paper [4] proposes a scheme to use the self-similarity at the same resolution scale. But for the convergence purpose, the method has some limiting conditions in the process of finding the fitting same-sized domain block. That is, when a same-sized block mapping is found, they must check if the mapping is cyclic to decide the usage. These conditions increase the complexity of FIC coder and decrease the performance of using the selfsimilarity at the same resolution scale. In this paper we propose a new scheme to exploit the self-similarity at the same resolution scale in natural images and decrease the complexity. The result shows that our method can achieve better rate-distortion performance than the traditional schemes.
IMPROVING THE PERFORMANCE OF FRACTAL IMAGE CODING
389
1.2. Problem 2-the difference brought by the collage theorem. The mathematical foundations of FIC are the Fixed Point Theorem and the Collage Theorem. Collage Theorem means that if the difference error between the original image and its collage image is small enough, then the difference error between the decoded image and the original image can be also small enough. That is to say, the condition is a sufficient condition, however, it is not a necessary condition, since the difference between the original image and the decoded image is not directly influenced by the difference between the image and its collage image. So in the encoding procedure, the minimum of the difference between the original image and its collage usually does not result in the minimum of the difference between the decoded image and the original image. From inequality (4), the error value is the upper bound between the original image and the decoded image. When s is close to one, the upper bound value may be very large. That means we can’t make sure the error between the original image and the decode image small enough. To overcome this drawback, we take a scheme like “recursive coding scheme”, the original image is encoded and decoded in the coding process. For each range block, fractal transforms are achieved using conventional FIC, and then the domain pools are updated with the decoded image for the next range block. Using this method, we can achieve coding gains of 0.2-0.4 dB than traditional Fisher’s quadtree scheme. 1.3. Problem 3-the difference brought by the iterated function systems. In all traditional FIC, the parameters of each range block are computed independently. But our purpose is to find an optimal system to represent the original image. It is obvious that the system we got in former FIC is not an optimal system because the parameters of the system are computed independently. So we take the “climbing mountain” scheme to adjust the parameters we have obtained so that we can find the optimal or sub-optimal system, further, we can minimize the difference between the original image and the decoded image. In this way, we obtain coding gains of 0.1-0.2dB than traditional Fisher’s quadtree scheme. Based on the above problems, we propose a new scheme. The scheme can improve coding performance significantly. Compared with the quadtree scheme [3], a higher ratequality is achieved. The result shows that our improved scheme can achieve coding gains of 0.3-2 dB for compression ratios larger than 8:1 than traditional Fisher’s quadtree scheme. Meanwhile, the computing complexity is greatly reduced compared with [4]. The remainder of this paper is organized as follow: Section 2 describes our scheme in detail. Section 3 presents some experimental results. Lastly, Section 4 summarizes the paper. 2. Our Scheme to Improve the Performance of FIC. In Section 1, we have discussed three problems existing in FIC. In this Section, we will respectively explain our scheme in detail to solve these three problems. The total procedure is shown in Figure 1. From Figure 1, we can see the original image I is partitioned into range blocks. And for each range block, we find a same-sized domain block in the coded image or a largesized domain block in the total domain image to compute the corresponding parameters. We also use a recursive coding scheme to improve the performance of the coder. After the range blocks are all encoded, we use the “climbing mountain” method to adjust the parameters we have obtained to improve the performance of the coder.
390
Z. Y. ZHANG AND Y. ZHAO
Figure 1. The total procedure of our scheme In Section 2.1, our method is given to exploit the self-similarity at the same resolution scale in natural images. We use a recursive coding scheme in Section 2.2 and the “climbing mountain” method in Section 2.3 to reduce the difference between the original image and the decoded image. 2.1. Exploit the self-similarity at the same resolution in natural images. In Section 1.1, we have discussed that the self-similarity at the same resolution scale most commonly exists in natural images. It is worthwhile to use this type of self-similarity in FIC. Unfortunately, this type of self-similarity was not used in traditional FIC schemes until Bedford et al first found the problem. In paper [5], Bedford et al use domain blocks of the size as range blocks to overcome this problem intuitively. But Zhao in paper [4] finds there is a convergence problem if we exploit the self-similarity at the same resolution scale like [5]. So, Zhao in paper [4] gives a method to overcome the problem. In his method, when a range block finds a same size domain block, method in paper [4] check if the block has been mapped with other larger domain block and check if the mapping is cyclic to decide the usage. In this way, the convergence of the transform is assured. But some limiting conditions must be taken at the same time. These conditions increase the complexity of the method and decrease the performance of the coder, so we propose a new improved scheme. In our scheme, we exploit the self-similarity at the same resolution scale in natural images without restrictions like those in paper [4]. Our scheme is illustrated in Figure 2. The basic idea is to construct compound transforms to be eventually contractive. For each range block, we firstly find a suitable samesized domain block. Note that, the same-sized domain block must be chosen in the coded image area. If we can’t find a suitable same-sized domain block in the coded image area, then we find a large-sized domain block in total image like traditional FIC scheme. For example, in Figure 2, the shadow area is a coded area. We can see that a range block is mapped with a same-sized block and is in the coded image area. That is, another range block containing pixels has been mapped with the larger domain block. In this case, the compound transform involved is contractive. The convergence of such a case can be proved theoretically. In FIC, a map from a domain block to a range block can be written as an affine transform. In natural images, when two blocks are similar to each other, they are usually in the same direction. So in the same-sized mapping proposed here, the transform involved
IMPROVING THE PERFORMANCE OF FRACTAL IMAGE CODING
391
Figure 2. A compound transform that is eventually contractive (the shadow area is coded image area) is only a position translation without rotation or flipping. In Figure 2, if a range block encoded with a same-sized domain block, the relation between a point (X, Y, Z) in block B1 and another correspondent point (x1 , y1 , z1 ) in block B2 can be expressed as an affine transform: ⎞⎛ ⎞ ⎛ ⎛ ⎞ ⎛ ⎞ x1 e1 X 1 0 0 ⎝ Y ⎠ = ⎝ 0 1 0 ⎠⎝ y1 ⎠ + ⎝ f1 ⎠ (5) Z 0 0 α1 z1 0
where, (x1 .y1 ) is a pixel position, z1 is the pixel gray value, (X, Y ) is the transformed pixel position, Z is the transformed pixel value, e1 , f1 are the parameters of the transformed denoting the position shift, α1 and is the scale factor of gray level, 0 ≤ α1 < 1. The affine transform only makes a block shift in position without shrinking in size. In the case shown in Figure 2, we know that the pixel (x1 , y1 , z1 ) is mapped from another point by a contractive transform: ⎞ ⎛ ⎞⎛ ⎞ ⎛ ⎞ ⎛ x2 x1 a 1 b1 0 e2 ⎝ y1 ⎠ = ⎝ c1 d1 0 ⎠⎝ y2 ⎠ + ⎝ f2 ⎠ (6) 0 0 α2 z1 z2 o2
where the parameters a1 , b1 , c1 , d1 , e2 , f2 make the affine transform contractive in the X-Y plane, α2 (|α2 | < 1) makes it contractive in the gray level, and o2 is a gray level offset. From equations (5) and (6), we get ⎞⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ x2 e1 + e2 X a 1 b1 0 ⎠⎝ y2 ⎠ + ⎝ f1 + f2 ⎠ ⎝ Y ⎠ = ⎝ c1 d 1 0 (7) Z z2 o2 0 0 α1 · α2 where |α1 · α2 | < 1.
392
Z. Y. ZHANG AND Y. ZHAO
In compound transform T = T1 ◦ T2 ,T1 is a twice-sized mapping and T2 is a same-sized mapping. This transform is obviously a contractive transform. It means that even though T2 is not a contractive mapping, the compound transforms T1 ◦ T2 is still contractive. There is only one same-sized block mapping in the compound transform at anterior example in Figure 2. We can extend the idea to fully exploit the similarity at the same scale. We can construct a compound transform as T = T1 ◦ T2 ◦ · · · ◦ Tk , where T1 , T2 , · · · , Tk−1 are all same-sized block mappings, and only Tk is a contractive transform. It is easy to prove that T is also contractive. On average, the range blocks encoded with the same-sized domain block occupy onethird of all range blocks. It is common senses, when two same-sized blocks are similar to each other, they are usually in the same direction, and we think that these two blocks are the same at gray diversification. In this hypothesis, the scale coefficient(s) equals 1 and the offset coefficient (o) equals 0.So, in the same-sized mapping proposed here, the transform involved is only a position translation without rotation or flipping and some other coefficients. So, only position information of the same-sized domain block needs storing. In this way, we can significantly decrease the bitrate. Of course, we need 1bit to tag where the domain block is from, resulting in a slight decrease of the improvement. 2.2. Updating the domain pools to improve the performance of FIC. As discussion in Section 1.2, only the error between the original image and the collage image is minimized in the traditional fractal image coding (FIC). However, our purpose is to minimize the error between the decoded image and the original image. From (4), we know the error minimization between the original image and the collage image does not mean the minimization of the error between the original image and the decoded image. When α is close to 1, the difference between the two errors may be very large. So, in this Subsection we try to solve this problem. Just like previous discussion, we should minimize the difference between the decoded image and the original image in the encoding process. However, this is impossible by definition, since only after the image is encoded are the transforms obtained, and only after the transforms are obtained can the decoded image be reconstructed. Fortunately, in FIC scheme, we encode the original image one range block after another. So in the encoding procedure, we obtain affined mappings for each range block in succession, therefore we can achieve an approximate decoded image we call it domain image later by iteratively transforming the original image using the transforms obtained. Then we can use the approximate decoded image to construct domain pools. As the encoding procedure processed the error between the original image and the decoded image it was gradually minimized. 2.3. Adjusting the parameters to improve the performance of FIC. The mathematics hypostasis of fractal image coding is that we try to find an Iterated Transform System (IFS) whose fixed point is the original image. As we all know, each part of a system has a relationship with others. And FIC is a type of system and all the parameters construct the system. That is, the parameters of one block should be effective on the others in FIC. In other words, it is all block parameters together being effective on the system not independently. Unfortunately the parameters of each range block are computed independently in all foregone schemes. So, the system we got is not the optimal
IMPROVING THE PERFORMANCE OF FRACTAL IMAGE CODING
393
system. Based on this reason, we devise the adjusting parameters scheme to try to find the optimal parameters of the encoding image when all comes to all.
Figure 3. The flow chart of the “climbing mountain” method to adjust the parameters The basic flow chart of our method is shown in Figure 3. Firstly, for each mapping, we add or minus δ to the scale coefficient in a random way. In our paper, we generate 0 or 1 randomly, we add δ to the scale coefficient when we get 0 and minus δ to the scale coefficient when we get 1, where δ ∈ (0, 0.2] is preselect variable, then we compute other parameters according to the new scale coefficient. Using these parameters we can get a decoded image, and then compute the PSNR of the decoded image. The PSNR means the difference between the decoded image and the original image. We can judge if the adjusting is effective by using the PSNR: The PSNR after adjusting is lager than the PSNR before adjusting, which means the adjusting is effective, otherwise the adjusting is not effective enough. For the effective adjusting, we use the corresponding parameters as our new system parameters. We repeat this process until the PSNR does not change any longer or the repeating number reaches a preselect threshold. In this way, we get the optimal or sub-optimal system and the difference between the original image and the decoded image is gradually reduced. Experimental result shows we needn’t change the parameters of all mapping. So we compute the errors between the range blocks and the corresponding decoded image blocks. Because the larger the error is, the more preferential power should we give to adjust the corresponding parameters. So we sort those errors in descendent order. Following this order we take some mappings to adjust the parameters. In our method, we use 20 mappings to adjust the parameters and δ is 0.05. Experiment shows that we can obtain coding gains of 0.1-0.2dB. 2.4. The implementation of our scheme. Our scheme can be used in any FIC, and we use it in the Fisher’s quadtree method in this paper. The choices of scale and offset coefficient, storing size, the minimum and maximum partitioned depth have direct influences on performance and speed. Once these parameters are fixed, the rate-distortion curve can be drawn with variable threshold tol1 and tol2. The tol1 is a preselected threshold for root-mean-square (RMS) error between the original image block and the collage image block in twice-sized block mapping and the tol2 is the preselected threshold for the RMS error in the same-sized block mapping. Larger collage error corresponds to lower bitrate, lower PSNR and quicker encoder, which should be also carefully selected. In the process of coding, we take Variable coding to store the position of domain block. For each range block, we index the domain pools in a fixed way, and then we can store the index of the corresponding domain block in the domain pools instead of the position of the corresponding domain block. Therefore the position coding length is changing with the size of domain pools. In this way, we can economize some storage spaces. 3. Experimental Results. In this Section, we will test the compression ratios and the quality of compressed image using our scheme and traditional quadtree scheme.
394
Z. Y. ZHANG AND Y. ZHAO
In the following experiments, Lena 256×256×8 image is used as the test image, and we choose the following parameter setting for our experiments: The max partitioned depth is 6; the min-partitioned depth is 4. That is, our range block size is from 16 pixels to 256 pixels. We also need 3 bits to save the orientation information, 5 bits to save scale information and 7bits to save offset information. Of course, the orientation information, scale and offset information are only used in twice-sized block mappings. As for the coordinates of domain blocks is alterable with the coding processing. Figure 4(a) and Figure 4(b) show the original Lena image and the corresponding partition. Figure 4(c) and 4(d) show the decoded Lena image using Fisher’s scheme and our improved scheme. At similar reconstruction quality,it can be seen that our scheme achieves higher compression ratios.
(a)
(b)
(c)
(d)
Figure 4. (a) The original Lena256 × 256 × 8 image; (b) Partition corresponding to our scheme; (c) Decoded Lena using quadtree scheme at CR = 9.10 and P SN R = 31.23dB (without full domain search); (d) Decoded Lena using our improved scheme at CR = 10.01 and P SN R = 31.29dB (without full domain search)
IMPROVING THE PERFORMANCE OF FRACTAL IMAGE CODING
395
3.1. Experiment 1. This experiment is conducted to show the performance of using the “climbing mountain” method to adjust the parameters of the system. The curves of peakto peak signal to noise ratios (PSNR) versus compression ratios (CR,times) in Figure 5 show the result. We can see that we can achieve coding gains of 0.1-0.2 dB compared to the traditional Fisher’s quadtree scheme.
Figure 5. Comparison of our adjusting parameters scheme and quadtree scheme without full domain search scheme 3.2. Experiment 2. As the threshold of collage errors tol1 and tol2 vary, CR vs. PSNR curves for the Lena image in Figure 6 are obtained with our improved scheme and conventional quadtree scheme without full domain search scheme. And in Figure 7, we show the CR and PSNR curves for Lena image between our improved scheme and conventional quadtree scheme with full domain search scheme. Our scheme yields a better rate-distortion performance compared to the quadtree scheme, which can achieve coding gains of 0.3-2 dB for compression ratios larger than 8:1. It demonstrates that our method leads to significant gains over the rigid quadtree-based approach. 3.3. Experiment 3. The experiment is conducted to compare the performance of the scheme proposed by Zhao in [4] and our scheme. The curves of PSNR versus CR in Figure 8 summarize the results. From Figure 8, we can see that our scheme has almost the same results as the scheme proposed by Zhao. But we know our scheme has lower computing complexity than the scheme in [4] from the description of our scheme. 3.4. Experiment 4. The coding results of three standard test images Lena, Girl and Boat are analyzed to interpret the advantage of our improved scheme over quadtree scheme Refer to Table 1
396
Z. Y. ZHANG AND Y. ZHAO
Figure 6. Comparison of our improved scheme and quadtree scheme without full domain search scheme
Figure 7. Comparison of our improved scheme and quadtree scheme with full domain search scheme About one-third range blocks whose domain block are from the same-sized domain block is seen. But the percentage of block from the same-sized block is determined by tol2. It brings memory saving in favor of compression efficiency. Of course the fidelity of the decoding image has small damage. But though the methods we propose in Section 2.2
IMPROVING THE PERFORMANCE OF FRACTAL IMAGE CODING
397
Figure 8. Comparison of our improved scheme and the scheme proposed by Zhao in [4] and 2.3 the damage become very small. So our coder can outperform the conventional one with respect to rate-distortion performance and visual assessment in the nature of things. At Table 2 we also give the comparison between our scheme and DCT transformation with the three test image 256 × 256 × 8Lena, Girl and Boat. Table 1. The coding results of three standard test images(tol1 = 8, tol2 = 4, sbits = 5obits = 8, minpart = 4, maxpart = 6) Image Number of block from twice-sized domain blocks Lena 1558 Girl 1400 Boat 2066
Number of block from samesized domain blocks 723 578 681
CR
PSNR(dB)
10.01 31.30 11.37 33.54 8.20 29.55
Table 2. Comparison of three images between DCT transformation ( partitioned block computer using Vcdemo) and our improved scheme with full domain search Image DCT transformation Our improved scheme PSNR(dB) CR PSNR(dB) CR Lena 31.5 8.1 31.69 14.14 Girl 35.6 7.84 36.12 9.74 Boat 29.8 7.9 31.04 10.83
398
Z. Y. ZHANG AND Y. ZHAO
4. Conclusions. In this paper, we have presented a new scheme based on any conventional fractal image scheme. Our scheme exploits the self-similarity at the same resolution in the image without restrictions. And we also use two methods to reduce the difference between the original image and the decoded image. The experiment results prove that our scheme is a better solution for fractal image coding than the conventional one. For improving the fidelity of coding, we take the “climbing mountain” method. But the result is not optimal, and we can take some other method like “GA” method to adjust the parameters too. And the relationship between tol1 and tol2 was investigated further. Following the lead of publications [6]-[12], these further improvements are currently being investigated. Compared with some successful techniques (e.g. wavelet, Run-Length Coding, Adaptive Arithmetic Coding) which are integrated into still image coding standards JPEG, JBIG, JPEG2000, fractal image coding is still far away from industrial application due to some limitations. So there are some further investigations which are valuable to research. Acknowledgment. This work is partially supported by National Natural Science Foundation of China (No.60172062, No.60373028), Fok Ying Tong Education Foundation and the Scientific Research Foundation for the Returned Overseas Chinese Scholars, State Education Ministry. The authors also gratefully acknowledge the helpful comments and suggestions of the reviewers, which have improved the presentation. REFERENCES [1] Barnsley, M. F. and A. D. Sloan, A better way to compress images, byte, pp.215-223, 1998. [2] Jacquin, A. E., Image coding based on a fractal theory of iterated contractive image transformations, IEEE Trans. Image Processing, vol.1, pp.18-30, 1992. [3] Fisher, Y., Fractal Image Coding - Theory and Application, New Springer, York, 1994. [4] Zhao, Y., H. X. Wang and B. Z. Yuan, Multiple same-sized block mapping for recursive fractal image coding, Opt. Eng., vol.41, no.2, pp.328-334, 2002. [5] Bedford, T., F. M. Dekking, M. Breeuwer, M. S. Keane and D. vanSchooneveld, Fractal coding of monochrome images, Signal Process, Image Commun., vol.6, pp.405-419, 1994. [6] Jackson, D. J. and T. Blom, Fractal image compression using a circulating pipeline computation model, OUP Computer Journal, vol.39, 1996. [7] Horowitz, F. G., D. Bone and P. Veldkamp, Karhunen-loeve based iterated function system encodings, Proc. of the 1996 International Picture Coding Symposium, Melbourne, Australia, vol.2, pp.409-413, 1996. [8] Monro, D. M. and F. Dudbridge, Fractal approximation of image blocks, Proc. of the International Conference on Acoustics, Speech and Signal Processing, vol.3, pp.23-26, 1992. [9] Urtgen, B. H., Contractivity of fractal transforms for image coding, Electronics Letters 30th , vol.29, no.20, pp.1749-1750, 1993. [10] Saupe, D. and M. Ruhl, Evolutionary fractal image coding, Proc. of the IEEE International Conference on Image Processing, Lausanne, Switzerland, vol.I, pp.129-132, 1996. [11] Oien, G. E. and Z. Baharava, New improved collage theorem with applications to multiresolution fractal image coding, Proc. of the 1994 IEEE International Conference on Signal Processing, vol.5, pp.19-22, 1994. [12] Sun, Y. D. and Y. Zhao, A parallel implementation of improved fractal image coding based on tree topology, Chinese Journal of Electronics, vol.2, no.2, 2003.