Towards Intrinsic Evolvable Hardware for Predictive Lossless Image Compression Jingsong He1,2 , Xin Yao1,3 , and Jian Tang2 1
2
Nature Inspired Computation and Applications Laboratory (NICAL) Department of Electronic Science and Technology, University of Science and Technology of China 3 School of Computer Science, University of Birmingham
[email protected],
[email protected],
[email protected]
Abstract. This paper presents a novel method for predictive lossless image compression via evolving a set of switches, which can be implemented easily by intrinsic evolvable hardware mode. A set of compounded mutations for binary chromosome through combining the local asexually reproducing with multiple mean step size search was proposed, and a gradually approach method for evolving larger scale images was fabricated. Experimental results show that the proposed method can reduce the computing time much more, and can scale up the image size increasing up to 70 times with relative slower increase speed of computing time.
1
Introduction
As the new emerged field, evolvable hardware (EHW) may provides new types of mechanism for automatic circuit design and optimization, and adapting the environment by circuit itself. Recent studies on applications of evolvable hardware conduct an important issue, that is to seek valuable applications of evolvable hardware and to discover new problems and their corresponding solutions. Indeed, the way to study problems through combining intelligent computation with real-world applications is one of the most important methodologies in the field of evolvable hardware. Among various applications, the adaptive lossless image compression is one of the typical applications of evolvable hardware. T.Higuchi et.al.[1] firstly presented an evolvable chip implemented by a special functional FPGA (F2 PGA) with their special variable length genetic algorithm (VGA) for compressing images, which has been regarded as the merely non-toy problem in the early researches in the field of evolvable hardware[2]. And afterwards, they present a further research on the problem of evolving large scale images (up to 315MB, the computing time has not been mentioned.) in [3]. As a different study, A. Fukunaga et.al.[2][4] presented a new prototype system based on genetic programming (GP) to solute the problem of such implementation on conventional
This work is partially supported by the National Natural Science Foundation of China through Grant No. 60573170 and Grant No. 60428202.
T.-D. Wang et al. (Eds.): SEAL 2006, LNCS 4247, pp. 632–639, 2006. c Springer-Verlag Berlin Heidelberg 2006
Towards Intrinsic Evolvable Hardware
633
FPGA. [5] shows that the GP-based model can enlarge the size of processable image from 64K-bytes up to 2M-bytes, and can reduce the computing time from 2 hours[4] down to 5 minutes (in simulation mode) at the same time. However, both models above have limitations in their functionality: VGA for optimizing templates is executed on a host computer[3]; The compiling time of converting a chromosomes to an assessable circuit is usually about 0.5 hour[2]. Obviously, problems behind them are seriously. In this sense, intrinsic EHW may be more suitable for real-time applications[6], specially for the task of predictive lossless image compression. This paper proposes a novel and simple evolutionary technique for predictive lossless image compression, where a genetic algorithm with small size of populations was used for reducing the circuit resources, and the parameters of predictive function was discrete, therefore the chromosome can be corresponded to a set of circuit switches thus can be evolved on chip directly. Experimental results show that the proposed method can process more larger images, while reduce the computing time efficiently.
2
Lossless Image Compression Through Evolving a Set of Switches
The character of extrinsic evolvable hardware mode is to use symbolic expression in chromosome, where each symbol corresponds to a real circuit completed by conventional design. The advantage of using extrinsic mode is that many commercial compilers and circuit resources out of conventional design can be employed directly. The disadvantage of extrinsic mode in real-time control and adaption is that this kind of using may be detrimental to online tasks, since extrinsic mode needs an extra device to accommodate a compiler software, and has to have many compiling time and download time. On the problem of implementing the self-adaption task of real-time lossless image compression on a chip, if the control parameters could be described as a set of switches, the evolving of states of switches will be easily achieved inside a chip through intrinsic mode. 2.1
The Binary Cording Mechanism
To obtain a fixed length chromosome with binary cording, here the exponential function is used to approach the predictive function. Thus the 4-neighbor predictive mode can be written as xk =
4
xk,i e−αi d(xk ,xk,i ) ,
(1)
i=1
where xk denotes the kth predicted value, xk,i denotes the pixel value of the ith neighbor of xk , α denotes the parameter of the interpolating function, and d(xk , xk,i ) denotes the distance between the current point and its ith neighbor.
634
J. He, X. Yao, and J. Tang
Therefore, the task of prediction becomes to minimize the following objective function f (x) =
N
||xk − xk ||2 =
k=1
N
xk −
4
xk,i e−αi d(x,xi)
2
(2)
i=1
k=1
where N is the number of pixels for prediction. Conveniently, assume the effect of interpolating function only affects its neighboring pixels, i.e. d(xk , xk,i ) equals to one uniformly. Hence (2) becomes f (x) =
N k=1
xk −
4
2 xk,i e−αi .
(3)
i=1
Hence the problem of lossless image compression becomes a parameter optimization problem with four variables α1 , α2 , α3 , α4 . Obviously, there are many numerical/function optimization algorithms, e.g., IFEP[7], and StGA[8], can solute this kind of problem well. However, the task of optimization problem in the field of evolvable hardware (specially in intrinsic EHW mode) is quite different from tasks implemented by software. To make the problem of Eq.(3) be easily implemented by intrinsic EHW, we can make parameters in Eq.(3) coded with a set of binary string. Thus the value of exponential function can be calculated on chip by querying an embedded LOOK-UP Table (LUT). e.g., we can use “101101,011100,101100,010101” represent “0.8125, -1.7500, 0.7500, -1.3125” for the values of α1 , α2 , α3 , α4 respectively, thus the candidate solution of predictive function with formulation of Eq.(1) can be expressed as linear function as x = 2.25353478721321x1 + 0.173773943450445x2 + 2.11700001661267x3 + 0.269146348729184x4. It is apparent, 1) The size of LUT is equal to 2L/4 , where L is the length of binary string; 2) This kind of cording method can make the task of predictive lossless image compression being implemented on chip easily. 2.2
The Binary Evolutionary Programming Algorithm
Two important notions can make help in achieving a binary evolutionary programming algorithm. The one is the technique of asexually reproducing for local selection presented in StGA, and the other is the idea discussed in IFEP that long jumps can help to generating an offspring at the neighborhood of the global minimum. For the sake of minimizing the population size to reduce circuit resources as much as possible, here the notion of asexually reproducing and the idea of using long jumps are fabricated to archive the binary evolutionary programming algorithm. Technically, the asexually reproducing can repeat one by one with a series of compounding mutations as shown in Fig.1, where h0 is the parent, h1 –h5 are offsprings, pm1 and pm2 are probability for mutations. Empirically, pm1 and pm2 can usually take 13 and 19 respectively. Thereby, the averaged differences (search step size with hamming distance) between the parent and each offspring can
Towards Intrinsic Evolvable Hardware
635
Fig. 1. The illustration of the asexually dendriform reproducing, where h0 is the parent, h1 , h2 , h3 , h4 , and h5 are offsprings, pm1 and pm2 are the probability of mutations 1 be calculated to be 13 , 19 , 49 , 27 , and 16 81 of the length of bit string respectively. It is apparent, the characteristic of the local selection mechanism in StGA is absorbed but the random generation method is substituted to a mixed search with five different mean step size of mutations, the characteristic of using two kinds different mean step size (Gaussian mutation and Cauchy mutation) for mutation presented in IFEP is also absorbed but the jumping has been made more times and more deeply. Since the local dendriform reproducing as shown in Fig.1 is mixed with relative large mutations and relative small mutations, it can be thought as a hybrid mechanism between local and global search. For scalability reason, here we just take it as an independent evolution procedure, and denote it as binEP. The basic procedure of binEP is summarized as follows.
Step 1. Initial a L-bits binary string randomly, and denote it as h0 . Set C = 1. Each bit of h0 is taken as a circuit switch for controlling the real number of parameter in the predictive function. Evaluate the fitness of h0 . Step 2. Generate five offsprings asexually as follows. Mutate h0 with the probability pm1 → h1 ; Mutate h0 with the probability pm2 → h2 ; Mutate h1 with the probability pm1 → h3 ; Mutate h3 with the probability pm1 → h4 ; Mutate h2 with the probability pm2 → h5 ; where 1 > pm1 > pm2 , pm1 and pm2 are the probability of mutation happened on each binary bit with the value changes from 0 to 1 or from 1 to 0. Step 3. Evaluate the fitness of each offspring by Eq.(3). Step 4. Select the best one out of {h1 , h2 , h3 , h4 , h5 }, and denoted as h0 . If h0 is better than or equal to h0 , using h0 substitute h0 to be the parent of next generation. Step 5. Stop if the halting criterion is satisfied; otherwise, C = C + 1, go to Step 2. The sampling and interpolating mechanism in the field of signal processing can make the problem of scalability solved simply and directly. To reduce the computational time on large amount data, here we use a pyramidal fitness evaluation as follows. For convenient reason, we denote it as BinEP.
636
J. He, X. Yao, and J. Tang
Step 1. Segment the predictive area of a m × n size image to a set of l × k templet, where l < m, k < n. Step 2. Take the set of mean value of each l × k templet as the input data, execute binEP . Step 3. If the fitness has not been improved after some generations, reduce l and k randomly, go to Step 2. Stop if the fitness has not been improved after some generations, when l = 1 and k = 1. Obviously in BinEP, the changing of input data can change the environment of binEP from coarse granularity to fine granularity. Since the coarse expression of search space can make the objective problem be approached more easily, therefor BinEP can supply a kind of greedily and gradually approach through disposing the scalability problem at the same time.
3
Simulation Experiments
The proposed BinEP is evaluated by comparing with the method of Huffman coding, and the lossless JPEG. Images used here are Lena (as shown in Fig.2) and a number of science images from NASA[9] (we use these images because they are open, thus the work in this paper can be validated by others). In experiments, the initial parameters of BinEP are that l = 10, k = 10, pm1 = 13 , pm2 = 19 , and the length of chromosome is 24 bits. The size of compressed image is calculated as: Sizeof(compressed Error)+Sizeof(compressed Bord)+Sizeof(Binary string).
Fig. 2. The 256 × 256 image of Lena, and the 1374 × 889 science image PIA04349
The experiments are divided into two parts based on different points of view. The first is for comparing with the GP-based method on Lena image, since this kind of comparison can exhibit the efficiency of methods basically. The second is for evaluation on large scale images to see the scalability of BinEP. Although the proposed evolutionary technique can be implemented by on our DSP development platform (with TMS320VC5402), to put down the evolving time for computational efforts comparison, all results have been averaged over 50 runs of software simulation on AMD 1.2G CPU.
Towards Intrinsic Evolvable Hardware
3.1
637
Evaluation on Small Size Image
Experiment results on Lena are summarized in Table 1, where two kinds of parameter setting for BinEP with 3 × 3 and 1 × 1 templet (which means BinEP is run with no technique of gradually approach, i.e., binEP) are also listed for analysis. It is clear, the optimization performance of BinEP are better both than the GP-based method and the lossless JPEG2000. It is also clear that BinEP and binEP have the same performance of optimization, while the computing time of BinEP is much less than that of binEP. That is, the proposed technique for large amount of input data with gradually approach method is validated to be feasible on small size image. Table 1. Comparison between BinEP and GP-based method on Lena image (256×256, grey, 66536 bytes), where BinEP is with 3 × 3 templet, and binEP means BinEP with 1 × 1 templet (i.e., BinEP with no technique of gradually approach). Here the method for coding error matrix are both the Huffman code technique. Methods GP binEP BinEP JPEG2000
3.2
Compressed Size 43, 154 bytes 40, 247 bytes 40, 247 bytes 44, 090 bytes
Computing Time 169.0 ± 0.1 sec. 11.23 ± 0.1 sec. 3.080 ± 0.1 sec. < 1 sec.
Evaluation on Large Scale Images
In experiment, the templet used in BinEP is 10 × 10. Two lossless coding techniques were used to code the evolved error matrix, the one is Huffman code which was used in the past approaches in [2] and [4], the other is the lossless JPEG2000 which is investigated newly in this paper. The experimental results are listed in Table 2, where the results of using Huffman code and the lossless JPEG2000 compress the original image directly are also summarized. It is apparent, the compressive results and the computing time of BinEP on large scale images is better and acceptable. The difficulty of BinEP on vary large images, such as PIA07335, is at the later period of evolution when the templet is shrunk to one pixel. At this time, the computing time can not be reduced since the evolved data is total of an image. Take the result of binEP (with no technique of gradually approach) on 256 × 256 size image Lena as the reference, the efficiency of BinEP on larger scale images can be summarized approximately as shown in Fig.3. It seems that the satisfying of BinEP is up to 70 times increase of image size, where the increase speed of images is relative slower than the increase speed of computing time. 3.3
General Comparison Between BinEP and DRC
The Dispersed Reference Compression (DRC)[3] is the newly developed method comes of [1]. Essentially, the idea of proposed method and DRC is quite different.
638
J. He, X. Yao, and J. Tang
Table 2. Comparison between BinEP, Huffman code, and the lossless JPEG2000 on large scale images. BinEPv1 means that Huffman code is used for coding the error matrix, and BinEPv2 means the lossless JPEG2000 is used for coding the error matrix. Image Name P IA07335 P IA07217 P IA05578 P IA07225 P IA07096 P IA07343 P IA07227 P IA04349 P IA05202 P IA06322
Huffman (bytes) 5, 578, 020 6, 286, 500 5, 522, 256 4, 919, 459 4, 247, 303 2, 270, 780 1, 955, 670 1, 115, 492 816, 127 731, 328
JPEG2000 (bytes) 2, 515, 657 2, 031, 297 1, 540, 111 2, 108, 906 1, 938, 940 1, 935, 780 1, 622, 853 789, 619 576, 951 529, 634
BinEPv1 (bytes) 2, 311, 920 1, 918, 890 1, 698, 480 2, 151, 157 1, 637, 002 1, 762, 368 1, 552, 944 720, 341 534, 536 481, 381
BinEPv2 Evolving Time (bytes) (sec.) 2, 389, 501 266.57 ± 3 1, 983, 626 214.89 ± 3 1, 444, 500 161.81 ± 4 2, 016, 007 121.58 ± 2 1, 886, 143 132.92 ± 3 1, 814, 022 75.50 ± 2 1, 550, 086 52.23 ± 3 735, 189 32.32 ± 3 541, 220 25.14 ± 2 492, 653 37.03 ± 2
1 The size of images from up to down are: 3000 × 2400, 3000 × 2400, 2400 × 2400, 2841 × 1846, 3000 × 1688, 2104 × 1726, 1320 × 1840, 1374 × 889, 1065 × 771, and 1239 × 805 respectively.
25
The increase times of computing time
20
15
10
5
0 10
20
30
40
50
60
70
80
90
100
110
The increase times of image scale
Fig. 3. The approximately estimation of the efficiency of BinEP on larger scale images, where the size of image for contrast is 256×256. The vertical axe is the increase times of computing time, and the horizontal axe is the increase times of image scale.
On the issue of evolving large scale images, [3] presents a number of results of compression ratio, however had not shown details about computing time. Thus it is hard to give an exact experimental comparison between BinEP and DRC, since the computing time is the most important issue for evolvable hardware, special for real-time applications. For the issue of predictive lossless image compression itself, intrinsic EHW is more attractive than extrinsic EHW. The reason is vary simple and clear: using extrinsic EHW, images have been evolved by software (simulator) already, that repeat the last work done by software on a hardware has few realistic meaning
Towards Intrinsic Evolvable Hardware
639
but for study. On this issue, the proposed method has more advantageous, since its problem expression is suit for intrinsic EHW mode.
4
Conclusions
This paper proposes an intrinsic EHW model for predictive lossless image compression, which can be implemented on a chip directly, other than extrinsic EHW model used by [1-4] and [5]. For the problem of solving large scale of images, this paper presents a binary evolutionary programming (BinEP) by combining the technique of local asexually reproducing with the idea of using mixed mutations. The proposed evolutionary technique is suit for implementation with intrinsic evolvable hardware mode. Experimental results show that the proposed method can reduce the computing time much more, and can scale up the processed image size up to 70 times larger with relative slower increase speed of computing time.
References 1. T. Higuchi, M. Murakawa, M. Iwata, I. Kajitani, W. Liu, and M. Salami: Evolvable hardware at function level. In IEEE Intemational Conference on Evolutionary Computation, (1997) 187–192. 2. A.Fukunaga, K.Hayworth, A.Stoica: Evolvable Hardware for Spacecraft Autonomy. Aerospace Conference, IEEE, 3 (1998) 135–143. 3. H. Sakanashi, M. Iwata, and T. Higuchi: A Lossless Compression Method for Halftone Images using Evolvable Hardware. Evolvable Systems: From Biology to Hardware, Lecture Notes in Computer Science 2210, pp. 314-326, Springer Verlag, 2001. 4. A.Fukunaga, A.Stechert, ”Evolving Nonlinear Predictive Models for lossless Image Compression with Genetic Programming”, Proc.of the Third Annual Genetic Programming Conference,Winsconsin,1998. 5. J.He, X.Wang, M.Zhang, J.Wang, Q.Fang: New Research on Scalability of Lossless Image Compression by GP Engine. The 2005 NASA/DoD Conference on Evolvable Hardware, June 29 - July 1, 2005, The Westin Grand, Washington DC, USA. 160–164. 6. X. Yao, T.higuchi: Promises and challenges of evolvable hardward. IEEE Trans. On Systems, Man, and Cybernetics - Part C: Applications and Reviews. 29(1), (Feb. 1999). 7. X. Yao, Y. Liu, G. Lin. Evolutionary Programming Made Faster. IEEE Trans. on Evolutionary Computation, 2(2), (1999) 82–102. 8. Zhenguo Tu and Yong Lu: A Robust Stochastic Genetic Algorithm (StGA) for Global Numerical Optimization. IEEE Trans. on Evolutionary Computation. 8(5), (Oct. 2004) 456–470. 9. http://photojournal.jpl.nasa.gov/gallery/universe