Progressive Very Low Bit Rate Image Coding

2 downloads 0 Views 252KB Size Report
Progressive Very Low Bit Rate Image Coding. C. A. Christopoulos1, A. N. Skodras2, W. Philips3, J. Cornelis1, A. G. Constantinides4. 1Vrije Universiteit Brussel ...
Progressive Very Low Bit Rate Image Coding C. A. Christopoulos1, A. N. Skodras2, W. Philips3, J. Cornelis1, A. G. Constantinides4 1Vrije Universiteit Brussel, VUB-ETRO (IRIS), Pleinlaan 2, 1050 Brussels, Belgium 2University of Patras, Electronics Laboratory, Patras 26110, Greece 3University of Gent, Sint-Pietersnieuwstraat 41, 9000 Gent, Belgium 4Imperial College, Dept. of Electrical. Engineering, London SW7 2BT, UK E-mail: [email protected] Abstract: This paper describes progressive very low bit rate coding using segmented images. A segmentation algorithm which gives good results for coding purposes is given and it is compared to existing segmentation methods used for the same purpose. Progressive image coding is achieved by a recently proposed set of weakly separable basis functions.

1

Introduction

Segmented Image Coding (SIC) is a relatively new image compression technique [1]. In SIC, the image is partitioned into regions of slowly varying intensity. The contours separating the regions are coded, e.g. by chain codes, while the image intensity inside an image region is approximated by a linear combination of basis functions. At high compression ratios, SIC yields a better subjective image quality than block transform coding, as for example JPEG, because the objectionable blocking effects are avoided [2]. For this reason, SIC is ideal for very low bit rate coding and progressive transmission of images. The actual performance of SIC depends highly on the segmentation algorithm used. For coding purposes, the segmentation should have the following properties: (a) it should be possible to control the number of regions, and hence the amount of detail in the segmented image, (b) the produced regions must be smooth with slowly varying image intensity so that they can be reconstructed with low-degree polynomials, (c) few small regions should be produced, since coding many small regions decreases the compression ratio and (d) the contours should be smooth so that they can be coded efficiently. In the newest SIC techniques [3,4], the basis functions within a given region are orthonormal. By using orthonormal bases, the coefficients in the linear expansion can be obtained independently, with fewer and numerically stable computations [3]. Even so, the computational requirements of SIC are very high, because the orthonormal base for a region depends on its shape and size and consequently a new base must be constructed for every single region. In this paper, the orthogonal basis functions of Philips [4] are used. They produce similar results as the basis functions of Gilge et al. [3], but the set of basis functions can be computed faster due to the fact that they are weakly separable. For an image of size 256x256 pixels, they require 5 times less memory and they can be computed 8 to 30 times faster (the exact speed-up figure depending on the number of basis functions computed in each region [5]). The way the region intensity is approximated will be reviewed in following sections. A new segmentation algorithm which takes into account the edge information that exists in the image will also be given. The number of contour points is very important in SIC applications, because it determines the compression ratio at low bit rates. In this paper it is assumed that no more than 1.6 bits per contour pixel (bpcp) are needed to code the contours, even though the approximation method described in [6] requires only 1.2 bpcp. Our experiments show that at high compression, 60-80 percent of the total bits are spent for coding the contours and therefore minimizing the number of contour points is an important aspect of developing efficient SIC techniques.

2

The new segmentation algorithm

2.1

Theory

The importance of edges has been recognized by many researchers [7]. To reduce the risk of loosing important edge information, the segmentation algorithm starts by producing an over-segmented image. The oversegmented image is produced with the "edgmentation" algorithm described in [8]. Edgmentation is a modified split-and-merge algorithm whereby the splitting step is performed according to the edge information - and not according to the quad-tree of the image - resulting in much more natural borders. The splitting step consists of a steepest path growing algorithm applied to the reverse of the edge image (the edge image is the square of the gradient image). The local maxima in the reverse of the edge image are defined as roots. Pixels are linked to one another along the steepest path in the direction of the maximum of the reverse edge image, until each pixel is linked to a root. All pixels linked to the same root are attributed the gray value of the root and define a micro-

segment. In mathematical form, if the input image is expressed as a continuous function z = f ( x , y ), then the 

edge image is z'  = grad f . A lot of small segments (micro-segments) are produced after the splitting step. Usually these segments are too small to be meaningful. Since adjacent micro-segments could be part of the same object, they often have to be merged. Our tests showed that a merging cost function based only on the absolute difference of the mean value of adjacent regions is not appropriate. Such a cost function results in relatively big regions which are easy to code but a lot of important edge information is lost. Since at high compression, edges are important and must be retained as much as possible, we define a new cost function. More specifically, the gradient image used in the splitting step of the edgmentation algorithm will be involved again in the region merging. A function CF ( i , j ) is defined, as the cost for merging region j to region i (if merging is performed, the label of region j will be changed and made equal to the label of region i) as follows: gradient ( i , j ) size ( j ) ⋅ ⋅ difference. in. mean. value( i , j ) CF ( i , j ) = (1) shared . contour. length( i , j ) shared . contour . length( i , j ) The cost function is calculated for all regions produced from the edgmentation algorithm. Regions with minimum cost function will be merged first. After merging, the costs are recalculated and the procedure continues until the desired number of regions is obtained. The cost function of eq. (1) favours merging of neighbouring regions with long shared contours, small sizes (number of points) and weak separation. The gradient is calculated along the shared contour between two neighbouring regions. The ratio gradient(i,j)/shared.contour.length(i,j) (where gradient(i,j) is the sum of the gradient magnitudes along the common boundary of regions i and j) is a measure of the strength of the edge. The ratio size/shared.contour.length is included to select the regions with longer common boundary among the regions with small size. It is easy to see that when the difference in mean value between adjacent regions is similar, the regions with the smallest strength of their boundary will be merged first. Experiments showed that the cost function defined in eq. (1) gives a much better image representation from the point of view of subjective image quality than a cost function which is based exclusively on the difference in mean gray value of adjacent regions.

2.2

Segmentation results and comparisons

The results of the proposed method applied to the "girl" image (Fig. 1) are shown in Figures 2.1, 2.2 and 2.3. Fig. 2.1 shows the over-segmented image (about 1300 regions). Figure 2.2 shows the segmented image after applying the merging cost function of eq. (1) and Fig. 2.3 shows the result when all regions are filled with their mean value. The number of regions is 200 and the number of contour points is 7429 (compression is about 39:1 when 8 bits are used for storing the mean intensity of each region). The results of the proposed segmentation algorithm are compared with the results produced by two other segmentation methods for coding purposes, namely the Recursive Shortest Spanning Tree (RSST) algorithm [2] and Hierarchical Segmentation using compound Gauss-Markov Random Fields [9]. The RSST algorithm maps the original image onto a region graph, so that each region initially contains only one pixel. Sorted weights, associated with the links between neighbouring image regions are used to decide which link will be eliminated and therefore which regions will be merged. Merged regions are attributed the mean gray value of their pixels. After each merge (a cost function which minimizes the sum squared error (SSE) between the original and the segmented image is used in [2]), the link weights are recalculated and resorted.  Therefore, the number of regions is progressively reduced from  (in an image if size NxN ) to the desired value. Desired property (a) of the segmentation algorithm is fulfilled when the RSST algorithm is used. The results of the RSST algorithm applied on the "girl" image are shown in Fig. 3.1 and 3.2 (45 regions, 8288 contour points, compression 38.5:1). In this image, the eye and part of the hair have been regarded by the algorithm as one region, resulting in a bad representation of the image. This can cause problems (as will be shown later) in the approximation of the region intensities with polynomials, since polynomials are only good at representing smooth regions [3]. The large number of contours can be explained by looking at Fig. 3.1. It is clear that the shape of the produced contours is very rough, which is to be expected, since the algorithm starts from the pixel level. Desired properties (b), (c) and (d) for the segmentation algorithm are therefore not met by the RSST algorithm. In the method of Marques et al [9], the image is initially decomposed in several levels of different resolution. Let us define a given level l of the decomposition z ( l ) = {zi, j ( l )} as a realization from a stochastic process Z ( l ) = {Zi, j ( l )}. The location of the regions is governed by a homogeneous Gibbs-Markov Random

Field (GMRF)

Q(l ) = {Qi, j (l )}. The decomposition that has been chosen in [9] is the Gaussian pyramid. At

each level of the pyramid, the image is modelled by a Compound Gauss-Markov Random Field and the segmentation is obtained by using a Maximum a Posteriori criterion which maximizes the joint distribution P(Y (l ), Q(l )) . The segmentation is carried out first at the top level of the pyramid. Once a level l has been segmented, this segmentation is projected onto the next lower level l-1 and used as a first estimate of the final segmentation at this level. The process is repeated until the segmentation at the bottom level l=0 has been found. Merging is based on the absolute difference of the mean gray value of adjacent regions. The results of the algorithm discussed in [9] applied on the "girl" image are shown in Fig. 4.1 and 4.2 (184 regions, 8103 contour points, compression 36.5:1). Since the number of regions can not be specified in this algorithm, it was chosen by trial and error. In this algorithm, a large number of the produced regions do not correspond to real regions in the scene. An image of better quality could have been obtained if the image had been segmented in a larger number of regions, but the number of contour points would have been increased, resulting in a smaller compression ratio. Desired properties (a) and (b) are not met by this segmentation algorithm.

3

The progressive image transmission scheme f (x,y) of a region Ω is approximated by a weighted sum of orthogonal base functions Pm,n ( x , y ): f ( x , y ) ≈ f ′ ( x , y ) = ∑ Am,n Pm,n ( x , y ) where I is a suitably chosen index set and the

In SIC, the image intensity

( m ,n )∈I

coefficients  P Q are determined such that the total squared approximation error between f' and f is minimal (least squares approximation). As the base functions are normalized and mutually orthogonal, the mean square coefficients are Am,n = f ⋅ Pm,n . The method to find the approximation of the image intensity in a region is fully described in [4]. Parts of this description are reviewed here and results are given. Let Ω be the union of N different rows R j = {( x , y ) ∈ Ω: y = y j }, ≤ j ≤ N , let nj be the number of pixels

Rj and lets assume that the rows are indexed such that ni ≤ n j if i > j . Using these notations it can be shown i j that the functions in S = {x y : ≤ j < N , ≤ i < n j } are linearly independent on Ω [4]. Suppose also that B = {Pi, j ( x , y ): ≤ j < N , ≤ i < n j } is the base obtained by orthogonalizing the functions in S in the lexicographical order , y, y  ,..., x , xy, xy  ,... . It can be shown that B is weakly separable [4], and more specifically that Pi, j ( x , y ) = Pi, ( x , y )Qi, j ( y ) where Qi, j ( y ) is a polynomial of degree j in y . Also Pi,  ( x , y j ) is proportional to the discrete Legendre polynomial pi, j ( x ) of degree i associated with the row Rj , i.e. Pi, ( x , y j ) = si, j pi, j ( x ) . The functions Qi, j ( y ), j = ,,... and pi, j ( x ), i = ,,... satisfy a three-term in

recurrence relation and therefore can be generated quickly [4]. The coefficients can be computed as: N

A i, j = f ⋅ Pi, j = ∑ Q i, j ( y k ) s i, k Bi, k where Bi, k = k =

∑p

i, k

( x ) f ( x, y k )

(2)

x ∈Rk

Figures 5.1, 5.2 and 5.3 show the results of the reconstructed image "girl" after compression 25:1, 26:1 and 25:1 respectively, with the segmentation method of [2], [9] and the algorithm proposed in this paper. In each case, the number of coefficients Nc for a region with np pixels was determined as Nc = min( α * np, Nb max) , where < α ≤  and Nb max is the maximum number of coefficients to be computed in each region. This strategy assigns more coefficients to larger regions because it generally needs more degrees of freedom to represent large regions with the same accuracy as small regions. In addition, it limits the maximum number of base functions to be calculated in a region, and consequently the computational and memory requirements. It can be observed that some important details have been lost in the reconstructed images of Figures 5.1 and 5.2 due to the initial bad segmentation. Comparison with the results obtained with the proposed method, proves that the edge information which was taken into account both in the splitting and the merging part of the algorithm improves the quality of the reconstructed image.

4

Discussion

The segmentation algorithm presented here integrates region growing and edge detection. Both the splitting and the merging steps of the algorithm are using the edge information of the image, and this gives better results compared to other existing segmentation algorithms for SIC. The algorithm has the desired properties (a), (b), (c) and partly (d) and can, in our opinion, still be improved, e.g. by applying contour smoothing techniques such

as the one proposed in [10] to improve the visual appearance of the boundaries and to reduce the number of contour points. The algorithm differs significantly from that in [7] which also integrates region growing and edge detection. In [7] the images are segmented by split-and-merge based on quadtree decomposition and in a second stage an objective function is used to detect and remove artefacts introduced by the quadtree segmentation algorithm (artificial region boundaries, etc.). Our splitting and merging steps are different from the ones given in [7]; they do not rely on the quadtree image representation. The results presented here were also evaluated on other images as 'peppers', 'plane' and frames of the sequence "Miss America". Improved results can be achieved if the parameters α and Nb max are optimized for a specific compression factor. For example, in each region, we can start from a small number of basis functions and increase it until a desired quality (based on MSE for example) for the reconstructed image is obtained. Our results indicate that such an approach gives much better results, both in compression and in image quality. The drawback of such an approach is that the computation time increases dramatically. Additionally, if a model of the image is available, the approximation of the background of the images can be done only with the mean gray value and more basis functions can then be used for approximating the significant portions of the image, without decreasing the compression ratio.

Conclusions The results presented in this paper show the importance of the segmentation algorithm in Segmented Image Coding. A segmentation algorithm suitable for SIC has been presented and progressive image transmission using weakly separable basis functions has been described.

Acknowledgements This work was supported by the EC Human Capital and Mobility programme under the contracts ERBCHBGCT930260 and ERBCHRXCT930382 and by the NFWO contract 9.0051.93.

References [1] [2] [3]

[4] [5] [6] [7] [8] [9] [10]

M. Kunt, M. Benard, R. Leonardi, "Recent results in high-compression image coding" IEEE Trans. Circuits and Systems, Vol. 34, November 1987, pp. 1306-1336 M.J. Biggar, O.J. Morris, and A.G. Constantinides, "Segmented image coding: performance comparison with the discrete cosine transform", IEE Proceedings, Part F, No. 2, April 1988; 135:121-132. M. Gilge, T. Engelhardt, R. Mehlan, "Coding of arbitrarily shaped image segments based on a generalized orthogonal transform" Signal Processing: Image Communication, Vol. 1, No. 2, October 1989, pp. 153-180. W. Philips, "Weakly separable bases for fast segmented image coding", Proceedings of SPIE Hybrid Image and Signal Processing IV, Vol. 2238, pp. 153-163, April 1994, Orlando, Florida. W. Philips, C.A. Christopoulos, "Fast segmented image coding using weakly separable bases", Proceedings of ICASSP 94, Adelaide, Australia, 19-22 April 1994, pp. V:345-348. M. Eden, M. Kocher, "On the performance of a contour coding algorithm in the context of image coding - Part I: contour segment coding", Signal Processing, , Vol. 8, No. 4, July 1985, pp. 331-386. T. Pavlidis, and Y.T. Liow, "Integrating region growing and edge detection", IEEE Trans. on PAMI, Vol. 12, No. 3, March 1990, pp. 225-233. R. Deklerck, J. Cornelis and M. Bister, "Segmentation of medical images", Image and Vision Computing, Vol. 11, No. 8, October 1993, pp. 486-503. F. Marques, J. Cunillera, A. Gasull, "Hierarchical segmentation using compound-Markov random fields" Proceedings of ICASSP 92, Vol. 3, San Francisco, March 1992, pp. 53-56 C. Gu and M. Kunt, "Contour simplification by a new non-linear filter for region-based coding", Proc. of Visual Communication and Image Processing, VCIP 94, Chicago, USA, Sept. 1994.

Figure 1 The original image 'girl'

Figure 2.1 Over-segmented image (1300 regions)

Figure 2.2 Result of region merging (200 regions)

Figure 2.3 The image of Fig. 2.2 with each region filled with its mean intensity

Figure 3.1 Image segmented with [2] (45 regions)

Figure 3.2 The image of Fig. 3.1 with each region filled with its mean intensity

Figure 4.1 Image 'girl' segmented with the algorithm of [9] (184 regions)

Figure 4.2 The image of Fig. 4.1 with each region filled with its mean intensity

Figure 5.1 Image of Fig. 3.2 reconstructed after compression 25:1

Figure 5.2 Image of Fig. 4.2 reconstructed after compression 26:1

Figure 5.3 Image of Fig. 2.3 reconstructed after compression 25:1

Suggest Documents