A Feature-Based Scalable Codec for Image ... - Semantic Scholar

3 downloads 0 Views 598KB Size Report
A Feature-Based Scalable Codec for Image Compression. Lun-Chia Kuo and Sheng-Jyh Wang. Department of Electronics Engineering and Institute of ...
A Feature-Based Scalable Codec for Image Compression Lun-Chia Kuo and Sheng-Jyh Wang Department of Electronics Engineering and Institute of Electronics National Chiao Tung University (NCTU), Hsinchu, Taiwan, R.O.C. E-Mail: [email protected]

Abstract This paper presents a feature-based scheme for image compression. In the proposed method, the features in an image are recorded by threedimensional verge points, which are defined as high curvature points on the image surface. The verge point representation can be further condensed using Bspline control point approximation. By encoding and transmitting these control points, the features in an image could be revealed at the decoder side. Furthermore, by selecting different coding modes, the proposed method can offer diversified scalabilities, including SNR scalability, spatial scalability, and shape scalability. In addition, this coding technique is suitable for region of interest selection and transmission. Experimental results are provided to present the scalability and interactivity of the proposed method.

1. Introduction To transmit bulky image data through limited network bandwidth, different image coding approaches have been proposed [1]-[4]. In network environment, scalable image coding is useful for combating the fluctuation of the allowed bandwidth. Moreover, for network users, since objects in an image provide more information than the original pixel array, it would be helpful if the objects in an image can be conveniently identified and selected. In [5], we proposed a method to represent images using verge points on the image surfaces. In this paper, based on [5], we explore the application in image coding further, and proposed an image coding scheme featured in two aspects, scalability and interactivity. The overall structure of the proposed codec is shown in Fig. 1. In the first phase, the three-dimensional feature curves :v are extracted in the image surface, and approximated using a set of B-spline control points :b. By arranging the :b in different hierarchical forms, scalable

transmission and region of interest transmission could be achieved. To support interactive transmission, scaling and bitplane shifting factors are incorporated to create different visual effects at the decoder side. Three quantizers, Qk, Qd, and Qi, are included to strike balance between visual quality and required bandwidth.

Fig. 1. Block diagram of the proposed image codec.

2. Quantizers Determination As mentioned above, to strike balance between the visual quality and the required bandwidth, three quantizers, Qk, Qd, and Qi, are included in the proposed scheme. These parameters can be either specified by users or adaptively determined.

2.1 Curvature Quantizer Qk The curvature quantizer Qk determines how much details would be transmitted to the decoder side. If a smaller Qk is applied, the transmitted image is with better quality, while the required bandwidth is high. On the contrary, if a larger Qk is used, less bandwidth is needed while the image quality is sacrificed. Let QS indicate the allowed curvature set. To automatically determine Qk, we apply the elements in QS in an ascending order to the test image to obtain the relation between the number of verge points Re and the mean

Proceedings of the 19th International Conference on Advanced Information Networking and Applications (AINA’05) 1550-445X/05 $20.00 © 2005 IEEE

square error of the reconstructed image De. This relation is used to estimate the rate-distortion curve. Then, we find the knee point, the place where the ratedistortion curve bends most, in the estimated R-D diagram. The curvature value which produces the knee point is considered as the suggested curvature quantizer Qk opt . Qk opt is defined as d 2 De Qk

opt

arg max jQS

dRe

2

( j)

,

(1)

dD 1 ( j) dR

PSNR difference on the right-hand side and left-hand side of a testing bitplane j.

3. Basic Data Structure The basic hierarchical data structure of the proposed method is shown in Fig. 2. At the top layer, the header information and the B-spline curves in :b are stored. For each B-spline curve, the curvature sign and the total number of control points are recorded first. Then, the starting control point and the body control points of each curve are encoded separately.

where j indicates the testing point in the estimated R-D curve.

2.2. Shape Qunatizer Qd The shape qunatizer Qd determines the maximum allowed position distortion between the verge curves and the approximated B-spline curves. Different features in an image may have different visual impacts. Curves with higher contrast and longer length, such as the contour of the highlighted human face with a dark background, may provide larger visual impact. On the contrary, curves with smaller contrast and shorter length, such as the vague textures on a cloth, may cause less visual impact. Therefore, we proposed a scheme to adaptively select Qd based on the accumulated contrast AC, which is defined as the sum of all the curvature magnitudes in a curve. For the j-th curve in :v, its corresponding Qd is defined as Qd ( j )

(

AC max D ) u QD const , AC ( j )

(2)

where ACmax indicates the maximum accumulated contrast in :v . The weighting factor D is set to be 0.3 empirically, and QDconst is set to be 1 in this simulation.

Fig. 2. Basic data structure.

To encode the body control points, we adopt the scheme proposed in [6], and add a linear mode to improve the coding efficiency. This modification is efficient in coding straight lines, especially for long ones. To check whether a curve is a straight line, we draw a line from the starting point to the ending point of each curve. If the distortion between the curve and the locus of this line is less than Qd, the curve is considered as a straight line. Then, this curve is coded using linear mode. For intensity component, the quantized intensity value of the starting point is encoded using its absolute value, while the intensity values of the body control points are coded differentially.

2.3. Intensity Qunatizer Qi The quantizer Qi controls the intensity accuracy between the original images and the decoded images. In the proposed codec, since the intensity values of the smooth regions are obtained using iterative linear interpolation, the impact of intensity quantization may be less significant as long as Qi is not too large. To obtain Qi adaptively for different test images, we increase Qi step by step, and use the quantized verge curves in :v to reconstruct images. We search the knee point to obtain Qiopt . Here, Qiopt is estimated using

Qi

opt

PSNR  (j)-PSNR (j) arg max



-

,

(3)

PSNR (j)  PSNR (j) where BN indicates the number of intensity bitplanes. PSNR+(j) and PSNR-(j) indicate the magnitudes of the j[ 0... BN 1]

(a)

(b)

Fig. 3. (a) Approximated B-spline curves, where curves with positive and negative curvature values are indicated in red and green respectively. (b) Decoded Image using the proposed method.

Fig. 3(b) shows the decoded image using the Bspline curves in Fig. 3(a) with PSNR of 28.83dB and compression rate of 30.86, where the Qk, Qd and Qi are set as 2,1,and 4, respectively.

Proceedings of the 19th International Conference on Advanced Information Networking and Applications (AINA’05) 1550-445X/05 $20.00 © 2005 IEEE

4. Scalable Transmission Scalable coding is useful in transmitting images through network with fluctuated bandwidth. By rearranging the B-spline control points, the proposed method could offer diversified scalabilities, including SNR scalability, spatial scalability, and shape scalability.

4.1. SNR scalability In this paper, we perform SNR scalability by combining the bitplane scheme and scale scheme. The data structure for SNR scalability is shown in Fig. 4. For B-spline curves in the coarse-scale CS, the shape component and the first b1 bitplanes of the intensity component are transmitted first. After that, the bitplanes from b1+1 to b2 for B-spline curves in the coarse-scale CS are followed. Then, for B-spline curve in the medium-scale MS, shape component and the first b2 bitplanes of the intensity component are transmitted. Similar rules apply in the fine-scale FS. Fig. 5(a)-(c) show the examples of SNR scalable transmission, where b1, b2, and b3 are set as 1,3, and 4 respectively. Fig. 10 shows the PSNR of the decoded image with respect to the required bitstream size at different selections of parameters for the 512u512 image “peppers”.

Fig. 4. Data arrangement for SNR scalable transmission.

R2/R1, reconstruct these scaled curves, and match the corresponding curves in R2. The ratio R2/R1 could be a rational factor. After curve mapping, for the unmapped curves in R2, we store the control points in layer R2. Fig. 6 shows the procedures to obtain highest resolution R3 from the three-layer image hierarchy at the decoder side. The experimental results using the proposed spatial scalable method are shown in Fig. 7(a)-(c).

Fig. 6. Decoding process for the proposed spatial scalability scheme.

(a)

(b)

(c)

Fig. 7. Spatial scalable transmission for the image “peppers”. (a) 256x256. (b) 370x370. (c) 512x512.

4.3. Shape Scalability Shape scalability is useful for users to retrieve the images they demand efficiently. For example, users may be interested in images which contain specific number of objects. Given a B-spline curve with MB control points, C1, C2,…,CMB, the shape scalability can be achieved by downsampling these control points. We construct a three-layer shape hierarchy, S1, S2, and S3. The head and tail control points must be recorded in S1. Except head and tail control points, the control point with index j is classified following the rules, if j mod 4=1, CjS1 else if j mod 4=3, Cj S2 else Cj S3.

(a)

(b)

(c)

Fig. 5. SNR scalable transmission of the image “peppers”. (a) (CS,b1). (b) (CS+MS,b2). (c) (CS+MS+FS,b3).

(4)

The experimental results using the proposed shape scalability scheme are shown in Fig. 8(a)-(c).

4.2. Spatial Scalability We use the affine invariant property of B-spline curves to perform spatial scalability. Given two resolutions, R1 (lower resolution) and R2 (higher resolution), we extract verge curves in R1 and R2 separately, and then approximate these curves using Bspline control points. The control points obtained in R1 are stored in layer R1. We scale up the shape component of the control points in R1 with a factor

(a)

(b)

(c)

Fig. 8. Examples of shape scalability. (a) S1. (b) S1+S2. (c) S1+S2+S3.

Proceedings of the 19th International Conference on Advanced Information Networking and Applications (AINA’05) 1550-445X/05 $20.00 © 2005 IEEE

5. Interactivity In the proposed method, it is convenient to select and transmit “region of interest” since we based on the features to code an image. Instead of selecting specific regions, we select features of interest, which are approximated by B-spline control points, and reconstruct the region information by interpolation. To perform interactive selection and transmission, we divide the B-spline curve space :b into two categories, :b =:bNormal Mode+:bFeature of Intereset.

scalabilities, such as SNR scalability, spatial scalability, and shape scalability. The proposed method is suitable for “region of interest” selection and transmission, since we based on the features to code an image. Experimental results are provided to demonstrate the scalability and interactivity of the proposed method.

(5)

By this decomposition, different effects can be presented by manipulating the curves in :bFeature of Intereset . For example, to change the priority of the selected features, we can use

(a)

(b)

(c)

:b’=:bNormal Mode+BitShift(:bFeature of Intereset , BN) (6) (d) The priority of a feature is increased with a positive BN, and decreased with a negative BN. Fig. 9(a) shows a decoded image with BN=0. In Fig. 9(b), selected features are indicated in magenta. Fig. 9(c) shows the decoded image using Fig. 9(b), where the parameter BN is set as 6. In addition, the resolution of the selected features can be changed by :b’=:bNormal Mode +M(:bFeature of Intereset,

SF),

(7)

where M(.) represents the shape scaling function similar to the concept of the spatial scalability. The parameter SF indicates the scaling factor. Fig. 9(d) shows the scaled B-spline curves (indicated in yellow), and Fig. 9(e) shows the decoded image using Fig. 9(d). In this example, we set SF as 1.4. This scaling and bitplane shifting procedures can also be applied at the decoder side. This further increases the interactivity of the proposed coding scheme.

6. Conclusions In this paper, we propose a feature-based method for scalable and interactive image compression. We extract and link the high curvature points on the image surface to form three-dimensional verge curves. The verge curves can be further approximated using Bspline approximation. The features in an image can be revealed at the decoder side by transmitting the Bspline control points. We construct a hierarchical data structure to encode these control points. The visual performance of the proposed method is comparable to JPEG using the proposed data structure. The control points can be rearranged to support diversified

(e)

Fig. 9. Examples of feature of interest selection and transmission. (a) Original decoded image. (b) Features selected for bitplane shifting. (c) Decoded image with bitplane shifting. (d) Features selected for resolution shifting. (e) Decoded image using (d).

7. References [1] H.H. Hsu, T.K. Shih, L. H. Lin, and R.C. Chang, “Adaptive Image Transmission by Strategic Decomposition,” in Proceedings of the 18th International Conference on Advanced Information Networking and Applications, Japan, Mar. 2004. [2] O. Egger and W. Li, “Subband Coding of Images Using Asymmetrical Filter Banks,” IEEE Trans. Image Processing, Apr. 1995, pp. 478-485. [3] P.J.L. van Beek, Edge-Based Image Representation and Coding, Ph. D. thesis, Delft University of Technology, 1995. [4] A. Kaup and T.Aach, “Coding of Segmented Images Using Shape-Independent Basis Functions,” IEEE Trans. Image Processing, July 1998, pp. 937-947. [5] S.J. Wang, L.C. Kuo, H.H. Jong, and Z.H. Wu, “Representing Images Using Verge Points on Image Surfaces,“ to appear in IEEE Trans. Image Processing. [6] F. W. Meier, G. M. Schuster, and A. K. Katsaggelos, “A Mathematical Model for Shape Coding with B-splines,” Signal Processing: Image Communications, May 2000, pp. 685-701.

Proceedings of the 19th International Conference on Advanced Information Networking and Applications (AINA’05) 1550-445X/05 $20.00 © 2005 IEEE

Suggest Documents