1284
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 18, NO. 6, JUNE 2009
Hierarchical Segmentation-Based Image Coding Using Hybrid Quad-Binary Trees Ashraf A. Kassim, Member, IEEE, Wei Siong Lee, and Dornoosh Zonoobi
Abstract—A novel segmentation-based image approximation and coding technique is proposed. A hybrid quad-binary (QB) tree structure is utilized to efficiently model and code geometrical information within images. Compared to other tree-based representation such as wedgelets, the proposed QB-tree based method is more efficient for a wide range of contour features such as junctions, corners and ridges, especially at low bit rates. Index Terms—Beamlets, binary space-partitioning, image coding, platelets, quadtrees, segmentation, wedgelets.
I. INTRODUCTION
T
HE discovery of orientation selective neural cells [1]–[3] in the mammal visual system have, undoubtedly, substantiated the perceptual importance of edges or contour information [4], [5] in the human vision. It is well known that image compression techniques based on cosine [6] and wavelet [7] transform coding produce visual artifacts [8], [9] at very low bit rates. These artifacts are generally observed along high contrast regions such as object edges, which could severely distort structural information and, thus, impairing visual cognitive tasks. In recent years, there is a growing body of work on image coding techniques that attempt to preserve edge information through methods that can be broadly categorized into edge-adaptive decomposition [10]–[13], directional filtering [14]–[17], and segmentation-based approximation [18]–[20]. The last one is related to a particular approach in image coding by separating the edge/contour and texture information and coding them independently. This idea originated in the 1980s during the development of second-generation image coding techniques [21], [22] that attempts to imitate some of the functions in the human visual system in order to obtain high compression. Examples of these works are [23]–[26] and, more recently, [27]–[30]. An advantage of such an approach is that the cross-distortion between edge and texture information can be avoided. At very low bit rates, images can be reconstructed from only contour information at a relatively low coding cost. This results in artificial graphic-like images which could still be both visually pleasing and cognitively meaningful.
Manuscript received December 26, 2007; revised February 06, 2009. First published April 24, 2009; current version published May 13, 2009. The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Ercan E. Kuruoglu. The authors are with the Electrical and Computer Engineering Department, National University of Singapore, Singapore (e-mail:
[email protected];
[email protected];
[email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TIP.2009.2017339
In this paper, a novel segmentation-based lossy image coding technique is proposed that addresses the problem of extracting and coding the edge/contour information. The geometric structures within images are modeled and coded by a hybrid quadand binary (QB) tree structure. Compared to other tree-based representations such as wedgelets, the proposed method is more efficient for a wide range of contour features such as junctions, corners and ridges, especially at low bit rates. The article is organized as follows. Section II provides an overview of several segmentation-based image approximation methods that utilize tree structures. In particular, the merits and drawbacks of each of these methods are highlighted in order to exhibit the motivation for the present work. Section III describes the proposed QB-tree structure and a procedure to model geometric information in images. In the next three sections, details of the QB-tree decomposition and approximation are discussed. In Section VII, we discuss the performance of our proposed technique in comparison with other related methods. We provide our concluding remarks in Section VIII. II. SEGMENTATION-BASED IMAGE APPROXIMATION Segmentation-based image approximation attempts to represent an image as a union of disjoint sets of regions. Each region is described by the geometry of the region boundaries (geometric information) and the image data within (texture information). Naturally, for simplicity and efficiency, tree structures are utilized to represent and organize this information describing the underlying image. Given a certain rate-distortion constraint, the tree is pruned to obtain a more compact image representation to achieve compression. The two major approaches to segmentation-based image approximation are: 1) region-based and 2) curvilinear-based. The former relies on tree structures to describe the interior of the regions, and the latter produces tree representations that specify the boundaries of regions. Several examples of these techniques are described below. A. Region-Based Approximation 1) Region Quadtree Decomposition: Based on the principle of recursive spatial decomposition, the quadtree is a simple hierarchical data structure [31] that provides an efficient framework for computing data subsets, and lends itself to serve as an image approximation tool. By virtue of its hierarchical structure, the quadtree lends itself to serve as an image approximation tool. In image quadtree decomposition, each partition is recursively subdivided into more homogeneous quadrants. Fig. 1(a) shows an example of image approximation using quadtree decomposition. Through a spatial aggregation of similar regions, the quadtree is able to efficiently describe large areas of homogenous image regions. However, it is evident
1057-7149/$25.00 © 2009 IEEE Authorized licensed use limited to: Laurent Duval. Downloaded on February 16,2010 at 11:50:40 EST from IEEE Xplore. Restrictions apply.
KASSIM et al.: HIERARCHICAL SEGMENTATION-BASED IMAGE CODING USING HYBRID QUAD-BINARY TREES
1285
Fig. 2. Wedgelet approximation: Examples of excessive fine partitioning of simple piecewise linear structures. Left: ridges. Middle: corner. Right: junction.
Fig. 1. Examples of various tree-based image approximation. (a) Quadtree. (b) BSP. (c) Wedgelet. The bottom row demonstrates the effects of pruning the corresponding tree representation, leading to different natures of visual distortion. (d) Quadtree, pruned. (e) BSP, pruned. (f) Wedgelet, pruned.
that the quadtree partitions do not necessarily correspond to maximal homogenous regions in the image. The representation can be made more compact by merging adjacent partitions (at different scales) [32]–[34] with similar homogeneity. Note that this is at the expense of more complex tree codes to describe the splitting and merging operations. 2) Binary Space Partitioning Tree: Regions represented by quadtree leaves are geometrically restricted to be square. Consequently, partition merging across scales is necessary such that the region boundaries coincide with image object boundaries. To overcome this limitation, Radha et al. [35] proposed using binary space partitioning (BSP) tree for image representation. In this approach, the image plane is recursively bi-partitioned along arbitrarily-orientated linear boundaries, constrained by a least-squares-error measure or a contour-matching criterion. The recursive partitioning process generates a BSP tree, where intermediate nodes are associated with the partitioning lines, and the leaf nodes represent convex polygonal image regions or cells [see Fig. 1(b)]. Similar to quadtree regions, the image data within each cell can be approximated by low-order polynomials. Pruning [36] a BSP tree results in mergence of all the polygonal cells belonging to the trimmed branch. This potentially destroys any edge structure that lies within the cells which could interrupt the contour continuity of larger structures as illustrated in Fig. 1(e). For low bit rate image coding, Rabiee et al. [37] have proposed a method that uses the projection pursuit encoding technique with a dictionary of spline functions in order to encode the BSP tree. More recently, the binary tree based partition methods have been used for segmentation purposes, as well [38], [39]. B. Curvilinear Data Approximation 1) Edge Quadtree and Beamlets: Analogous to storing image area information in the region quadtree decomposition, the edge quadtree of Shneier [40] stores the edge or contour information of the image. Each partition is obtained such that it contains at
most a single linear contour passing through it. This decomposition leads to quadtrees in which long linear edges can be efficiently represented by large partitions. However, finer partitions are required for curvilinear edges and in the vicinity of edge junctions, corners and ridges. To enable efficient handling of curves with fewer nodes than the edge quadtree, Martin [41] proposed the least-squares quadtree which approximates a curve with straight lines within a least-square. Donoho et al. [18] propose a beamlet as a tool to analyze 2-D linear objects using a multiscale (dyadic) collection of line segments. In essence, the beamlet analysis is identical to edge quadtree decomposition. Both the edge quadtree and beamlet decomposition are not directly applicable to the general class of images since they are only able to approximate with lines of constant thickness. 2) Wedgelets: Based on the beamlet analysis, the wedgelet theory [19] was introduced to approximate piecewise-constant images with smooth boundaries by computing an optimal linear bi-segmentation of each quadtree image partition, where each segment describes the mean of the corresponding image data within [see Fig. 1(c)]. Due to the reliance on the quadtree structure, a wedgelet representation shares the same problems and limitations with all the previously discussed techniques. Fig. 2 illustrates the problem of excessive fine partitioning in wedgelets. The wedgelet analysis also fails to exploit geometrical structure dependencies between neighboring partitions, which can result in contour continuity problems. This can be resolved by using ideas from the split-merge schemes for improving quadtree representations. For instance, see [42] in which dyadic rectangular tilings (similar to a kd-tree structure) are used, and [36] by Shukla et al., where partitions are merged adaptively for joint coding in a manner similar to the split-join strategy for region quadtree in Leonardi et al. [43]. Since image edges often have some degree of curvature, the extended wedgelets [44] has been proposed as an improvement by splitting a partition along curves instead of a linear edge. Another improvement over the original wedgelets is to use higher-order polynomials, such as platelets [20] in order to have a better approximation of the interior of each segment. Generally, the slight visual improvement from higher-order (i.e., greater than 2) approximation is not justifiable in view of the additional computation and coding cost. 3) Geometric Wavelets: This a segmentation-based image coding method by Alani et al. [45] which uses the BSP tree to partition the image and geometric wavelets [46] to approximate the surface in each partition.
Authorized licensed use limited to: Laurent Duval. Downloaded on February 16,2010 at 11:50:40 EST from IEEE Xplore. Restrictions apply.
1286
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 18, NO. 6, JUNE 2009
where (2) (3) and Fig. 3. Example illustration of the different nodes of the QB-tree (right) and their representation/operation in the image space (left). (Legend: —branch, —twig-root, —twig-branch, —leaf).
TR
TB
denotes the indicator function such that if if
B
L
(4)
.
Each partition is sub-divided into QB-tree) polygonal cells
(leaf-order of
III. QUAD-BINARY (QB) TREE IMAGE REPRESENTATION In this section, the hybrid quad-binary tree, or the QB-tree, is proposed as a novel tree structure for image representation. In essence, the method of QB-tree-based image decomposition is a cross between the BSP-tree and wedgelet techniques. Consider an image that has been recursively quad-partitioned such that each partition contains less than a certain threshold of contours. Each partition that is not homogenous is sub-divided into cells by arbitrary hyperplane cuts such that each cell can be satisfactorily approximated by a 2-D polynomial in the sense of least-squares error. The result is an ensemble of piecewise-polynomial 2-D functions with piecewise-linear cell boundaries that, ideally, should give a good approximation of contour features in an image. This is similar in nature to the approaches that use split-merge schemes for improving quadtrees (e.g., [36] and [42]). However, the main difference is that these types of tree structures are based on dyadic cubes, and bisecting lines are only used at their leaves. Fig. 3 illustrates the representations of the different QBtree nodes in an image partition. The QB tree has three internal nodes: branch node, twig-root node and twig-branch node, and a terminating node, called the leaf node. The branch node denotes a quad-split of the corresponding rectangular partition. Its parent node is a branch node, and has four children nodes. The twigroot node denotes a hyperplane cut of its corresponding rectangular partition into polygonal cells while the twig-branch node denotes a hyperplane cut of its corresponding polygonal cell into subcells. Both these nodes have two children nodes each. While the parent node of a twig-root node is a branch node, the parent node of a twig-branch node is either a twig-root or a twig-branch node. The Leaf node is associated with a 2-D polynomial that spans its corresponding partition, cell, or sub-cell. The parent node of a leaf node is a branch, twig-root, or twig-branch node. Additionally, we define the leaf-order of QB-tree as the maximum number of cells per partition. Thus, a QB-tree of leaforder 0 is equivalent to quadtree representation, and a QB-tree of leaf-order 1 corresponds to a wedgelet quadtree. Therefore, the QB-tree can be viewed as a generalization of linear segmentation schemes based on quadpartitioning.
The objective is to obtain a best least-square estimate of by polynomial spanning over each cell (5) where mial function over the region
and
is a polyno-
such that (6)
is minimized. Substituting (5) into (1), a single scale QB-tree image approximation is given by (7) For a multiscale approximation of , an optimal set of parti, it minimizes tions is determined such that for some (8) where
is a certain cost measure function and (9)
In the following, we describe a bottom-up pruning approach to obtain . Given an initial QB-tree with complete quad-subQB-tree decomposition, the tree, i.e., a single scale decision to prune a particular subtree , root at node , is determined by realizing the following inequality:
(10)
IV. QB-TREE DECOMPOSITION AND APPROXIMATION Consider partitioning an
image
uniformly at scale (1)
denotes the set of leaf nodes of tree and where denotes the set of polynomials spanning each cell in is replaced by leaf node and deleted subtree
Authorized licensed use limited to: Laurent Duval. Downloaded on February 16,2010 at 11:50:40 EST from IEEE Xplore. Restrictions apply.
.A
KASSIM et al.: HIERARCHICAL SEGMENTATION-BASED IMAGE CODING USING HYBRID QUAD-BINARY TREES
substituted for image partition
1287
as the approximation of the .
V. FAST QB-TREE DECOMPOSITION The most computational demanding step in QB-tree decomposition is the determination of that minimizes (6) for each partition, there are possible partition. For an compuways1 to bi-partition it, each of which requires tations to determine the squared error. For QB-tree decomposition, a direct computation for an optimal partition representation with leaf-order is nonviable since there is a very large number of cell configurations with an upper-bounded by
A. Cell Numbers and Partition Size The number of cells permissible on a partition is limited by the partition dimension. For example, it is not practical to consider more than 2 cells when approximating a 2 2 partition. number of cells Therefore, one can assume a larger/smaller permissible for each finer/coarser partition. In our experience, 2 to 4 cells for each partition of size ranging from 2 2 to 64 64 would be sufficient to approximate a majority of the contour features. B. Bounding Box for Cell Division When a partition is subdivided into cells, the number of possible hyperplane cuts is restricted by the dimension of its bounding cell, i.e., as the subdivision continues, due to the smaller discrete area, the number of possible ways to divide a cell is reduced. Given, for a cell, a bounding box of size , where , the number admissible hyperplane orientations on the cell is approximately
, the bounding box of dimension can be used to limit the search on the optimal cell division in the subsequent iterations.
Fig. 4. Plot of computation complexity ratio of against partition size of .
q
f (q
) to
ff (q )g
, where and , can be computed as follows. denote the parameters of hyperplane cuts for • Let partition . , translate to obtain • For each if if if if if
in 1st quadrant in 2nd quadrant in 3rd quadrant in 4th quadrant.
• Let denotes the set of polygonal cells on partitioned by . that minimizes • Find
Since
C. Bottom-Up Estimation image partition . The plot of the comConsider a , versus the number of its quad-subplexity of computing, , is shown in Fig. 4. It is found that the sets, particomputation for approximation of four partition, especially for higher tions is less than for one leaf-order QB-tree, thereby suggesting that the number of computations needed for coarser scale approximation can be reduced by estimation from finer scale approximations. This is feasible if there is some correlation between the approximation for coarser and finer scale partitions and many contour features persist across different scales, especially in the QB-tree representations. , it is possible to compute With the knowledge of more efficiently. Assuming given approximations of 1Assuming
linear boundaries modeled by Bresenhem lines on the discrete
N 2 N partition.
• Obtain by performing a window search . around parThe above procedures allow us to approximate an tition with computation of the order of where . VI. CODING THE QB-TREE DECOMPOSITION To code the QB-tree structure, each tree node requires bits to distinguish between the different node types. Since a twig-branch node can only occur after a twig-root node, it can share the same symbol with the branch or twig-root node. The coding information required to be encoded with the twig-root and twig-branch nodes are the orientations and offsets of the hyperplane cuts, each requiring bits if the nodes reside in an partition. As discussed in for the hyperplanes Sections V-A and V-B, the range of is dependent upon the dimension of its bounding partition or cell. Thus, we can code the hyperplane cuts with bits.
Authorized licensed use limited to: Laurent Duval. Downloaded on February 16,2010 at 11:50:40 EST from IEEE Xplore. Restrictions apply.
1288
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 18, NO. 6, JUNE 2009
Fig. 5. Original test images used in the experiments. (a) Cameraman1. (b) Cameraman2. (c) Bike, cropped. (d) Lena.
Each leaf node denotes a cell and it is associated with a -order 2-D polynomial that requires bits, where denotes the precision bits-per-coefficient of the polynomial. In the experiments, the tree codes, hyperplane parameters, and polynomial coefficients are Huffman entropy-coded. VII. SIMULATIONS AND DISCUSSION We compared the coding performance of our QB-tree based method against JPEG2000, BSP and wedgelet representations. In these experiments, standard test images (some illustrated in Fig. 5) were used. Fig. 6 shows two images coded at 0.15 bpp and their PSNR measurements for the various methods. As expected, the JPEG2000 decoded images exhibit ringing artifacts along high contrast edges. On the other hand, the BSP, wedgelet and QB-tree decoded images show high fidelity reconstruction of image edges. Compared with wedgelets, the QB-tree clearly demonstrates more contour details in its decoded images. A major difference in the images produced by the BSP and QB-tree methods is that, the former is better in reconstructing large contours, while the QB-tree is capable of rendering fine scale features because of its underlying quad-partitioning structure. A drawback of the BSP approach is that it is relatively costly to estimate and code the hyperplane cuts since their initial bounding box is large, spanning the entire image. In the QB-tree, the initial hyperplane cut is already constrained to the dimension of its corresponding partitions. Table I shows the coding performance of our algorithm versus that of the wedgelets method using the results reported in [47]. Here, for a meaningful comparison, the results of our approach are presented without Huffman compression. It is evident that our approach leads to a better PSNR performance
2
Fig. 6. Approximation of Bike 256 256. (a) JPEG2000, 16.60 dB. (b) BSP, 17.90 dB. (c) Wedgelet, 15.8 dB. (d) QB-tree leaf-order 3, 18.12 dB, Cameraman1 256 256. (e) JPEG2000, 21.57 dB. (f) BSP, 21.21 dB. (g) Wedgelet, 20.8 dB. (h) QB-tree leaf-order 3, 21.42 dB, images at 0.15 bpp.
2
using a smaller number of bits. Moreover, our algorithm performs noticeably better in coding images which contain strong features (e.g., House, Pentag). Tables II and III compare the rate-distortion (PSNR) performance of our QB-tree algorithm with that of the PP-BSP method [37] and the more recent geometric wavelets [45] and the Prune-Join algorithm [36] based methods. From these results, it is clearly evident that our QB tree and the geometric wavelets based methods are superior at rates below 0.125 bpp based on the PSNR quantitative measure.
Authorized licensed use limited to: Laurent Duval. Downloaded on February 16,2010 at 11:50:40 EST from IEEE Xplore. Restrictions apply.
KASSIM et al.: HIERARCHICAL SEGMENTATION-BASED IMAGE CODING USING HYBRID QUAD-BINARY TREES
1289
TABLE I COMPARING THE PERFORMANCE OF THE QB-TREE WITH THE WEDGELETS METHOD ON 256 256 TEST IMAGES (WEDGELETS RESULTS OBTAINED FROM [47]), WITHOUT COMPRESSION
2
TABLE II COMPARING THE PERFORMANCE OF THE QB-TREE WITH PP-BSP, GEOMETRIC WAVELETS, AND THE PRUNE-JOIN QUADTREE METHODS ON THE LENA (512 512). ENTRIES ARE IN PSNR (THE RESULTS OF OTHER METHODS OBTAINED FROM [36], [45] AND [37], RESPECTIVELY)
2
TABLE III COMPARING THE PERFORMANCE OF THE QB-TREE WITH JPEG200, GEOMETRIC WAVELETS, AND THE PRUNE-JOIN QUADTREE METHODS ON THE CAMERAMAN2 (256 256). ENTRIES ARE IN PSNR (THE RESULTS OF OTHER METHODS OBTAINED FROM [36] AND [45], RESPECTIVELY)
2
2
Fig. 7. Comparison of Cameraman2 256 256. (a) Original image. (b) GW, PSNR = 25:07. (c) QB-tree, PSNR = 30:3, coded at 0.125 bpp and Lena 512 512. (d) Original Image (e) GW, PSNR = 28:72. (f) QB-tree, PSNR = 29:51, coded at 0.0625 bpp.
2
Fig. 7 compares a scaled-up version of Cameraman2 and Lena images as decoded by geometric wavelets and by the proposed QB-tree. The geometric wavelets approach seems to be similar to the BSP approach with respect to contour approximations. However, in the former, the partitioned regions are smoother due to the higher-order approximation by wavelets, analogous to the use of high-order polynomials. It is found that the difference in these decoded images is similar to that between the corresponding operations of the BSP- and QB-trees, as discussed above. It is evident that our QB-tree method [Fig. 7(c) and (f)] does a better job of reconstructing the fine details (such as the camera in Camreman2, and the hat in Lena) when compared with the geometric wavelets based method [Fig. 7(b) and (e)]. Further, the major difference between the application of geometric wavelets and the QB-tree is in algorithmic complexity, and the former’s BSP underlying decomposition. The initial bounding box (in the application of geometric wavelets) is quite large, spanning the entire image. Therefore, there arises the need in the geometric wavelets based method to partition the image into fixed-size tiles for separate decomposition. This operation may be undesirable for coding some images (see Table III). In comparison, the QB-tree has an underlying quad-tree structure that provides automatic
multiscale partitioning, and facilitates the bottom-up estimation to reduce computations. VIII. CONCLUSIONS The QB-tree is a compromise between the rigidity of discrete space structures of quadtrees, which allows spatial partitioning for local analysis, and the generality of BSP tree, which facilitates the creation of more adaptive and accurate representations of image discontinuities. In this paper, we propose a novel image approximation technique using the QB-tree, which is a hybrid structure of the binary and quad-trees. The QB-tree image decomposition is able to: • avoid excessive fine partitioning over complex linear features, e.g., junctions, corners, bars and ridges and, hence, thereby obtaining a more efficient single-scale representations of these features; • improve visual representations by producing a more meaningful geometric description of images at coarser scales. From the results presented in this paper, it is shown that our proposed method consistently outperforms other image approximation methods in subjective observations especially for images that contains significant geometrical structures and in low bit rates.
Authorized licensed use limited to: Laurent Duval. Downloaded on February 16,2010 at 11:50:40 EST from IEEE Xplore. Restrictions apply.
1290
IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 18, NO. 6, JUNE 2009
ACKNOWLEDGMENT The authors would like to thank Prof. Y. V. Venkatesh for his valuable comments and suggestions.
REFERENCES [1] D. H. Hubel and T. N. Wiesel, “Receptive fields of single neurones in the cat’s striate cortex,” J. Physiol., vol. 148, pp. 574–591, 1959. [2] J. A. Movshon, I. D. Thompson, and D. J. Tolhurst, “Spatial summation in the receptive fields of simple cells in the cat’s striate cortex,” J. Physiol. (London), vol. 283, pp. 53–77, 1978. [3] J. A. Movshon, I. D. Thompson, and D. J. Tolhurst, “Receptive field organization of complex cells in the cat’s striate cortex,” J. Physiol. (London), vol. 283, pp. 79–99, 1978. [4] D. Marr and S. Ullman, “Directional selectivity and its use in early visual processing,” Proc. Roy. Soc. Lond. B, vol. 208, pp. 151–180, 1981. [5] A. G. Leventhal, Y. C. Wang, M. T. Schmolesky, and Y. Zhou, “Neural correlates of boundary perception,” Vis. Neurosci., vol. 15, no. 6, pp. 1107–1118, Nov. 1998. [6] N. Ahmed, T. Natarajan, and K. R. Rao, “Discrete cosine transform,” IEEE Trans. Computer, vol. C-23, no. 1, pp. 90–93, Jan. 1974. [7] S. Mallat, “Multiresolution approximations and wavelet orthonormal bases of l (r ),” Trans. Amer. Math. Soc., vol. 315, no. 7, pp. 69–87, 1989. [8] A. J. Jerri, The Gibbs Phenomenon in Fourier Analysis, Splines and Wavelet Approximations. Dordrecht, The Netherlands: Kluwer, 1998. [9] M.-Y. Shen and C.-C. J. Kuo, “Review of postprocessing techniques for compression artifact removal,” J. Vis. Commun. Image Rep., vol. 9, no. 1, pp. 2–14, Mar. 1998. [10] S. Mallat and S. Zhong, “Wavelet transform maxima and multiscale edges,” in Wavelets and Their Applications. Boston, MA: Jones and Bartlett, 1991, pp. 67–104. [11] P. L. Dragotti and M. Vetterli, “Wavelets footprints: Theory, algorithms and applications,” IEEE Trans. Signal Process., vol. 51, no. 5, pp. 1306–1323, May 2003. [12] T. F. Chan and H. M. Zhou, “ENO-wavelet transforms and some applications,” in Beyond Wavelets. New York: Academic, 2001, pp. 1–34. [13] W. S. Lee and A. A. Kassim, “Signal and image approximation using interval wavelet transform,” IEEE Trans. Image Process., vol. 16, no. 1, pp. 46–56, Jan. 2006. [14] E. J. Candès and D. L. Donoho, “Ridgelets: A key to higher dimensional intermittency?,” Philos. Trans. Roy. Soc. London Ser., vol. 357, no. 1760, pp. 2495–2509, Sep. 1999. [15] M. Do and M. Vetterli, “The contourlet transform: An efficient directional multiresolution image representation,” IEEE Trans. Image Process., vol. 14, no. 12, pp. 2091–2106, Dec. 2005. [16] E. J. Candès and D. L. Donoho, “Curvelets: A surprisingly effective nonadaptive representation for objects with edges,” in Curves and Surfaces, A. Cohen, C. Rabut, and L. L. Schumaker, Eds. Nashville, TN: Vanderbilt, 2000. [17] E. L. Pennec and S. Mallat, “Sparse geometric image representations with bandelets,” IEEE Trans. Image Process., vol. 14, no. 4, pp. 423–438, Apr. 2005. [18] X. Huo and D. L. Donoho, “Beamlet pyramids: A new form of multiresolution analysis,” in Proc. SPIE, Jul. 2000, vol. 4119, pp. 475–486. [19] D. L. Donoho, “Wedgelets: Nearly-minimax estimation of edges,” Ann. Statist., vol. 27, no. 859–897, 1999. [20] R. Willett and R. Nowak, “Platelets: A multiscale approach for recovering edges and surfaces in photon-limited medical imaging,” IEEE Trans. Med. Imag., vol. 22, no. 3, pp. 332–350, Mar. 2003. [21] M. Kunt, A. Ikonomopoulos, and M. Kocher, “Second generation image coding,” Proc. IEEE, vol. 73, no. 4, pp. 549–574, Apr. 1985. [22] M. M. Reid, R. J. Millar, and N. D. Black, “Second-generation image coding: An overview,” ACM Comput. Surv., vol. 29, no. 1, pp. 3–29, Mar. 1997. [23] M. Kocher and R. Leonardi, “Adaptive region growing technique using polynomial functions for image approximation,” Signal Process., vol. 11, no. 1, pp. 47–60, Jul. 1986.
[24] J. Jang and S. A. Rajala, “Texture segmentation-based image coder incorporating properties of the human visual system,” in Proc. IEEE ICASSP, 1991, pp. 2753–2756. [25] F. C. Leou and Y. C. Chen, “A contour-based image coding technique with its texture information reconstructed by polyline representation,” Signal Process., vol. 25, no. 1, pp. 81–89, Oct. 1991. [26] S. Marshall, “Application of image contours to three aspects of image processing: Compression, shape recognition and stereopsis,” IEE Proc. I, Commun., Speech Vis., vol. 139, no. 1, 1992. [27] F. Meyer, A. Averbuch, and R. Coifman, “Multilayered image representation: Application to image compression,” IEEE Trans. Image Process., vol. 11, no. 9, pp. 1072–1080, Sep. 2002. [28] N. Sprljan and E. Izquierdo, “New perspectives on image compression using a cartoon—Texture decomposition model,” in Proc. 4th EURASIP Conf. Video/Image Process. Multimedia Comm., Jul. 2003, vol. 1, pp. 359–368. [29] C. E. Guo, S. C. Zhu, and Y. N. Wu, “Towards a mathematical theory of primal sketch and sketchability,” in Proc. IEEE Conf. Computer Vision, Oct. 1982, vol. 2, pp. 1228–1235. [30] J.-L. Starck, M. Elad, and D. Donoho, “Image decomposition via the combination of sparse representations and a variational approach,” IEEE Trans. Image Process., vol. 14, no. 10, pp. 1570–1582, Oct. 2005. [31] H. Samet, “The quadtree and related hierarchical data structures,” ACM Comput. Surv., vol. 16, no. 2, pp. 187–260, Feb. 1984. [32] S. L. Horowitz and T. Pavlidis, “Picture segmentation by a tree traversal algorithm,” ACM Comput. Surv., vol. 23, no. 2, pp. 368–388, Feb. 1976. [33] M. Kunt and R. L. M. Benard, “Recent results in high compression image coding,” IEEE Trans. Circuits Syst., vol. CAS-34, no. 11, pp. 1306–1336, Nov. 1987. [34] O. Kwon and R. Chellappa, “Segmentation-based image compression,” Opt. Eng., vol. 32, no. 7, pp. 1581–1587, Jul. 1993. [35] H. Radha, M. Vetterli, and R. Leonardi, “Image compression using binary space partitioning trees,” IEEE Trans. Image Process., vol. 5, no. 12, pp. 1610–1624, Dec. 1996. [36] R. Shukla, P. L. Dragotti, M. N. Do, and M. Vetterli, “Rate-distortion optimized tree-structured compression algorithms for piecewise polynomial images,” IEEE Trans. Image Process., vol. 14, no. 3, pp. 343–359, Mar. 2005. [37] H. R. Rabiee, R. L. Kashyap, and S. R. Safavian, “Multiresolution segmentation-based image coding with hierarchical data structures,” in Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing, May 1996, pp. 1870–1873. [38] L. Huihai, J. Woods, and M. Ghanbari, “Binary partition tree for semantic object extraction and image segmentation,” IEEE Trans. Circuits Syst. Video Technol., vol. 17, no. 3, pp. 378–383, Mar. 2007. [39] S. Ghanbari, J. Woods, H. Rabiee, and S. Lucas, “Wavelet domain binary partition trees for semantic object extraction,” IEEE Trans. Image Process., vol. 43, no. 22, pp. 1189–1191, Oct. 2007. [40] M. Shneier, “Two hierarchical linear feature representations: Edge pyramids and edge quadtrees,” Comput. Graph. Image Process., vol. 17, no. 3, pp. 211–224, Nov. 1981. [41] J. J. Martin, “Organization of geographical data with quad trees and least square approximation,” in Proc. IEEE Conf. Pattern Recognition Image Process., Jun. 1982, pp. 458–463. [42] Y. Huang, I. Pollak, M. N. Do, and C. A. Bouman, “Fast search for best representations in multitree dictionaries,” IEEE Trans. Image Process., vol. 15, no. 7, pp. 1779–1793, Jul. 2006. [43] R. Leonardi and M. Kunt, “Adaptive split and merge for image analysis and coding,” in Proc. SPIE Int. Conf. Applications of Digital Image, 1985, vol. 15, pp. 1–9. [44] A. Lisowska, “Extended wedgelets—Geometrical wavelets in efficient image coding,” Mach. Graph. Vis., vol. 13, no. 3, pp. 261–274, Mar. 2004. [45] D. Alani, A. Averbuch, and S. Dekel, “Image coding with geometric wavelets,” IEEE Trans. Image Process., vol. 16, no. 1, pp. 69–77, Jan. 2007. [46] S. Dekel and D. Leviatan, “Adaptive multivariate approximation using binary space partitions and geometric wavelets,” SIAM J. Numer. Anal., vol. 43, no. 2, pp. 707–732, 2005. [47] A. Lisowska, “Second order wedgelets in image coding,” in Proc. EUROCON Inter. Conf. Computer as a Tool, Sep. 2007, pp. 237–244.
Authorized licensed use limited to: Laurent Duval. Downloaded on February 16,2010 at 11:50:40 EST from IEEE Xplore. Restrictions apply.
KASSIM et al.: HIERARCHICAL SEGMENTATION-BASED IMAGE CODING USING HYBRID QUAD-BINARY TREES
Ashraf A. Kassim (S’82–M’85) received the B.Eng. (with First Class Honors) and M.Eng. degrees in electrical engineering from the National University of Singapore (NUS) in 1985 and 1987, respectively, and the Ph.D. degree in electrical and computer engineering from Carnegie Mellon University, Pittsburgh, PA, in 1993. From 1986 to 1988, he worked on machine vision systems at Texas Instruments. Since 1993, he has been with the Electrical and Computer Engineering Department, NUS, where he is currently an Associate Professor and Vice Dean of the School of Engineering. His research interests include image analysis, machine vision, video/image processing, and compression.
1291
Wei Siong Lee received the B.Eng. (with First Class Honors) and Ph.D. degrees in electrical and computer engineering from the National University of Singapore (NUS) in 2000 and 2006, respectively. His research interests include image and video processing.
Dornoosh Zonoobi received the B.Sc. degree with honors from Shiraz University, Iran, in 2005, and the M.Eng. degree in computer engineering from the National University of Singapore (NUS) in 2008, where she is currently pursuing the PhD. degree. Her research interests include medical image analysis, image coding, signal processing, and machine learning.
Authorized licensed use limited to: Laurent Duval. Downloaded on February 16,2010 at 11:50:40 EST from IEEE Xplore. Restrictions apply.