Octree-Based Progressive Geometry Encoder Jingliang Peng and C.-C. Jay Kuo Integrated Media Systems Center and Department of Electrical Engineering University of Southern California, Los Angeles, CA 90089-2564, USA ABSTRACT Among progressive 3D mesh compression algorithms, the kd-tree-based algorithm proposed by Gandoin and Devillers [1] is one of the state-of-the-art algorithms. Based on the observation that this geometry coder has a large amount of overhead at high kd-tree levels, we propose an octree-based geometry coder that demands a less amount of coding bits at high octree levels by applying selective cell subdivision at high tree levels, leading to a better rate-distortion performance for the low bit rate coding. Experimental results show that, compared with the kd-tree-based coder, the proposed 3D geometry coder performs better for an expanded tree of a level less than or equal to 8 but slightly worse for the full tree expansion with 12-bit quantization. Keywords: 3D mesh compression, progressive mesh coding, octree-based, geometry coder.
1. INTRODUCTION Graphics data are more and more widely used in various applications, including video gaming, engineering design, architectural walkthrough, virtual reality, e-commerce and scientific visualization. At the same time, the huge amount of 3D mesh data require efficient storage, processing and transmission. Therefore, it is essential to compress 3D mesh data efficiently. For a 3D mesh model, there are typically three kinds of information to encode: connectivity, geometry and attributes. Connectivity data describe the adjacency relationship between vertices; Geometry data define the positions of the vertices and the attribute data define surface normals, material reflectance, texture coordinates etc. Most early work concentrated on connectivity compression, but in most cases geometry data dominate the compressed file size. Thus, a good geometry coder is essential in achieving higher compression efficiency. Previous research on 3D mesh compression focused on single-rate compression techniques to save the bandwidth between CPU and the graphics card. In a single-rate 3D mesh compression algorithm, all the connectivity and geometry data are compressed and decompressed as a whole. The graphics card cannot render the original mesh until the entire bit stream has been received. Later on, with the popularity of the Internet, the progressive compression and transmission has been intensively studied. When progressively compressed and transmitted, a 3D mesh can be reconstructed continuously from coarse to fine levels of detail (LODs) by the decoder when the bit stream is being received. Moreover, progressive compression can enhance the interaction E-mail:
[email protected],
[email protected]
capability, since the transmission can be stopped whenever an user finds that the mesh being downloaded is not what he/she wants. The kd-tree-based approach proposed by Gandoin and Devillers [1] is one of the state-of-art geometry coders, which encodes geometry data through an iterative space partitioning based on the kd-tree data structure. However, there is a significant amount of coding overhead at high kdtree levels. In this work, we propose an octree-based geometry coder with better rate-distortion performance at low bitrates. The main contributions of this paper include two folds. First, the proposed method demands a much lower coding cost at high tree levels when compared with the kd-tree-based approach of Gandoin and Devillers [1]. Second, we present a rate control scheme through selective cell expansion. The rest of this paper is organized as follows. A brief review of the previous work is given in Section 2. Then, the original kd-Tree-based approach is described in Section 3. The proposed octree based algorithm is presented in detail in Section 4. Experimental results are shown in Section 5. Finally, concluding remarks are given in Section 6.
2. REVIEW OF PREVIOUS WORK 2.1. Single-Rate Mesh Compression Most early work concentrated on single-rate mesh compression. Typically, the connectivity data and geometry data are encoded separately, and the coding order of geometry data is determined by the underlying connectivity coding scheme. Deering [2] first proposed to compress 3D meshes with the approach of generalized triangular mesh. But its coding efficiency is low. Taubin and Rossignac [3] proposed a topological surgery approach to encode mesh connectivity. The disadvantage is that it demands a large memory buffer due to its global random vertex access at the decoding stage. Later on, there came out approaches that encode the mesh connectivity through triangle conquest, including the cut-border machine [4], the edgebreaker [5], the improved edgebreaker [6, 7] where each time a triangle is conquered, the corresponding opcode is output and entropy encoded. Another kind of algorithms are valence-driven approaches including [8, 9]. This kind of work conquers a mesh vertex by vertex, and the vertex valences are output and encoded, observing that the vertex valences concentrate around 6 in a typical mesh. The best connectivity coding schemes so far are valence-driven approaches with a typical connectivity cost of less than 2 bits per vertex(bpv). For the geometry coding, most single-rate mesh compression algorithm adopts a quantizationprediction-residual coding procedure. In quantization, 8∼16 bits per coordinate is typically used, and the residual is entropy coded. For prediction scheme, Linear predictive coding (LPC) has been widely used. Recently, vector quantization (VQ) approaches [10, 11] have also been proposed for geometry compression. The best achievable bitrate so far for single-rate geometry coding is between 6∼10 bpv at a quantization resolution of 8 bits per coordinate.
2.2. Progressive Mesh Compression Progressive mesh compression is closely related to the area of mesh simplification. Typically, a 3D mesh is firstly simplified to a base mesh, the progressive mesh encoder then encodes the base mesh and the reverse of the simplification operations from which the original mesh can be restored. Generally speaking, a progressive coder is less efficient than a single-rate coder in terms of the coding gain, since it cannot exploit the correlation in mesh data as freely as the single-rate coder. Hoppe [12] first introduced the concept of progressive mesh(PM) where the input mesh is simplified through a sequence of edge collapse operations, then the base mesh and the sequence of vertex splits, the reverse of edge collapses, are encoded. Later on, several variants of progressive meshes are introduced including progressive simplicial complex (PSC) [13], progressive forest split (PFS) [14], and compressed progressive mesh (CPM) [15] which use generalized vertex splits or group vertex splits into batches, achieve greater coding gain and/or deal with meshes of more general topology type. Li and Kuo [16] introduced the concept of embedded coding to encode connectivity and geometry data in an interwoven manner. The geometry data as well as the connectivity data are encoded progressively, thus a better rate-distortion performance is achieved. Alliez and Desbrun [17] proposed a valence-based approach. Observing that the entropy of mesh connectivity is dependent on the distribution of vertex valences, they iteratively applied valence-driven decimating conquest and cleaning conquest in pair to get multi-resolution meshes. The vertex valences are output and entropy encoded during this process. Karni and Gotsman [18] used the spectral theory on meshes [19] to compress geometry data, which is a pure geometry coder that assumes the mesh connectivity is already known before geometry coding. Khodakovsky et al. [20] proposed a progressive compression algorithm based on the wavelet transform. It first remeshes an arbitrary manifold mesh into a semi-regular mesh. The base mesh and wavelets corresponding to differences between adjacent LODs are coded with modified SPIHT algorithm [21], but the original connectivity is lost. Gandoin and Devillers [1] proposed a kd-tree-based approach which encodes geometry data through kd-tree-based cell subdivision and encodes connectivity data through generalized vertex splits. The kd-tree-based algorithm is one of the state-of-the-art progressive coders and its geometry coder will be described in detail in Section 3. To the best of our knowledge, the kd-tree-based approach [1] produces the best results among the progressive mesh coders that encode both connectivity and geometry such that the mesh can be losslessly restored in the full resolution in sense that the quantization error of geometry data can be negligible. On the average, its connectivity coding cost is around 3.5 bpv and its geometry coding cost is around 15.7 bpv at a quantization resolution of 12 bits per coordinate.
3. KD-TREE-BASED GEOMETRY CODER In most mesh compression techniques, geometry coding is guided by the underlying connectivity coding. Gandoin and Devillers [1] proposed a contrary strategy, where the connectivity coding is guided by the geometry coding. Their algorithm works in two passes: the first pass encodes geometry data progressively; the second pass encodes the connectivity changes between each two successive LODs.
1 1 4
7
2
0
1 1 1
7
4
32 bits
3 (log8)
2 2.3 (log5)
0 2 (log4)
1 1.6 (log3)
1 1.6 (log3)
1
1 1
1
1
0
0
1
1 1
1
...
2 1 1 1 1 1 1.6 (log4) (log2) (log2) (log2) (log2) (log2) (log3)
Figure 1. Illustration of the kd-tree-based geometry coding in the 2D case.
For geometry coding, their algorithm employs a kd-tree decomposition based on cell subdivisions [22]. Each time it subdivides a cell into two child cells, it encodes the number of vertices in one of the two child cells. If the parent cell contains p vertices, the number of vertices in one of the child cells can be encoded using log2 (p + 1) bits with an arithmetic coder [23]. This subdivision is recursively applied, until each non-empty cell is small enough to contain only one vertex and enable a sufficiently precise reconstruction of the vertex position. Fig. 1 illustrates the geometry coding process with a 2D example. First, the total number of vertices, 7, is encoded using a fixed number of bits (32 in this example). Then, the cell is divided vertically into two child cells, and the number of vertices in the left child cell, 4, is encoded using log2 (7 + 1) bits. Note that the number of vertices in the right child cell is not encoded, since it is deducible from the number of vertices in the parent cell and the number of vertices in the left child cell. The left and right child cells are then horizontally divided respectively, and the numbers of vertices in the upper grandchild cells are encoded, and so forth. To further improve the coding efficiency, prediction is used. When a cell is to be split, the possible number of vertices in one of the two new child cells typically follows a discrete Gaussian law centered at some value predicted from the relative vertex densities in the parent cell’s neighborhood. On average, this kind of prediction further improves the coding efficiency by about 5% as reported. For connectivity coding, their algorithm encodes the topology change after each cell subdivision using one of two operations: vertex split [12] or the generalized vertex split [13]. The advantage is that the vertex to split is implied by the cell subdivision, thus saving the vertex index coding cost. On the average, this scheme requires 3.5 bpv for connectivity coding and 15.7 bpv for geometry coding at 10 or 12-bit quantization resolution.
4. OCTREE-BASED GEOMETRY CODER 4.1. Basic Approach The original kd-tree-based geometry coder has a high overhead at higher levels, since there are more number of vertices within each cell in higher levels than in lower levels. As a result, at low
1
0
1
1
1 0 1 0
1 0 0 1 1 1 0 1
Figure 2. Illustration of the octree-based geometry coding in the 2D case.
bitrates, the distortion will be high. In this work, we aim at reducing the coding cost at high tree levels. The method proposed here is a pure geometry coder. To encode the connectivity, the connectivity compression method in the kd-tree-based algorithm can be easily adapted to the proposed octree-based geometry coder. In the original kd-tree-based coder, each 3D cell is subdivided into eight child cells with three subdivisions: one along XY plane, one along Y Z plane, and one along ZX plane, and three vertex numbers are encoded. In our geometry coder, we subdivide each cell into eight at one time, forming an octree structure of the 3D vertices. Instead of coding the number of vertices in a child cell as in the kd-tree-based geometry coder, we use one bit for each child cell to indicate whether it is empty or not. Nonempty cells will be recursively subdivided until the finest resolution is reached. On the other hand, empty cells will not be subdivided any more. The bits associated with all child cells after a cell subdivision form the bit pattern for that cell subdivision. For instance, for the case given in Fig. 1, the cell subdivisions and the encoding process in our octree-based geometry coder are illustrated in Fig. 2, where one bit is associated with each child cell after each cell subdivision: 1 indicates a nonempty child cell and 0 indicates an empty child cell. For each cell subdivision, if we output the bits for all child cells in the clockwise order starting from the left top child cell, we get the output bitstream for all subdivisions in Fig. 2, 1011110011101010. At certain subdivision stages, some cells may contain only one vertex, those cells are further divided until the finest resolution allowed by the quantization is reached. For the 2D cell subdivision example shown in Fig. 1, the original kd-tree-based approach uses about 20 bits not counting the starting 32 bits which is a fixed initial overhead and will be amortized by later cell subdivisions, while the octree-based scheme uses only 16 bits to encode the subdivisions shown in Fig. 2. As illustrated above, in higher tree levels, our algorithm achieves better coding efficiency. However, for lower tree levels where most cells contain only 1 vertex, the straightforward bit pattern approach could be problematic. For instance, as illustrated in Fig. 3(a), to subdivide a 2D cell containing only 1 vertex into four, the original kd-tree-based approach needs two subdivisions and the coding cost is log2 (1+1) + log2 (1+1) = 2 bits. However, our straightforward approach needs 4 bits as shown in Fig. 3(b). In the 3D counterpart, to subdivide a 3D cell with only one vertex inside to eight child cells, the kd-tree-based approach needs log2 (1 + 1) + log2 (1 + 1) + log2 (1 + 1) = 3 bits while the straightforward octree-based approach needs 8 bits. To overcome this problem, we use a 1-bit flag to indicate whether it has only 1 non-empty child
1
1
(a)
1
0
0
0
(b) Figure 3. Subdivision of a 2D cell containing 1 point with (a) the kd-tree-based approach, and (b) the straightforward octree-based approach.
cell for each cell subdivision. If it does, the 3-bit index of that non-empty child cell is encoded. Otherwise, the 8-bit bit pattern is encoded. Furthermore, to reduce the entropy of bit patterns, for each cell subdivision with more than 1 non-empty child cell, we use one 1-bit bias flag to indicate whether there are more 1’s or more 0’s in its bit pattern. If there are more 0’s, the bias flag is set to 0 and the original bit pattern is encoded; Otherwise, the bias flag is set to 1 and the bitwise complement of the original bit pattern is encoded. In summary, given a 3D mesh, we first calculate its bounding box, quantize the vertex positions, and then build up an octree structure through recursive 3D cell subdivision. After that, the octree-based geometry coder traverses the octree from top down to the lowest level, for each cell subdivision, generates and encodes four kinds of data: • the 1-bit flag to indicate whether the subdivision contains only 1 non-empty child cell (which is QM coded); • the 1-bit bias flag if the subdivision has more than one non-empty child cell (which is QM coded); • the bit pattern or its bitwise complement if the subdivision has more than one non-empty child cell (which is Huffman coded); • the 3-bit non-empty child cell index if the subdivision has only one non-empty child cell.
4.2. Rate Control Note that, at each iteration of the basic algorithm, we subdivide all cells in the same tree levels at the same time. Observing that the subdivisions of different cells at the same tree level typically lead to different decrease in distortion, a possible improvement is to subdivide first those cells whose subdivisions lead to the biggest distortion decrease. In other words, with a given bit budget, we try to optimize the order of cell subdivisions to approximate the original mesh as well as possible. At each tree front, to signify which cells to subdivide, we associate a 1-bit flag with each cell in the tree front to indicate whether to subdivide it in the current iteration. To select the subset of cells to subdivide, we do the following. 1. For the current tree front, we calculate the distortion decrease D, the coding bits B, and their ratio R = D/B. 2. For each cell i in the current tree front, we calculate its distortion decrease Di , its coding bits Bi , and their ratio Ri = Di /Bi , and sort them in the decreasing order.
3. We traverse the sorted cell list and accumulate the distortion decrease SDj = jk=1 Dk , the coding bits SBj = jk=1 Bk and their ratio SRj = SDj /SBj for each cell j in the list. We also calculate two ratios DDj = Dj /D, and RRj = Rj /R, and their product P DRj = DDj ∗ RRj . 4. We choose index m such that P DRm ≥ P DRj (j = 1, 2, ...n) and DDm > DDt and RRm > RRt , where n is the number of cells in the current tree front, DDt and RRt are two user-definable threshold values. The distortion of a vertex is measured with its distance to the center of the cell it belongs to. The distortion of a cell is the sum of the distortions of all vertices it contains. The distortion of a tree front is the sum of the distortions of all cells it contains. The coding bits are approximately counted as the number of raw bits before entropy coding. At high tree levels, the rate control measure described above helps. However, at low tree levels, there are a huge number of cells, the flags signifying the cells for subdivision become a big overhead. At the same time, the distortion decrease is small. Therefore, if we fail to select the cell subset according to the method described above at a certain level, we simply expand the nodes with highest levels first, no signifying flags are output anymore from that point on.
5. EXPERIMENTAL RESULTS The objects used in our experiments are shown in Fig. 4, where the vertex numbers are shown in parentheses in the caption. The 12-bit quantization is used for each vertex coordinate in the experiments. Experimental results are shown in Table 1, where the first column shows the mesh name, the second column shows the number of vertices in each mesh, and the remaining parts show the bitrates at different distortions(Ds) achieved by the kd-tree-based geometry
(a)
(b)
(d)
(e)
(c)
Figure 4. (a) horse(#v=19,851), (b) blob(#v=8,036), (c) feline(#v=49,864), (d) truck(#v=11,738), (e) dinosaur(#v=14,070).
Table 1. The rate-distortion comparison of the octree-based geometry coder(OCT) and the kd-treebased geometry coder(KDT), where bitrates are represented in bpv. Mesh #v D = 125.0 D = 61.5 D = 30.7 KDT OCT KDT OCT KDT OCT horse 19,851 0.21 0.05 1.62 0.62 6.20 5.91 blob 8,036 0.70 0.21 4.08 3.34 9.70 10.06 feline 49,864 0.12 0.02 1.10 0.45 5.32 4.55 truck 11,738 0.20 0.06 1.50 0.83 5.36 4.84 dinosaur 14,070 0.20 0.05 1.58 0.74 5.99 5.92
(a)
(b)
(c)
Figure 5. Restoration of the horse mesh at distortion/bitrate(bpv) of (a) 125.0/0.05, (b) 61.5/0.62, and (c) 30.7/5.91. Table 2. Experimental results when the tree is fully expanded, again the 12-bit quantization is used and the bitrates are reported in bpv. Mesh #v KDT OCT horse 19,851 16.4 18.17 blob 8,036 20.1 22.01
coder(KDT) and the octree-based geometry coder(OCT). In the table, the distortion D is measured with the average distance between a restored vertex and its original quantized position. Roughly speaking, D = 125.0 corresponds to 4 levels of tree expansion, D = 61.5 corresponds to 6 levels of tree expansion, and D = 30.7 corresponds to 8 levels of tree expansion in the kdtree-based geometry coder. Since we do not have the executable for the kd-tree-based geometry coder, we estimated its bitrates at different distortions from the algorithm description in the paper [22]. The calculation also takes into account the claimed 5% improvement using prediction. From Table 1, we see that, for all the test meshes except the blob, our algorithm is better than the kd-tree-based algorithm for the distortion down to 30.7, roughly corresponding to 8 levels of octree expansion, or equivalently, 8-bit quantization of the original coordinates. The restored horse mesh at different distortion levels are shown in Fig. 5. Note that the connectivity in the figure directly comes from the original mesh since we do not compress the connectivity in our geometry coder. In Fig. 5(b) with the distortion equal to 61.5 and the bitrate equal to 0.62, we already have a clear idea what is in the figure. In Fig. 5(c) with the distortion equal to 30.7 and the bitrate equal to 5.91, we get a mesh of good quality. We also performed the full tree expansion to the 12th level on the horse and blob meshes. These test cases were also experimented and reported in [1]. The experimental results are shown in Table 2, from which we see that our approach is worse when the octree is fully expanded.
6. CONCLUSION AND FUTURE WORK In this work, an octree-based 3D geometry coder was given to achieve better rate-distortion performance at low bit rates as compared with the kd-tree-based geometry coder. The performance is a little bit worse than the kd-tree-based geometry coder for higher bit rates. In the future, we will use context-based entropy coder to encode the bit patterns, the neighborhood configuration for each cell subdivision may be used as the context for the corresponding bit pattern. Another direction is to apply some prediction schemes. Since low level cell subdivisions dominate the coding cost, a slight improvement with prediction will significantly improve the coding performance for high resolution mesh compression. Besides the geometry coding, we will also work on connectivity coding, aiming at a mesh compression scheme that compresses both the quantized geometry and the mesh connectivity losslessly. For the connectivity coding, we will use the incremental and/or constrained Delaunay triangulation techniques to predict the connectivity from geometry data.
REFERENCES 1. P. M. Gandoin and O. Devillers, “Progressive lossless compression of arbitrary simplicial complexes,” ACM Trans. Graphics 21(3), pp. 372–379, 2002. 2. M. Deering, “Geometry compression,” in ACM SIGGRAPH, pp. 13–20, 1995. 3. G. Taubin and J. Rossignac, “Geometric compression through topological surgery,” ACM Trans. Graphics 17(2), pp. 84–115, 1998. 4. S. Gumhold and W. Straßer, “Real time compression of triangle mesh connectivity,” in ACM SIGGRAPH, pp. 133–140, 1998. 5. J. Rossignac, “Edgebreaker: Connectivity compression for triangle meshes,” IEEE Trans. Visualization and Computer Graphics 5(1), pp. 47–61, 1999. 6. D. King and J. Rossignac, “Guaranteed 3.67v bit encoding of planar triangle graphs,” in 11th Canadian Conference on Computational Geometry, pp. 146–149, 1999. 7. S. Gumhold, “New bounds on the encoding of planar triangulations,” Technical Report WSI– 2000–1, Wilhelm-Schickard-Institut f¨ ur Informatik, University of T¨ ubingen, Germany, Jan 2000. 8. C. Touma and C. Gotsman, “Triangle mesh compression,” in Proceedings of Graphics Interface, pp. 26–34, 1998. 9. P. Alliez and M. Desbrun, “Valence-driven connectivity encoding for 3D meshes,” in EUROGRAPHICS, pp. 480–489, 2001. 10. E.-S. Lee and H.-S. Ko, “Vertex data compression for triangular meshes,” in Pacific Graphics, p. 225, 2000. 11. P. H. Chou and T. H. Meng, “Vertex data compression through vector quantization,” IEEE Trans. Visualization and Computer Graphics 8(4), pp. 373–382, 2002. 12. H. Hoppe, “Progressive meshes,” in ACM SIGGRAPH, pp. 99–108, 1996. 13. J. Popovic and H. Hoppe, “Progressive simplicial complexes,” in ACM SIGGRAPH, pp. 217–224, 1997. 14. G. Taubin, A. Gueziec, W. Horn, and F. Lazarus, “Progressive forest split compression,” in ACM SIGGRAPH, 32, pp. 123–132, 1998. 15. R. Pajarola and J. Rossignac, “Compressed progressive meshes,” IEEE Trans. Visualization and Computer Graphics 6(1), pp. 79–93, 2000.
16. J. Li and C.-C. J. Kuo, “Progressive coding of 3-D graphic models,” Proceeding of the IEEE 86, pp. 1052–1063, Jun 1998. 17. P. Alliez and M. Desbrun, “Progressive encoding for lossless transmission of triangle meshes,” in ACM SIGGRAPH, pp. 198–205, 2001. 18. Z. Karni and C. Gotsman, “Spectral compression of mesh geometry,” in ACM SIGGRAPH, pp. 279–286, 2000. 19. G. Taubin, “A signal processing approach to fair surface design,” in ACM SIGGRAPH, pp. 351– 358, 1995. 20. A. Khodakovsky, P. Schr¨ oder, and W. Sweldens, “Progressive geometry compression,” in ACM SIGGRAPH, pp. 271–278, 2000. 21. A. Said and W. A. Pearlman, “A new, fast, and efficient image codec based on set partitioning in hierarchical trees,” IEEE Trans. Circuits and Systems for Video Technology 6, pp. 243–250, Jun 1996. 22. O. Devillers and P. Gandoin, “Geometric compression for interactive transmission,” in IEEE Visualization, pp. 319–326, 2000. 23. I. H. Witten, R. M. Neal, and J. G. Cleary, “Arithmetic coding for data compression,” Communications of the ACM 30(6), pp. 520–540, 1987.