Fractal Volume Compression - Semantic Scholar

1 downloads 0 Views 1MB Size Report
Oct 18, 1995 - 25 William B. Pennebaker and Joan L. Mitchell. JPEG Still Image Data Compression. Standard. Van Nostrad Reinhold, New York, 1993.
Fractal Volume Compression Wayne O. Cochran John C. Hart Patrick J. Flynn School of Electrical Engineering and Computer Science Washington State University Pullman, WA 99164-2752 fwcochran,hart, [email protected]

October 18, 1995 Abstract This research is the rst application of fractal compression to volumetric data. The various components of the fractal image compression method extend simply and directly to the volumetric case. However, the additional dimension increases the already high time complexity of the fractal technique, requiring the application of sophisticated volumetric block classi cation and search schemes to operate at a feasible rate. Numerous experiments over the many parameters of fractal volume compression show it to perform competitively against other volume compression methods, surpassing vector quantization and approaching the discrete cosine transform.

Keywords: Data compression, fractal, iterated function system, volume visualization.

1 Introduction The problem of managing extremely large data sets often arises in applications employing volumetric data. This has prompted research in new techniques for economizing both storage 1

space and processing time. Data compression techniques reduce the size of volumetric data, converting the array into a representation which is stored more eciently. Fractal techniques based on iterated function systems [11, 2] have been successfully applied to the compression of one-dimensional signals [3, 31] and two dimensional images [13, 8], by nding a fractal representation that models the original signal as closely as possible, and storing the model instead of the original data. Fractal volume compression uses analogous techniques for the encoding of three-dimensional volumetric data. This research in fractal volume compression is part of our recurrent modeling project, which focuses on the abilities of the recurrent iterated function system [3] as a geometric representation. Fractal volume compression provides a rst step toward the use of linear fractals to model arbitrary 3-D shapes. Section 2 reviews the fractal image compression technique and summarizes four items of previous work relating to fractal volume comrpession. Section 3 extends the components of the fractal image compression method to volumetric data. Section 4 discusses optimization of the search used by fractal compression methods, allowing fractal volume compression to perform its search in a feasible amount of time. Section 5 brie y describes decompression algorithms. Section 6 lists the results of various experiments and compares them with previous results. Section 7 concludes and o ers directions for further research.

2 Previous Research This section begins by reviewing the essentials of the fractal image compression method, mentioning elements that directly contributes to the fractal volume compression method. Compression researchers have only recently applied their methods to the task of compressing volumetric data. There appear to be only two previous extensions of sophisticated image compression techniques (vector quantization and the discrete cosine transform) to volumetric data. In addition, there is previous work in fractal compression of 3-D image data, but only in the form of animations and multi-view imagery. The rest of this section summa2

rizes these previous techniques whereas their results appear in Sec 6.7 for better comparison to fractal volume compression.

2.1 Fractal Image Compression Fractal image compression partitions an image into contiguous, non-overlapping square range blocks, and imposes a second, coarser partition on the image, separating it into larger possibly-overlapping square domain blocks [13]. The collection of domain blocks forms the domain pool D: Next, a class of contractive block transformations, the transformation pool T is de ned. Each transformation consists of a value1 component, which alters the scalar values of voxels within the block, and a geometry component, which shues (permutes) the positions of voxels within a block. An image block transformation is contractive if and only if it brings every two pixels in a block both nearer spatially (by re-sampling it at a lower resolution) and closer in value (by reducing its contrast). For each range block R; the encoding process searches for the (D; T ) 2 D  T such that T (D) best matches R: The quality of this match is determined by a distortion measure. The L2 distortion measure sums the square of the di erence between corresponding pixels between the two w  h blocks X and Y is

L2(X; Y ) =

w h

X X

x=0 y=0

(X (x; y) , Y (x; y))2

(1)

If the search fails to nd a satisfactory match, the range block can subdivided and some or all of its children encoded. Storage of the range block is replaced by storage of the indices of the domain block and transformation. This technique yielded high- delity image encoding at compression rates ranging from 9 : 1 to 12 : 1 [13]. Numerous variations of the basic fractal image compression have appeared [28]. Techniques to reduce the large search space of transformed domain blocks are of particular interest This component was called massic in [13], presumably due to fractal image compression's roots in iterated function systems and measure theory. 1

3

to the volumetric case due to its increased time and space complexity. A block classi cation scheme [26] trims the search space by a constant factor [13], segregating searches within simple edge, mixed edge, midrange or shade block classes. The most promising method reduces the time complexity of the search from linear to logarithmic by adapting existing sophisticated nearest-neighbor techniques [27].

2.2 Vector Quantization of Volumetric Data Vector quantization (VQ) is a popular technique for compressing images as well as many other forms of data [10]. It attains high compression rates with little coding error by nding the representative blocks of pixels, and representing the input in terms of these blocks. When applied to images, vector quantization partitions the input into contiguous, nonoverlapping blocks. Based on the distributions of the block contents and the desired compression rate, a codebook is constructed containing a small but diverse subset of these representative blocks. Each of the original blocks is then coded as an index to its nearest representative in the codebook. The decoding algorithm simply performs a table lookup for each stored index. Vector quantization extends directly to volume data [23]. Encoding the gradients as well as the scalar values of volume data accelerates the rendering of VQ compressed volume data, directly from the compressed version [24]. Shading the codebook as a preprocesses reduced total shading time by a factor of 1000. Rendering the codebook as a preprocess and compositing the resulting pixmaps reduced orthographic-projection rendering time by a factor of 12. High-speed VQ rendering requires a \space- lling" block arrangement, such that each block contain enough information to interpolate the values of the cells. For example, their 1283 dataset consists of (128=2)3 = 262; 144 \voxel-spanning" 23-voxel blocks, but consists of (127=1)3 = 2; 048; 383 space- lling blocks. Moreover, high-speed VQ requires the storage of not only the scalar value at each voxel, but also its three normal components and a color. Hence, their example \compression" for high-speed rendering of a CT dataset using 23-voxel 4

blocks rst expands the data by a factor of ve with shading information, then compresses the result by a factor of only ve due to the space- lling arrangement of blocks, resulting in no compression, just fast low-quality rendering.

2.3 DCT Volume Compression The standard JPEG [25] image compression scheme, based on the discrete cosine transform (DCT), also extends to volumetric data [32]. The scheme partitions the input volume into 83-voxel blocks and the DCT converts these 512 spatial values into 512 frequency values, which quantize and entropy-encode better than the original spatial data. Discrete cosine transform algorithms have been studied exhaustively, and existing optimized algorithms for computing DCT coecients have yielded very fast compression times (e.g. 132.1 seconds for a 12-bit 2563 CT head dataset). An overlapping (space- lling) arrangement of the blocks, as in the previous example, allowed DCT-compressed volumes to be rendered directly from the compressed version. Nonoverlapping (voxel-spanning) blocks of 83 voxels were collected into overlapping (space- lling) \macro-blocks" of 323 voxels. Moreover, the overlap was increased from one voxel to three to support gradient computation, but the redundancy was reclaimed through careful compression of macro-block boundaries. Macro-blocks were accessed and decompressed on demand, one at a time, during the rendering process. This scheme saved time by avoiding the unnecessary decompression of blocks that were completely occluded, and saved space, allowing compressed volumes to be rendered on workstations that lack the necessary memory to store the entire uncompressed volume.

2.4 Fractal Encoding of Moving Pictures The fractal encoding of moving picture sequences was demonstrated in [5] as an extension of one and two dimensional signal encoding. Using the temporal data as a third dimension, 43-voxel range blocks and 123 -voxel domain blocks were de ned over a series of 12 partial video sequences (termed slabs) for 3-D fractal compression. Treating time simply as a third 5

dimension caused annoying artifacts in the decompressed frames. Accurately-reproduced high-contrast edges tended to visually mask errors that occurred in the lower frequency background. Once an edge passed from the scene, these previously-masked errors became more obvious, causing a visual incongruity between edge and non-edge frames. Increasing the frame rate reduced this problem, but e ectively reduced the system to a still image coder. As a result, this form of volume compression was abandoned and a new system was developed that compressed the rst frame using a fractal technique whereas successive frames simply drew upon 2-D domain blocks from the previous frame. Reasonable quality was reported at 80Kb=s for 352  288 monochrome frames at 10 frames per second.

2.5 Fractal Compression of a Multi-View Image A fractal compression technique was devised for another form of three-dimensional image data called a multi-view image, a collection of images obtained from viewing an object from a discrete range of positions. Given enough images the object can be re-observed in 3-D with smooth transitions between continuously changing viewpoints as motion parallax places the object in relief. The tremendous size of these 3-D datasets necessitates compression techniques that exploit the view-to-view coherence between neighboring images. The encoding technique used in [22] for multi-view images is an application of the algorithm presented in [13] with very few extensions to take advantage of the extra dimension. Their 3-D dataset contained color values along the x; y; and v (view) axis which was partitioned into 82  5-voxel range blocks. Neighboring range blocks along the view axis overlapped slightly, such that when decoded, these overlapping regions averaged together to produce smoother transitions between reconstructed range blocks. Domain blocks consisted of eight range blocks, and were searched in an outward spiraling path in the xy-plane, extending from the goal range block, with the expectation that matching domain blocks are spatially near the range block [13]. Only four isometry transformations were used. As in [13], range and domain blocks were classi ed, though in this case based solely on 6

the variance of their brightness. Range blocks were also subdivided when necessary. After entropy encoding, the parameters describing the fractal transformations, a 17-view color multi-view image of a toy dog was coded at a bit rate of 0.1095 bits per pixel with a PSNR of 37.52 dB. Fractal coding is appreciated for its resolution independence. Since fractal transformations simply describe a contraction from one region of the dataset to another they can be used at any resolution. Arti cial data was interpolated from the compressed representation by decoding at a higher resolution than the original dataset. This feature supported the approximation of a 51-view 3-D image with a 17-view image, which e ectively improved the bit rate from 0.1095 to 0.0365 bits per pixel [22].

3 Fractal Volume Compression Just as fractal image compression encodes images with block-restricted contractive selftransforms, fractal volume compression likewise encodes volumes. The design issues for creating a fractal block coding system for volumes are direct extensions of those used in fractal image compression. The process partitions the volume into both ne range blocks and coarse domain blocks, and then nds for each range block the transformed domain block that best matches it. This section begins with the characterization and notation of volumetric data, followed by descriptions of the volumetric range, domain and transformation pools.

3.1 Volume Datasets A volumetric dataset is a collection of voxels which represent some measurable property of an object sampled on an integer 3-D grid. The algorithm presented here is constrained to input volumes with scalar voxels. Volumes with vector samples may be divided into separate datasets containing only scalar voxels and each resulting new dataset is encoded 7

independently2. The functional notation V (x; y; z) 2 R denotes the voxel located in the input volume at the grid point (x; y; z) 2 Z3: Each voxel outside the volume's region of support is de ned to be zero (i.e., a volume with dimensions W  H  D implies V (x; y; z)  0 if (x; y; z) 62 [0; W , 1]  [0; H , 1]  [0; D , 1]). Each voxel is normally quantized to b bits by mapping it to the integer range 0 : : : 2b , 1 thus requiring W  H  D  b bits to store the entire volume directly. Volumetric data typically arises from the spatial measurement of real-world data, but also from simulated sources. Medical applications use computed-tomography (CT) or magneticresonance-imaging (MRI) scans. The study of aerodynamics depends on the results of wind tunnel data and computational uid dynamics simulations. Computer graphics has found situations in which a volumetric representation performs better than a surface description [16, 15]. Volume compression makes the management of such massive amounts of volumetric data feasible for general use in existing facilities. The distortion metric

L2(X; Y ) =

wX ,1 hX ,1 dX ,1 x=0 y=0 z=0

(X (x; y; z) , Y (x; y; z))2

(2)

measures the similarity or \distance" between two w  h  d blocks of voxels X and Y: The notation L2(X; Y ) will be used to represent this value.

3.2 Volumetric Range Partitioning As with many compression algorithms, the dataset is partitioned into small spatial regions and each region is encoded separately. The simple scheme used here dices the volume into w  h  d non-overlapping cuboid cells (termed range blocks) and encodes each cell (i.e., subvolume) individually. A range block is uniquely located within the volume by a corner grid point (xr; yr ; zr) taken from the set

R = f(iw; jh; kd) j (i; j; k) 2 [0; dW=we , 1]  [0; dH=he , 1]  [0; dD=de , 1]g (3) 2

In image compression, color images are analogously divided into independently-compressed color planes.

8

which de nes all the range blocks in this partitioning set. The voxels that comprise any range block can be enumerated by the function

R(x; y; z) = V (xr + x; yr + y; zr + z); (x; y; z) 2 [0; w , 1]  [0; h , 1]  [0; d , 1]

(4)

which conveniently allows us to reference voxels within a range block without regard to their global positions within the volume. Adaptive partitioning subdivides range blocks that fail to encode satisfactorily. Larger range blocks yield a higher compression rate, though typically at a lower delity. The encoder rst attempts to nd maps for large range blocks. Each range block is subdivided into eight children. If the coding of the range block results in a distortion above a speci ed threshold tmse for any child, then that child block is coded separately (and possibly subdivided itself). If seven or eight children require separate coding, then a short stub code replaces the range block's code, and all eight children are encoded separately. The overhead associated with tracking this hierarchical coding requires each (non-stub) parent code contain child con guration information.

3.3 Volumetric Domain Pool A set of nw  nh  nd (n = 2; 3; : : :) subvolumes called domain blocks are now de ned by the set of corner lattice points

D = (ix; jy ; kz ) (i; j; k) 2 0; W , nw  0; H , nh  0; D , nd (5) x y z which are spread evenly throughout the volume by the integer domain spacing parameters (x; y ; z ): A domain block located by the grid point (xd; yd; zd) 2 D uses the function (









"

$

%#

"

$

%# )

D(x; y; z) = V (xd + x; yd + y; zd + z); (x; y; z) 2 [0; nw , 1]  [0; nh , 1]  [0; nd , 1]: (6) to reference local constituent voxels. Note that the lattice support of these domain blocks may overlap for large domain pools (i.e., 's are small) or may be widely spread apart (i.e., 's are large). 9

The domain block search is a minimization problem, but often the true minimum is discarded in practice when other domain blocks yield higher compression with sucient delity. While large domain pools are e ective for providing accurate maps they can decrease compression performance by forcing large indices to be transmitted to the decoder. Often a small localized search can provide suciently accurate domain maps for many range blocks. A common scheme for locating source domain blocks (see [22]) is a outward spiral search emanating from the target range block. Transmitted indices along this spiral path will usually provide low bit rates when entropy encoded [4]. Fractal volume compression utilizes a two pass system that rst checks a small number (e.g., 64) of spatially close domain blocks (using only the identity transformation). For those range blocks that encode poorly during this rst pass a larger (global) domain pool is interrogated during a second pass. The rst pass is very rapid (and signi cantly reduces the number of searches needed during the slower second pass). The global domain pool is extremely large and Section 4 addresses its search.

3.4 Volumetric Transformation Pool A set of transformations T are used to map (source) domain blocks to (target) range blocks. Each transformation is composed of several components. The rst component re-samples domains blocks at a coarser resolution to obtain geometric contractivity. The second removes the mean value (i.e., the DC component) from each of the domain block's voxels. Another set of transformations alter the values of voxels within a block by scaling each voxel by a constant and adding an o set to the result. Constraining the magnitude of the scaling constant to be strictly less than one guarantees contractivity, but the introduction of the orthogonalization operator appears to removes any restrictions on the size of these scaling coecients [12]. The diversity of the domain pool is increased by the last class of transformations which permute voxels within the block. The spatial contraction operator Cn decimates a nw  nh  nd domain block so it is 10

re-sampled with the same spatial resolution as a w  h  d range block: n,1 n,1 n,1 Cn  D(x; y; z) = w  1h  d D(nx + i; ny + j; nz + k): i=0 j =0 k=0 X X X

(7)

This e ectively applies an n3 averaging lter to the domain block followed by a simple subsampling operation. Here we have forced the dimensions of the domain block to be integral multiples of the range block dimensions, but the following equivalent decimation operator Cn0 can be used to contract w0  h0  d0 domain blocks without this restriction:

Cn0  D(x; y; z) =

1 w ,1 h ,1 d ,1 D w0  h0  d0 i=0 j=0 k=0 0

X

0

0

X X



x+i ; y+j ; z+k w h d 





$

%!

:

(8)

An orthogonalization transformation O is now applied which simply removes the contracted domain block's DC component so that it is has zero-mean. Given the domain block mean w,1 h,1 d,1 d = w  1h  d Cn  D(x; y; z); (9) x=0 y=0 z=0 X X X

this operation is described by

O  Cn  D(x; y; z) = Cn  D(x; y; z) , d:

(10)

Next, the ane voxel value transformation G ; scales each voxel in a block B () by and adds  to the result G ;  B (x; y; z) = B (x; y; z) + : (11) When = 0 this transformation simply de nes a block with a single uniform value of : In place of the eight 2-D block isometries used in [13], we now have 48 3-D block isometries for cubic partitions (i.e., w = h = d). Exactly half of them are listed in Tables 1 and 2, and represent the rigid body rotations about thirteen axes (three face axes, six edge axes and four vertex axes). These are doubled to include all re ections by complementing all three coordinates in each case. Non-cubic partitions only consider eight isometries: the identity, the three 180 rotations about the principle axes, the re ections of the principle axes, and total re ection, shown in Table 3. 11

x ,x x ,x ,y y x x ,z z

y ,y ,y y x ,x z ,z y y

z z ,z ,z z z ,y y x ,x

Identity Rotation of  about the three face-face axes. Rotation of =2 about the three face-face axes.

Table 1: Ten rigid body cubic rotations, shown in terms of their result on the input vector (x; y; z): The transformation pool T is the set of all possible transformations of the form

T = Ik  G ;  O  Cn

(12)

which can be parameterized by an isometry index 0  k < 8; a \contrast scale" and a \luminance shift" : For e ective quantization the value of is chosen from a nite set f 0; : : :; n,1g (determined a priori) and  is mapped to the nearest integer. We will use the notation T (D) to denote the net e ect of the transformation T 2 T on the domain block D 2 D:

4 Searching For each range block R 2 R a transformation T 2 T and domain block D 2 D are sought after that yield a suciently small distortion measure L2(R; T (D)): This search dominates the fractal compression process, and its extension to volumetric data causes this search time 12

y z ,x ,y ,z ,x z ,y ,z y y z ,y ,z

x ,y z ,x ,y ,z x z x ,z z ,x ,z ,x

,z Rotation of 

x y ,z ,x ,y y ,x ,y ,x x ,y x y

about the six edge-edge axes.

Rotation of (2=3) about the four vertex-vertex axes.

Table 2: Fourteen more rigid body cubic rotations shown in terms of their result on the input vector (x; y; z):

13

x ,x ,x x ,x x x ,x

y ,y y ,y y ,y y ,y

z Identity z Rotations ,z ,z z Re ections z ,z ,z

Table 3: Eight non-cubic transformations. to greatly lengthen. Designing a fast encoder hinges on the eciency of the search for matching transformed domain blocks. This problem is approached by introducing heuristics to guide the encoder's search towards regions of the dataset that share common characteristics and are therefore more likely to provide self-similar mappings. A classi cation scheme [26] provides such guiding [13]. This scheme used an edge-oriented classi cation system designed to alleviate both the time complexity and edge degradation problems inherent to block image coders. This system does not easily extend to the realm of 3-D block classi cation as a consequence of the complexity added by the extra dimension. One previous volumetric classi cation scheme [22] thresholded the sample variance of voxels within a block. This has the undesirable e ect of avoiding rapidly convergent transformations that map high contrast regions to low contrast range blocks. Even with classi cation, this search for self-ane maps still remains sequential. Instead of segregating blocks into a set of prede ned classes, associating a real-valued key for each block replace the linear global block search with a logarithmic multi-dimensional nearestneighbor search [27]. This solution is readily generalized to 3-D block encoding and allows 14

for much larger search spaces which improve coding delity but also require larger domain indices.

4.1 Brute Force Search A brute force algorithm is e ective for the rst pass search of the local domain pool but is too time consuming for the global domain pool. For each range block R 2 R; the compression algorithm searches the domain and transformation pools for the domain block D 2 D and transformation T 2 T that minimizes the distortion L2(T (D); R) where T (D) is the result of applying the transform T to the domain block D: The resulting set of \self-tiling" maps produce the \fractal" code that replaces the input volume dataset. A complete search of the virtual codebook D  T involves examining all possible parameterizations Tk; ; and their e ect on every domain block in the domain pool. The algorithm in Table 4 outlines this \brute force" solution. Every range block in the partitioning set R is examined for which the entire domain pool D and transformation pool T is scanned for the optimal transformation. One can prune the exhaustive search of D  T by solving for the optimal gray-value transformation coecients. If we represent range blocks and contracted domain blocks with the m  1 (m = w  h  d) column vectors [r0 : : : rm,1]T and [d0 : : : dm,1 ]T ; respectively, then the optimal choice for and  provides the least squares t to an over-determined system of the form Ax = b: r0 d0 1 r1 d1 1 : (13) = . ... ... . .  rm,1 dm,1 1 If we consider uniform domain blocks (i.e., d0 = d1 =    = dm,1) non-admissible then the columns of A are linearly independent, the matrix AT A is invertible and the unique  T is least-squares solution x = [ ] i

2

3

6 6 6 6 6 6 6 6 4

72 7 7 76 74 7 7 7 5

3 7 5

2

3

6 6 6 6 6 6 6 6 4

7 7 7 7 7 7 7 7 5

x = (AT A),1AT b: 15

(14)

for R 2 R do begin

1; for D 2 D do for T 2 T do begin dist

D0 T (D); if L2(R; D0 ) < dist then begin code fT; Dg; dist L2(R; D0 );

end end;

replace R with quantized code; end; Table 4: Exhaustive search over D  T for each range block R 2 R:

16

Let d = E (di ) = m1 mi=0,1 di be the rst moment or mean of di (similarly for r = E (ri )), and let d2 = E (d2i ) , (E (di))2 be the second central moment or variance of di ; and let rd = E (ridi ) , E (ri)E (di ) be the second central moment or covariance of ri and di ; then P

2

3

2

3

6 4

7 5

6 4

7 5

 rd =d2 x =  = : (15)  r ,  d The orthogonalization transformation O yields d = 0: The o set value simply becomes the DC component r of the range block. The value i closest to  from the set fa0; : : :an,1g is selected as our contrast scaling coecient.  is simply mapped to the nearest integer. It is well known from transform coding that much of the energy in an image represented in the frequency domain resides in its DC term. Therefore, it is critical that the decoder can faithfully reproduce  from whatever quantization scheme is used.

4.2 Volumetric Block Classi cation Block classi cation was used in [13] not only for the encoding speedup that was obtained by segregating the search, but for also determining the complexity of each block. This allowed for more expressive transformations to be used for highly detailed blocks and simpler transformations for less interesting blocks. Devoting a higher bit rate to more complex areas of the image is worthwhile for maximum performance. Extension of the block classi cation techniques borrowed from classi ed VQ [26] into 3-D is non-trivial and would no doubt be computationally expensive. The nal volume coder used in our experiments used a simple variance threshold value to separate blocks into one of two classes: boring and active. Less bits were devoted to contrast scaling coecients and no isometries (other than the identity) were attempted for blocks belonging to the boring class. In [6] principal component analysis (PCA) classi ed blocks based on their gradients. PCA is a well known technique for nding a set of ordered orthogonal basis vectors that express the directions of progressively smaller variance in a given dataset [21, 14]. The dimensionality of highly correlated datasets can be reduced in this manner by expressing their values in terms of these new basis vectors. This is often used for nding normals to planes that best t a set 17

of scattered data points fxig: The technique of weighted PCA assigns a relative importance or weight wi to each point xi in the set. A set of principal vectors were extracted from subvolume blocks in this manner by using the block's supporting lattice for each xi and the corresponding voxel values for the weight wi: While this reduced encoding time and bit rate for the volume coder, some complex blocks were misclassi ed as not containing signi cant gradients thus leaving the possibility of critical blocks producing poor collage maps. Threedimensional block classi cation remains a fertile area of study that could prove bene cial to many areas in volume visualization.

4.3 Nearest Neighbor Search Even with classi cation, we are still faced with a demanding search through N admissible domain blocks (under the set of allowed isometries), computing least squares approximation along the way, to nd the optimal encoding for each range block. This daunting task can be reduced to a multi-dimensional nearest neighbor search which can be performed in expected logarithmic time [27]. Using the notation of Section 4.1, the transformed domain block Ax is the projection of r onto the column space of A which is spanned by the orthonormal basis vectors e = p1m [1 : : : 1]T and (d) where , hx; eie : (16) (x) = kxx , hx; eiek Thus the projection Ax is equivalent to

ProjA (r) = hr; eie + hr; (d)i(d)

(17)

for a given domain block d: Note that (x) simply removes the DC term of x which is accomplished by the operator in Equation 10. Since e and (x) are orthogonal the coecients and  of the gray level transformation G from Equation 11 are not correlated. In [27], the search for the domain block d that yields minimal coding error L2(r; ProjA (r)) is shown to be equivalent to the search for the nearest neighbor of (r) in the set of 2N vectors f(di)gNi=1: The problem is now reduced to nding nearest neighbors in Euclidean space for 18

which there are well known algorithms [9, 30]. Given a set of m-dimensional points [9] shows how an optimal space dividing kd-tree can be constructed with O(mN log N ) preprocessing steps so that the search for the nearest neighbors of a given query point can be found in expected O(log N ) time. Since this technique su ers when the dimension m becomes large we down- lter all blocks to 23(m = 8) vectors before processing. All domain blocks that are close to uniform (e.g., d2  8) are discarded while the remaining domain block addresses are stored in the kd-tree twice, once for each search key (d): Since we are down-sampling r and d to compute (r) and (d) the kd-tree search will not guarantee nding the actual nearest neighbors. Fortunately, searching the kd-tree for several (e.g., 10 or 20) neighbors can still be done in logarithmic time for a given query point. For each neighbor the least squares method for nding and  are performed at the range block's full resolution. Also, in order to reduce the memory requirements, we perform the search for isometries of the range block fIk,1(r)g7k=0 instead of explicitly storing all of the keys f(Ik (d))g7k=0 for each domain block. A substantial speedup is possible if we relax the constraint of nding the true nearest neighbors to simply nding good matches. When searching for the n nearest neighbors, the algorithm described in [9] keeps track of the dissimilarity measure r (i.e., radius) of the nth best match found so far. It is this value r that informs us whether we need to traverse another branch of the space partitioning tree. If we scale r by some number less than 1 (e.g., 0.2) when making this decision we can avoid inspecting large portions of the tree. This modi cation was added in our encoder without signi cantly a ecting the quality of the maps found.

5 Fractal Volume Decompression Complete decompression begins with an initial volume, typically initialized with zeroes. The next volume in the sequence is partitioned into range blocks, and the previous volume into domain blocks. Then the appropriate domain block in the previous volume is mapped to 19

each range block in the next volume. At each step of the iteration, the domain pool is re ned. As this process continues, it converges to the decompressed version. In implementation, the process actually needs only one volume, partitioned into both range and domain blocks, which is decompressed onto itself at each step of the iteration. Each range block is overwritten with the transformed contents of the appropriate domain block. If this overwrites a section of the volume later accessed as domain block, then some or all of the domain block would be as it was in a later iteration. Some care is necessary for hierarchical (e.g., octree) representations to make sure that the larger more lossy \parent" range blocks do not overwrite previously decoded \child" blocks from a previous iteration. Faster techniques exist for decompression fractal-coded images [20], and could be easily extended to the volumetric case. However, these techniques are typically designed for animation playback, and volume visualization systems are not yet fast enough to render dynamic volume datasets in real time. The fractal volume compression algorithm can be adjusted to permit direct rendering of compressed data, avoiding a complete decompression step. Separating a volumetric dataset into a set of possibly overlapping \macro-blocks" supports on-demand decompression during rendering [32]. Such macro-blocks may be incorporated into fractal volume compression by selecting domain blocks within the macro-block containing the range block. In this fashion, each macro-block of the compressed volume may be decompressed independently.

6 Results The fractal volume compression algorithm was tested on a variety of popular, publiclyavailable datasets, and its performance measured using a variety of metrics, to promote better comparison with other existing and future volume compression methods.

20

6.1 Measurement Several quantitative methods exist for indicating the delity of a compression algorithm. In the following, V is the original volume with dimensions W H D; and V~ is the decompressed volume. The function "(x) represents the error between V (x) and V~ (x) :

"(x) = V (x) , V~ (x):

(18)

The mean-square error (mse) is one numerical measure for determining the accuracy of a compressed volume, and is de ned x "2(x) :

P

mse = W  H  D

(19)

Another way to express the di erence or \noise" between V and V~ is the signal-to-noise ratio (SNR). Several versions of SNR exist, which can cause confusion when comparing results. The signal-to-noise ratio used in [24], denoted SNRf ; measures the ratio of the signal variance to the error variance: ~ (x) ~ 2(x)) , E 2(V~ (x)) var V E ( V SNRf = 10 log10 var "(x) = 10 log10 E ("2(x)) , E 2("(x)) : (20)

If "(x) and V (x) each have a mean of zero then SNRf is equivalent to the mean squared signal-to-noise ratio ~ 2 SNRms = 10 log10 x V (x~) : (21) L2(V; V ) Even for datasets that are not zero-mean (as is the case here) this is often used to measure the quality of the reconstructed signal. The peak-to-peak signal-to-noise ratio, PSNR, is de ned P

2 PSNR = 10 log10 varP"(x) ;

(22)

where P 2 is the square of the dynamic range or peak signal value, the distance between the largest and smallest voxel values, and the denominator is the variance of "(x) which is often simply approximated by the mse. 21

6.2 CT Medical Data A 2562  113 12-bit CT head dataset (containing voxel values ranging from ,1117 to 2248) was use to test the fractal volume compression algorithm over several di erent range block sizes w  h  d and octree subdivision down to an edge size no less than two voxels. An mse distortion threshold value of tmse (the maximum allowed average mse per voxel) determined which child blocks were recursively encoded. The domain pool blocks were spaced x = 4; y = 4; and z = 2 apart, yielding approximately 230; 000 domain block records for each octree partition level. Encoding times were measured on a dual-processor HP-9000/J200 with 256 MB of RAM, and do not include le I/O time.

whd 23 43 43 43 83 83 83 32  2 62  4

tmse 4; 608 1; 024 2; 048 4; 608 1; 024 2; 048 4; 608 4; 608 4; 608

comp. 10:99 : 1 14:17 : 1 18:41 : 1 26:18 : 1 15:93 : 1 22:00 : 1 34:37 : 1 20:83 : 1 39:21 : 1

SNRms 27.62 30.15 28.74 26.68 29.70 27.97 25.50 25.14 21.08

PSNR time 42.57 134 s 45.09 973 s 43.68 711 s 41.63 519 s 44.64 1,607 s 42.91 1,285 s 40.43 1,081 s 40.09 249 s 38.06 616 s

Table 5: Encoding results for a 12-bit 2562  113 CT head using an mse octree partitioning threshold of 4608. To demonstrate the fast convergence of the decoder, several reconstructed 12-bit CT heads were created using one to six iterations of the stored transformations. The delity of these reconstructed volumes (using the maps corresponding to the seventh row of Table 5) are given in Table 6. One advantage to introducing the orthogonalization operator O (see 22

Equation 10) is a quick and guaranteed convergence to the attractor in a xed number of iterations [12]. In this example, only ve iterations of the simple method outlined in Section 5 were required to decode the data. Each iteration took approximately 15 seconds. iterations 1 2 3 4 5 6

SNRms 13.24 21.09 24.99 25.47 25.50 25.50

PSNR 28.08 36.03 39.34 40.40 40.43 40.43

Table 6: SNR for decoded 12 bit CT head using 1 to 6 iterations of the stored transformations. This same dataset was reduced to 8 bit voxels by mapping each original voxel v to the range f0 : : : 255g by round(255  (v + 1117)=3365): The 8-bit volume was compressed using various mse child threshold values tmse to determine when octree subdivision would occur. Whereas images are generally compressed in their rendered state, volume data requires further processing, such as classi cation, shading and rendering [18, 7], before being displayed as an image. Hence errors introduced through compression yield di erent artifacts in the volumetric case than in the image case. An isovalue surface rendering, such as Figure 2, puts volume compression techniques to an extreme test. Moreover, faces are typically used to analyze compression delity since humans have developed visual skills speci cally for interpreting subtle changes in facial expression.

6.3 MRI Medical Data Fractal volume compression was also tested on an 8-bit 2562  109 MRI dataset of a human head. The noise in MRI datasets challenges compression algorithms more than CT datasets. 23

whd 23 32  2 42  2 42  2 43 62  4 62  4 82  4 83 122  8

tmse 18 18 9 18 18 9 18 18 18 9

comp. 11.02:1 13.92:1 24.61:1 29.07:1 18.02:1 20.78:1 27.51:1 55.35:1 22.27:1 24.08:1

SNRf 25.90 24.53 22.04 21.52 25.30 22.81 21.74 14.41 24.18 20.69

PSNR time 41.49 166 s 40.10 241 s 37.63 448 s 37.12 298 s 40.87 607 s 38.39 830 s 37.32 640 s 30.13 592 s 39.75 1,003 s 36.29 1,362 s

Table 7: Encoding results for an 8-bit 2562  113 CT head using an using varying mse octree partitioning thresholds

24

Figure 1: Slice #56 of the CT head volume dataset: Original (upper left), 18:1 43-voxel range blocks (upper right), 22:1 6  42-voxel range blocks (lower left), 22:1 83-voxel range blocks (lower right).

25

Figure 2: Isovalue surface rendering of the skin (isovalue = 50) of the CT head volume dataset: Original (upper left), 18:1 43-voxel range blocks (upper right), 22:1 6  42-voxel range blocks (lower left), 22:1 83-voxel range blocks (lower right).

26

For example, a simple run-length encoding of the CT head dataset reduces it to 68% of its original size whereas the RLE method failed to compress the MRI head dataset.

tmse 36 64 100 144 169 256 324 400 900

comp. 20.23:1 29.82:1 42.72:1 60.80:1 85.30:1 118.23:1 160.06:1 211.97:1 728.85:1

mse 11.116 17.939 25.813 36.058 48.355 61.501 75.232 89.530 152.33

SNRf 18.332 16.281 14.520 12.977 11.611 10.480 9.527 8.696 5.949

PSNR 37.672 35.692 34.014 32.563 31.289 30.246 29.372 28.617 26.305

Table 8: Coding resulting for an 8-bit 2562  109 MRI of a human head with varying octree partitioning threshold values tmse : Range block size varied dynamically from 163 to 23: A domain spacing of x;y;z = 4 was used along each axis. Figure 3 demonstrates the delity of the fractal volume compression on a slice of the MRI dataset. Even at the extremely high 729:1 rate, the skin and bone edges are reproduced, but the textured detail is obscured.

6.4 CT Engine Data Table 12 contains results from an 8-bit 2562  110 CT scan of an engine block. The engine contains many sharp edges, which fractal volume compression performs particularly well on. A side-by-side comparison (Figure 4 shows the compressed version to be indistinguishable from the original. 27

Figure 3: Slice #54 of the MRI head volume dataset: Original (upper left), 20:1 (upper center), 25:1 (upper right), 30:1 (lower left), 43:1 (lower center), 729:1 (lower right).

28

block size 23 22  3 32  2 43 42  6 83 163 323

comp. SNRf (dB) PSNR (dB) 8.59:1 21.31 40.60 11.09:1 20.24 39.53 15.35:1 18.99 38.30 12.14:1 20.98 40.26 16.29:1 19.08 38.39 13.59:1 20.72 39.99 13.91:1 20.47 39.76 13.96:1 20.00 39.30

Table 9: Comparison of compression rate and delity versus range block sizes and hierarchy depth of the MRI head dataset, using an mse octree partitioning threshold tmse = 18 and domain spacing x;y;z = 4 along each axis.

partition total local search admissible global range size range codes codes found domain blocks 163 1,792 1,109 47,703 83 4,302 144 42,992 43 26,724 1,037 36,445 23 104,521 75,606 29,218 Total codes transmitted = 137,339. Table 10: Coding statistics for an 8-bit 2562  109 MRI of a human head at various levels in the octree using a partitioning threshold value tmse = 36 and domain spacing x;y;z = 4 along each axis. 29

partition total local search admissible global range size range codes codes found domain blocks 163 1,792 1,443 47,703 83 744 96 42,992 43 879 356 36,445 23 274 273 29,218 Total codes transmitted = 3,689. Table 11: Coding statistics for an 8-bit 2562  109 MRI of a human head at various levels in the octree using a partitioning threshold value tmse = 900 and domain spacing x;y;z = 4 along each axis. direct storage 7,208,960 bytes RLE compressed 5,413,292 bytes FVC compressed 216,460 bytes compression time 552 s compression rate 33.3:1 compression rate (over RLE) 25.0:1 mse 4.17 PSNR 41.93 SNRms 29.10 SNRf 28.13 Table 12: Results from compressing an 8-bit 2562  110 CT scan of an engine block. 83-voxel range blocks (octree partitioned down to 23 voxels) were used with a partitioning treshold of tmse = 18: The domain spacing x;y;z = 4 was used along all three axes. 30

Figure 4: Original (left) and compressed (right) rendering an engine block MRI.

6.5 Slicing: Comparison with Fractal Image Compression Volumetric encoding performs better than slice-by-slice encoding and also o ers other bene ts such as direct rendering from the compressed version. A fractal image coder was used to compress each cross section of some of the datasets. Independent compression of the individual slices of the 8-bit CT head resulted in a volume compressed by 18.06:1 with a PSNR of 39.21 dB, but Table 7 shows that similar delity (39.75 dB) resulted from fractal volume compression at a better rate of 22.27:1. Slice-byslice compression of the 8-bit MRI resulted in a 30.74:1 rate at a PSNR of 33.717 dB whereas Table 9 shows the volumetric version compressed at a better delity (34.014 dB) at a higher ratio of 42.72:1. As the compression rate increases, the bene ts of volumetric compression of volumetric data appears to grow.

31

6.6 Dicing: Integration of Macro-Blocks Fractal volume compression supports the rendering of a volume directly from the compressed version by rst dicing the volume into macro-blocks [32] (groups of blocks), and then compressing each macro-block independently. Each macro-block may then be decompressed independently. Hence large datasets may be rendered on workstations lacking sucient memory to contain the entire data, and only visible sections of the volume need be decompressed. The 8-bit CT head was also diced into 323 -voxel macro-blocks, which were compressed individually, to measure any loss of delity due to the limited domain pools. Since the macro-blocks were signi cantly smaller than the entire volume, a tighter domain spacing was allowed, and resulted in an impressive compressed volume delity PSNR of 41.28 dB (SNRms of 27.50 dB). The compression rate for such a volume remains under investigation, but will surely beat the 11 : 1 ratio in Table 7.

6.7 Comparison with Previous Results The CT head dataset results compare with other volume compression algorithms based on vector quantization and the discrete cosine transformation. The VQ algorithm was tested on an 8-bit 1283 version of the dataset [24] whereas the DCT algorithm was tested on a 12-bit 2563 version [32]. As the resolution of the original dataset is 2562  113; both results are guilty of some form of interpolation or padding, and such added redundancy can in ate the compression ratio by as much as a factor of two. Using voxel-spanning blocks, the VQ algorithm was capable of a compressing the 1283 volume by 40:1 at a SNRf of 17.6 dB [24]. At about 40:1, fractal volume compression yields a measurably-better volume of 21.08 dB. The performance of this fractal volume compression implementation falls short of the mark set by the DCT volume compression algorithm [32]. Some have produced fractal image compression algorithms that outperform the DCT [4], and one would expect similar performance from fractal volume compression once fractal techniques receive the same level of attention and optimization as the DCT currently has. 32

7 Conclusion Fractal image compression extends simply and directly to three dimensions. As in the image case, the technique still matches domain blocks to range blocks, but the additional dimension produces many more isometries yielding a richer transformed-domain pool, higher compression rate and better delity. The increased search time caused by the additional dimension is overcome by several sophisticated classi cation schemes. In particular, the nearest-neighbor method [27], which is the only one to reduce the time complexity of the search, performed best for fractal volume compression. The performance of fractal volume compression beats VQ, and rivals DCT volume compression. There is evidence that fractal image compression can beat DCT image compression [4] and there are many common cases where fractal compression is the preferred technique. The fractal volume compression system used for these results is available at: ftp.eecs.wsu.edu:/pub/hart/fvc.tar.gz

7.1 Applications Contrary to its name, fractal image compression performs better on sharp edges and worse in textured areas. Hence, fractal volume compression performs better on clearly delineated regions, such as bone and skin, but worse on tissue or nely detailed areas. The DCT volume compression method blurs both edges and ne detail [32]. Hence, fractal volume compression appears well suited for the compression of classi ed, ltered datasets, and also of synthesized data sets. Fractal compression techniques tend to perform better at high compression rates compared to other methods. For medical volume compression applications, any data loss due to compression is likely unacceptable. Hence, volume compression in general may nd its most suitable application in medicine for the high-rate low- delity indexing, previewing and presentation of volume datasets, for which fractal compression techniques are the best choice. 33

7.2 Future Research Fast rendering algorithms appear available based on the fractal representation. As [23] preceded [24], further research on fractal volume compression will likely produce a fast rendering algorithm. Fractal compression research is much newer than other compression techniques such as the DCT or VQ, and the technique is still not well understood nor fully developed. As new enhancements appear for fractal image compression, they will likely improve fractal volume compression as well. The Bath fractal transform [19] operates without a domain search, and could be easily extended the 3-D and applied to volumetric data. This technique compensates for the loss of block diversity by augmenting the ane block transformation with linear, quadratic and even cubic functions. Compression of volumes sampled on an irregular grid would require a more sophisticated partitioning and block-transformation scheme.

7.3 Acknowledgments Several volume visualization systems were used during the development of fractal volume compression [1, 17, 29]. Data for the CT head was obtained from the University of North Carolina-Chapel Hill volume rendering test data set. This research is part of the recurrent modeling project, which is supported in part by a gift from Intel Corp. The research was performed using the facilities of the Imaging Research Laboratory, which is supported in part under grants #CDA-9121675 and #CDA-9422044. The second author is supported in part by the NSF Research Initiation Award #CCR9309210. The third author is supported in part by the NSF under grants #IRI-9209212 and #IRI-9506414.

34

References [1] Ricardo S. Avila, Lisa M. Sobierajski, and Arie E. Kaufman. Towards a comprehensive volume visualization system. In Proc. of IEEE Visualization '92, pages 13{20, Oct. 1992. [2] Michael F. Barnsley and Stephen G. Demko. Iterated function schemes and the global construction of fractals. Proceedings of the Royal Society A, 399:243{275, 1985. [3] Michael F. Barnsley, John H. Elton, and D. P. Hardin. Recurrent iterated function systems. Constructive Approximation, 5:3{31, 1989. [4] Kai Uwe Barthel, Thomas Voye, and Peter Noll. Improved fractal image coding. In Proc. of Picture Coding Symposium, March 1993. [5] J.M. Beaumont. Image data compression using fractal techniques. BT Technology Journal, 9(4), October 1991. [6] Wayne O. Cochran, John C. Hart, and Patrick J. Flynn. Principal component analysis for fractal volume compression. In Proc. of Western Computer Graphics Symposium, pages 9{18, March 1994. [7] Robert A. Drebin, Loren Carpenter, and Pat Hanrahan. Volume rendering. Computer Graphics, 22(4):65{74, Aug. 1988. [8] Yuval Fisher. Fractal image compression. In Przemyslaw Prusinkiewicz, editor, Fractals: From Folk Art to Hyperreality. SIGGRAPH '92 Course #12 Notes, 1992. To appear: Data Compression, R. Storer (ed.) Kluwer. [9] Jerome H. Friedman, Jon Louis Bentley, and Raphael Ari Finkel. An algorithm for nding best matches in logarithmic expected time. ACM Transactions of Mathematical Software, 3(3), September 1977. 35

[10] Alan Gersho and Robert M. Gray. Vector Quantization and Signal Compression. Kluwer, Boston, 1992. [11] J. Hutchinson. Fractals and self-similarity. Indiana University Mathematics Journal, 30(5):713{747, 1981. [12] Geir Egil ien. L2-Optimal Attractor Image Coding with Fast Decoder Convergence. PhD thesis, Norwegian Institute of Technology, 1993. [13] Arnaud E. Jacquin. Image coding based on a fractal theory of iterated contractive image transformations. IEEE Transactions on Image Processing, 1(1):18{30, Jan. 1992. [14] I.T. Jolli e. Principal Component Analysis. Springer-Verlag, New York, 1986. [15] James T. Kajiya and Timothy L. Kay. Rendering fur with three dimensional textures. Computer Graphics, 23(3):271{280, July 1989. [16] Arie Kaufman. Ecient algorithms for 3D scan conversion of parametric curves, surfaces, and volumes. Computer Graphics, 21(4):171{179, July 1987. [17] Philippe Lacroute and Marc Levoy. Fast volume rendering using a shear-warp factorization of the viewing transformation. In Computer Graphics, Annual Conference Series, pages 451{458, July 1994. Proc. of SIGGRAPH '94. [18] Marc Levoy. Display of surfaces from volume data. IEEE Computer Graphics and Applications, 8(3):29{37, 1988. [19] Donald M. Monro and Frank Dudbridge. Fractal approximation of image blocks. In Proc. of ICASSP, volume 3, pages 485{488, 1992. [20] Donald M. Monro and Frank Dudbridge. Rendering algorithms for deterministic fractals. IEEE Computer Graphics and Applications, 15(1):32{41, Jan. 1995. [21] Donald F. Morrison. Multivariate Statistical Methods. McGraw-Hill Book Company, New York, 1976. 36

[22] Takeshi Naemuri and Hirishi Harashima. Fractal encoding of a multi-view 3-d image. In Bob Werner, editor, Proceedings ICIP-94, volume 1. IEEE Computer Society Press, November 1994. [23] Paul Ning and Lambertus Hesselink. Vector quantization for volume rendering. In Proc. of 1992 Workshop on Volume Visualization, pages 69{74. ACM Press, 1992. [24] Paul Ning and Lambertus Hesselink. Fast volume rendering of compressed data. In Gregory M. Nielson and Dan Bergeron, editors, Proc. of Visualization '93, pages 11{18. IEEE Computer Society Press, 1993. [25] William B. Pennebaker and Joan L. Mitchell. JPEG Still Image Data Compression Standard. Van Nostrad Reinhold, New York, 1993. [26] B. Ramamurthi and A. Gersho. Classi ed vector quantization of images. IEEE Transactions on Communications, 34, Nov. 1986. [27] Dietmar Saupe. Accelerated fractal image compression by multi-dimensional nearest neighbor search. In J.A. Storer and M. Cohn, editors, Proceedings DCC'95 Data Compression Conference. IEEE Computer Society Press, March 1995. [28] Dietmar Saupe and Raouf Hamzaoui. A guided tour of the fractal image compression literature. In John C. Hart, editor, SIGGRAPH '94 Course #13 Notes: New Directions for Fractal Models in Computer Graphics, pages 5{1{5{21. ACM SIGGRAPH, 1994. [29] Barton T. Stander and John C. Hart. A Lipschitz method for accelerated volume rendering. In Proc. of Volume Visualization Symposium '94, Oct. 1994. To appear. [30] Pravin M. Vaidya. An O(n log n) algorithm for the all-nearest-neighbors problem. Discrete Computational Geometry, 4:101{115, 1989. [31] Greg Vines. Signal Modeling with Iterated Function Systems. PhD thesis, Georgia Institute of Technology, May 1993. 37

[32] Boon-Lock Yeo and Bede Liu. Volume rendering of DCT-based compressed 3D scalar data. IEEE Transactions on Visualization and Computer Graphics, 1(1), March 1995.

38