Interactive exploration of distributed 3D databases ... - Semantic Scholar

2 downloads 0 Views 60KB Size Report
interactive graphic exploration, which may combine traditional 3D graphics with ... Detailed urban models should be accessible to city planners, architects,.
Interactive exploration of distributed 3D databases over the Internet Jarek Rossignac GVU Center Georgia Institute of Technology Interactive 3D visualization and Internet-based access to information are already common. Their combination is—or will soon be—the primary vehicle for accessing remote databases in fundamental areas of manufacturing, architecture, petroleum, urban planning, tourism, defense, medicine, electronic commerce, and entertainment. Unfortunately, whether based on precise 3D geometry or involving combinations of shapes and images, the complexity of 3D graphic models of airplanes, cities, or virtual stores significantly exceeds the limits of what can be quickly downloaded over popular connections and what can be rendered on personal workstations during interactive exploration. We review recent progress in the compression and simplification of 3D models and in the progressive transmission of these models for interactive graphic exploration, which may combine traditional 3D graphics with image-based rendering. We propose an architecture for a 3D server capable of supporting a large number of independent client-users accessing interactively various subsets of a possibly distributed database of complex 3D models.

Introduction 3D models of terrain’s, buildings, factories, airplanes, consumer products, human organs, and simulation results are at the heart of numerous applications that drive major segments of the manufacturing, architectural, geoscience, military, medical, and entertainment industries (see also [Rossignac94, Rossignac97]). 3D models will also enhance future computer-human interfaces and will benefit important areas of electronic commerce. They must be easily accessible and usable by all. Interactive 3D viewing makes users more effective Interactive 3D graphics helps viewers understand 3D models of products, buildings, biological structures, or simulation results. 3D views combine in a natural way the details being inspected and the larger perspective of their surroundings. Direct and intuitive control of the 3D viewing parameters leverages our natural ability to quickly identify areas of interest and to focus on them for closer inspection. To fully exploit these natural human capabilities, interactive graphic systems must provide users with immediate graphic feedback to their actions even when interacting with highly complex, remotely located, and distributed 3D models. Everyone should have access to 3D databases The following examples illustrate the importance of our endeavor. Collaborative design in the manufacturing industry and the reuse of on-line catalogs of part models are made possible because a few CAD experts equipped with expensive workstations and high bandwidth connections have instant network access to large shared 3D databases. Other numerous— but less fortunate— employees or sub-contractors would like— but cannot— exploit these 3D databases for their bidding, documentation, manufacturing, or marketing activities, because downloading these models through limited bandwidth connections is practically impossible. Remotely located radiologists and surgeons should be able to consult each other and exchange 3D models of a patient’s organs instantly. Detailed urban models should be accessible to city planners, architects, investors, regulatory agencies, builders, and security agencies for a variety of purposes. Internet shoppers should be able to navigate through 3D models of virtual shopping malls and easily inspect the shape and appearance of selected products. The complexity of 3D models continues to increase The internet and the intranet provide a convenient medium for posting such 3D databases on-line for general or restricted access. However, most users who need to access these databases are equipped with personal computers and standard telephone connections. They do not have enough storage to locally cache all the models they wish to interact with and lack automatic consistency maintenance procedures for keeping such local copies updated. Consequently, most PC users must download the 3D models each time they wish to inspect or use them. The complexity of 3D models in CAD, architectural, GIS, and medical applications has been rising rapidly, fueled by the improvements in interactive 3D design tools, in data acquisition technologies, and in the storage, processing power, and graphics capabilities of personal workstations. Early designers and scientist were deliberately limiting the accuracy of their data sets to what could be stored and manipulated on their

workstations. Today, the more complex models used by the automotive, aerospace, construction, petroleum, and architecture industries contain millions or even hundreds of million triangles. Their complexity will continue to increase rapidly in response to a need for higher accuracy during analysis, planning, and inspection. Standard 3D representations are useless Although many representations have been proposed for 3D models [Rossignac94b], polyhedra (or more precisely triangular meshes) are the standard for exchanging and viewing 3D data sets. This trend is reinforced by the wide spread of 3D graphic libraries (OpenGL[Neider93], VRML [Carey97]) and other 3D exchange file formats, and of 3D adapters for personal computers that have been optimized for triangles. Graphic subsystems can convert polygons and curved surfaces into an equivalent (or approximating) set of non-overlapping triangles, which may be rendered efficiently using hardware-assisted rasterizers [Rockwood89, Neider93]. But to avoid the cost of this runtime conversion, most applications precompute and store the triangle meshes. Therefore, triangle count is a suitable measure of a model’s complexity. Representations of triangulated meshes usually store the coordinates of each vertex as 3 floating point numbers per vertex and the incidence relation between triangles and vertices as 3 integer vertex-references per triangle. Such simple representations require 18 bytes per triangle. Most representations also include photometric information (vertex normal, color, and texture coordinates) , which is important for photo-realistic rendering. Furthermore, adjacency relations, which help accelerate mesh traversal algorithms, may also be included. The total storage requirement often reaches a hundred bytes per triangle. Downloading a 100 million triangles over a 33Kbits/second modem would take a month. Furthermore, displaying such a model on the PC client would take hours. This brute-force scheme clearly is not suited for the internet access to such complex 3D models and therefore makes these models useless for many important applications. We must reduce the total response time Our goal is to minimize the “total response time”, i.e.: the delay between a user’s action (for example, a mouse movement that changes the model or the viewing parameters) and the corresponding graphic feedback (updated image). We propose to focus on three common situations: •

The user requests access to a model and is willing to wait a few seconds for an approximating first image, which must be sufficiently accurate to enable the user to decide on the next move.



The user is navigating through the model using the mouse or other input devices and expects a total response time of about 0.1 second to support the closed loop graphic feedback necessary for direct manipulation. The fidelity of the images during navigation is less crucial, but must allow the user to clearly identify the details that are used as reference for navigation and to assess their relative position and orientation.



The user has stopped navigation and is waiting for the system to progressively improve the fidelity of the image to a point where decisions for further actions may be safely reached.

Progressive download We consider a solution based on a progressive download. The client may already have in local storage a subset of the entire 3D model acquired through prior transmissions. That set is initially empty and its growth may be limited by the available local memory. When the subset is not sufficient to produce a new image on the client with the desired accuracy, the server would identify the missing information, compress it, and send it. The client would decode it, add it to its growing representation, and render it. This missing information may combine refinements of a multi-resolution geometric representation and also images used as substitutes for small or distant geometry [Mann97, Sillion97]. Factors influencing the total response time This total response time in a remote client/server architecture combines delays due to: 1. 2. 3. 4.

the sampling rate of the input device, the processing necessary to compute the new scene parameters, the transfer of these parameters to the server, the extraction and compression by the server of the incremental information needed by the client to produce the new image, 5. the transfer of that information over network or phone lines, 6. the decoding of that information by the client, 7. the updating of the client’s representation of the scene, 8. the traversal by the client of the updated representation, 9. the generation of the appropriate graphic instructions for the rendering subsystem, 10. and finally the geometric transformations, lighting calculations, clipping, and scan-conversion tasks performed by the graphic subsystem.

The total response time is thus constrained by: 1.

the bandwidth of the connection,

2.

the performance of the graphic subsystem,

3.

the allocated server power,

4.

the processing power and memory of the client,

5.

and the image accuracy imposed by the application or the current situation.

A new software architecture is needed to support acceptable compromises between the total response time and the image fidelity for the three situations described above for the anticipated client/server hardware and bandwidth configurations. Combining compression, simplification, and image-based rendering Because the exact representation of the visible geometry is not always required to produce an image of sufficient accuracy for navigation or inspection, geometric simplification and lossy compression techniques may be invoked to further reduce the transmission and rendering time. A significant challenge of current research is to develop and validate a unifying theory and an efficient algorithmic technique that combines geometric compression, adaptive simplification, and progressive transmission. 1.

Geometric compression reduces the number of bits used to encode a geometric model.

2.

Simplification reduces the number of triangles in an object’s representation.

3.

Adaptive simplification produces multi-resolution models where the tolerances applied when simplifying individual features of a model depend on the viewing, and possibly other parameters.

4.

Progressive transmission schemes permit to first send a coarse model and then send information sufficient to refine the representation of specific features of the model.)

This unified technique must also be combined with complementary techniques for efficient visibility calculations [Durand97, Zhang97] and for image-based rendering [Darsa97, Mark97]. Fundamental research issues Progress towards more effective schemes that combine simplification, compression, and progressive refinement requires that we address some fundamental issues. How to formulate graphic tolerances

First we need a precise formulation of the tolerance that would ensure the desired visual accuracy for a given view and an efficient and reliable error bound estimation procedure. This formulation is fundamental for further research on a unified simplification and compression technique. Such a formulation is not available today and reported attempts either focus on view independent measures or on image fidelity. Both approaches lead to expensive error measures or to largely sub-optimal solutions which do not take advantage of the acceptable model tolerance. Given that our strategy is focused on reducing the number of bits that must be transferred at each frame, we must understand how to measure or estimate the bit-complexity of a toleranced geometric shape. More precisely: given an original triangle set, O, and some tolerance measure, T, what is the minimum number of bits necessary for representing the simplest triangulated model that does not deviate from O by more than T? Clearly, that number depends on the modeling primitives (shapes and operations) that can be used to encode the geometry. The simplification and lossy compression techniques discussed below alter the shape of the object and result in images that differ from those produced using the original model. We suggest that measuring the error in terms of the number of pixels that have a significantly different color in both images or in terms of the magnitude of the color difference is not appropriate for most 3D graphics applications (except maybe for styling, which would not significantly benefit from the proposed research, because it is typically performed using expensive rendering systems). We base our conclusion on the following observations: 1. the user does not have the capability to recognize absolute colors with precision; 2. human ability to derive shape from shading and from highlights is limited to general curvature signs and magnitudes, and changing these slightly will rarely confuse the user as to the nature of the objects being displayed; 3. the fidelity of rendering systems commonly used in 3D graphics applications varies with view changes and may itself create visual artifacts. We expect the acceptable shading error limits to be significantly less constraining than limits on geometric errors that may impact on the size, location, and orientation of geometric landmarks used by the viewer to make logical decisions during navigation, and model selection, manipulation, and inspection. For example, the interior features of the model of a telephone hand-set should not become visible from the outside in the simplified model; two clearly separated pipes in the original model of a factory should not interpenetrate in the image of the simplified model; the corners of near-by building in a walkthrough

of a virtual city should not be displayed too far away from their correct position. Because the graphic accuracy required for distant objects is generally lower than for nearby objects, we suggest to formulate the tolerance as the Hausdorff distance between the original and simplified models both transformed by the three dimensional perspective transformation that corresponds to the current view. If our assumptions prove correct, one should be able to use the geometrically acceptable approximations instead of the original model, regardless of the corresponding image deviations. Image-based rendering reduces the client’s workload

These simplification, compression, and progressive transmission techniques must be combined with efficient hardwareassisted dynamic visibility estimation and with the use of images as impostors for distant “background” shapes [Maciel95, Sillion97,Regan94].

3D simplification In this section, we review the current state of the art in simplification techniques and propose a unifying view, which leads to an increased flexibility in optimizing specific steps of the simplification process and in combining independently developed modules. Simplifications reduce the triangle count Simplification produces a 3D impostor, S, for an original triangulated surface O, where S has fewer triangles than O and where the error produced by replacing O with S does not exceed a prescribed tolerance. Better simplification schemes produce impostors with lower triangle count for the same tolerance. Using impostors instead of the original models may increase graphic performance when rendering distant features that appear small on the screen during camera motions or scene animations [Crow82, Upstill90]. During navigation pauses, the system may start improving the image quality and, if not interrupted by the user, will automatically display refined images [Bergman86]. Error metrics Most prior attempts at estimating the error that results from using approximations to nominal shapes for graphics were either limited to view-independent geometric deviations [Rossignac93, Kalvin96, Cohen96, Garland97] or to heuristic measures focused on preserving image characteristics, such as the location of silhouettes or highlights [Hoppe97, Luebke97]. For example, Cohen at al. [Cohen96] have used Varshney’s tolerance envelopes [Varshney94] and while Xia and Varshney [Xia96] have used use the sum of the length of collapsed edges as an error measure for an edge-collapse simplification process. The tolerance may by uniform or view-dependent. The Hausdorff distance, H(O,S), defined as max(d(o,S),d(s ,O)) for all points o in O and s in S, provides a reasonable definition of the geometric error. The distanced d(a,B) between a point a and set B is defined as min(||a-b||) for all points b in B. A more intuitive definition of the Hausdorff distance is the radius of largest ball that has its center on one surface and that is disjoint from the other surface. A variety of simplification techniques Reviews of simplification techniques may be found in [Blake87, Heckbert94, Rossignac95, Rossignac96, Puppo97, Rossignac97b, Rossignac97c, Heckbert97]. The various simplification schemes segment the vertices or triangles of O into features and simplify each feature independently without exceeding the allowed error bound. The different simplification approaches reported so far vary in the size and in the nature of the features they can handle, in the techniques used for partitioning the boundary of O into features, in the techniques used to simplify the individual features, and in the error estimates they use. Few approaches compute a new simplified triangle mesh directly, others simplify the original mesh one progressively. Turk treated an entire surface as a single feature, uniformly distributing new vertices on the surface and removing old ones while preserving the original topology [Turk92]. Others compute a tolerance zone around the original surface and fit a simple mesh inside that zone [Varshney94, Mitchell95, Andujar96, Cohen96]. Although these approaches offer many advantages, they are expensive because they explore a large search space when creating a new mesh. He et al. [He96] compute an isosurface of a volumetric density model, which they derive by first rasterizing the original surface into a volume and then performing a low-pass filter transformation to reduce details. Kalvin and Taylor [Kalvin96] compute “superfaces”, which are features made by nearly coplanar clusters of triangles, and replace each feature by a star-shape triangulation with a single vertex at the center of the feature. They also simplify the boundaries between features by merging nearly collinear edges. Shroeder, Zarge, and Lorensen [Schroeder92] restrict features to be the sets of triangles incident upon a single vertex and remove the vertex, retriangulating the hole. The vertex-decimation process is applied to all features whose simplification does not result in excessive error and the process is repeated, allowing the triangles of a simplified feature to be part of other features and further simplified.

Vertex clustering simplification The geometric simplification introduced by Rossignac and Borrel [Rossignac93] operates on arbitrary sets of triangles. The Hausdorff error introduced by the simplification is bounded by a user-controlled accuracy factor. This approach first groups vertices into clusters by coordinate quantization, which is very efficient and groups vertices that fall in the same cell of a regular space subdivision (see below far left). The vertices of each cluster are coalesced into a single attractor positioned somewhere in the cell (center left). Zero-area triangles, easily identified because they have at least two vertices in the same cluster (center right), are either removed or replaced by an isolated point or a dangling edge. These lower-dimensional simplices and the remaining triangles are adjusted by moving their vertices to the corresponding attractors (far right).

This approach may change the genus, but preserves the connectivity of the model. It is particularly effective for removing holes and for simplifying thin objects into plates and elongated objects into line segments. It does not impose any topological restrictions on the input data, which may be an arbitrary collection of triangles, edges, and points. Because of its simplicity, robustness, and computational efficiency this scheme is used in several commercial products, including IBM’s 3D Interaction Accelerator [3DIX96] and SGI’s OpenGL Optimizer [Optimizer97] and has been widely implemented in academia. It has inspired some excellent recent work in this field [Low97, Garland97, Popovic97, Luebke97]. Low and Tan [Low97] have recently improved this method with the concept of a floating cell, allowing to form clusters with vertices of neighboring cells, and thus making Rossignac and Borrel’s method less susceptible to produce artifacts resulting from a particular alignment of the cell boundaries. Edge collapsing simplification The vertex-clustering simplification of [Rossignac93] yields sub-optimal results, because the distance between the original vertices and their attractors is limited by the cell size, which prevents the removal of vertices in triangulations of nearly planar surfaces. Hoppe [Hoppe96] and Ronfard and Rossignac [Ronfard96] have expanded upon the pioneering work of Hoppe et al. [Hoppe93] and have independently devised simplification techniques that collapse edges one-by-one in the order that minimizes the resulting error. (Both solutions use a priority queue to maintain a list of edges sorted by increasing error estimate.) Each edge collapse operation, eliminates one edge, one vertex and two triangles, and modifies neighboring triangles. In fact, it is equivalent to a preferred case of Shroeder’s vertex decimation step [Schroeder97]. The result of such a transformation is shown below, where the horizontal edge marked by an arrow was collapsed by aligning its right vertex with its left vertex. The result is equivalent to coalescing a cluster of two vertices, using the left vertex as the attractor. The same topology, although possibly a different geometry would be obtained by switching the role of both vertices. Thus each edge may be collapsed in two different ways. The triangles that degenerated during this transformation were hatched in the figure, which also shows all the triangles affected by this move.

Ronfard and Rossignac first evaluate the error associated with each oriented edge collapse and maintain a priority queue so that the collapse operation that least affects the error is always available. They perform such simplification steps until the error tolerance is exceeded or until the desired triangle count is reached. At each simplification step, they collapse the best edge candidate, update the topology of the model, and update the error estimates associated with the incident edges and update the priority queue. The error resulting from a simplification step must be computed with respect to the original model, not the result of prior simplifications! Ronfard and Rossignac have introduced a very efficient error estimation techniques that yields a guaranteed and tight error bound. A sequence of their simplification steps identifies clusters of vertices and coalesces each cluster into a single attractor. With each attractor is associated a list of all the planes that support the triangles incident upon the vertices of the cluster in the original model. This association is easily implemented by constructing the lists for the original vertices and by merging them recursively as the edge-collapsing operations coalesce individual vertices or clusters. An additional “tangent” plane is associated with each sharp edge or vertex. The absolute error of a simplification step that brings a vertex or attractor v to a new position v’, is bounded by 1.73 times the maximum of the distances between v ’ and each

plane associated with v. The cost of this approach is dominated by computing distances between candidate points and cluster planes. An optional step finds locally optimal locations of the attractors in the simplified model by minimizing their distance to all the planes they are associated with. This early work has recently inspired Garland and Heckbert [Garland97] to develop a variation, where instead of using the maximum distance to all the planes of each cluster, the sum of the squares of the distances to all the planes is used. The drawback is of course that Garland and Heckbert’s estimate is not a proper error bound. However, the use of an L2 norm permits to replace an arbitrarily long list of planes with a single 4x4 matrix. Furthermore, coalescing two clusters requires simply adding their matrices. A unifying view of many simplification techniques Rossignac and Borrel [Rossignac93] have enabled more general vertex clustering (discussed earlier). Notice that Rossignac and Borrel’s vertex clustering approach merges into a single cluster vertices that may, but need not be connected, thus making it possible to simplify the topology of a shape. Recently, Popovic and Hoppe [Popovic97] and Garland and Heckbert [Garland97] have independently expanded the edge collapses approaches of Ronfard and Rossignac [Ronfard96] and of Hoppe [Hoppe96] by allowing to merge disconnected vertex clusters, as in [Rossignac93]. We propose a new unifying view of these various simplification techniques (vertex decimation [Schroeder97], edge or vertex collapsing [Popovic97, Garland97, Ronfard96, Hoppe96], and face merging [Kalvin96]). They all perform the following four tasks, although in different sequences: 1. Identify vertex clusters 2. Coalesce each cluster into one attractor vertex 3. Compute the position of these attractors 4. Remove triangles with more than one vertex in the same cluster, unless this would alter connectivity Indeed, connected components of edge collapses define vertex clusters, independently on the order in which edge collapses were performed. Similarly, superfaces treat their internal vertices as a single cluster. The difference between the various approaches lies in the strategies for identifying clusters and for computing the position of attractors, which may coincide with a vertex of the original model or may be optimally positioned at a new location. Our unifying view suggests the possibility of decoupling the research and of combining independently developed techniques for cluster identification, for attractor computation, and for degenerate triangles removal into a single simplification solution. Hierarchical multi-resolution models Clustering the vertices of a model according to a given non-uniform tolerance produces a unique simplification, well suited for a particular view. When the view is changed, this simplification may no longer meet the desired accuracy. A new simplification that meets the new tolerance may be obtained in several ways. View-independent discrete levels-of-detail of the features of a model

Simplifying the original model as described above according to the new tolerance requirements may be computationally too expensive for runtime applications. Storing precomputed simplifications of the entire model for a dense set of viewing conditions involves storing prohibitively large amounts of redundant information. Therefore, most commercial implementations of simplification techniques subdivide the model into isolated objects (for example, the components of an assembly or the buildings of a city) and precompute a series of static approximations for each object [3DIX96]. Each approximation is obtained by applying a simplification process with a higher view-independent tolerance, and thus having a significantly lower triangle count than the previous one. At runtime, the rendering subsystem decides which approximation to use for each object, based on an estimate of the visible error associated with using a particular approximation [Funkhouser93]. This process works well for complex scenes composed of relatively small objects. However, it is not effective for relatively large and complex objects, because if a subset of the large object is close to the viewpoint, the entire object will have to be displayed at the highest resolution, even though the distant portions could have been displayed much faster using a much lower resolution. Extracting view-dependent simplifications from a global multi-resolution model

Several recently proposed approaches [Xia96, Klein96, Puppo96, Hoppe97, Luebke97] precompute a single multi-resolution representation and use it to accelerate the derivation of a new simplification from the previous one. The efficiency of these adaptive techniques is based on the assumption that the tolerance varies slowly between frames. These approaches are based on a precomputed cluster-merging tree. With each node N of the tree is associated a simplification operation that produces a single cluster C(N) by merging the clusters of the children of N. Furthermore, with each node is also associated an estimator of the error resulting from this simplification. If the error is acceptable, the corresponding cluster-merging operation is executed and the degenerate triangles removed from an active display list. If the error associated with a previously merged cluster C(N) exceeds the current tolerance, the cluster is broken back into the sub-clusters, represented by the children of node N. Breaking the clusters adds new triangles and modifies the geometry of other triangles. With each cluster is also associated

an attractor vertex. A coalesced cluster whose parent is not coalesced is said to be active. The attractors of all active clusters correspond to the vertices of the current simplification. (Note that when simplification is restricted to preserve topology, the cluster hierarchy is a forest of trees. Allowing topological changes, as in [Rossignac93, Popovic97, and Garland97] makes it possible to group the forest into a single tree.) Xia and Varshney [Xia96] and Hoppe [Hoppe97] use a single edge-collapse as the individual cluster-merging step. Luebke and Erikson [Luebke97] claim more flexibility and for example propose to merge Rossignac and Borrel’s cell-based clusters [Rossignac93] into an octree. Lindstrom et al. [Lindstrom96] use a similar quadtree-based scheme for an adaptive terrain model. In all four cases, the decision tree imposes a partial dependency order on the cluster merging operations: It is not possible to merge two arbitrary clusters unless one merges their common ancestor. A similar scheme is also discussed in [Cignoni95, DeFloriani92]. Hoppe requires even more constraining preconditions for splitting a cluster into its two components. When a cluster was coalesced, an edge was collapsed and the two incident triangles T and T’ (hatched in the figure below left) were removed. The four other triangles (T1..T4) adjacent to T and T’ before the merger will have to be present in the simplified model before the cluster can be split back to its two original constituent sub-clusters. This may require expanding other clusters and may create a chain reaction, leading to unnecessary refinements and higher transmission or rendering costs.

T1

T3

T2

T4

merge

T1 T2

T3 T4

Xia and Varshney avoid this additional dependency by requiring at construction time that sibling clusters be disconnected (i.e. that there be no edges connecting them). For example, the two thick edges in the figure above (right) cannot be simplified at the same level of Xia and Varsheny’s hierarchy, while they can in Hoppe’s variation. Consequently, the depth of Xia and Varshney’s cluster tree is in general higher than the depth of Hoppe’s tree for the same model. Both schemes lead to suboptimal simplifications by restricting the set of acceptable cluster arrangements. Computing view dependent multi-resolution models for the entire scene from a single multi-resolution model has several drawbacks [Luebke97]. Instead, we recommend to use discrete levels of detail for all components that are either relatively small or geometrically simple and use multi-resolution models only for large, complex objects.

3D compression Several compression techniques have been proposed. Each has its own advantages. Compression based on triangle strips A representation based on triangle strips, supported by OpenGL [Neider93] and other graphic libraries, is used to reduce the number of times the same vertex is processed by the graphics subsystem from an average of 6 to an average of 2.4 (assuming an average length 10 triangles per strip). Basically, in a triangle strip, a triangle is formed by combining a new vertex description with the descriptions of the two previously sent vertices, which are temporarily stored in two buffers. The first two vertices are the overhead for each strip, so it is desirable to build long strips, but the automation of this task remains a challenging problem [Evans96]. A more general use of buffers for rendering triangle meshes is studied in [Bar-Yehuda96], concluding that to process the triangles of a mesh of n vertices receiving each vertex only once requires local memory sufficient to store at most 13 n vertices. Generalized strips Deering's generalized strips [Deering95] use a 16 registers stack-buffer where previous vertices may be stored for later use. One bit per vertex is used to indicate whether the vertex should be pushed onto the stack-buffer. Two bits per triangle are used to indicate how to form the next triangle. One bit per triangle indicates whether the next vertex should be read from the input stream or retrieved from the stack. 4 bits of address are used for randomly selecting a vertex from the stack-buffer, each time an old vertex is reused. Assuming that each vertex is reused only once, the total cost for encoding the connectivity information is: 1+4 bits per vertex plus 2+1 bits per triangle. Assuming 2 triangles per vertex, this amounts to 11 bits per vertex. Algorithms for systematically creating good traversals of general meshes using Deering's generalized triangle mesh syntax are not available and naive traversal of arbitrary meshes may result in many isolated triangles or small runs, implying

that a significant portion of the vertices will be sent more than once, and hence increasing the storage cost. Deering used an entropy code to compress the quantized coordinates of vertices and normals. Topological surgery The Topological Surgery method recently developed by Taubin and Rossignac [Taubin96] provides the best to date compression results for arbitrary triangular meshes. It builds a “cut” (a vertex spanning tree of the original mesh) using a spiraling strategy (see the thick line in the figure below, top left), which in general produces a tree with relatively few bifurcations (nodes with more than one child). The cut is the boundary of a network of corridors connected by bifurcation triangles which have zero edges in the cut (figure below, center). The structure of the cut and of the corridors’ network is efficiently encoded as a sparse tree by storing the numbers of consecutive single-child nodes (right). The internal triangles to a corridor may be encoded using one bit per triangle. Each one of these “left/right” bits indicates whether the next triangle along the corridor is incident upon an edge of the left or of the right boundary of the corridor (bottom right). The vertices are stored in the depth-first traversal order of the cut tree. The entire mesh is represented by the list of vertex coordinates, an encoding of the sparse cut and corridor trees, and the concatenation of the left/right bits. The entire triangle/vertex incidence graph is usually encoded with less than 2 bits per triangle. Turan claims a minimal worst case cost of 6 bits per triangle [Turan84] for storing the triangulation of a graph. Taubin and Rossignac’s result of fewer than 2 bits per triangle is an expected asymptotic cost for complex, but well behaved meshes.

5

5 4 2

3

As done by Deering, the vertex locations are compressed by generating predictors, quantizing the corrective vectors and using entropy coding. Where Deering used as predictor for each vertex the displacement from the origin of the local coordinate system, Taubin and Rossignac compute the predictor for each vertex as a linear combination of its 4 ancestors in the cut tree. They compute optimal value of the 4 coefficients of this linear combination for the entire model and send them as part of the compressed file. The difference between the predicted and the actual location is quantized to 12 bits or less. (Using 11 bits for each coordinate would yield 0.25mm accuracy for an entire engine block.) When the predictors are good, most correctors’ coefficients are small and have many repetitions, leading to drastic compression ratios. (Often 4 bits per coordinate are sufficient.) This compression technology has been implemented at IBM and interfaced to the VRML-2.0 internet standard data format for exchanging 3D models. The compression algorithms are efficient and decompression is faster than downloading the compressed data over an ISDN line. The compression ratios depend on the nature of the model. For highly complex models, compressed files use less than a byte per triangle. Progressive transmission Hoppe’s Progressive Meshes [Hoppe96] permit to transfer a 3D mesh progressively, starting from a coarse mesh and then inserting new vertices one by one. A vertex insertion identifies a vertex V and two of its incident edges. It splits the mesh at these edges and fills the hole with two triangles (see below). V is thus split into two vertices. One must specify how the new vertices are to be derived from the position of V.

Encoding each split requires the index to a vertex (log(n) bits, where n is the number of vertices) and several bits to identify the two edges amongst all edges incident upon the vertex. Because Hoppe’s corrective vectors are in general shorter than Deering’s relative vectors, a Huffman coding should further improve the geometric compression. Recently, Hoppe has expanded the progressive meshes to support adaptive resolution [Hoppe97]. As the viewpoint changes, a different model is computed on the server by performing edge collapses and their inverse. Sending an encoding of these transformations to a client who has the previous model will allow the client to upgrade the model for the tolerance imposed by the new viewing parameters. Unfortunately, the space requirements for the underlying datastructure reported in [Hoppe97] indicate 122 bytes per triangle and constructing it takes hours for models with less than a million triangles. The best compromise between compression and progressive refinement We must understand the intricate relations between geometric compression, simplification, and progressive download, attempting to derive a unifying approach that compromises between the number of triangles produced by simplification techniques and the representation of their vertices and incidence relations so as to minimize the bit-complexity of progressive refinements. The challenges of this 3D problem may be illustrated through the simpler example of a map represented by a set O of edges in 2D that are viewed by on a remote client through a feoviated interface, where the accuracy of the map decreases as one moves away from the center of the viewing window. Thus the required accuracy, T, is not uniform and changes with the view, as the user pans and zooms over the map. For a given view, the model is clipped by the server to the desired window. A simplified map, S, may be constructed so as to approximate O while minimizing the number of edges without exceeding the non-uniform tolerance T. A simple compression may reduce the number of bits necessary to represent the vertices of S by quantizing their coordinates to fixed-length integers while ensuring that the resulting error still does not exceed T at any point. However, better results may often be obtained by combining simplification and compression (for example using slightly more edges, but less bits to encode the location of their vertices). The resulting compressed and view-dependent simplification is sent to the remote user. Changing the view requires downloading new edges that were previously not visible and also refining locally available edges so as to meet the new tolerance function T’. One could discard the locally available edges and download a new compressed simplification S’ computed directly from O using T’. To avoid downloading information that is already available locally, one could compute the transformation from S to S’ and download its compressed description. However, S’ may not have been computed with this optimization in mind. Even when T and T’ are similar, representations for S and for S’ may be very different, resulting in costly descriptions of the transformation between them. A better approach would be to start with S and search for a transformation that could be encoded with a minimum number of bits and that would produce a set S’ that satisfies the new tolerance. Downloading a compressed version of a fully detailed model may prove significantly more effective than first downloading a compressed coarse version of the model and then downloading progressive refinements. For example, a compressed model of a complex shape may require only 8 bits per triangle [Taubin96], while the data for a vertex-split in Hoppe’s progressive refinement requires more than twice this amount per triangle. Consequently, if a refined model with twice more triangles is needed, it may be preferable to download it as a whole, rather than to download it as progressive refinements. Furthermore, if only a coarse model is needed for the current view, but we estimate that a refined model will be needed in the next view, it may be preferable to download the more refined model now at the cost of the level of resolution of other models expected to require less resolution in the future. Downloading more complex models before they are needed may increase the rendering cost, unless the client may easily recover simplified versions of these models.

The architecture of a distributed 3D server Consider the situation where the 3D model is distributed over several remotely located databases. For example, a shopping mall may combine several models of department stores, each residing in a different server. It is not necessary for a given view to download all of the models, which would require prohibitive amounts of bandwidth or time. Instead, it often suffices to extract from each server, the information necessary to produce the new image, as defined by the user’s view-editing actions. Given that the bandwidth to the client is limited and that database access must often be shared by several clients, we propose to use a global scheduler for optimal resource allocation. The scheduler must have enough information to establish what data is needed by each client and must optimally allocate network resources, so that each client may upgrade its local representation to produce the best image quality.

Feature upgrades Each model is hierarchically decomposed into features. With each node of the hierarchy we associate: 1. A simple bound that encloses the feature. The bound may be used to estimate the feature’s visibility for a given view. It may also be used to estimate the magnification factor associated with the projection of the feature on the screen for a given view. In short, it may be used to estimate the error associated with using each level of detail of the feature. 2. The transmission costs associated with upgrading the feature from each level-of-detail to the next. 3. A compressed encoding of an upgrade operation that takes a lower resolution of the feature and produces a refinement (which may optionally split the feature into sub-features). A global scheduler Consider that a global scheduler (which may be distributed) has access to the feature bounds and feature-upgrade transmission costs for the hierarchy for each model. The scheduler also keeps a log of the resolution available for each feature at each client location. As the scheduler receives the view control and model editing commands from each client it decides which features are visible by which client and optimizes the allocation of feature upgrades. The optimization is constrained by the available network resources and attempts to ensure optimal image quality for all clients.

Conclusion The bandwidth required for accessing over the internet 3D models that support interactive graphic applications in manufacturing, scientific, commercial, and entertainment areas exceeds the capabilities of current personal computers and phone connections by several orders of magnitude. Anticipated technology improvements are offset by the growing need for more detailed models of cities, factories, aircrafts, appliances, molecules, or virtual malls. We outline here the infrastructure and the supporting technologies for a 3D data server that will significantly reduce the delays experienced by internet users when first accessing complex 3D databases, when interactively navigating through the model, and when inspecting specific details. To allow a client to produce a new image, the 3D server establishes which parts of the model must be available to the client and at which resolution. Then it extracts the missing information from the various databases, compresses it, and transmits it to the client. The client decompresses the data and uses it to upgrade the previously available geometry and to produce the new image. To generate this incremental information and encode it for fast transmission, a multiresolution model of the scene must be precomputed through an automatic simplification process. Then, the incremental “upgrades” ( information needed to refine the representation of each feature of the model to its successive levels of resolution) must be compressed, so as to balance transmission and decompression costs. The overall framework introduced here for minimizing the total response time and the solution based on progressive transmission of the necessary refinements may drive the architecture of future CAD and animation systems and of future 3D browsers for internet applications. Most commercial products that provide users with access to 3D data sets are or will soon be web-enabled. These products will benefit from the research outlined here, either through the development of proprietary 3D servers and clients solutions, or through the adoption of a new standard for progressive geometric transmission.

Bibliography [Andujar96] C. Andujar, D. Ayala, P. Brunet, R. Joan-Arinyo, J. Sole, Automatic generation of multi-resolution boundary representations, Computer-Graphics Forum (Proceedings of Eurographics’96), 15(3):87-96, 1996. [Bar-Yehuda96] R. Bar-Yehuda and C. Grotsman, Time/space tradeoffs for polygon mesh rendering. ACM Transactions on Graphics, 15(2):141-152, April 1996. [ B e r g m a n 8 6 ] L. Bergman, H. Fuchs, E. Grant and S. Spach, Image Rendering by Adaptive Refinement, Computer Graphics (Proc. Siggraph'86), 20(4):29-37, Aug. 1986. [ B l a k e 8 7 ] E. Blake, A Metric for Computing Adaptive Detail in Animated Scenes using Object-Oriented Programming, Proc. Eurographics`87, 295-307, Amsterdam, August1987. [ C a r e y 9 7 ] R. Carey, G. Bell, C. Martin, The Virtual Reality Modeling Language ISO/IEC DIS 14772-1, April 1997, http://www.vrml.org/Specifications.VRML97/DIS. [ C i g n o n i 9 5 ] P. Cignoni, E. Puppo and R. Scopigno, Representation and Visualization of Terrain Surfaces at Variable Resolution, Scientific Visualization 95, World Scientific, 50-68, 1995. http://miles.cnuce.cnr.it/cg/multiresTerrain.html#paper25. [ C o h e n 9 6 ] J. Cohen, A. Varshney, D. Manocha, G. Turk, H. Weber, P. Agrawal, F. Brooks, W. Wright, Simplification Envelopes, Proc. ACM Siggraph’96, pp. 119-128, August 1996. [ C r o w 8 2 ] F. Crow, A more flexible image generation environment, Computer Graphics, 16(3):9-18, July 1982.

[Darsa97] L. Darsa, B. Costa Silva, A. Varshney, Navigating static environments using image-space simplification and morphing, 1995 Symposium on Interactive 3D Graphics, ACM Press, pp. 7-16, April 1997. [ D e e r i n g 9 5 ] M. Deering, Geometry Compression, Computer Graphics, Proceedings Siggraph'95, 13-20, August 1995. [ D e F l o r i a n i 9 2 ] L. De Floriani and E. Puppo, A hierarchical triangle-based model for terrain description, in Theories and Methods of Spatio-Temporal Reasoning in Geographic Space, Ed. A. Frank, Springer-Verlag, Berlin, pp. 36--251, 1992. [Durand97] F. Durand, G. Drettakis, and C. Puech, The Visibility Skeleton: A powerful and efficient multi-purpose global visibility tool, Computer Graphics, Proceedings Siggraph'97, pp. 89-100, August 1997. [ E v a n s 9 6 ] F. Evans, S. Skiena, and A. Varshney, Optimizing Triangle Strips for Fast Rendering, Proceedings, IEEE Vizualization'96, pp. 319--326, 1996. [Funkhouser93] T. Funkhouser, C. Sequin, Adaptive Display Algorithm for Interactive Frame Rates During Visualization of Complex Virtual Environments, Computer Graphics (Proc. SIGGRAPH '93), 247-254, August1993. [Garland97] M. Garland and P. Heckbert, Surface simplification using quadric error metrics, Proc. ACM SIGGRAPH'97. pp. 209216. 1997. [ H e 9 6 ] T. He, A. Varshney, and S. Wang, Controlled topology simplification, IEEE Transactions on Visualization and Computer Graphics, 1996. [ H e c k b e r t 9 4 ] P. Heckbert and M. Garland, Multiresolution modeling for fast rendering, Proc Graphics Interface'94, pp:43-50, May 1994. [ H e c k b e r t 9 7 ] P. Heckbert and M. Garland, Survey of Polygonal Surface Simplification Algorithms, in Multiresolution Surface Modeling Course, ACM Siggraph Course notes, 1997. [ H o p p e 9 3 ] H. Hoppe, T, DeRose, T, Duchamp, J, McDonald, and W, Stuetzle, Mesh optimization, Proceedings SIGGRAPH'93, pp:19-26, August 1993. [ H o p p e 9 6 ] H. Hoppe, Progressive Meshes, Proceedings ACM SIGGRAPH'96, pp. 99-108, August 1996. [ H o p p e 9 7 ] H. Hoppe, View Dependent Refinement of Progressive Meshes, Proceedings ACM SIGGRAPH'97, August 1997. [ K a l v i n 9 6 ] A.D. Kalvin, R.H. Taylor, Superfaces: Polyhedral Approximation with Bounded Error, IEEE Computer Graphics & Applications, 16(3):64-77, May 1996. [ K l e i n 9 6 ] R. Klein and W. Strasser, Generation of multiresolution models from CAD data for realtime rendering. In W. Strasser, R. Klein, R. Rau, Editors. Theory and practice of Geometric Modeling. Springer-Verlag, 1996. [ L e e 7 7 ] D.T. Lee and F.P. Preparata, Location of a point in a planar subdivision and its applications. SIAM J. on Computers, 6:594-606, 1977. [ L i n d s t r o m 9 6 ] P. Lindstrom, D. Koller and W. Ribarsky, L. Hodges, N. Faust and G. Turner, Real-Time, Continuous Level of Detail Rendering of Height Fields, SIGGRAPH '96, 109--118, Aug. 1996. [ L o w 9 7 ] K-L. Low and T-S. Tan, Model Simplification using Vertex-Clustering, Proc. 3D Symposium on Interactive 3D Graphics, pp. 75-81, Providence, April 1997. [ L ue bke 9 7 ] D. Luebke and C. Erikson, View-dependent simplification of arbitrary polygonal environments, Proc. Siggraph. pp. 199-208, Los Angeles, 1997. [ M a c i e l 9 5 ] P. Maciel, and P. Shirley, Visual Navigation of Large Environments Using Textured Clusters, 1995 Symposium on Interactive 3D Graphics, ACM Press, pp. 95-102, April 1995. [Mark97] W. Mark, L. McMillan, and G. Bishop, Post-rendering 3D warping, 1995 Symposium on Interactive 3D Graphics, ACM Press, pp. 7-16, April 1997. [ M a n n 9 7 ] Y. Mann and D. Cohen-Or, Selective Eurographics’97, Budapest, Hungary, September 1997.

Pixel Transmission for Navigation in Remote Environments, Proc.

[ M i t c h e l l 9 5 ] J.S.B, Mitchell, and S. Suri, Separation and approximation of polyhedral objects, Computational Geometry: Theory and Applications, 5(2), pp. 95-114, September 1995. [ N e i d e r 9 3 ] J. Neider, T. Davis, and M. Woo, OpenGL Programming Guide, Addison-Wesley, 1993. [ O p t i m i z e r 9 7 ] SGI OpenGL Optimizer, http://www.sgi.com/Technology/openGL/optmizer_wp.html. 1997. [ P o p o v i c 9 7 ] J. Popovic and H. Hoppe, Progressive Simplicial Complexes, Proceedings ACM Siggraph'97, pp. 217-224, August 1997. P u p p o 9 6 ] E. Puppo, Variable resolution terrain surfaces, Proceedings of the Eight Canadian Conference on Computational Geometry, Ottawa, Canada, pp. 202-210, August 12-15, 1996. [ P u p p o 9 7 ] E. Puppo and R. Scopigno, Simplification, LOD and multiresolution: Principles and applications, Tutorial at the Eurographics’97 conference, Budapest, Hungary, September 1997. [ R e g a n 9 4 ] M. Regan and R. Pose, Priority rendering with a virtual reality address recalculation pipeline, Proceedings ACM SIGGRAPH'94, pp. 155-162, August 1994.

[ R o c k w o o d 8 9 ] A. Rockwood, K. Heaton, and T. Davis, Real-time Rendering of Trimmed Surfaces, Computer Graphics, 23(3):107116, 1989. [Ronfard96] R. Ronfard and J. Rossignac, Full-range approximation of triangulated polyhedra, Proc. Eurographics'96 , Computer Graphics Forum, pp. C-67, Vol. 15, No. 3, August 1996. [ R o s s i g n a c 9 3 ] J. Rossignac and P. Borrel, Multi-resolution 3D approximations for rendering complex scenes, pp. 455-465, in Geometric Modeling in Computer Graphics, Springer Verlag, Eds. B. Falcidieno and T.L. Kunii, Genova, Italy, June 28-July 2, 1993. [ R o s s i g n a c 9 4 ] J. Rossignac and M. Novak, Research Issues in Model-based Visualization of Complex Data Sets, IEEE Computer Graphics and Applications, 14(2):83-85, March 1994. [ R o s s i g n a c 9 4 b ] J. Rossignac, Through the cracks of the solid modeling milestone, in From Object Modelling to Advanced Visualization, Eds. S. Coquillart, W. Strasser, P. Stucki, Springer Verlag, pp. 1-75, 1994. [ R o s s i g n a c 9 5 ] J. Rossignac, Geometric Simplification, in Interactive Walkthrough of Large Geometric Databases (ACM Siggraph Course Notes 32), pp. D1-D11, Los Angeles, 1995. [ R o s s i g n a c 9 6 ] J. Rossignac, Geometric Simplification, in Interactive Walkthrough of Large Geometric Databases, Siggraph Course notes 35, pp. D1-D37, New Orleans, 1996.

ACM

[ R o s s i g n a c 9 7 ] J. Rossignac, The 3D revolution: CAD access for all, International Conference on Shape Modeling and Applications, Aizu-Wakamatsu, Japan, IEEE Computer Society Press, pp. 64-70, 3-6 March 1997. [ R o s s i g n a c 9 7 b ] J. Rossignac, Geometric Simplification and Compression, in Multiresolution Surface Modeling Course, ACM Siggraph Course notes 25, Los Angeles, 1997. [ R o s s i g n a c 9 7 c ] J. Rossignac, Simplification and Compression of 3D Scenes, Eurographics Tutorial, 1997. [Schroeder92] W. Schroeder, J. Zarge, and W, Lorensen, Decimation of triangle meshes, Computer Graphics, 26(2):65-70, July 1992. [Schroeder97] W. Schroeder, A topology modifying Progressive Decimation Algorithm, in Multiresolution Surface Modeling Course, ACM Siggraph Course notes 25, Los Angeles, 1997. [ S i l l i o n 9 7 ] F. Sillion, G. Drettakis, and B. Bodelet, Efficient impostor manipulation for realtime visualization of urban scenery, Proceedings, Eurographics’97, Computer Graphics Forum, vol. 16, No. 3, pp. C207-C218, September 1997. [Taubin96] G. Taubin and J. Rossignac, Geometric Compression through Topological Surgery, IBM Research Report RC-20340. January 1996. (http://www.watson.ibm.com:8080/PS/7990.ps.gz). [Turk92] G. Turk, Re-tiling polygonal surfaces, Computer Graphics, 26(2):55-64, July 1992. [Turan84] G. Turan, Succinct representations of graphs, Discrete Applied Math, 8: 289-294, 1984. [ U p s t i l l 9 0 ] S. Upstill, The Renderman Companion. Addison Wesley, Reading, MA, 1990. [ V a r s h n e y 9 4 ] A. Varshney, Hierarchical Geometric Approximations, PhD Thesis, Dept. Computer Science, University of North Carolina at Chapel Hill, 1994. [ X i a 9 6 ] J. Xia and A. Varshney, Dynamic view-dependent simplification for polygonal models, Proc. Vis’96, pp. 327-334, 1996. [ Z h a n g 9 7 ] H. Zhang, D. Manocha, T. Hudson, K. Hoff, Visibility culling using hierarchical occlusion maps, Computer Graphics, Proceedings Siggraph'97, pp. 77-88, August 1997. [ 3 D I X 9 6 ] IBM 3D Interaction Accelerator, Product Description, http://www.research.ibm.com/3dix. 1996.

Jarek Rossignac is Professor in the College of Computing at the Georgia Institute of Technology and the Director of GVU, Georgia Tech's Graphics, Visualization, and Usability Center, which involves 50 faculty members and over 160 graduate students focused on technologies that make humans more effective . Prior to joining Georgia Tech, he was the Strategist for Visualization and the Senior Manager of the Visualization, Interaction, and Graphics department at IBM Research, where he managed research activities in modeling, animation, interactive graphics, scientific visualization, medical imaging, and VR. He also managed the development of IBM's 3DIX, an interactive viewer for the collaborative inspection of highly complex 3D CAD models, along with two other IBM visualization products: Data Explorer and PanoramIX. He received numerous Best Paper and Invention awards; chaired 12 conferences, workshops, and program committees in Graphics, Solid Modeling, and Computational Geometry; guest edited 7 special issues of professional journals; and co-authored 13 patents. He holds a PhD in EE from the University of Rochester, New York in the area of Solid Modeling and a Diplome d'Ingenieur from the ENSEM in Nancy, France. He may be contacted at [email protected].

Suggest Documents