Interactive Forest Walk-through

Interactive Forest Walk-through P. Micikevicius, C. E. Hughes, J. M. Moshell

Abstract Interactive rendering of a forest containing a large number of unique trees and other vegetation is a challenging and important problem in computer graphics and visual simulation. While methods for rendering near photo-realistic vegetation scenes have been described in the literature, they require tens of minutes or even hours of computation. In order to support interactive forest walk-throughs, we propose a hierarchical method for computing levels of detail for trees, as well as a framework for traversing scenes of arbitrary size. The proposed framework selects levels of detail based on a combination of visibility and projected size metrics, rather than projected size alone. Dynamic scene modification is possible since visibility is determined at run-time and requires no preprocessing step. Keywords: L-systems, level-of-detail, occlusion, walk-through.

1. Introduction While computer modeling and rendering of trees and forest scenes has been studied quite extensively, little research has been published on traversing forest scenes at interactive frame rates. Rendering photo-realistic forest scenes requires from tens of minutes to hours of computation per frame [12]. In order to be useful for visual simulation, training, and scientific applications, a forest walk-through framework must satisfy a number of requirements: •

•

•

•

The scenes should be rendered at interactive frame rates. While 30Hz is usually the goal, lower rates can be acceptable in order to increase the visual quality of a scene. Each tree should be unique. Instantiating affinely transformed copies of the same tree is unacceptable in some applications. Of course, to increase performance such system can always be scaled down by using a only a few unique specimens. The exact same tree must be rendered, given the same viewpoint position and direction. Thus, tree structures cannot be generated randomly when their positions fall within the viewing frustum and discarded when they are no longer visible. The forest must be modifiable at run-time. Users should be able to grow, place, move, and break trees as well as other three-dimensional objects.

In this paper we propose a framework that satisfies the above conditions. The contributions lie in two areas. First, we propose a hierarchical scheme for computing and displaying levels of detail for recursively generated

trees. While the method is described in the context of Lsystems, it is equally applicable to any tree generation approach. Second, we propose a forest walk-through framework to achieve interactive frame rates. Trees are efficiently rendered at appropriate levels of detail based on their visibility as well as projected size. The proposed level of detail tree model and the relevant background is discussed in Section 2. In Section 3 we propose a visibility-based framework for traversing vegetation scenes at interactive rates. Conclusions and future work are described in Section 4. 2. Tree Model Computer modeling of trees has been an active research area for a number of decades. Cohen [8] and Honda [19] pioneered computer models for tree generation. Prior to that, Ulam [40] and Lindenmayer [23] studied mathematical models of plant growth. Lindenmayer proposed a set of rules for generating text strings as well as methods for interpreting these strings as branching structures. His method, now referred to as L-system, was subsequently extended to allow random variations [32], pruning [33], and to account for growing plant's interaction with the environment [30,12]. A wide range of environmental factors, including exposure to sunlight, distribution of water in the soil, and competition for space with other plants, were incorporated into the model, leading to biologically accurate ecosystems. Ray-tracing methods were employed to produce visually stunning images of vegetation scenes [12,32,33,30] at the cost of high computation complexity. Another botanical tree model was proposed by de Reffye et al. [35]. A number of methods have been proposed for generating plants with the goal of visual quality without relying on

2

Micikevicius,Hughes,Moshell / Interactive Forest Walk-through

botanical knowledge. Oppenheimer [31] used fractals to design self-similar plant models. Bloomenthal [6] assumed the availability of skeletal tree structure and concentrated on generating tree images using splines and highly detailed texture maps. Different procedural models were proposed by Weber and Penn [41], Aono and Kunii [3]. Reeves and Blau [34] and Chiba et al. [7] utilized particle systems to generate images of forest scenes. 2.1. Previous Tree Rendering Methods Traditional surface simplification methods, such as multiresolution meshes [24], assume a large connected mesh structure. These methods are not suitable for trees which contain a large number of disconnected polygons (leaves or leaf clusters). For example, Remolar et al. [36] showed that a general mesh simplification approach resulted in a “pruned” appearance. Thus, a number of custom methods have been proposed to speed up tree rendering. These can be roughly categorized into two approaches, geometry-based and texture-based. Oppenheimer [31] used polygonal cylinders to render large branches, replacing the small ones with lines. Weber and Penn [41] proposed a similar simplification approach by replacing cylindrical branch segments with lines, and leaf polygons with points or not rendering them at all. Branches and leaves are grouped into "masses" and a decision on the rendering method is made for each mass. When selecting the level of detail, Weber and Penn take into consideration the field of view in addition to the distance from the viewer, which "compensates for the effect of a telephoto lens" [41]. Deussen et al. [13] sampled each object to obtain a list of points. Based on the projected size of each object, a decision is made at run-time whether to render triangles or the point-approximation. The number of points used in the approximation is also determined at run-time. Each list is ordered randomly to ensure that the over-all shape of the object is preserved no matter how many points are rendered. Thus, the method produces no popping artifacts. Complex plants are stored in shallow octtrees, with a separate point list for each cell. Marshall [25] used a non-uniform tetrahedral subdivision of 3D space to render images in three steps: tetrahedra visibility computation, subdivision refinement, and rendering the plant geometry (or approximating it with Gouraud shaded tetrahedra that contain it). The method was implemented in software and required minutes to render each frame of an 80-tree scene. Remolar et al.[36,37] proposed a multiresolution simplification scheme by iteratively merging the leaves of a tree. Popping artifacts occur, though the approach maintains the over-all shape of a tree. Several texture-based methods for reducing the geometric complexity of trees have been proposed. Jakulin [21] approximated a tree with slicings, each

consisting of multiple layers of textured polygons. Several slicings are precomputed from multiple views of a given tree. To reduce the popping artifacts, each tree is approximated by blending the two slicings that most closely match the viewing direction. Meyer, Neyeret, and Poulin [29] render a tree as a 3-level hierarchy, each level composed of transformed instances of the lower level elements. A set of bidirectional textures is precomputed for a representative element of each hierarchy level. Separate textures are obtained for direct and ambient lighting. A tree is approximated by selecting and blending 5 textures most closely matching the viewing direction and lighting conditions. The method requires a large amount of storage for bidirectional texture maps, as well as a long precomputation step. Max et al. [26,27,28] proposed to reconstruct trees from texture maps that contain normal and depth information in addition to RGBA color for each pixel. Several texture maps are precomputed from a number of viewing directions, evenly distributed on a bounding sphere of a tree. A tree is approximated by rendering only the texture maps most closely matching the viewing direction. During rendering, each texel is transformed using the precomputed depth information. In [26,27] texture maps were precomputed for each hierarchical level of a tree, rather than for the entire tree only. The approach required hours of precomputation and seconds to render an approximation for a single tree. However, rendering performance would likely be dramatically improved using modern graphics hardware with pixel shaders. It is worth mentioning that a large amount of work has been done in texture-based impostors for general scenes (for an overview see [1]). However, these are most applicable to highly complex but relatively compact scenes. Among the approaches discussed above, only Max [27,28] utilizes a tree hierarchy. Thus, the other approximations assume static scenes and associated limitations. For example, swaying in the wind or breaking off branches cannot readily be implemented. Furthermore, all the approaches described tacitly assume that complex scenes are obtained by instantiating multiple copies of a few distinct plants. For the approach proposed by Deussen et al. [13], a scene with a large number of unique trees would require a prohibitive amount of storage for all point lists. Similarly, the approach in [21,26,27,28,29] would require large amounts of texture memory to store the images for each unique plant. 2.2. The Modified L-system In our presentation, we have modified the stochastic Lsystem 5, described by Prusinkiewicz et al. [33], to generate trees for the proposed forest walk-through implementation. L-systems were chosen because they provide a compact representation of a tree as well as the methods necessary for advanced biological simulation. The rules of the system as well as our modifications are briefly described below.


ω: p1: p2:

FA(1) A(k) → /(φ)[+(α)FA(k + 1)] – (β)FA(k + 1): min{1, (2k + 1)/k2} A(k) → /(φ) – (β)FA(k + 1): max{0, 1 – (2k + 1)/k2}

The initial string is specified by the axiom ω, which consists of two modules. Module F is interpreted and rendered as a branch segment. Module A() is used for “growing” the tree and has no graphical interpretation (the integer in the parentheses denotes the number of times rewriting rules have been applied). Modules +, – denote rotation around the z-axis, while / denotes rotation around the y-axis. The angles for the rotations are specified in the parentheses (α = 32°, β =20°, ϕ = 90°). There are two possibilities when rewriting module A(k). The first rewriting rule p1 produces two branches with probability min{1, (2k + 1)/k2}, while p2 produces a single branch segment. For more details on L-systems and their interpretation refer to [32,33]. In order to produce models suited for real-time rendering, our interpretation of the L-system strings has a number of minor differences from that of Prusinkiewicz et al. First, the length of a branch segment in the modified model is decreased with each rewriting step. Second, leaf clusters are rendered as textured cross-polygons. A cross polygon is made up of two quadrangles, intersecting along their respective middle lines. Leaf clusters are attached to the last three levels of the tree, whereas the original model due to Prusinkiewicz et al. used only the last level. Furthermore, the color of the leaf clusters depends on the branch level: interior clusters are darker to simulate light occlusion within the tree canopy. The L-system description of a tree is stored as a singly-linked list, with a list element for each module. Times to render a single tree after a varying number of L-system productions are listed in Table 1. Results are in milliseconds and were averaged over 100 randomly generated trees. The number of branch segments (cylinders) and leaf clusters are also listed. Experiments were conducted on a PC equipped with a 2.4GHz Pentium 4 processor, 256MB RAM, and an NVidia GeForce FX5800 Ultra graphics card. The application was written in ANSI C++ using OpenGL 1.2. Textures for leaf clusters were mipmapped 512x512 RGBA images. n. prod n. cyls n. leaves time (ms) FPS 2 3 3 0.025 40000.00 4 15 14 0.128 7812.50 6 62 55 0.490 2040.82 8 177 146 1.363 733.68 10 393 284 2.933 340.95 12 745 475 5.219 191.61 14 1281 735 8.660 115.47

3

Table 1: Time required to render a single tree generated after different number of rewriting steps. The trees for the walk-through were produced by 12 productions of the modified L-system. Branch structure as well as foliated trees, produced after 6, 8, and 12 productions, are shown in Figures 1 and 2, respectively. Rendering the highest level of detail requires approximately 5.2 milliseconds on the test system. For a 12-level tree, traversing the L-system linked list and rendering only the branch segments require 0.11 and 2.3 milliseconds, respectively. While polygon complexity is modest, a large number of transformations (three for each segment and one for each leaf cluster) as well as OpenGL state switches are involved in rendering a single tree.

Figure 1: Branch structures: 6, 8, and 12 productions

Figure 2: Foliated structure: 6, 8, and 12 productions 2.3. Hierarchical Levels of Detail A simplifying assumption is made that a tree consists of branch segments and leaf clusters. The level of a branch segment is the rewriting step in which the corresponding module was added. Any level-k branch segment is connected to a single level-(k – 1) segment, or parent, and may have multiple children, or level-(k + 1) segments connected to it. We say that a tree is rooted at the level-0 branch segment, which by construction does not have a parent. Given any branch segment we define the subtree to be the set of all successor segments and associated leaf clusters. A k-subtree is a subtree rooted at an arbitrary level-k branch segment. 2.3.1. Discrete LOD We propose a hierarchical scheme for computing tree levels of detail, similar to Max’s approach [27]. Human perception experiments [38,39] suggest that the structure of lower level branches is critical to memorization and recognition. Thus, in the kth level of

4


detail, LOD-k replaces each k-subtree with a textured cross-polygon impostor. Each instance of the impostor is transformed in 3D space as the k-subtree it approximates. The two (untextured) cross-polygons are shown in Figure 3 for LOD-1 of a tree. The lowest level of detail, LOD-0 replaces the entire tree with a crosspolygon. The LOD-k textures are rendered from several views of an arbitrary k-subtree. In our implementation two texture maps are generated from two perpendicular views of the subtree. Each texture map is a 256x256 RGBA (8 bits/component) image, scaled appropriately at run-time by the graphics hardware. It took 0.13 seconds to prepare the textures for all impostors of a 12level tree. Since the silhouette of a subtree is the same from any two views at an angle of 180° to each other, it is not necessary to use distinct textures for the two sides of a polygon. While this simplification does provide an incorrect view from one side, we observed minimal discrepancies due to occlusion and distance to trees rendered at lower levels of detail.

Figure 4: Six levels of detail for a 12-level tree Times to render a single tree at varying levels of detail are listed in Table 2. The listed times are in seconds and were averaged over 128 randomly generated trees. Figure 3: Cross-polygons for LOD-1 Six sample levels of detail for a 12-level tree are shown in Figure 4. LOD-0 is the lowest level of detail (a single cross-polygon) and LOD-12 is the full level of detail. Each LOD was rendered at the same angle and distance from the viewpoints. Note that LOD-0 appears smaller than the tree it approximates because both polygons in the impostor are at a 45° angle to the viewer. Orientation of individual impostors is less significant for higher levels of detail. Nevertheless, this does result in some popping artifacts. Some methods for reducing the popping effect are discussed in Section 7.

LOD n. cyls n. leaves time (ms) 0 0 1 0.172 1 1 2 0.188 2 3 4 0.203 3 7 8 0.235 4 15 16 0.312 5 31 31 0.547 6 62 47 0.844 7 109 68 1.281 8 177 93 1.875 9 270 123 2.391 12 745 475 5.219

FPS speedup 5813.95 30.343 5319.15 27.761 4926.11 25.709 4255.32 22.209 3205.13 16.728 1828.15 9.541 1184.83 6.184 780.64 4.074 533.33 2.783 418.24 2.183 191.61 1.000

Table 2: Time to render a single tree (with 12 levels of branches) at different levels of detail 2.1.2. Continuous LOD The discrete LOD approach described above suffers from popping artifacts, observed when switching between two levels of detail. This effect can be mitigated by extending the discrete approach to be continuous. For example, any real number between 0 and 12 corresponds to a unique level of detail of a 12production tree. Continuity is achieved by alphablending the two nearest discrete levels of detail. To distinguish a continuous level of detail from a discrete,


we use LOD-k.f to denote a level of detail where k and f are the integer and fractional components (for example, LOD-3.2 implies k = 3 and f = 0.2). LOD-k.f of a tree is obtained by rendering both LOD-k and LOD-(k + 1). If we assume that opacity varies between 0 and 1, then LOD-k and LOD-(k + 1) are rendered at opacties (1 – f) and f, respectively. Though the concept is simple, implementation is complicated by the fact that because of z-buffering modern graphics hardware correctly composites transparent objects only if they are rendered in a sorted order (either back-to-front or front-to-back). Furthermore, alpha-blending two objects often results in an image that is darker than a rendering of a single opaque object [21].

5

before any processing. The trees inside the frustum are rendered in the front-to-back order, which enables iterative visibility computation.

In our experiments, two pass rendering resulted in between 60% and 70% increase in rendering time. 3. Visibility-based Walk-through Framework A large body of research exists for visibility-based object culling (see [9] for a recent comprehensive survey). The methods fall into two general categories: object and image based. Object-based methods clip and cull primitives against a set of pre-selected objects or occluders [11,22] and culling takes place before any rendering. Some examples of object-based methods include Prioritized-Layered Projection algorithms [22], Binary Space Partition algorithms, and algorithms using shadow frusta [20]. Object-based algorithms perform best in the presence of a small number of portals or large occluders, which makes them unsuitable for a forest walk-through since the objects making up trees (branch segments and leaf clusters) are many and tend to be relatively small. Image-based visibility methods cull in window coordinates, thus rendering is required. The z-buffer algorithm is the most common example of image-based culling. In order to cull primitives or even objects, depth tests are performed on the projections of the bounding boxes. Computation is accelerated by hierarchical methods, such as Hierarchical Z-Buffer [16] and Hierarchical Occlusion Buffer [42]. These techniques store a pyramid of the buffers, where a pixel in the kth level of the pyramid corresponds to a 2x2 area in the (k – 1)th level. When computing the visibility of an object, computation is started in the level that minimizes the number of pixels to be examined. Related methods were proposed by Bartz et al. [45] and Hey et al. [17,18]. While the above methods were designed to cull polygons, we adopt a similar approach to select levels of detail for the trees. We propose an image-based level of detail selection method for rendering forest scenes at interactive frame rates. Given a tree, the appropriate level of detail is chosen at run-time and is based on visibility of the tree in addition to its projected size. The trees that do not fall within the view frustum are culled

Figure 5: Terrain grid and view frustum 3.1.

Terrain Grid To facilitate front-to-back sorting from any viewing position, the terrain is divided into a two-dimensional rectangular grid and trees are partitioned among the grid cells. The system could readily be extended to employ a quad-tree, which should improve efficiency. Snapshots of the grid, the viewpoint and viewing frustum are shown in Figure 4a. The light blue grid cell contains the viewpoint and non-white grid cells are either intersected or contained by the viewing frustum. There are 8D cells at distance D from the viewpoint, where distance is the radius of a "square" circle. Alternatively, one could use Bresenham’s or mid-point [15] algorithms to traverse proper circles. A radial coordinate system is utilized to traverse only the cells within the view frustum for each distance. Care must be taken when computing the boundary cells: the viewing direction is the same in Figures 4b and 4c but the left-most cells are two indices apart due to different viewer positions. 3.2.

Visibility Computation Visibility information can be computed at run-time since trees in grid cells at distance k are rendered before any trees in cells at distance (k + 1). Visibility Vis(t) of a given tree t is:

Vis(t ) =

numberof pixelsof t that pass thedepth test numberof pixelsin theclippedprojectionof t

Visibility of a bounding box can be defined similarly. In order to simplify computations for bounding boxes, pixels are counted for the smallest window-aligned rectangle that includes the projection of the box. Most of the time this simplification is conservative: visibility of the box is seldom greater than

6


that of the rectangle. The same relationship holds for any tree and its bounding box. However, there are pathological cases, an example of which is a narrow tall object along a diagonal of its bounding rectangle. In such cases the rectangle can be largely occluded while the object is 100% visible. However, these are extremely unlikely in a forest walk-through. If the image rendered so far is available, visibility of a tree can be predicted by projecting its bounding volume into the view plane and counting the number of pixels in the bounding rectangle that have not yet been rendered. While Bartz et al. [5] utilized OpenGL stencil buffer for this purpose, experiments on our system indicated that obtaining the frame and stencil buffers require the same amount of time (approximately 14ms for a 1000x700 area). Rather than reading the entire buffer, one can obtain the area defined by the extremal x and ycoordinates in viewport space of all trees in the next visibility query. A number of recent graphics cards implement OpenGL extensions for occlusion queries [45]: primitives can be submitted to the graphics hardware which then determines how many pixels would pass the depth test. While GL_HP_occlusion_test extension returns only a boolean value indicating whether any pixel would pass the depth test, the recently introduced GL_NV_occlusion_query returns the number of pixels that would pass the depth test. Visibility of a given object is computed by issuing two queries with different depth functions: one passes the pixels that are in front of the current z-buffer, the other passes the pixels that are behind. The sum of the queries approximates the projected size of the tree after clipping (it is an approximation since some pixels are counted twice due to object self-occlusion). LOD-0 of a tree is used for occlusion queries. Multiple queries are issued to achieve parallelism between the CPU and the GPU.

Select the highest LOD used for d = CUTOFF_2 Render trees The occlusion buffer read in the second step is either the framebuffer or the stencil buffer. The occlusion buffer is read up to distance CUTOFF_1. Visibility is computed up to distance CUTOFF_2. Trees are rendered up to distance MAXD. These limits can be selected by the user or computed at run-time. For example, the system could advance from Step 2 to Step 3 if visibility of all trees at the current distance is below a pre-selected threshold. The framework is simplified if OpenGL occlusion query extension is utilized (which at the time of this paper was available only on a small selection of graphics cards): 1. Render the viewpoint cell and all cells at distance 1 at the highest level of detail. 2. While the highest LOD in the previous step is greater than LOD-THRESHOLD do For each cell at distance d within the frustum do Compute visibility Select level of detail Render trees The while loop in step 2 iterates as long as the highest level of detail selected in the previous step is above a user-specified threshold, LOD-THRESHOLD.

3.3.

Walk-through Framework We propose the following framework for interactive forest walk-through using the occlusion buffer: 1. Render the viewpoint cell and all cells at distance 1 at the highest level of detail. 2. For distance d = 2 to CUTOFF_1 do Read the occlusion buffer For each cell at distance d within the frustum do Compute visibility Select level of detail Render trees 3. For distance d = CUTOFF_1 + 1 to CUTOFF_2 do For each cell at distance d within the frustum do Compute visibility Select level of detail Render trees 4. For distance d = CUTOFF_2 + 1 to MAXD do For each cell at distance d within the viewing frustum do

Figure 6: A forest scene at full level of detail


7

LOD-12. Visibility: 60% to 100% and size > 60K. LOD-8. Visibility: 40% to 60%, and size > 60K. LOD-4. Visibility: 20% to 40% or size < 20K. LOD-0. Visibility: 5% to 20% or size < 1000. no rendering for under 5% visibility. Forests with 100, 400, and 1600 trees were traversed at an average of 6.5, 6.25 and 5.16 frames per second, respectively. Since occlusion queries using LOD-0 of a tree provide more accurate visibility estimates, a different set of conditions was used to select levels of detail in the occlusion query experiments: Figure 7: A forest scene using the proposed framework

LOD-12. Visibility: 55% to 100% and size >10K. LOD-8. Visibility: 20% to 55% and size > 10K. LOD-4. Visibility: 10% to 20% or 1K < size