A First Step towards Occlusion Culling in OpenSG PLUS - CiteSeerX

2 downloads 1898 Views 182KB Size Report
with Large Scene Support, High Level Primitives and High Level. Shading. The presented ... Jupiter uses the HP Occlusion Culling Flag [SOG98, BS99]. Other.
A First Step towards Occlusion Culling in OpenSG PLUS Dirk Staneker WSI/GRIS, University of T¨ubingen, Germany

Abstract The fast increasing size of datasets in scientific computing, mechanical engineering, or virtual medicine is quickly exceeding the graphics capabilities of modern computers. Toolkits for the large model visualization address this problem by combining efficient geometric techniques, such as occlusion and visibility culling, mesh reduction, and efficient rendering. OpenSG PLUS is such a toolkit with support for large models. In this paper, we present three techniques for occlusion culling in OpenSG PLUS. The first technique uses the z-buffer to determine the visibility of a bounding box. The second technique uses the stencil-buffer to get visibility information and the third technique exploits the HP Occlusion Culling Flag. All three techniques are conservative and work on arbitrary scenes without any geometric or topological assumptions. CR Categories: I.3.3 [Picture/Image Generation]: Viewing Algorithms, Occlusion Culling; I.3.4 [Graphics Utilities]: Application Packages, Graphics Packages; I.3.7 [Three-Dimensional Graphics and Realism]: Hidden Line/Surface Removal; Keywords: Large Model Visualization, Toolkit, Visibility and occlusion culling.

1 INTRODUCTION The datasets for visualization are growing faster than the rendering speed of modern graphics subsystems. Several techniques exist to solve this problem, most of them reduce the number of polygons, others use sampling techniques like ray tracing or point sampling. To reduce the number of polygons level-of-detail [Gar99] or impostor techniques are used. Another approach is occlusion culling. Hereby hidden parts of a scene are detected and excluded from the rendering process. In this paper, three different occlusion culling techniques for OpenSG PLUS will be presented. OpenSG [OSG00] is a portable scene graph programming toolkit which has been started in 1999, with the focus on real time rendering. With the OpenSG PLUS project, OpenSG will be enhanced with Large Scene Support, High Level Primitives and High Level Shading. The presented occlusion culling techniques are part of the Large Scene Support. This paper is organized as follows; the next section briefly reviews related toolkits for visualization and other occlusion culling techniques. Section 2 describes, how occlusion culling can be applied to OpenSG. In the following Sections 3, 4 and 5 three different approaches are presented for image-space occlusion culling. In Section 6, we introduce the OSGviewer application, which was used for tests and implementation of the presented techniques. Finally the results are summarized in Section 8.

1.1 Related Work Scene graph programming toolkits are widely available, e.g. Open Inventor, IRIS Performer, Cosmo3D, but most of them have no sup

[email protected]

port for occlusion culling. One of the scene graph programming toolkits having occlusion culling is Jupiter ([HP98, BSS01]). Jupiter focuses in large model visualization and provides different concepts to manage large amounts of data. For occlusion culling Jupiter uses the HP Occlusion Culling Flag [SOG98, BS99]. Other techniques are not yet available, but could be implemented. A wide range of algorithms for occlusion culling are available. However, not all algorithms work on every scene without doing extensive preprocessing. One of the well known algorithms is the Hierarchical z-Buffer [GKM93] which uses hierarchical data structures for the depth buffer and the scene. Algorithms using OpenGL acceleration for speedup calculations are also available [BMH99]. In [KS01] the histogram extension is mentioned for doing occlusion culling. Another hardware extension is the HP Occlusion Culling Flag [HP97], only available on the HP VISUALIZE fx graphics subsystem. Similar extensions are available on a small number of other graphics subsystems.

2 Traversal and Sorting For conservative occlusion culling, we must ensure that no pixel of a given occluded object is visible in the screen-space. Hence we are using an approximation of an axis-aligned bounding box (AABB) for the occlusion culling test. If no pixel of the AABB is visible in the screen-space, the content of the AABB must be hidden. However, if an AABB pixel is visible, its content is not necessarily visible, too. All three approaches presented in this paper work in image-space and use the z-buffer in some way for the occlusion culling test, thus accurate z-buffer values are needed to get correct culling results. This leads to a front-to-back sorted rendering of the given scene. Without front-to-back sorted rendering, results can vary and most of the hidden parts not found. Scene graph

3

1

4

Viewpoint

1

2

3

2

4

z

Viewplane Result

2

1

4

3

Figure 1: Example of a list for occlusion culling.

OpenSG manages axis-aligned bounding boxes for every node in the graph. These nodes can be used as bounding boxes for the applied culling approach. At the moment only the AABB of the geometry nodes are used. The OpenSG classes RenderAction [RBV01] or DrawAction renders a scene graph and apply NodeAction callbacks during traversal. Due to the lack of depth-sorted traversal for frontto-back rendering in OpenSG, an customized implementation of

DrawAction is used. During the traversal of the scene graph every GeometryNode is collected by the NodeAction in a depthsorted list. The depth-value for sorting is determined by the nearest vertex of the bounding box. In the DrawAction::stop() method the depth-sorted list is traversed and one of the occlusion culling approaches is applied to every saved node. If a node’s AABB is visible, the geometry is rendered or culled otherwise. 

the stencil-buffer (see Figure 3) by using glStencilOp(). After rasterizing the AABB, the stencil-buffer is read and sampled in software. Occluded AABB will not contribute to z-buffer, hence will not cause a respective entry in the stencil-buffer. 0 0 0 1 Visible 1 pixels 0 0 0 0 0

3 Occlusion Culling with Z-Buffer Viewpoint

The OpenGL z-buffer can be used to get the visibility information of an AABB, because it always holds the correct depth-value for every pixel. To test occlusion, the depth-values of the AABB are computed and tested against the values of a z-buffer maintained in software. A glReadPixels() to read the OpenGL z-buffer is quite expensive, hence this operation is split in fragments. Each fragment has the same size, which is a multiple of the databus width to exploit memory alignment on the graphics card. A fragment is only read, if it is necessary for a pixel test. The test stops after at least one visible pixel. Every fragment holds two flags, an invalid and an unused flag. At the beginning of every frame all the unused flags are true and a tested pixel against this fragments leads always to a visible pixel without reading the OpenGL z-buffer. If a pixel is visible, the invalid bits of all1 fragments are enabled, because the geometry of the bounding box will be rendered and the content of the z-buffer may change. For pixels inside fragments with a true invalid bit, we read the z-buffer and disable the invalid bit.

z

Viewplane

zmax

Stencil−buffer

Figure 3: Occlusion test with the stencil-buffer.

The actual implementation reads the whole region of the covered zone by the AABB. This could be optimized like the fragments in Section 3 or with the interleaving scanning scheme from [BMH98].

5 Occlusion Culling with the HP Occlusion Culling Flag The HP Occlusion Culling Flag [HP97] is a small hardware extension, which returns information of the visibility of an object. The AABB is rendered through the pipeline with disabled color- and z-buffer writes. Is the result a visible AABB (at least one pixel of the AABB triggered a z-buffer write) the content of the bounding volume has to be rendered. If more general bounding volumes are available, the HP Occlusion Culling Flag provides a very easy and one of the fastest ways for doing Occlusion Culling [BKS01]. 700 No backface culling With backface culling 600

microsec

500

400



300

200

Figure 2: Z-buffer with marked fragments. The dark (blue) fragments are used for tests.

100

0 0

In many scenes it is not necessary to render every detail. For this approach a minimum of visible pixels for a bounding box can be set. Only if at least this minimum of pixels is visible, the complete bounding box is set as visible. This leads to a speedup with a miner reduction in rendering quality.

4 Occlusion Culling using the Stencil-Buffer As mentioned in [BMH98], the stencil-buffer can be used to compute visibility informations. During rasterization writing to the frameand z-buffer is disabled. For each pixel of the AABB the z-buffer test is applied. If the pixel would be visible, a value is written to 1 It

would be enough to invalidate only the fragments with visible pixels.

10000

20000

30000 pixel

40000

50000

60000

Figure 4: Latency for the HP Occlusion Culling flag on a Pentium III, 750 MHz with a VISUALIZE fx10

The performance of the HP Occlusion Culling Flag depends on the fillrate of the z-buffer. Larger bounding volumes need more time for the test, because the whole bounding volume passes always the z-buffer stage of the rendering pipeline. Figure 4 shows the correlation between the size of a bounding volume in screen-space and the latency for an occlusion culling request. With enabled backface culling the test is almost twice as fast as without, because with backface culling only one scan through the z-buffer for the frontface is done. The latency is the same whether the result is visible or hidden.

6 OSGviewer

7 Z-Buffer test Stencil-Buffer HP No culling

6.5

6

5.5

5

4.5

4

3.5

3 0

50

150

200

250

300

350

Figure 7: Framerates for the camera path.

Figure 5: Mainwindow of OSGviewer

OSGviewer is a small application using OpenSG as rendering back end. For the graphical user interface is implemented with Qt and the QGLviewer ([Mei00, QGL]) for camera control. OSGviewer allows browsing and editing of the OpenSG scene graph. Multiple views to the scene are allowed and QGLviewer enables interchanging of camera positions with drag and drop. These feature allows also the recording of camera positions. With the application CameraPathInterpolation from the QGLviewer package a camera path can be calculated and then viewed with the OSGviewer for reproducible performance measurements with different configurations.

100

Min. fps 3.58 3.46 3.65 4.44

No culling Stencil test Z-Buffer test HP Flag

Max. fps 3.85 5.13 5.15 6.67

Avg. fps 3.77 4.28 4.42 5.70

Deviation fps 0.03 0.23 0.28 0.41

Avg. speedup 0.0% 12.0% 14.8% 33.8%

Table 1: Comparison between the used culling techniques

The first benchmark (no culling) shows the performance of OpenSG without any changes. In the second benchmark we tested the performance of the stencil-buffer test. In the third test the zbuffer technique was applied and in the last one the HP Occlusion Culling flag was used. For all benchmark tests a Dual Pentium III with 750 MHz with a HP VISUALIZE fx10 running Linux was used for rendering. Although OpenSG supports threading, no threading was used for the tests. The resulting framerates show average speedups between 12% and 34%. 100 Hidden polygons Hidden nodes

80

Percent

60 

40

Figure 6: Sceneview of OSGviewer with QGLviewer

20

0 0

50

100

150

200

250

300

350

Frame

7 RESULTS For all tests we have used the FormulaOne car from the Jupiter project. The model has about 750.000 polygons in 306 geometry nodes. A camera path with 342 frames was created with the tools from the QGLviewer. In every frame the whole model is located within the viewing frustrum, therefore view frustrum culling itself does not remove geometry. In Figure 7 and Table 1, the resulting framerates are shown for the different occlusion culling techniques.

Figure 8: Percentage of occluded nodea and polygons for the camera path.

Occlusion culling generally depends on the scene and its depth complexity. Figure 8 show the percentage of hidden nodes and polygons in every frame. The limited depth complexity of the test dataset (about 60% of the polygons are detected as hidden) leads only to a limited culling performance. In scenes with a higher depth com-

plexity, a better performance can be expected, as preliminary tests indicate. The benchmarks show, that the HP Occlusion Culling Flag is the fastest solution in this test. The stencil- and z-buffer-tests show similar results, whereby the stencil-test will perform better in frames with lower depth-complexity (less setup- and rasterizationtime in software), while the z-buffer-test is faster in frames with more depth-complexity, because the z-buffer in software needs less updates (for hidden nodes is no update necessary). 

8 CONCLUSIONS and FUTURE WORK In this paper, we tested three occlusion culling techniques for OpenSG to speedup the performance of large model rendering. The scene graph hierarchy was derived from the hierarchical part list of the MCAD model. In cooperation with the University of Braunschweig algorithms for special hierarchies will be developed. One of the major problems for the implemention of occlusion culling is the lack of enhanced traversal techniques in OpenSG. This will be available in the next public release of OpenSG. In conjunction with the enhanced traversal subsystem and the special hierarchies a clean API for culling will be defined like [Fue01]. Furthermore only the Pentium III PC with a VISUALIZE fx was used for the tests. Other architectures and graphics subsystems are in the focus of further development. We showed only image-based techniques which do not exploit features of special scenes or applications. Another point of development in the near future will be the implementation of portal and virtual occluder nodes. The virtual occluder nodes with shadow frustras mentioned in [HMC+97] The portal nodes will work like the portals for dynamic scenes. These new nodes can help to speedup special applications and scenes, e.g. architectural walkthroughs. To get further speedups, frame-to-frame coherence could be exploited, better bounding volumes as mentioned in [BKS01] could be used and in conjunction with special hierarchies and enhanced traversal schemes hierarchical approaches could be applied to cull complete subgraphs.

ACKNOWLEDGEMENTS This work is supported by the OpenSG PLUS project of the bmb+f in Germany. The MCAD dataset are available from the Kelvin project [Jup]. We would like to thank Dirk Reiners, Gerrit Voss and Johannes Behr for their help in OpenSG programming. Manfred Weiler for his SGI support and SGbrowser code. Dirk Bartz and Alexander Ehlert from the University of T¨ubingen for proof reading.

References [BKS01] D. Bartz, J. Klosowski, D. Staneker, Tighter Bounding Primitives for Better Occlusion Performance, Siggraph Visual Proc., 2001 [BMH98] D. Bartz, M. Meißner, T. H¨uttner, Extending Graphics Hardware for Occlusion Queries in OpenGL, In Proc. of Eurographics/SIGGRAPH Workshop on Graphics Hardware, pages 97-104, Lisboa, Portugal, August 1998 [BMH99] D. Bartz, M. Meißner, T. H¨uttner, OpenGL-assisted Occlusion Culling of Large Polygonal Models, Computers and Graphics - Special Issue on Visibility - Technics and Applications, 23(5): 667-679, 1999

[BS99] D. Bartz, M. Skalej, VIVENDI - A Virtual Ventricle Endoscopy System for Virtual Medicine, EG/TVCG Symposiumon Visualization, 155-166, 1999 [BSS01] D. Bartz, D. Staneker, W. Straßer, Jupiter: A Toolkit for Interaction and Large Model Visualization, Proc. of Symposium on Parallel and Large Data Visualization and Graphics, 2001 [CCDS2000] D. Cohen-Or, Y. Chrysanthou, F. Durand, and C. Silva, Visibility: Problems, Techniques, and Application, In ACM SIGGRAPH Course 4, 2000 [Cla76] J. Clark, Hierarchical Geometric Models for Visible Surface Algorithms, Communications of the ACM, Vol. 19, No. 10, 1976 [Dur99] F. Durand, 3D Visibility: Analytical study and Applications, PhD thesis, Universite Joseph Fourier, Grenoble, France, 1999 [Fue01] Christoph F¨unfzig, Design of a flexible visibility library (draft), Computer Graphics, Digital Library Lab, Braunschweig Technical University, 2001 [Gar99] M. Garland, Multiresolution Modeling: Survey and Future Opportunities, In Eurographics STAR report 2, 1999 [GCS91] Z. Gigus, J. Canny, R. Seidel, Efficiently Computing and Representing Aspect Graphs of Polyhedral Objects, IEEE Transactions on Pattern Analysis and Machine Intelligence, 13(6), S. 542-551, 1991 [GKM93] N. Greene, M. Kass, G. Miller, Hierarchical z-buffer visibility, Proc. of ACM Siggraph, S. 231-238, 1993 [HP97] Hewlett-Packard Company, Specification Document, http://www.opengl.org/ [HP98] Hewlett-Packard Company, Draft F, 24.01.1998

GL HP Occlusion Test, erh¨altlich unter Jupiter 1.0 Specification,

[HP00] Hewlett-Packard Company, HP IA32 VISUALIZE fx5 and fx10 graphics accelerators , White Paper, HP 2000 [HMC+97] T. Hudson, D. Manocha, J. Cohen, M. Lin, K. Hoff, H. Zhang, Accelerated Occlusion Culling using Shadow Frustra, Proc. of ACM Symposium on Computational Geometry, 1997 [Jup] The Kelvin Project, http://www.gris.uni-tuebingen.de/ kelvin/ [KS01] J. Klosowski, C. Silva, Efficient Conservative Visibility Culling Using The Prioritized-Layered Projection Algorithm, IEEE Transactions on Visualization and Computer Graphics, 2001 (to appear) [MBH+99] M. Meißner, D. Bartz, T. H¨uttner, G. M¨uller, J. Einighammer, Generation of Subdivision Hierarchies for Efficient Occlusion Culling of Large Polygonal Models, Technical Report ISSN 0946-3852, Universit¨at T¨ubingen, WSI-99-13, 1999 [Mei00] M. Meißner, Occlusion Culling and Hardware Volume Rendering, Dissertation der Fakult¨at f¨ur Informatik der Eberhard-Karls-Universit¨at T¨ubingen, 2000 [OSG00] OpenSG Forum, OpenSG - Open Source Scenegraph, http://www.opensg.org, 2000

[QGL] QGLviewer, http://www.qglviewer.de/ [RBV01] Dirk Reiners, Johannes Behe, Gerrit Voss, OpenSG Starter Guide Version 1.0, OpenSG 1.0 Source Release, 2001 [Sev99] K. Severson, VISUALIZE fx Graphics Accelerator Hardware, Technical Report, Hewlett-Packard Company, http://www.hp.com/workstations/support/ documentation/whitepapers.html, 1999 [SGI99] SGI, Silicon Graphics Visual Workstation OpenGL Programming, Technical Report, SGI, 1999 [SOG98] N. D. Scott, D. M. Olsen, E. W. Gannett, An Overview of the VISUALIZE fx Graphics Accelerator Hardware, The Hewlett-Packard Journal, May 1998

Processing for Interactive [TS91] S. Teller, C.H. Sequin, Walkthroughs, Proc. of ACM Siggraph, S. 61-69, 1991 [GL] OpenGL, Manual Pages, http://www.opengl.org [GL99] OpenGL, Reference Manual, 3rd Edition, Longman Higher Education, 1999 [WND99] M. Woo, J. Neider, T. Davis, Open GL Programming Manual, 3rd Edition, Addison Wesley Longman Publishing, 1999 [ZMHH97] H. Zhang, D. Mannocha, T. Hudson, K. Hoff, Visibility Culling using Hierarchical Occlusion Maps, Proc. of ACM Siggraph, S. 77-88, 1997