In Proceedings of Game Developers ... Game Developers Conference '00, pp. 449â460 ... When scanning is not an option, the alternative currently is to hire.
First Edition 2008 © DAUT DAMAN, MOHD SHAHRIZAL SUNAR & MUHAMAD NAJIB ZAMRI 2008
Hak cipta terpelihara. Tiada dibenarkan mengeluar ulang mana-mana bahagian artikel, ilustrasi, dan isi kandungan buku ini dalam apa juga bentuk dan cara apa jua sama ada dengan cara elektronik, fotokopi, mekanik, atau cara lain sebelum mendapat izin bertulis daripada Timbalan Naib Canselor (Penyelidikan dan Inovasi), Universiti Teknologi Malaysia, 81310 Skudai, Johor Darul Ta’zim, Malaysia. Perundingan tertakluk kepada perkiraan royalti atau honorarium. All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical including photocopy, recording, or any information storage and retrieval system, without permission in writing from Universiti Teknologi Malaysia, 81310 Skudai, Johor Darul Ta’zim, Malaysia. Perpustakaan Negara Malaysia
Cataloguing-in-Publication Data
Advances in computer graphics & virtual environment. (Vol.1) / editors Daut Daman, Mohd. Shahrizal Sunar, Muhamad Najib Zamri. ISBN 978-983-52-0626-9 1. Computer graphics. I. Daut Daman. II. Mohd Shahrizal Sunar. III. Muhamad Najib Zamri. 006.6 Editor: Daut Daman dan Rakan-rakan Pereka Kulit: Mohd Nazir Md. Basri & Mohd Asmawidin Bidin Diatur huruf oleh / Typeset by Fakulti Sains Komputer & Sistem Maklumat
Diterbitkan di Malaysia oleh / Published in Malaysia by PENERBIT UNIVERSITI TEKNOLOGI MALAYSIA
34 – 38, Jln. Kebudayaan 1,Taman Universiti 81300 Skudai, Johor Darul Ta’zim, MALAYSIA. (PENERBIT UTM anggota PERSATUAN PENERBIT BUKU MALAYSIA/ MALAYSIAN BOOK PUBLISHERS ASSOCIATION dengan no. keahlian 9101) Dicetak di Malaysia oleh / Printed in Malaysia by UNIVISION PRESS SDN. BHD
Lot. 47 & 48, Jalan SR 1/9, Seksyen 9, Jalan Serdang Raya, Taman Serdang Raya, 43300 Seri Kembangan, Selangor Darul Ehsan, MALAYSIA.
CONTENTS
Preface
vii
CHAPTER 1 RANGE DETECTION TECHNIQUE FOR REAL-TIME VIRTUAL HERITAGE APPLICATION
1
Mohd Shahrizal Sunar, Abdullah Mohd Zin Tengku Mohd Tengku Sembok CHAPTER 2 OVERVIEW OF CROWD SIMULATION IN COMPUTER GRAPHICS 17 Mohamed `Adi Mohamed Azahar, Daut Daman CHAPTER 3
NON-PHOTOREALISTIC RENDERING FOR OUTDOOR SCENE Irene Liew Suet Fun, Mohd Shahrizal Sunar
31
vi
CHAPTER 4
REAL-TIME TERRAIN RENDERING AND VISUALIZATION BASED ON HIERARCHICAL METHOD 45 Muhamad Najib Zamri
CHAPTER 5
3D AVATAR MOVEMENT AND NAVIGATION
57
Ahmad Hoirul Basori, Daut Daman, Abdullah Bade CHAPTER 6 3D GRAPHIC SCENE MANAGEMENT IN DRIVING SIMULATOR 71 Mohd Khalid Mokhtar, Mohd Sharizal Sunar, CHAPTER 7
AUGMENTED REALITY THEORY AND APPLICATIONS 87 Ajunewanis Ismail, Zakiah Noh
CHAPTER 8
AUGMENTED REALITY SYSTEM DEVELOPMENT: FROM SIMPLE TO ADVANCE
111
Ahmad Hoirul Basori, Mohd Shahrizal Sunar, Daut Daman INDEX
129
vii
PREFACE
Computer graphics is the most significant element in development of virtual environment. Virtual environment is a world presented in a particular virtual reality system. The target of computer graphics is to produce an attractive computer-generated image in virtual environment. There are various new techniques have been proposed and developed by researchers and practitioners from all around the world. The research areas of computer graphics are concerned with improving several aspects especially in terms of realism, accuracy, effectiveness and efficiency of image generation. Our goal in writing this book is to provide a compilation of computer graphics and virtual environment research works beginning from fundamental concepts to advanced research and development (R&D). Throughout this book, we wish to share our knowledge and explore the potential of computer graphics and virtual environment fields in contributing for many related research domains. Our studies can span for a wide range of aspects, including the design and implementation of the applications that enable virtual reality systems to be produced. Hopefully, this book can provide greater understanding and interest in computer graphics and virtual environment. The book is not a tutorial on how to develop the entire computer graphics and virtual environment system but it gives you general pictures about related computer graphics research works and therefore helps ignite new ideas to other researchers in associated fields. Furthermore, instead of covering programming aspects, we tend to focus more on applications, interaction, system integration, usability and issues of computer graphics and virtual environment. Besides that, this book does not provide comprehensive overview of the medium elements of computer
viii
graphics and virtual environment. The contents of this book are useful as a reference, guideline and knowledge resources for researchers or students in order to gain in-depth information to the related disciplines. The target readers are technically knowledgeable but oblivious on how to apply the advanced techniques in computer graphics to their particular area of interests. We hope the information presented here will be able to drive the readers to go further in their learning and development process. In this second volume, we have presented the readers with various kinds of rendering techniques. Range detection optimization technique that is based on visibility culling method has been introduced. Then, crowd rendering techniques in developing simulation systems have been explained in detail. Subsequently, non-photorealistic rendering (NPR) has been proposed for virtual outdoor scene in order to produce cartoonbased rendering and visualization. This volume also has presented the terrain rendering and optimization techniques which involve real-time visualization and navigation system on single personal computer’s platform. While the use of rendering technique in character animation has been applied in developing 3D avatar movement and navigation system. The scope of this book also includes the development of 3D scene management for driving simulator which helps in reducing the computation cost and handling various types of objects in a scene systematically. Finally, the theory, applications and development of augmented reality have been reviewed in order to build up the state-of-the-art and immersive virtual reality (VR) systems. Mohd Shahrizal Sunar Muhamad Najib Zamri Department of Computer Graphics and Multimedia Faculty of Computer Science and Information System Universiti Teknologi Malaysia 2008
1 RANGE DETECTION TECHNIQUE FOR REAL-TIME VIRTUAL HERITAGE APPLICATION Mohd Shahrizal Sunar Abdullah Mohd Zin Tengku Mohd Tengku Sembok
INTRODUCTION
S
imulating real world demand real-time and realism of visual appearance. In most real-time computer graphics applications such as virtual walkthrough, flight simulator and games, maintaining the interactive frame rate and smoothness of movement is utmost important. This is essential to give a user an immersive experience in which it will move very smooth during the navigation in the virtual world. Realism as defined in Chiu et al. (1994), means how far the image can visualize in the users mental can be the same as real experience in the real world. According to Dzwig (1988), to generate the realistic 3D world, the geometric complexity level of 3D model has to be increased and this has been agreed by Ramasubramanian (1999). The need for beautiful textures will simultaneously increase the memory usage (Airey 1990). Angelidis (2001) claimed that there is a trade-off between realism and real-time and also between visual quality and rendering performance. Increasing realism will reduce the processing speed. Therefore, when speed is more crucial, we may have to sacrifice realism to some level. Most of virtual reality applications need more real-time than realism. This is because lots of interactions by user are needed to navigate the application. But,
2 Advances in Computer Graphics and Virtual Environment
as agreed by Luebke (2001), realism still can be achieved without sacrificing too many realism needs. This chapter will discuss about the techniques for optimization in computer graphics and why these optimization techniques are needed. Visibility culling is one of the method that can decrease the high computational of real-time computer graphics applications. As defined in Moller (2002), view frustum culling (VFC) is the simplest visibility culling among the other methods i.e. back faces culling (Zhang 1997) and occlusion culling (Bartz 1999; Aila 2005; Zhang 1998). Only the objects which are inside the frustum will be displayed. This chapter is focusing on the construction of alternative view frustum algorithm for effective rendering and examining a new improvement on view frustum culling approach by Placeres (2005). The approach is defined as a range detection approach which effectively eliminates the unseen objects in the virtual environment. This chapter will also describe on how to develop acceleration techniques for Ancient Malacca Virtual Walkthrough project which focuses on the modeling and visualization of Malacca city in 15th century. It is based on local and foreign sources such as the Sejarah Melayu (Ahmad 1979) and the descriptions by Portugese writer, Pires (1944) described the city and the empire as an opulent and prosperous centre of maritime Malay civilization. The next section will describe the related work of the view frustum culling algorithm and some similar virtual heritage projects. Furthermore, view frustum culling definition is explained in detail. The six planes of view frustum culling will be illustrated in this chapter. Then, the range detection approach in the next section will be enlightened. Finally, the statistical representation to portray the results delivery will be shown.
Range Detection Technique for Real-Time Virtual Heritage Application 3
RELATED WORKS Visibility is a wide research topic area in computer graphics. It has been studied since early era of computer graphics. The original idea on how to make an efficient view frustum culling was invent by Clark (1976) who used hierarchical approach for visible surface. When the area of computer graphics becomes more popular in 1990's, the algorithms also start growing (Cohen 2003). We focus here on the evolution of view frustum culling techniques. View frustum culling algorithm can be divided into two common approaches. The first approach is to transform the bounding volume into perspective view. The testing is performed with perspective coordinate system, as initiated in Bishop (1998). Second approach is by checking the bounding volume against view frustum volume represented by its six clipping planes according to Hoff (1999) as agreed with Greene (1994). The advantage of this approach can make the trivial acceptance and rejection. Then, the test recursively continued based on intersection between a box and a frustum. Greene (1998) showed how to detect the intersection of a rectangular and a convex polyhedron. They also introduced the fast intersection method between polygon and cube. A view frustum culling using probabilistic caching scheme was developed by Slater (1997). The algorithm is implemented using Binary Space Partition tree. Statistical probability representation is used to get faster result than the hierarchical bounding box scheme. The method produced unacceptable errors in some cases. Bittner (1998) has improved the view frustum culling based on traversal coherency of the scene hierarchy. Certain interior node of the hierarchy is avoided during the intersection test. This technique is not doing the test to the object that is expected to remain visible. Optimized bounding boxes for view frustum culling are explored by Assarsson (2001). They used low degree of freedom of camera motion in the user-driven walkthrough scene. As stated in Assarsson (1999) and Slater (1997), both algorithms are similar,
4 Advances in Computer Graphics and Virtual Environment
in which both algorithms try to minimize the work by caching information and avoiding more expensive intersection tests. The view frustum cullling algorithms were improved by Placeres (2005) who used slightly different way of defining the view frustum volume. In this approach, the view frustum volume is identified based on camera referential point and properties.
ANALYSIS OF VIEW FRUSTUM CULLING TECHNIQUE View frustum can be illustrated as a truncated pyramid representing user's scope of visible view. Essentially it is the field of view of the virtual camera which determines the region of virtual environment that may appear on the screen during run-time. A view frustum is defined by six planes: (1) n d π i = 0...5, where i is the normal and i is the offset of plane i , and x is an arbitrary point on the plane. We say that a point x is π outside a plane i if . . If the point is inside all planes then the point is inside the view frustum. To generate a view frustum, a pyramid with surfaces is used. There are two surfaces that represent the nearest and furthers viewpoint. The nearest and furthers representing the boundary of visible objects. Another four surfaces are representing top, bottom, left and right boundary. View frustum in our case is always set in the perspective view mode. View frustum culling (VFC) is important because it culls away the invisible (unseen) objects in complex scenes from the rendering pipeline. It saves the memory usage and it will definitely increase the frame rates. View frustum culling is typically used in virtual reality software, walkthrough system, 3D games and serious
Range Detection Technique for Real-Time Virtual Heritage Application 5
visualization such as medical and military simulation for its efficient rendering. Each frustum plane is tested if the object is inside or outside the view frustum. Only primitives that are totally or partially inside the view frustum need to be rendered. Three possible output states for bounding box and view frustum test are: OUTSIDE: All eight points representing the bounding box are totally outside the view frustum. So the object is eliminated from the further processing. INSIDE: All eight points representing the bounding box are totally inside the view frustum. So there is no further culling calculation. The object is included for the further processing. This state is useful for the application to avoid unnecessary computation. INTERSECT: Any point of bounding box is inside the view frustum. So down traverse the hierarchy is required. Partial of object geometry is included for the further processing. This state will employ expensive cost of computation. For speed up reason, we consider this state as INSIDE.
RANGE DETECTION TECHNIQUES Conventional view frustum culling test discussed in previous section was based on the idea that the volume of view frustum is surrounded by six planes. The test is performed against every single plane to detect the bordered objects for viewing using plane equation. Each object is compared six times in each frame to resolve the visibility. In this section, the introduction of alternative approach by Placeres et al. (2005) will be explained. They stated that the VFC
6 Advances in Computer Graphics and Virtual Environment
test is done to all point in the virtual environment. The improved approach is based on the formula given in (2). (1) n(Ej) = number of objects in group j which is the number of object we sent to the frame buffer; N = total of objects in the virtual walkthrough application; j = index of group; i = counter number of object in N but not in group j; m = number of N but not in group j; jc = complement of j. This section will describe the advantage of this approach, which named as Range Detection Technique (RDT).
Figure 1.1 Six objects in virtual environment that bounded by sphere and AABB
The RDT approach test is based on three steps as follow: First is front object test. Secondly, sphere bounding test and thirdly, AABB bounding test. The front object test is done to eliminate the objects that are not facing the camera view. The step is arranged based on the fact that sphere testing is faster than AABB. Every
Range Detection Technique for Real-Time Virtual Heritage Application 7
village houses and trees in the scene is bounded by sphere and AABB. Bounding sphere is generated based on center points and radius of each object in the application. To generate an AABB, the minimum and maximum point information for each object is needed. Then, based on the minimum and maximum point, a box with another six remaining points is generated. Therefore, for each object, eight points defining the AABB is tested to the range of the camera visible boundary. In the front object test, only object A is eliminated. After the sphere test is done using RDT, objects D, E and F are to be saved in the frame buffer. Only objects D and F are to be displayed as output when the AABB test is finished.
Figure 1.2 The range for point P in z coordinate
The next step has been taken to extract and determine the point in visibility range. As shown in Figure 1.2, P is the point that is being tested against the view frustum. The virtual camera properties such as location, lookup and orientation are used to check the distance between point P and camera. The camera reference is based on three unit vector X, Y and Z which labeled as C. First, checked the z-coordinate of P. Consider C is the point in camera orientation coordinates. To find z-coordinate of P, we need to find the vector that goes from c to P. The length of projection of this vector on z must be computed. This can be done by dot product where z is assumed as a unit vector.
8 Advances in Computer Graphics and Virtual Environment
(3) (4) If the z-coordinate is between the range of near and far then P is possibly inside the view frustum. Then, the x- and y-coordinates must be tested. Otherwise, P is absolutely out of the view frustum. (5) To find x- and y-coordinate of P in camera orientation coordinate, the similar procedure is followed. (6) (7)
Figure 1.3 Check the y coordinates of P
Figure 1.3 above shows y-coordinates of P is being checked from the z-coordinates of P in order to find the high range of view frustum. Therefore, as represented in Figure 1.4, y value will be compared to the height to test whether it is in view frustum range or not.
Range Detection Technique for Real-Time Virtual Heritage Application 9
Figure 1.4 P.y will be compared to the height to test whether it is in view frustum range or not
In order to get the height range of view frustum, the following trigonometry formula is used: (8) P is in the view frustum range if P.y value is larger than -h/2 and smaller than h/2. (9) Then, we will test the x-coordinate of P easily by checking P.x range of view frustum width. Width, w is computed based on the height and aspect ratio. Aspect ratio is the number of pixels horizontal divided by the number of pixels that can be displayed by the system. (10)
10 Advances in Computer Graphics and Virtual Environment
RESULT This section describes the implementation and testing results of both normal six planes of VFC, Placeres’s and RDT in Ancient Malacca Virtual Walkthrough project. Development of virtual heritage environment is built using C++, OpenGL, GLVU and GLUI. The 3D objects of Ancient Malacca are modeled by 3D Studio Max, which is modified from previous project that run at SGI Onyx machine. Intel Pentium 4 3.2 GHz with 512MB RAM and nVidia GeForce FX5950 graphics display card are exploited to test the VFC in virtual walkthrough application. For testing purpose, it is essential to fix the camera movement path in virtual environment. This is important to ensure the camera to follows the same path for every testing with different culling method. Figure 1.5 shows the difference of the number of object drawn in the screen with and without RDT. All objects in the virtual walkthrough appear in blue. Red dot in the figure represents the camera position. As shown in the top figure, all objects are drawn without the implementation of RDT. This may consume a high computational cost. The virtual walkthrough reduced more than three quarter number of object to be sent to the frame buffer with the implementation of RDT. This can be shown in the Figure 1.5 (below). Figure 1.6 shows the screenshot of the Ancient Malacca Virtual Heritage application that implemented the RDT to effectively determine the potential visible objects.
Range Detection Technique for Real-Time Virtual Heritage Application 11
Figure 1.5
Difference of virtual walkthrough executed between with (bottom) and without (top) RDT
Figure 1.6 Screenshot of the Ancient Malacca Virtual Heritage application that implemented the RDT
12 Advances in Computer Graphics and Virtual Environment
Placeres Range Detection Conventional VFC No VFC
0
200
Figure 1.7
400
600
800
1000
1200
1400
1600
1800 2000 Frame
2200
2400
2600
2800
3000
3200
3400
Results of frame per second for each normal conventional VFC, Placeres (2005), RDT and without VFC
The frame per second test is done to the Ancient Malacca virtual walkthrough. The purpose of the test is to measure the performance and the interactivity of the virtual walkthrough system with and without RDT. Besides that, it also shows the difference of smoothness during application run-time with RDT. Figure 1.7 shows the result of frame per second test. The frame rates become higher when the camera reaches to the area with not many objects to be rendered. When, the camera arrived at the area with crowd of objects, the low frame rates obtained. The result shows that the RDT give the highest frame rates among others during the run-time test (Figure 1.8). Without VFC, the frame rates remain 23 and 24 frames per seconds, which is under the expectation of real-time rendering standard.
Range Detection Technique for Real-Time Virtual Heritage Application 13
160
140 149.81 120
Time (s)
100 80 60 40
36.02 32.39 30.59
20 0
1
No VFC
Figure 1.8
Conventional VFC
Placeres
RDT
An advantage of RDT is the time taken to finish one round of the testing path is shorter than normal VFC
There are five times testing procedures for each approach including with no view frustum culling. Then, it takes the average of the result as shown in the bar chart. As a result, the RDT is the fastest among others. RDT was also tested with different categories of computer system. Figure 1.9 shows the results of frame per second (fps) for RDT running on different specification of computer systems. Category 1 is the highest specification which can run the virtual heritage application in highest fps. This result also shows that RDT can improve the application speed although it was running on the lower specification like in category 3. The specification categories of computer systems used for testing in this research are as follow: Category 1: Intel Pentium 4 3.2 GHz (HT), 2046 MB RAM, nVidia GeForce FX5950U with 256MB VRAM.
14 Advances in Computer Graphics and Virtual Environment
Category 2: Intel Pentium 4 3.0 GHz, 1024 MB RAM, nVidia GeForce FX5200 with 128MB VRAM. Category 3: Intel Pentium 4 1.9 GHz, 512 MB RAM, ATI Radeon 7500 VRAM. 250
200
Categori 1 150
Categori 2
Categori 3 100
50
0 0
200
400
600
800
1000
1200
1400
1600
1800
Frame
Figure 1.9 The results of frame per second for RDT running on different system category
REFERENCES
Ahmad A. S., 1979,“Sulatus Sulatin (Sejarah Melayu)”. Dewan Bahasa dan Pustaka. Aila T., 2005, “Efficient algorithms for occlusion culling and shadows”, Ph.D.dissertation, Helsinki University of Technology, Helsinki, Finland, Feb.
Range Detection Technique for Real-Time Virtual Heritage Application 15
Airey J. M., Rohlf J. H., and Brooks F. P., 1990, “Towards image realism with interactive update rates in complex virtual building environments”, vol. 24, no. 2, Mar, pp. 41–50. Angelidis A. and Fouquier G., 2001, “Visualization issues in virtual environments: from computer graphics techniques to intentional visualization,” in WSCG’2001, pp. 90–98. Assarsson U.,2001 “View Frustum Culling and Animated Ray Tracing: Improvements and Methodological Considerations”. Tesis Ph.D. Chalmers University of Technology, Goteborg, Sweeden. Assarsson U. and Möller T., 1999 “Optimized view frustum culling algorithms,” Chalmers University of Technology, Tech. Rep. 99-3, March. Bartz D., Meißner M., and H¨uttner T., 1999, “OpenGL-assisted occlusion culling for large polygonal models”, Computers and Graphics, vol. 23, no. 5, pp. 667–679. Bittner J., Havran V., and Slavik P., 1998, “Hierarchical visibility culling with occlusion trees”. Proc. of Computer Graphics International, Jun, pp. 207–219. Bishop L., Eberly D., Whitted T., Finch M., and Shantz M., 1998, “Designing a pc game engine,” Computer Graphics in Entertainment, pp. 46–53. Chiu K. and Shirley P., 1994 “Rendering, complexity, and perception,” in Proceedings of the 5th Eurographics Rendering Workshop, Darmstadt, June ,pp. 19–34. Clark J. H., 1976, “Hierarchical geometric models for visible surface algorithms,” Communications of the ACM, vol. 19, no. 10, pp. 547–554. Cohen-Or D., Chrysanthou Y., Silva C., and Durand F., 2003, “A survey of visibility for walkthrough applications,” Course 30, SIGGRAPH Course Notes, Dzwig P., 1988, “Complex scene generation,” in IEE Colloquium on Practical Applications of Parallel Signal Processing. London, UK: IEE, November, pp. 1–7.
16 Advances in Computer Graphics and Virtual Environment
Greene N., 1994, “Detecting intersection of a rectangular solid and a convex polyhedron”. San Diego, CA, USA: Academic Press Professional, Inc., pp. 74–82. Hoff K. E.,1996 “A ”fast” method for culling of oriented bounding boxes (obbs) against a perspective viewing frustum in large “walkthrough” models,” May 1996. [Online]. Available: Luebke D. P., 2001 “A developer’s survey of polygonal simplification algorithms,” IEEE Computer Graphics Application, vol. 21, no. 3, pp. 24–35. Moller T. and Haines E., 2002, Real-Time Rendering. Natick, MA, USA: A.K. Peters, Ltd. Placeres F. P.,2005 “Improved frustum culling,” in Game Programming Gems 5, K. Pallister, Ed. Charles River Media. Ramasubramanian M., Pattanaik S. N., and Greenberg D. P., 1999 “A perceptually based physical error metric for realistic image synthesis” in SIGGRAPH ’99: Proceedings of the 26th annual conference on Computer graphics and interactive techniques. New York, NY, USA: ACM Press/Addison-Wesley Publishing Co., pp. 73–82. Slater M. and Chrysanthou Y., 1997. “View volume culling using a probabilistic caching scheme”, in VRST ’97: Proceedings of the ACM Symposium on Virtual Reality Software and Technology. Placeres F. P.,2005. “Improved frustum culling”. In Pallister, K. (Ed.). Game Programming Gems 5, pp. 65-77. Masssachusetts: Charles River Media. Zhang H., 1998. “Effective occlusion culling for the interactive display of arbitrary models”. Ph.D. dissertation, University of North Carolina, Chapel Hill. Zhang H.and Hoff K. E.,1997 “Fast backface culling using normal masks,” in Symposium on Interactive 3D Graphics, pp. 103–106.
2 OVERVIEW OF CROWD SIMULATION IN COMPUTER GRAPHICS Mohamed `Adi Mohamed Azahar Daut Daman
INTRODUCTION
H
igh-powered technology use computer graphics in education, entertainment, games, simulation, and virtual heritage applications has led it to become an important area of research. In simulation, according to Tecchia et al. (2002), it is important to create an interactive, complex, and realistic virtual world so that the user can have an immersive experience during navigation through the world. As the size and complexity of the environments in the virtual world increased, it becomes more necessary to populate them with peoples, and this is the reason why rendering the crowd in real-time is very crucial. Generally, crowd simulation consists of three important areas. They are realism of behavioural (Thompson and Marchant 1995), high-quality visualization (Dobbyn et al. 2005) and convergence of both areas. Realism of behavioural is mainly used for simple 2D visualizations because most of the attentions are concentrated on simulating the behaviours of the group. Highquality visualization is regularly used for movie productions and computer games. It gives intention on producing more convincing visual rather than realism of behaviours. The convergences of both areas are mainly used for application like training systems. In order to make the training system more effective, the element of valid replication of the behaviours and high-quality visualization is added.
18 Advances in Computer Graphics and Virtual Environment
OVERVIEW OF CROWD SIMULATION Real-time crowd simulation is a process of simulating the movement of a large numbers of animated characters or agents in the real-time virtual environment. Crowd movement in certain cases requires the agents to coordinate among themselves, follow after one another, walking in line or dispersing using different directions. All of these actions will contribute to the final collective behaviours of the crowd that must be achieved in realtime. Unlike non-real-time simulations which are able to know the full run of the simulated scenario, real-time simulations have to react to the situation as it unfolds in the moment. As stated by Thalmann and Musse (2007), real-time rendering of a large number of 3D characters is also a challenge, because it can exhaust the system resources quickly even for a powerful system.
Figure 2.1 Previous works in crowd simulation
Figure 2.1 shows a timeline of the previous works that have been done in crowd simulation fields. A behavioral animation of human crowd was based on foundations of group simulations of much more simple entities, notably flocks of birds (Reynolds 1987) and schools of fish (Tu and Terzopoulos 1994). Earliest procedural animation of flocks of virtual birds called Eurhythmy was developed from concept that was presented at The Electronic Theater at SIGGRAPH in 1985 and the final version was presented
Overview of Crowd Simulation in Computer Graphics 19
at Ars Electronica in 1989 (Amkraut et al. 1985). The flock motion was achieved by a global vector force field guiding a flow of flocks. In his early work, Reynolds describes a distributed behavioral model for simulating aggregate motion of a flock of birds. The idea was that a complex behavior of a group of actors can be obtained by simple local rules for members of the group. The flock was simulated as a complex particle system, using the simulated birds, called boids, as the particles. Reynolds, later extended his work by including various steering behaviors as goal seeking, obstacle avoidance, path following, or fleeing (Reynolds 1999), and introduced a simple finite-state machines behaviors controller and spatial queries optimizations for real-time interaction with groups of characters (Reynolds 2000). More recent work was studies on group modeling based on hierarchies. Niederberger & Gross (2003) proposed architecture of hierarchical and heterogeneous agents for real-time applications. Behaviors were defined through specialization of existing behaviors types and depend on multiple inheritances to create new types. An approach that has become more common now was geometry baking. By taking snapshots of vertex positions and normals, complete mesh descriptions were stored for each frame of animation as in the work of Ulicny et al. (2004). Another approach was through dynamic meshes, which was using systems of caches to reuse skeletal updates. A hybrid of baked meshes and dynamics meshes was found in Yersin et al. (2005) where the graphics programming unit (GPU) was used to its fullest. A real-time crowd model based on continuum dynamics has been proposed by Treuille et al. (2006). In their model, a dynamic potential field integrates global navigation with moving obstacles, efficiently solving for the motion of large crowd without the need for explicit collision avoidance. In addition, Mao et al. presented an effective and ready to use framework for real-time visualization of large-scale virtual crowd. Script that describes motion state and position information was used as an input. It provides convenient interface and makes the framework universal to almost all applications of crowd simulations (Mao et al. 2007).
20 Advances in Computer Graphics and Virtual Environment
CROWD RENDERING The complicated part when dealing with thousands of characters is the quantity of information that needs to be processed. Simple approaches, where virtual humans are processed one after another without specific order will produce high computational cost for both the central processing unit (CPU) and graphics processing unit (GPU). This is the reason why data that flows through the same path need to be grouped for an efficient use of the available computing power. Therefore, for the best simulation result, characters capable of facial and hand animation are simulated in the area near to the camera to improve believability, while for farther area, less expensive representations are used. Concerning efficiency of storage and data management, database must be used to store all the virtual human-related data. Crowd Rendering Issues Figure 2.2 shows certain problems that arise when rendering crowd. For instance, collision avoidance problems for a group of peoples in the same place required different strategies in comparison with the methods used to avoid collision between individuals. Moreover, motion planning used in a group that walks together requires more information compared to individual motion planning. The trajectories computed for agents in the same group that walk together with similar speeds must be different even when they share the same environment and goals. In addition, other levels of behaviours can exist when treating crowd in this hierarchical structure. The group behaviours can be used to specify the way a group moves, behaves, and acts in order to fit different group structures (flocking, following, repulsion, attraction, etc). Individual abilities are also required in order to improve the autonomy and intelligence of crowd.
Overview of Crowd Simulation in Computer Graphics 21
Figure 2.2 Crowd rendering issues
However, in order to render thousands of individuals, these complex behaviours cannot be provided individually due to the hardware constraints and computational time rates. Another problem relates on how to improve the intelligence and provide autonomy to scalable crowd, in real-time systems. Furthermore, the simulation of large crowd in real-time requires many instances of similar characters. That is why an algorithm to allow for each individual in the crowd to be unique is needed. There are several techniques used to speed up rendering process in crowd simulation. Billboarded impostors are one of the methods used to speed up crowd rendering. Impostors are partially transparent textured polygons that contain a snapshot of a full 3D character and are always facing the camera. Aubel et al. (2000) proposed dynamically generated impostors to render animated virtual humans. A different possibility for a fast crowd display is to
22 Advances in Computer Graphics and Virtual Environment
use point-based rendering techniques. Wand and Strasser (2002) presented a multiresolution rendering approach which unifies image based and polygonal rendering. An approach that has been getting new life is the geometry baking. By taking snapshots of vertex positions and normal, complete mesh descriptions are stored for each frame of animation as in the work of Ulicny et al. (2004). Another approach to crowd rendering using geometry is through dynamic meshes as presented in the work of de Heras et al. (2005), where dynamic meshes use systems of caches to reuse skeletal updates which are typically expensive. Types of Crowd Rendering Methods In recent years, researchers have applied several approaches, either separately or in combination, to develop crowd simulation in various graphics applications. In this section, five of the crowd simulation approaches will be reviewed as shown in the Figure 2.3.
Figure 2.3 Crowd rendering methods
Overview of Crowd Simulation in Computer Graphics 23
Cellular Automata Cellular automata were proposed by Beuchat and Haenni (2000) and Georgoudas et al. (2007). Cellular automata are discrete dynamic systems consisting of a regular grid of cells. Cellular automata evolve at each discrete time step, with the value of the variable at one cell determined by the values of variables at the neighboring cells. The variables at each cell are simultaneously updated based on the values of the variables in their neighborhood at the previous time step and according to a set of local rules (Wolfram 1983). At present, cellular automata have been successfully applied to various complex systems, including traffic models and biological fields. In recent years, cellular automata models have been developed to study crowd evacuation under various situations. These models can be classified into two categories. The first one is based on the interactions between environments and pedestrians. The other is based on the interaction among pedestrians. Social Force Helbing and Molnar (1995) introduced a ‘Social force model for pedestrian dynamics’. They suggested the motion of pedestrians can be described as if they are subject to social forces which are a measure of the internal motivation of individuals to perform certain actions or movements. They described three essential forces: Acceleration - the velocity of the pedestrian varies over time, as it attempts to reach its optimum speed and as it avoids obstacles. Repulsion - there is a repulsive force from other pedestrians and from obstacles and edges. Attraction - pedestrians are sometimes attracted by other people or by other objects. By putting these three forces together, Helbing and Molnar (1995) have produced an equation for a pedestrian’s ‘total
24 Advances in Computer Graphics and Virtual Environment
motivation’ and combining this with a term to allow for fluctuations in behaviour, produces the ‘social force model’. They described computer models based on the equations, which have been successful in demonstrating various observed phenomena, for example lane formation. In Helbing et al. (2000), the social force model is applied to the simulation of building escape panic, with impressive results. Fluid Dynamics Helbing et al. (2002) have described that at medium and high densities, the motion of pedestrian crowds shows some striking analogies with the motion of fluids. For example, the footprints of pedestrians in snow look similar to streamlines of fluids or, again, the streams of pedestrians through standing crowds are analogous to riverbeds. Fluid-dynamic models describe how density and velocity change over time with the use of partial differential equations. Colombo and Rosini (2005) presented a continuum model for pedestrian flow to describe typical features of this kind of flow such as some effects of panic. In particular, this model describes the possible over-compressions in a crowd and the fall in the outflow through a door of a panicking crowd jam. They considered the situation where a group of people needs to leave a corridor through a door. If the maximal outflow allowed by the door is low, then the transition to panic in the crowd approaching the door may likely cause a dramatic reduction in the actual outflow, decreasing the outflow even more. Particles The majority of pedestrian simulations take this particulate approach, sometimes called the atomic approach. Early influential work was that of Craig Reynolds (1987) who worked on simulations of flocks of birds, herds of land animals and schools of fish. Each particle, or boid, was implemented as an individual actor which navigates according to its own perception of the
Overview of Crowd Simulation in Computer Graphics 25
environment, the simulated laws of physics, and a simple set of behavioural patterns. Reynolds (1999) has extended the concepts to the general idea of ‘autonomous characters’, with an emphasis on animation and games applications. Bouvier et al. (1997) described a generic particle model and applied it to both the problem of pedestrian simulation and to the apparently distinct problem of airbag deployment. They presented software allowing the statistical simulation of the dynamic behaviour of a generic particle system. In their system, the particle system was defined in terms of the: i. Particle types - mass, lifetime, diffusion properties, charge, drag, interactions with surfaces, visualisation parameters. ii. Particle sources or generators - size, geometry, rate and direction of emission. iii. 3D geometry, including obstacles. Evolution of particles within the system. iv. Agent Based Agent based are computational models (Goldstone and Janssen 2005) that build social structures from the ‘‘bottom-up’’, by simulating individuals with virtual agents, and creating emergent organizations out of the operation of rules that govern interactions among agents. Bonabeau (2002) supported the following point of view: in agent terms, collective panic behavior is an emergent phenomenon that results from relatively complex individual-level behavior and interactions among individuals; the agent based seems ideally suited to provide valuable insights into the mechanisms and preconditions for panic and jamming by incoordination. In the last few years, the agent based technique has been used to study crowd evacuation in various situations. Agents based are generally more computationally expensive than cellular automata, social force, fluid-dynamic or particles models. However, their ability to allow each pedestrian to have unique behaviors
26 Advances in Computer Graphics and Virtual Environment
makes it much easier to model heterogeneous humans, which are groups that contain individual with difference characteristic.
CONCLUSION Crowd simulation brings great challenges into virtual reality application system either involving a small number of interacting characters or non-real-time system. In order to have a real and immersive application, virtual humans composing a crowd should simulate real crowd as similar as possible. In this chapter, we have discussed the overview of crowd simulation, crowd rendering issues, and a few types of common crowd rendering methods. The main goals of this chapter are to give general ideas of what is crowd simulation and methods involved in it.
REFERENCES
Amkraut S., Girard M. and Karl G., 1985, “Motion studies for a work in progress entitled Eurnythmy”, SIGGRAPH Video Review, 21. (second item, time code 3:58 to 7:35) Aubel A., Boulic R. and Thalmann D., 2000, “Real-time display of virtual humans: Levels of detail and impostors.” IEEE Transactions on Circuits and Systems for Video Technology, 10 (2), pp. 207–217 Beuchat J. L. and Haenni. J. O., 2000, “Von Neumann’s 29-State Cellular Automaton: A Hardware Implementation.” In: IEEE Transactions on Education, Vol. 43, No. 3, AUGUST Bonabeau E.,2002, “Agent-based modeling: methods and techniques for simulating human systems.” In: Proceedings of the National Academy of Sciences of the USA(PNAS) 99(Suppl. 3): pp. 7280–7287
Overview of Crowd Simulation in Computer Graphics 27
Bouvier E., Cohen E. and Najman L., 1997, “From crowd simulation to airbag deployment: Particle systems, a new paradigm of simulation.” Journal of Electrical Imaging, 6 (1), pp. 94–107 Colombo R.M. and Rosini M.D., 2005, “Pedestrian flows and nonclassical shocks.” In: Mathematical Methods in the Applied Sciences 28: pp.1553–1567 De Heras P., Schertenleib S., Ma¨ım J. and Thalmann D., 2005, “Realtime shader rendering for crowd in virtual heritage.” In Proc. 6th International Symposium on Virtual Reality, Archaeology and Cultural Heritage (VAST 05) Dobbyn S., Hamill J., O’Conor K. and O’Sullivan C., 2005, “Geopostors: A real-time geometry/impostor crowd rendering system.” In SI3D ’05: Proceedings of the 2005 Symposium on Interactive 3D Graphics and Games (New York, NY, USA), ACM Press, pp. 95–102 Georgoudas I.G., Sirakoulis G.Ch., and Andreadis. I.Th., 2007, “An Intelligent Cellular Automaton Model for Crowd Evacuation in Fire Spreading Conditions.” In: 19th IEEE International Conference on Tools with Artificial Intelligence. Goldstone RL and Janssen MA., 2005, “Computational models of collective behavior.” In: Trends in Cognitive Sciences 9(9): pp. 424–430 Helbing D, Farkas I and Vicsek T., 2000, “Simulating dynamical features of escape panic.” In: Nature 407: pp. 487–490 Helbing D, Farkas IJ, Molnar P. and Vicsek T., 2002, “Simulation of pedestrian crowds in normal and evacuation situations.” In: Schreckenberg M, Sharma SD, editors. Pedestrian and evacuation dynamics. Berlin: Springer; pp. 21–58 Helbing D. and Molnar P., 1995, “Social force model for pedestrian dynamics.” In: Physical Review E 51(5):4282–6 Mao, T., Shu, B., Xu, W., Xia, S. and Wang, Z., 2007, “CrowdViewer: from simple script to large-scale virtual crowd.” In Proc. of the 2007 ACM Symposium on Virtual Reality Software and Technology, pp. 113-116.
28 Advances in Computer Graphics and Virtual Environment
Niederberger C. and Gross M., 2003, “Hierarchical and heterogeneous reactive agents for real-time applications.” Computer Graphics Forum 22, 3 (Proc. Eurographics’03) Reynolds C. W., 1987, “Flocks, herds, and schools: A distributed behavioural model.” In Computer Graphics (ACM SIGGRAPH ’87 Conference Proceedings) (Anaheim, CA, USA), Vol. 21, ACM, pp. 25–34 Reynolds C. W., 1999, “Steering behaviours for autonomous characters.” In Proceedings of Game Developers Conference 1999, pp. 763–782 Reynolds C. W., 2000, “Interaction with groups of autonomous characters.” In Proc. Game Developers Conference ’00, pp. 449–460 Tecchia, F., Loscos, C. and Chrysanthou, Y.,2002, “Visualizing Crowd in Real-Time.” Computer Graphics Forums. Thalmann, D. and Musse, S. R., 2007, “Crowd Simulation.” Springer-Verlag, London Thompson P.and Marchant E., 1995, “Testing and application of the computer model simulex.” Fire Safety Journal 24, 2, pp. 149–166 Treuille A., Cooper S. and Popovic Z., 2006, “Continuum crowd.” ACM Transactions on Graphics 25 (3), pp.1160–1168 Tu X. and Terzopoulos D., 1994, “Artificial fishes: Physics, locomotion, perception, behaviour.” In Computer Graphics (ACM SIGGRAPH ’94 Conference Proceedings) (Orlando, FL, USA), Vol. 28, ACM, pp. 43–50 Ulicny B., de Heras Ciechomski P. and Thalmann D., 2004, “Crowdbrush: Interactive authoring of real-time crowd scenes.” In Proc. ACM SIGGRAPH/Eurographics Symposium on Computer Animation (SCA’04), pp. 243– 252. Von Neumann, J. and A. W. Burks. Theory of self-reproducing automata. Urbana,, University of Illinois Press (1966) Wand M. and Strasser W., 2002, “Multi-resolution rendering of complex animated scenes.” Computer Graphics Forum 21, 3 (Proc. Eurographics’02)
Overview of Crowd Simulation in Computer Graphics 29
Wolfram S., 1983, “Statistical mechanics of cellular automata.” In: Reviews of Modern Physics 55: pp. 601–644 Yersin B., Ma¨ım J., de Heras Ciechomski P., Schertenleib S. and Thalmann D., 2005, “Steering a virtual crowd based on a semantically augmented navigation graph.” In First International Workshop on Crowd Simulation (VCROWD’05).
3 NON-PHOTOREALISTIC RENDERING FOR OUTDOOR SCENE Irene Liew Suet Fun Mohd Shahrizal Sunar
INTRODUCTION
A
dvance rendering technique in computer graphics include non-photorealitstic rendering. Non-photorealistic Rendering (NPR) has become an important area of research in computer graphics. One interesting fact in this area is that many of the techniques developed over the years in computer graphics research can be used here to create specific effects. Many NPR images created from 3D models and combine many different algorithms to yield exactly the image that user wants. In computer graphics, photorealistic rendering attempts to make virtual images of simulated 3D environments that look like “the real world”. So NPR is any technique that produces images of simulated 3D world in a style other than “realism”. Nonphotorealistic rendering is a representation of style for computer imagery using non-photorealistic techniques to create artistic illustration (sketch, pen and ink, hatching, etc), paintings (painterly rendering) or render 3D scenes in styles which match the “look” of traditional animated cartoon, which conveys hand-drawn artistic illustration of art. There is an overall reason why NPR might continue to expand its influence within 3D graphics and perhaps eventually grow to rival the importance of photorealistic rendering: realism is expensive. We need a vast amount of detail to faithfully represent and animate a realistic natural scene. When a human (designer)
32
Advances in Computer Graphics and Virtual Environment
creates that scene by hand, the process necessarily requires great time and effort. One strategy to avoid this inherent cost of realism is to capture details from the real world. Examples include imagebased rendering, 3D scanning and motion capture. But these strategies have limitations, which they require that the desired data be presented in the real world at a suitable scale. For many applications, including storytelling, this might not be the case. When scanning is not an option, the alternative currently is to hire a team of trained experts and let them painstakingly model and animate the needed 3D content by hand. While this strategy works, it is only feasible for high-budget industries such as games and movies. For 3D graphics to reach new applications and attract new users, something must be changed. We believe NPR has the potential to solve the content creation problem because nonphotorealistic images can be far simpler to create by hand than photorealistic ones. Much of the research in NPR has targeted a particular style of imagery and developed algorithms to reproduce that style when rendering appropriately-annotated 3D scenes and photographs. When comes to the context of 3D scene rendering, it is usually involving non-photorealistic rendering on 3D characters, technical illustrations, virtual environment and geometrical objects. There is not a deep insight on rendering a 3D outdoor scene which depicts non-photorealism. An outdoor scene consists of a combination of many objects and geometrical elements such as trees, plants, buildings, road, sky and vehicles, which are complex in structure and sizes that is not easy to be rendered. Many considerations on the different characteristics of every element has to be done before rendering the scene because various types of rendering techniques will be used on different types of geometrical objects.
Non-Photorealistic Rendering For Outdoor Scene 33
DIFFERENCE BETWEEN PHOTOREALISM AND NONPHOTOREALISTIC Photorealism is one representational form of images to represent the outside world in an “objective way”. It creates images which depicts realism. Non-photorealistic rendering exhibits a nonrealistic image. Table 1 shows the comparison between photorealism and non-photorealistic.
Table 3.1 Comparison of photorealism and non-photorealistic rendering (NPR)
Approach Characteristic Influences Accuracy
Photorealism Simulation Objective Simulation of physical processes Precise
NPR Stylization Subjective Sympathies with artistic processes; perceptual-based Approximate
Deceptiveness
Can be regarded as “dishonest”
Honest
Level of detail
Constant level of detail
Can adapt level of detail across an image to focus the viewer’s attention
Good for representing
Rigid surfaces
Natural and organic phenomena
34
Advances in Computer Graphics and Virtual Environment
Application of NPR In many applications, such as architectural, industrial, automotive and graphics design, non-photorealistic is preferred than photorealism. Non-photorealistic conveys information better by omitting extraneous details, by focusing attention on relevant features, by clarifying, simplifying and disambiguating shapes and showing parts that are hidden. It also provides more natural vehicle for conveying information at different level of details. Besides that, the resulting images are more attractive as they add a sense of vitality difficult to capture with photorealism. In architectural designs, for example, an architect presents the client with a preliminary design for a house not yet built. An imprecise pencil sketch that omits many details suggests to client that the design remains open for revision, providing the exchange of ideas and keeping the client happy. Another example of the use of drawings instead of photographs is the case of medical textbooks. Medical textbooks employ schematic line drawing to illustrate anatomical structures. It is because photographs tend to imply an exactness and perfection of the scene to a real object. Hand-drawn illustrations can better communicate 3D structure, elide the unimportant details and emphasize important features only. In paintings, a scene represents an artist’s view of the world. All the information he wants to convey with his image has to be assembled by strokes of a brush. Therefore, to create a work of art, the artists would have to understand their subject matter so that they could include their interpretation of the important details in rendering. In the film production industry, most of the nonphotorealistic rendering work centers around the “toon-shading” or the rendering of 3D objects to match the look of traditionally drawn 2D artwork. For example, in the cartoon animated movie Tarzan, where the painterly 3D jungle, through which the 2D ape man swings, designed in non-photorealistic style albeit in a recognized cartoon-like style. By using “toon-shading”, a form of
Non-Photorealistic Rendering For Outdoor Scene 35
non-photorealistic rendering, to render 3D objects, graphics designers can create more complex props, objects, and scene with thousands of characters that would be far too complex or costly to draw. Apparently, this allows them to produce things in volume and at scale that would be otherwise impossible. Outdoor Scene The term “outdoor” is antonymous with the term “indoor”. Outdoor refers to things that that exist in the open air, which involves scenery view of sky and land. It does not relate to things in a closed-space area, such as in a building or in a room. Outdoor scene exhibits geometric complexity, which defines the environment of an open-space, conveying information of the atmosphere of the real world. Outdoor environments are typically flat and sparsely occluded, so the area that can be observed by humans is rather large. Therefore, the sheer scale of outdoor scenes is significantly larger than that of indoor scenes. Outdoor scenes can be categorized in several characteristics, which are: i. Natural scenery – scenery consists of natural resources and beauty. For example, scene of a forest, jungle, waterfall, seaside, ocean, and a garden of flowers. ii. Buildings – scenery contains buildings, which is usually common for the architectural designs. For example, the view of Petronas Twin Tower, offices and shop houses, shopping complexes and architectural building designs. iii. Urban scenery – scene that reflects the real city life and environment with skyscrapers, buildings, vehicles and other elements found in urban city areas. For example, New York city has various styles of buildings and complex structures of lifestyle.
36
Advances in Computer Graphics and Virtual Environment
iv.
Countryside scenery – scene that indicates the lifestyle in countryside and its atmospheric landscape with barns, cottages, coconut trees, mountains or hills, paddy fields, farms and many more.
OVERVIEW OF NON-PHOTOREALISTIC RENDERING The field of non-photorealistic rendering (NPR) involves various techniques in rendering many types of applications, namely in cartoon animations, film productions, architectural designs, artistic paintings and in technical illustrations. The emerging applications of non-photorealistic rendering have basically focused on rendering human and cartoon characters, artwork masterpieces, and technical tools. There is not a clear focus on rendering non-photorealistic outdoor scenes. Apparently, most of the researches are based on the rendering the realistic of an outdoor environment in virtual reality environment (Debeyec, Taylor & Malik 1996). The traditional strategy for immersive virtual environments is to render detailed sets of 3D polygons with appropriate lighting effects as the camera moves through the model (Manocha 2000). With this approach, the primary challenge is constructing a digital representation for a complex, visually rich, real-world environment. In recent years, a few researchers have turned their attention away from photorealism and towards developing nonphotorealistic rendering techniques in a variety of styles and simulated media, such as impressionist painting (Haeberli 1990; Litwinowicz 1997; Meier 1996), pen and ink (Winkenbach & Salesin 1996), technical illustration (Gooch et al. 1998), watercolor (Curtis et al. 1997) and the style of Dr. Seuss (Kowalski et al. 1999). Most of these works have focused on creating still images either from photographs, from computer-rendered reference images, or directly from 3D models, with varying degrees of user-
Non-Photorealistic Rendering For Outdoor Scene 37
direction. One of their goals is to make their system work in conjunction with any of these technologies, particularly those that are more automated, to yield virtual environments in many different styles. Several stroke-based NPR systems have explored timechanging imagery, confronting the challenge of frame-to-frame coherence with varying success. Winkenbach et al. (1994) and later Curtis et al. (1997) observed that applying NPR techniques designed for still images to time-changing sequences yields flickery, jittery, noisy animations because strokes appear and disappear too quickly. Meier (1996) adapted Haeberli’s “paint by numbers” scheme (1990) in such a way that paint strokes track features in a 3D model to provide frame-to-frame coherence in painterly animation. Litwinowicz (1997) achieved a similar effect on video sequences using optical flow methods to affix paint strokes to objects in the scene. Markosian (1997) found that silhouettes on rotating 3D objects change slowly enough to give frame-to-frame coherence for strokes drawn on the silhouette edges. Techniques of NPR There are many techniques available for non-photorealistic rendering (NPR). Examples of the techniques used in NPR are water-colour technique (Curtis et al. 1997), silhouettes rendering techniques (Hertzmann 1999), real-time rendering (Markosian 1997), hatching, charcoal rendering, stylized sketching (Winnemoller & Bangay 2003) and pencil drawing techniques (Sousa and Buchanan 1999). Different style of rendering produces different effects to the rendered scene and images. Considerations are made upon to the decision on rendering what kind of objects relatively to its purpose, time and the usage capability. For outdoor scene, pen-and-ink illustration is usually used in architectural designs where the ink outline of the image gives a clear picture of the structure of the designs and the texture tone effects resembles hand-drawn. As for
38
Advances in Computer Graphics and Virtual Environment
the water colour and painterly rendering, they are commonly used to depict outdoor natural scenery which involves natural resources. A sketchy artwork of outdoor scene with black and white features is created using halftoning techniques. Cartoon rendering is preferred in cartoon film production to produce background scenes in movies which convey cartoon resemblance.
RENDERING FRAMEWORK There are several phases that should be implemented in order to achieve the intended cartoon-shaded outdoor scene, namely input of the 3D scene, which is then rendered with silhouette and cartoon rendering techniques. Silhouette A silhouette is an edge that connects one back-facing (invisible) polygon to a front-facing polygon. Therefore, an edge is marked as a silhouette edge if a front-facing (visible) and a back-facing (invisible) polygon share the edge as illustrated in Figure 3.1.
Eye point Silhouette edge Figure 3.1 Silhouette edge detection. A silhouette edge is an edge between a front-facing and a back-facing polygon
Generally, the silhouette set of a model can be computed into two methods: in object space and in screen space. The object
Non-Photorealistic Rendering For Outdoor Scene 39
space algorithms involve computations in three dimensional (3D) views and produce a list of silhouette edges or curves for a given viewpoint. The screen space algorithms, on the other hand, involve image processing techniques and are useful if rendering silhouettes is the only goal. There are many approaches in extracting and detecting silhouette edges in object space (Gooch 2003). The Brute Force method of silhouette extraction simply tests each edge in the polygonal mesh sequentially to verify whether or not it is a silhouette. Brute Force approach extracts the silhouette edges of the scene as the approach is simple and its performance is faster than edge buffer. Probabilistic method is usually used in animation rendering where temporal coherence is considered. Whereas Gauss Map Arc Hierarchy is only suitable for orthogonal view, and Normal Cone Hierarchy is more complex and difficult to be implemented. The focus is on Brute Force as it is the simplest form to determine silhouettes. The implementation of silhouette edge detection is based on the algorithm described by Adam Lake et al (2000). The equation of the formula is shown in Equation 1, which faceNormal1 and faceNormal2 represent normal of two polygon faces adjacent to an edge, and eyeVect indicates the viewing vector from the eye point. If the result of the computation is less than or equal to zero, the edge is a silhouette edge and it is flagged for rendering. ( faceNormal1 ● eyeVect )*( faceNormal2 ● eyeVect ) ≤ 0 Cartoon Rendering Cartoon rendering is done to enhance the scene to look like cartoon. The brunt of cartoon rendering lies in the shading algorithm. Rather than smoothly interpolating across an object as the Gouraud or Phong shading models do, to give the object a three-dimensional appearance, cartoon rendering typically uses solid colours that do not vary across the material they represent, which helps add lighting cues, cues to the shape and context of the object in the
40
Advances in Computer Graphics and Virtual Environment
scene, and even dramatic cues. Figure 3.2 illustrates a production of 3D cartoon rendered outdoor scene which is rendered using hard shading technique.
Figure 3.2 A scene of City of Birmingham (UK) in 1066 (Studio Lidell 2000)
For shading, we need to calculate the directional light which to find how much light each vertex receives by calculating the dot product of the light vector and the normal vector of a vertex. The dot product function calculates the angle between two vectors and results a value with a maximum of 1. The result of the function L • n is actually the cosine of the angle formed between the two vectors. Since the result of a dot product has a maximum value of 1, we can use the result of the dot product as the texture coordinate to map to greyscale texture map. If the result of the dot product is less than 0, then the texture coordinate is considered as 0. It is because texture coordinates are stored between 0 and 1(0 ≤ x ≤ 1), which happens to be that the dot product of the normal and the lighting direction vector actually gives us the texture coordinate. The resulted numbers are used as the index for the 1D texture mapping
Non-Photorealistic Rendering For Outdoor Scene 41
coordinates. The dot product function generating texture coordinates is shown in Figure 3.3.
Figure 3.3 Generation of texture coordinates from L • n . In this case, the colour boundary occurs at the point where L • n equals to 0.5
CONCLUSION Many techniques of NPR have been introduced over these years, namely stylized sketching, pen-and-ink illustration, watercolourization, cartoon-shading, hatching, painterly rendering and silhouette rendering. These techniques are useful many applications in industrial designs, architectural drawings, medical and scientific visualizations, film production, cartoon animations, games development and computer graphics designs. Creating a non-photorealistic scene in the graphics applications is commonly known as virtual environment. Virtual environments allow us to explore an ancient historical site, visit a new home with a real estate agent, or fly through the twisting corridors of a space station in pursuit of alien prey. They simulate the visual experience of immersion in a 3D environment by rendering images of a computer model as seen
42
Advances in Computer Graphics and Virtual Environment
from an observer viewpoint moving under interactive control by the user. If the rendered images are visually compelling, and they are refreshed quickly enough, the user feels a sense of presence in a virtual world, enabling applications in education, computer-aided design, electronic commerce, and entertainment. While research in virtual environments has traditionally striven for photorealism, for many applications there are advantages to non-photorealistic rendering (NPR). Artistic expression can often convey a specific mood (e.g. cheerful or dreary) difficult to imbue in a synthetic, photorealistic scene. Furthermore, through abstraction and careful elision of detail, NPR imagery can focus the viewer’s attention on important information while downplaying extraneous or unimportant features. An NPR scene can also suggest additional semantic information, such as a quality of “unfinishedness” that may be desirable when, for example, an architect shows a client a partially-completed design. Finally, an NPR look is often more engaging than the prototypical stark, pristine computer graphics rendering.
REFERENCES
Curtis C. J., Anderson S. E., Seims J. E., Fleischer K. W., and Salesin D. H., 1997, “Computer-generated watercolor.” Computer Graphics SIGGRAPH 97, 1(31): pp. 421–430 Debevec P. E., Taylor C. J., and Malik. J., 1996, “Modeling and rendering architecture from photographs: A hybrid geometry- and image-based approach.” Computer Graphics (SIGGRAPH 96), pp. 11–20. Gooch. B., 2003, “Theory & Practice of Non-Photorealistic Graphics: Algorithms, Methods, & Production Systems – Silhouette Extraction.” Course Notes for SIGGRAPH Gooch A., Gooch B., Shirley P., and Cohen E., 1998, “A nonphotorealistic lighting model for automatic technical
Non-Photorealistic Rendering For Outdoor Scene 43
illustration.” Proceedings of Computer Graphics SIGGRAPH 98, pp. 447–452. Haeberli P. E., 1990, “Paint by numbers: Abstract image representations.” Computer Graphics SIGGRAPH 90, pp. 207–214. Hertzmann A., 1999, “Introduction to 3D Non-Photorealistic Rendering: Silhouettes and Outlines.” SIGGRAPH ’99 Kowalski M. A., Markosian L., Northrup J. D., Bourdev L., Barzel R., Holden L. S, and Hughes J., 1999, “Art-based rendering of fur, grass, and trees.” Computer Graphics SIGG’99. Lake A., Marshall C., Harris M. and Blackstein M., 2000, “Stylized Rendering Techniques For Scalable Real-Time 3D Animation.” Proceedings of NPAR 2000: First International Symposium on Non-Photorealistic Animation and Rendering. pp. 13-20. Litwinowicz P., 1997, “Processing images and video for an impressionist effect.” Computer Graphics SIGGRAPH 97, pp. 407–414. Manocha D., 2000, “Interactive walkthroughs of large geometric databases.”Course #18, SIGGRAPH 2000 Markosian L., Kowalski M. A., Trychin S. J., Bourdev L.D., Goldstein D. and Hughes J. F., 1997, “Real-time NonPhotorealistic Rendering.” Proceedings of SIGGRAPH 97. August 1997. Los Angeles, California, Computer Graphics Proceedings, Annual Conference Series, ACM SIGGRAPH /Addison Wesley: pp. 415–420. Meier B. J., 1996, “Painterly rendering for animation.” Computer Graphics SIGGRAPH 96, pp. 477–484. Sousa M. C. and Buchanan J. W., 1999, “Computer-Generated Graphite Pencil Rendering of 3D Polygonal Models.” Proceedings of Eurographics '99. 3(18): pp. 195-207. Schmalstieg D. and Gervautz M., 1997, “Modeling and Rendering of Outdoor Scenes for Distributed Virtual Environments.” Proceedings of ACM Symposium on Virtual Reality Software & Technology. pp. 209-215.
44
Advances in Computer Graphics and Virtual Environment
Studio Lidell. City of Birmingham in 1066. . Winkenbach G. and Salesin D. H., 1994, “Computer-generated pen-and-ink illustration.” Computer Graphics SIGGRAPH 94, pp. 91–100. Winkenbach G. and Salesin D. H., 1996, “Rendering parametric surfaces in pen and ink.” Computer Graphics SIGGRAPH 96, pp. 469–476. Winnemoller H. and Bangay S., 2003, “Rendering Optimization for Stylised Rendering.” Proceedings of ACM Symposium.
4 REAL-TIME TERRAIN RENDERING AND VISUALIZATION BASED ON HIERARCHICAL METHOD Muhamad Najib Zamri
INTRODUCTION
H
istory of Geographic Information System (GIS) began since early 20th century saw the development of geographical data information visualization. GIS has changed the people workflow from manual system into an automation system that provides more effective and systematic way. GIS is one of the important developments in the field of information technology where it is a set of computer tools for collecting, storing, retrieving at will, transforming and displaying spatial data from the real world for a particular set of purposes or application domains. Until recently, GIS contributed for many sector and field, including agriculture, archaeology, environmental monitoring, health, forestry, emergency services, marketing, regional planning, site evaluation, tourism and infrastructure. A complex nature of massive database caused the functionality of traditional GIS (two-dimensional GIS) at its boundary. This conventional system was unable to manage entire information successfully. User interaction with the system only restricted to the operation such as zooming, zoom to extent, panning and picking for viewing and manipulating topographic map. As an alternative, higher dimensional GISs have been proposed to overcome the problem. Three-dimensional Virtual GIS provides a new environment where it attempts to display virtual
46 Advances in Computer Graphics and Virtual Environment
scenarios in an appearance identical to the real world. Third dimension facilitates in displaying complex spatial information for visual analysis tasks. Terrain management and visualization is subfield in 3D Virtual GIS that has attracted a number of researchers to involve in tackling the challenging issues in this research area. In order to display terrain model, computer graphics elements are necessary. Visualization of this 3D model involves large amount of spatial data and imagery data. Thus, it is essential to employ an appropriate optimization algorithm or method to reduce the complexity of problem and at the same time to improve system performance. Spatial data retrieval technique is one of the solutions for the above-mentioned problem. The objective of this technique is to fetch data rapidly when dealing with real-time massive data. We have designed and developed a terrain management scheme that is based on this technique in order to test the performance of the realtime terrain system.
RELATED WORKS Tile Approach Tile approach can be used for handling large geometry and imagery data. There are two types: (i) naive approach and (ii) overlapping tile approach. The difference between two of them is the dimension of the data to be used. The dimension for the naive approach is power-of-two (2n) while the overlapping tile approach uses power-of-two plus one (2n+1) dimension. The drawback for the naive approach is the seams or cracks can be occurred automatically between the boundaries of the tiles. Frequently, most of researchers have preferred to apply second above-mentioned approach to their specific applications because the seams can be reduced or prevented.
Real-Time Terrain Rendering and Visualization Based on Hierarchical 47 Method
Pajarola (1998) divided the terrain data into several tiles for an efficient scene management and integrated with the restricted quadtree algorithm. Ulrich (2002) implemented the tile approach for both textures and terrain elevation data in his Chunk LOD. View Frustum Culling In view frustum culling, there are two different approaches have been applied: plane intersection test and projection intersection test. First approach needs to compare between six planes of view volume and surfaces of object’s bounding volume. Hoff (1997) has produced rapid axis aligned bounding box (AABB)-view frustum overlapping test in his system. Assarsson and Möller (1999,2000) have presented basic intersection test with four optimization techniques (plane-coherency test, octant test, masking and TR coherency test) for culling AABB and oriented bounding box (OBB). For the second approach, the comparison merely being made between the projection of view volume and projection of bounding box in screen-space coordinate. Rabinovich and Gotsman (1997) have introduced the concept of union set of projection for bounding box. While Youbing et al. (2001) did not take into account the near, top and bottom planes of view volume in their terrain rendering system for fast determination of terrain patches in each frame. View-Dependent LOD TIN-based terrain rendering system has been developed by Hoppe (1998) using his principle of progressive meshes. Unfortunately, this method consumes lot of memory for storing the terrain data. Regular grid is usually being adopted and adapted in realtime terrain rendering and visualization due to its simplicity. Typically, hierarchical tree is exploited to represent and manage terrain data. Binary tree-based LOD has been used by Duchaineau et al. (1997) (ROAM system), Pajarola (1998) (Virtual Reality GIS)
48 Advances in Computer Graphics and Virtual Environment
and Lindstrom and Pascucci (2001) (out-of-core terrain visualization system). Hierarchical quadtree data structure was applied in Röttger et al. (1998) (continuous LOD for height fields) and Ulrich (2002) (ChunkLOD). Hierarchical R-tree also can be used to manage multiresolution terrain data. LOD R-tree has been implemented by Kofler (1998) in his Vienna Walkthrough and Styria Flyover system.
METHODOLOGY Pre-Processing Pre-processing step is crucial for storing and loading terrain data efficiently as well as obtaining fast data retrieval in real-time environment. In general, the purpose of this phase is to extract the related information and split the terrain data into several patches. Two steps are required: data extraction and data tiling. Data extraction involves two sequential steps. The first one is the extraction of general information, while the second step is the extraction of elevation profiles. General information is obtained by reading Record A in DEM data format which contains header information about the related USGS DEM data (1992). Certain information will be extracted for facilitating the next process. These include ground coordinates for the corners, minimum and maximum elevation values and column and row data. Ground coordinates consist of four control points (NW, NE, SW and SE) will work as the boundary of the data. Minimum and maximum elevation values are the range of z coordinates to represent the terrain height field. It is useful in normalizing the elevation value in certain extent. Column and row data are important for determining the terrain size in vertical and horizontal directions respectively. All of this information will be stored in output file (*.txt file).
Real-Time Terrain Rendering and Visualization Based on Hierarchical 49 Method
Then, according to Record A, the elevation profiles will be read column by column from left to right. The profiles and its related index will be stored in another output file (*.txt file). Data tiling is the core component of the pre-processing phase. The overlapping tile approach will be used due to its simplicity and seam elimination capability. Therefore, the terrain data must be (2n+1) x (2n+1) in vertical and horizontal directions (Pajarola 1998, Duchaineau et al. 1997, Lindstrom and Pascucci 2001, Röttger 1998, Kim and Wohn 2003). Firstly, the terrain size needs to be determined in order to ensure the size conform the above rule. There are two suggested steps: • Find the maximum number of terrain data for one side. • Assign an appropriate power-of-two plus one (2n+1) value to the maximum number of terrain data for one side. The terrain data is stored in one-dimensional array which means the index is in the range from 0 to [(2n+1) x (2n+1) - 1]. This approach is applied in order to reduce the memory usage. After calculating and obtaining the terrain size, it is necessary to specify manually the number of vertices per side for each patch. Choosing a proper patch size is depends on data density, CPU speed, available memory and other considerations. Hence, the size must not too small or big for the efficiency purpose. The number of vertices per side must also be in (2n+1) size. The terrain size and number of vertices per side will be used to calculate the number of generated patches for one side as illustrated below: Number of Terrain Size = Generated Patches Number of Vertices Per Side
Then, the starting point for each patch will be determined by computing the index of terrain data. At the same time, a test is made to check the existing status for each patch. It is because when the terrain size is expanded to the (2n+1) x (2n+1), the possibility to generate non-existing patches is high. The purpose of this step is to
50 Advances in Computer Graphics and Virtual Environment
reduce the hard disk storage and memory utilization. The existing patches will be assigned with flag 1 and non-existing patches will be assigned with flag 0. Before that, the maximum sizes for vertical (column) and horizontal (row) directions which following the number of vertices per side are needed. After gaining all basic requirements, the elevation values will be read from the output file that has been created before by searching and comparing the starting point for each existing patches with the index from the output file. Subsequently, normalization of elevation data will be made for each existing patches. Thus, the elevation values will be in range 0 to 255 for facilitating the computation for the run-time process. Minimum and maximum elevation values are required for this computation. Each patch’s elevation values will be stored in separated output file (*.tile file). This approach is used in order to achieve fast searching and retrieving of terrain patches rather than using single output file with complicated searching process that will slow down the system performance. Run-Time Processing Firstly, the view position is needed in order to determine the patches to be processed. The current patch and eight adjacent patches will be selected rather than process the whole terrain patches. Then, view frustum culling technique is inserted to remove the unseen patches among the nine patches to be processed. Two steps should be followed: plane extraction and intersection test. View frustum comprises of six planes. These planes are left, right, top, bottom, near and far. Each plane has its own plane equation. General plane equation is: Ax + By + Cz + D = 0
Real-Time Terrain Rendering and Visualization Based on Hierarchical 51 Method
The coefficient A, B, C, and D need to be defined in order to complete frustum plane equation. This involves three major steps: i. Combine the projection and model view matrices. ii. Extract the plane. iii. Normalize the plane. Next, the plane-sphere intersection test is made between each view frustum’s planes and terrain patches (bounding sphere representation). Only the terrain patch that has ALL-INSIDE or INTERSECT flag will be taken for the next process. The algorithm is listed below: For i in [all view frustum planes] do Distance = (Center of the sphere X Unit vector) + Coefficient(D) If (Distance > Radius of the sphere) then Return ALL-OUTSIDE If (Distance > -Radius of the sphere) then Return INTERSECT End loop Return ALL-INSIDE
Figure 4.1 Pseudo code for plane-sphere intersection test
After that, the tessellations for visible terrain patches are made by using view-dependent LOD algorithm. Röttger’s algorithm (Röttger et al. 1998) has been chosen due to its memoryefficient characterization. It is based hierarchical quadtree structure and managed in top-down manner. In general, the flat and distant objects are given fewer polygons than the close one. The important operations in Röttger’s algorithm involve the distance calculation, subdivision test and generation of triangle fans. All the process in run-time phase are repeated, updated and rendered for each frame.
52 Advances in Computer Graphics and Virtual Environment
RESULTS The prototype has been implemented and run on high-end PC in Windows XP operating system. The main specifications are AMD AthlonXP 1800+ (1.53 GHz), 1 GB DDR-RAM and Gigabyte 128 MB ATI Radeon 9700. We have used Arizona data specifically adams mesa area as shown in Figure 4.2 and the results can be seen in Figure 4.3. After running this prototype, average frame rate obtained is 80.53 frames per second (fps), average polygon count is 22370 triangles for each frame and geometric throughput is 15614 triangles per second. For the data accuracy, this system can preserve only 95.34 percent compared to the original full resolution data.
Figure 4.2 USGS DEM data: elevation data and texture map
Real-Time Terrain Rendering and Visualization Based on Hierarchical 53 Method
Figure 4.3 Textured and wireframe terrain model
CONCLUSION We have presented our proposed technique for real-time terrain management and visualization. Although the prototype is able to decrease polygon generated for each frame and speed up the rendering performance but actually, this system is depending on hardware specification that runs this application. For future work, we will try to expand the current system to the parallel computing environment for achieving higher performance system in terms of efficiency, accuracy and realism.
54 Advances in Computer Graphics and Virtual Environment
REFERENCES
Assarson U. and Möller T., 1999, “Optimized View Frustum Culling Algorithms”, Technical Report:99-3, Chalmers University of Technology, Sweden. Assarson U., and Möller T., 2000, “Optimized View Frustum Culling Algorithms for Bounding Boxes”, Journal of Graphics Tools, vol. 5, no. 1, pp. 9-22. Duchaineau M., Wolinsky M., Sigeti D.E., Miller M.C., Aldrich C. and Mineev-Weinstein M.B., 1997, “ROAMing Terrain: Real-time Optimally Adapting Meshes”, Proceedings of IEEE Visualization, pp. 81 – 88. Hoff K., 1997, Fast AABB/View-Frustum Overlap Test, . Hoppe H., 1998, “Smooth View Dependant Level-of-Detail Control and its Application to Terrain Rendering”, Proceedings of IEEE Visualization, pp. 35-42. Kim S.H. and Wohn K.Y., 2003, “TERRAN: out-of-core TErrain Rendering for ReAl-time Navigation”, EUROGRAPHICS. Kofler M., 1998, “R-trees for Visualizing and Organizing Large 3D GIS Databases”, Technischen Universität Graz, Ph.D. Lindstrom P. and Pascucci V., 2001, “Visua-lization of large terrains made easy”, Proceedings of IEEE Visualization, pp. 363–370. Pajarola R., 1998, “Access to Large Scale Terrain and Image Databases in Geoinformation Systems.” Swiss Federal Institute of Technology (ETH) Zürich, Ph.D. Thesis. Rabinovich B. and Gotsman C., 1997, “Visualization of Large Terrains in Resource-Limited Computing Environments”, Proceedings of IEEE Visualization, pp. 95-102.
Real-Time Terrain Rendering and Visualization Based on Hierarchical 55 Method
Röttger S., Heidrich W., Slusallek P. and Seidel H.P., 1998, “RealTime Generation of Continuous Levels of Detail for Height Fields”, Technical Report. University¨ Erlangen-N urnberg. Thorsten S. and Anselmo L., 2003, “Simulation of Cloud Dynamics on Graphics Hardware.” SIGGRAPH/Eurographics Workshop on Graphics Hardware. Ulrich T., 2002, “Rendering Massive Terrains using Chunked Level of Detail Control”, DRAFT, Oddworld Inhabitants. USGS 1992, “Standards for Digital Elevation Models”. National Mapping Program Technical Instructions, U.S. Department of the Interior, U.S. Geological Survey, National Mapping Division. Youbing Z., Ji Z., Jiaoying S. and Zhigeng P., 2001, “A Fast Algorithm for Large Scale Terrain Walkthrough”, International Conference on CAD&Graphics. China.
5 3D AVATAR MOVEMENT AND NAVIGATION Ahmad Hoirul Basori Daut Daman Abdullah Bade
INTRODUCTION
A
vatar is very essential on virtual reality game, especially 3D avatar that natural or emotional movement. Basically, numerous researchers are focusing on movements from input such as joystick, mouse, and keyboard. The other movement’s concepts come from artificial intelligence as a part of improving interactivity. This chapter intends to offer different method of controlling the virtual human characters movement that involves human emotion and haptic sensation, which are known as human haptic emotion. The scheme is engage human haptic emotion to control object life movements as a way to enhance simulation closer to reality and more immersive. Furthermore, the movement of 3D avatar is a feature in virtual reality game which is usually controlled by mouse or keyboard, but recently it is controlled by artificial intelligence which is built from particular rule (Mocholi et al. 2006). Furthermore, virtual reality game (VRG) depends on interactive movement as well, because each interaction needs natural movement (Mocholi et al. 2006). In VRG context, this is quite useful for developing VRG applications such as Flight Simulation, Surgery Simulation and Military Simulation Training (Michael 2006; Aylett 2005; Marks 2007). Moreover, serious game which comprises tasks that include specific knowledge is attached
58 Advances in Computer Graphics and Virtual Environment
into game modules. It also offers more information using messages, provides tutorial and experience like in real life (Michael 2006). From educational perspective,, VRG needs to be more interactive in order to deliver the material of study. This interaction can be come from interaction between virtual characters in virtual environment and also interaction between VRG applications with players (Rossou 2004; Greitzer 2007). Human haptic emotion is an approach of applying human emotion on virtual human characters and transfers those human emotions to players through haptic devices. The players will sense the emotion of virtual human characters during playing session. This chapter describe about a different approach on handling movements of virtual human characters using human haptic emotion that will be used for as a feature on handle movement for virtual reality game in order to augmenting interactivity between virtual characters itself and also interactivity with players. The information of 3D avatar movements will be sent to the players through haptic device during play session. This approach will be expected to augment the virtual reality game on achieve smooth movements.
PREVIOUS WORK Characters movements have already become important issues on virtual reality and game development. A lot of advanced game engines consider this as an important aspect for simulating real life case, especially when using a navigation to combine game entertaining with educational theory such as military theory, biology, surgery or medical theory and etc.(Rossou 2004; Greitzer et al. 2007; Virvou et al. 2006; Cai et al. 2006). Other research looked into how to solve navigation problem using interface for guiding the players into objective place and also help to trace the object on virtual environment (Abasolo 2007). While the other researchers still concern on navigation way
3D Avatar Movement and Navigation 59
such as tour and looking for the track or path (Geiger et al. 2008). Characters movement is very useful for helping interactivity, especially on how to attract player’s attention during game session. Moreover, researchers give more focus on making communication between virtual characters through network computers in order to create interactivity and form human perception on virtual reality game or on real life simulation (Herrero et al. 2005; Li et al. 2005; Miranda et al. 2001).
THE MOVEMENT OF 3D AVATAR, NAVIGATION AND WAY FINDING Navigation in virtual world actually imitates the navigation from the real worlds such as walk, drive, ski, fly, skate and sail. Navigation comprises two independent elements: travel and way finding (see Figure 5.1 for detailed sample in second life). According to Sherman (2003), navigation is mixture between way finding (knowing where you are and how to get where you want to go) and travel (moving through virtual space).
60 Advances in Computer Graphics and Virtual Environment
Navigation=wayfinding+travel Figure 5.1
Virtual agent that communicated with other virtual object, picture by [http://blog.wired.com/photos/ uncategorized/2007/06/29/shift_second_life.jpg]
Way finding is a method inside VR system for helping the user which is doing some travel in virtual world to be able to determine the destination and the path to achieve the destination. One of the ways finding method is way finding aids. Way finding aid is an interface for traveler about the location information, distance and map information. Travel more on exploring and doing journey in virtual world during the certain times. Human behaviors already become a consideration for researcher to bring it on virtual reality game to perform better visualization. This technique includes performing complex texture and animation to 3D models. Therefore, some 3D models are able to imitate the expression of real human emotion such as human mimic representation, lip representation and also mouth appearance, so as to determine the emotion of virtual human characters.
3D Avatar Movement and Navigation 61
On other hand, researchers also want to improve the navigation and way finding of 3D avatar inside virtual environment. The 3D avatar navigation and movement which is happens between virtual human characters is based on specific emotion. This interaction can be described like this: teacher work together with students, then during the interaction teacher sometimes can feel “Joy” or “Angry” and student can feel “fear” or “joy”. This interaction is illustrated using “full line”. Second, Action-reaction movement based on stimulation is generated from emotional interaction. This movement comes after some virtual human character triggers some action to the others. After that, the other characters will react to the particular action (illustrated using dot-line). Action-reaction on this virtual environment is derived from eventappraisal human emotion. Every action-reaction on this sociable environment will be transferred to players using Visual, Haptic and Acoustic. The details about this navigation and movement are illustrated on Figure 5.2.
Figure 5.2 The illustration 3D avatar movement and navigation
62 Advances in Computer Graphics and Virtual Environment
3D AVATAR MOVEMENT AND NAVIGATION DEVELOPMENT We model the movements of virtual human characters through combining the model from (Bailenson et al. 2007; Zagalo et al. 2004), the capabilities of DirectX forced feedback and XNA framework. The conducted experiment showed that the VHC movement can be controlled by the angle direction for each emotion. The proposed model has used the alteration of angle direction from human characters by considering the result of angle graph from Bailenson et al. (2007) and for magnitude and duration of the vibration by considering the diagram from Zagalo (2004). We have considered each of the emotion to accomodate the approximation degree of their path according to the angle from (Bailenson et al. 2007). For example in Bailenson et al. (2007), Anger expression has two possibilities: 50%° to Right- 50%° to Left, which means that this emotion has the same chance to go to the right route or left route. Other emotions will generate different path of movements. For example, surprise has the composition of 70%°Right- 30%° Left. This indicates that the surprise emotion slope to move to the right angle with approximation 70%. The magnitude vibration frequencies have ranged from (0-10000), but in the real experiment where we used Forced feedback joystick, it is very hard to experience this with the vibration below 1000. The vibrations is felt differently when it vibrates above 1000. Range distinction of the second vibration should be increment with 1000 to show the variation of vibration. Based on Basori et. al (2008), joy emotion has slope 75% to right and 25% to left, and magnitude vibration 5000 with duration 1000000 µs. Sample 1 already showed ethics how to make good communication with your teacher. In order to show expression and the feel impression of happy emotion from sense of touch, we make forced feedback that hold by children vibrates into 5000 frequency for 1000000 µs. In this case, children will gain an experience and if they make a communication like in the given sample, their teacher and themselves will get “happy” emotion.
3D Avatar Movement and Navigation 63
Therefore for “anger” emotion, user will be shocked by highest magnitude vibration 10000 for 3000000 µs. This means user will feel more immersion with this virtual environment. This haptic immersion comes from haptic-emotion generation process. We classified every emotion into particular magnitude frequency and specific position. First, we need to get the initialization of each emotion, position for each character and also the magnitude vibration for each emotion. The emotion variable consists of seven basic emotions. Vibration variable consists of magnitude power and duration of motor vibration. A different magnitude power and duration illustrate different specific emotions. The remaining variable is position of each virtual human characters that have x, y and z properties. We describe the whole generation process algorithm below.
64 Advances in Computer Graphics and Virtual Environment
Algorithm 1: Classified_emotion(Emotion,x,y,z) for Emotion[i]Å{Disgust,Anger,Sadness,joy,fear,Interest ,Surprise} Emotion_Vibrate[i] Å{Magn_power,Timer} Emotion_pos[i] Å{x,y,z} if Emotion[i]="Anger" Emotion_Vibrate[i]={10000,3000000} Emotion_pos[i]={x+offset,y,z+offset} if Emotion[i]="Disgust" Emotion_Vibrate[i]={9000,2000000} Emotion_pos[i]={x+offset,y,z+offset} if Emotion[i]="Sadness" Emotion_Vibrate[i]={1000,3000000} Emotion_pos[i]={x+offset,y,z+offset} if Emotion[i]="joy"Emotion_Vibrate[i]={5000,1000000} Emotion_pos[i]={x+offset,y,z+offset} if Emotion[i]="Fear" Emotion_Vibrate[i]={8000,1000000} Emotion_pos[i]={x+offset,y,z+offset} if Emotion[i]="Interest" Emotion_Vibrate[i]={4000,2000000} Emotion_pos[i]={x+offset,y,z+offset} if Emotion[i]="Suprise" Emotion_Vibrate[i]={6000,3000000} Emotion_pos[i]={x+offset,y,z+offset} Return (Classified_emotion)
Figure 5.3 Algorithm for the whole generation process
The visual immersion affects children from their visual perception. The visual immersion consists of movement based on the human haptic emotion, emotion information for each characters, magnitude vibration and duration for the vibration. First of all is the information about current emotion of each virtual human character. Second, the magnitude vibration is used as an expression of emotion whereas the third is the angle direction of movement. All of this information appears as texts in the desktop. Figure 5.3 and Figure 5.4 show the preliminary testing using XNA and DirectX Library for simulating the sociable virtual environment.
3D Avatar Movement and Navigation 65
For the preliminary testing, we used Direct X as a library to communicate with forced feedback (haptic device) – joystick. On the other hand, for visualization we used XNA framework to visualize and demonstrate the methodology. The human model of this experiment (tinyanim.x) was taken from DirectX SDK. We still used visualization, however the visual immersion for user can be happen through some animation.
Figure 5.4 Visual Immersion-Camera away
66 Advances in Computer Graphics and Virtual Environment
Figure 5.5 Visual Immersion- Camera near
CONCLUSION The discussion above has explained the methodology and implementation of the proposed framework. The virtual environment provides user about the sensation and immersion using three of human basic senses: haptic, acoustic and visual. The results from conducted experiments are seen to give much benefit to the so called collaborative environment in virtual reality game. Our propose framework strives more on how children can receive and feel the ethics knowledge. This is due to the fact that common learning only accessible with conventional learning method such as ethics material using audio or video. But in this sociable collaborative learning environment, we mixed up together three
3D Avatar Movement and Navigation 67
basic senses (Haptic, Acoustic and Visual). Furthermore, we also added some emotion to these three senses in order to give user more immersion. Experiences such as uncomfortable feeling because of “Anger” emotions and good feeling like “joy” emotion from virtual environment are expected to encourage children become well behave. This is due to user have got more immersion when they are interacting and playing virtual environment. As a further research, we are in the process to integrate human haptic emotion with complex artificial intelligence in order to perform more immersion and interactivity in virtual reality game.
REFERENCE
Abasolo M. J. and Della J. M., 2007, “Magallanes: 3D Navigation for Everybody,” The Proceedings of GRAPHITE 2007, ACM Conference Proceeding", Perth Western Australia”, December 2007, pp.135-142. Aylett R.S, Delgado C., Serna J.H, Stockdale R., Clarke H., Estebanez M., Goillau P. and Lynam T., 2005, “REMOTE: Desk-top Virtual reality for future command and control,” Springer-Verlag London Limited, pp.131-146. Bailenson J.N., Yee N., Brave S., Merget D. and Koslov D., 2007, “Virtual Interpersonal Touch:Expressing and Recognizing Emotions Through Haptic Devices”, Human-Computer Interaction,Volume 22,pp.325-353,Lawrence Erlbaum Associates,Inc. Cai Y., Lu B., Fan Z., Indhumathi C., Lim K. T., Chan C. W., Jiang Y. and Li L., 2006, “Bioedutainment: learning Life Science through X Gaming”, Journal of Computer & Graphics-Elsevier, Vol.30, pp.3-9.
68 Advances in Computer Graphics and Virtual Environment
Geiger, Fritze R., Lehmann A. and Stocklein J., 2008, “HYUI- A Visual Framework for prototyping Hybrid User Interfaces”, Proceedings of the second International Conference on tangible anda embedded Interaction (TEI'08), February, 2008.pp.63-70. Greitzer F.L., Kuchar O. A. and Huston K., 2007, “Cognitive Science Implications for Enhancing Training Effectiveness in a Serious Gaming Context, ”ACM Journal of Educational Resources in Computing”, Vol. 7, No. 3, Art.2. Herrero P., Greenhalgh C. and Antonio A. D., 2005, “Modelling the Sensory Abilties of Intelligent Virtual Agents”,Springer Science+Business media, pp. 361-385 Li T.Y., Liao M.-Y., and Tao P.-C., 2005 “IMNET:An Experimental Tesbed for extensible Multi-user Virtual Enviroenment Systems ,” the International Conference on Computational Science and its Applications (ICCSA 2005)”, May, pp.957-966. Mocholí J.A., Esteve J.M., Jaén J., Acosta R.,and Louis P. X., “An Emotional path Finding Mechanism for Augmented Reality Applications,” Proceedings of the 5th International Conference on Entertainment Computing(ICEC2006),September 2006 ,pp.13-24. Michael D. and Chen S., 2006, “Serious Game, game that educate, train, and Inform “, Thomson Johnson technology, Thomson Course Technology PTR, Boston, USA Miranda F. R., Kogler J. E., Hernandez E. M. and Netto M. L., 2001, “An Artificial Life approach for the animation of cognitive characters”, Journal of Computer and GraphicsElsivier, pp.955-964. Marks S., Windsor J. and Wunsche B., 2007, “Evaluation of Game Engines for Simulated Surgical Training,” Proceedings of the 5th International Conference on Computer Graphics and Interactive Techniques in Australia and Southeast Asia(GRAPHITE ),December. pp.273-318 Magnenat N., Bonanni U. and Boll S., 2006, “Haptics in virtual reality and Multimedia”, IEEE Journal, pp.6-11
3D Avatar Movement and Navigation 69
Rossou M., 2004,“Learning by Doing and Learning through Play: An Exploration of interactivity in Virtual Environments for Children,”ACM Journal of Computers in Entertainment”, Vol. 2, No.1, pp.1-23. Takamura Y., Abe N., Tanaka K., Taki H., and He S., 2006, “A virtual Billiard game with Visual, auditory and haptic sensation,” Proceedings of International Conference on Elearning and Games(EDUTAINMENT 2006)-LCNS3942", April, pp.700-705. Virvou M., and Katsionis G., 2006, “On the Usability and likeability of virtual reality games for education: The case VR-ENGAGE” Journal of Computers and Education", Vol.50, pp.154-178. Williams II R.L., Howell J. N. and Eland D. C., 2004, “The Virtual haptic back for palpatory Training,” Proceedings of 6th IEEE International Conference on Multimodal Interfaces (ICMI)",October, pp.191-197. Zagalo N., Barker A. and Branco V., 2004, “Story Reaction Structures to Emotion Detection”, SRMC’04, ACM Conference Proceedings, pp.33-38
6 3D GRAPHIC SCENE MANAGEMENT IN DRIVING SIMULATOR Mohd Khalid Mokhtar Mohd Shahrizal Sunar
INTRODUCTION
S
imulate is the word that have same meaning as fabricate, feign, pretend, copy, mimic, or imitate. The word “simulation” can be defined as a "technique of substituting a real environment to fake one, so that it is possible to work under laboratory conditions of control. Within experimentally controlled environment, performance measures can be defined, collected, and repeatly tested which is in a cost-effective manner (Olsen 1996). A virtual driving simulator is a device that allows user to feel a life-like experience of driving an actual vehicle within virtual reality. It is effectively used for studying the interaction of a driver and vehicle and for developing new vehicle systems, human factor study, and vehicle safety research by enabling the reproduction of the actual driving environments in a safe and tightly controlled environment (Kang et al. 2004). Mostly vehicle simulators consist of physical mockups as examples steering wheel, gearshift and pedals. These are essential in trying to simulate real conditions, then it’s become as a drawback for system becomes more expensive, more huge (non mobile), and then limited to reflect changes on the vehicle type, dimensions, or interior design (Kallmann et al. 2003).
72 Advances in Computer Graphics and Virtual Environment
The virtual driving simulator environment consists of static universe, dynamic objects and interior of driver’s vehicle (Kang et al. 2004). The static universe can be building, trees, road and others. The dynamics objects can include any moving objects in virtual scene like cars, people, and crowd. With more complex virtual scene will contain many thousands of polygons which need more graphic processing power and more computation cost to render the scene. Even on latest graphic hardware processing, the increase in complexity in virtual environment will increase computational power. The complex scene is needed to manage so it can be efficiently render in real time and avoid memory leak. Implement of 3d scene graphic management techniques help to reduce computational burden in complex driving simulator environment. Figure 6.1 and 6.2 shows an example of a driving simulator and complex scene in driving simulator virtual environment.
Figure 6.1 An example of driving simulator
3D Graphics Scene Management in Driving Simulator 73
Figure 6.2 The complex virtual environment of driving simulator
HISTORY OF DRIVING SIMULATOR It not a latest research when we related to driving simulator for its development and application. As early of 1900s, driving simulators start having their roots on flight simulators. It has begun to appear in primitive forms in the 1970s. The advent of computer technologies brought Daimler-Benz to launch a high-fidelity driving simulator in 1980s, since then brought many automotive makers and research institutions worldwide to develop their own simulators (Lee 1998). After that, computer generated imagery was more extensively being introduced in driving simulation research on the early of 1980s. Computer-generated image systems advancement went through several stages in the 1980s, from very angular and
74 Advances in Computer Graphics and Virtual Environment
lack of shading and detail to the appearance of photo-driven texturing. Through a significant software development cost realistic images look like real (Decina 1996). TYPES OF SIMULATOR Simulators can be divided as interactive and non-interactive. Interactive driving simulator mean that the simulator responds to the driver just as the driver responds to the simulator or called as a closed-loop driving simulator. Apart in a non-interactive system, the driver responds to the simulator, but the simulator does only what it was programmed to do, regardless of the driver's actions, often referred to as an open-looped system. Interactive driving simulation represents real world such as actual roadways through computer generated imagery (CGI), auditory feedback, and realistic vehicle instruments and controls such as brake pedal, steering wheel, turn signal indicator, speedometer, and mirrors. ARCHITECTURE OF DRIVING SIMULATION
Figure 6.3 Architecture of driving simulator
3D Graphics Scene Management in Driving Simulator 75
As picture in Figure 6.3 from Sun et al. (2007) above shows driving simulation system architecture is a human-in-theloop mode. Human is placed in the outer loop operating the control equipments. Then, GPU and CPU are in the inner loops functions in the work of rendering, physical simulation and AI processing. As in this diagram, triangle-arrows describe control flow and point-arrows describe data flow. Operation Loop, driver of driving simulator send inputs control operations to the inner loops. Then in Rendering Loop, GPU renders the virtual scene to the Frame Buffer. At the same time GPU apply effect to scene using Vertex Shader and Pixel Shader. Lastly Physics Loop, the Physics engine used to calculate the driving car’s position, velocity, acceleration, and orientation.
DRIVING SIMULATOR SYSTEM DESIGN
Figure 6.4 The framework of major components Software Architecture
76 Advances in Computer Graphics and Virtual Environment
The architecture of software has been developed to support scalability, extendibility, flexibility, and evolve ability. It main focus consists of six components: visual database, scene control, scenario control, operations control: GUI, vehicle control, and audio control. The software framework of six main components is described in the following sections and is shown in Figure 6.4. The first component is Scene Control that functions to create or draw objects in the scene and also manage the scene contents by renders the specified views in real time. Next, Scenario Control is control component in generate and distribute viewing parameters as well as send rendering commands to the related component. GUI is component provides human-computer interface for user to interact with the system. Vehicle Control is component includes motion control and car cab I/O control.Lastly, Audio Control is component able to produce sound effects like engine, tire, wind, and tire squeal. All this compenent will interact between themselves to manipulate the database content visual model of the simulator in Visual Database.
VIRTUAL ENVIRONMENT IN DRIVING SIMULATOR Rendering the virtual environment (VE) will includes full texturing, shading, fog, and lighting effects with an aim to stimulate a real
3D Graphics Scene Management in Driving Simulator 77
driving environment. The development of the 3D model is the most difficult and interesting component of the simulator. The visual database of the simulator can be divided into three types of 3-D models: 3-D Street Models, 3-D roadway Models, 3-D mobile Models (Liao D. 2006). 3-D Street Models- Includes terrain, tree, buildings and trafic devices models. 3-D roadway Models- Modelled separately to allowed collusion detection between road surface and vehicle. 3-D mobile Models- The vehicles modelled so can be able to create the feel inside real driving simulator.
GRAPHICS DELAY CAUSES SIMULATOR SICKNESS Simulator sickness was firstly documented by Havron and Butler in 1957 in a helicopter trainer. It almost similar to motion sickness, but can occur without actual motion of the subject. Simulator sickness is potential treat that will cause severe discomfort some user during or after using driving simulator (Kolasinski 1995). It is one of the biggest problems that faced many driving simulators. Two factors can probably causes simulator sickness are transport delay (time lag) and update rate (frame rate). Transport delay is a problem whereby the response of a dynamic element slower in time relative to its input (Lee et al. 1998). Three major sources of transport delay in the driving simulator: the vehicle simulation, visual, and motion systems. Real time vehicle simulation is related with vehicle dynamics computation remains lower than the perception threshold of the driver. Next, in the visual system includes data acquisition, image processing and display time (is in the range of 25 and 50 msec depending on the quality of images). Lastly, the delay in the motion system is around 50 msec with including data acquisition, drive logic computation, and motion platform response time. As result, transport delay is critical issue in to make sure the fidelity of
78 Advances in Computer Graphics and Virtual Environment
the simulator. The delay will degrade system performance, and further system stability. Update rate also known as frame rate is can be defined as speed of simulation. This frame rate is different the refresh rate where it depend on the scene complexity and graphic processing power of hardware used by the driving simulators. The effects of this problem will cause visual lag. Frame rate can be referred as benchmark. When render the scene with high complexity, the frame rate will drop. With these two factors the solution needed to minimize the problems which may be not lead directly to simulator sickness problem. The important is to ensuring the high fidelity in complex driving simulator virtual environment.
3D SCENE MANAGEMENT The term of 3D scene management can be defined as algorithms and methods that select only the polygons that are needed for viewer depending on location and orientation of virtual camera in virtual environments (Zerbst and Duvel 2004). The scene management system is responsible for efficiently rendering complex scenes. Within complex VE, most polygons are not visible to the user needed to remove. Three scene management elements can be found in most VE (Young 2004). i. ii. iii.
The ability to load and destroy a scene properly. Management of scene of data, object in the scene. Display of the scene to player.
The design of scene management is different depend on types of VE. The way to do that depends on the capabilities of modern graphic adapter (Zerbst and Duvel 2004). The need of scene management is important especially when deal with complex virtual environment to minimize data to be processed and rendered
3D Graphics Scene Management in Driving Simulator 79
for each frame. With aims to make sure suitable frame rate is processed by renderer in driving simulator. In the following sections, the main data structures and algorithms that deal with scene management can be applied in driving simulator will be explained. Culling
Figure 6.5 Three types of visibility culling – view-frustum culling, back-face culling and occlusion culling
Culling is defined as method that filtering of faces that not required rendered by the scene (Young 2004). Mean that any faces of
80 Advances in Computer Graphics and Virtual Environment
polygon that do not require will not send to OpenGL or DirextX graphic pipeline. Visibility culling starts develop with two conventional techniques, which is back-face culling and viewfrustum culling. Back-face culling removes geometry faces that are away from the viewer, while view-frustum culling algorithm omits the rendering geometry that is outside viewing frustum. Next, occlusion culling also known as visibility culling or outputsensitive visibility calculation is a special case of visibility calculation introduced. Occlusion culling finds these visible parts without wasting time on the occluded parts, not even to determine that they are occluded. Three types of culling can be seen in Figure 6.5. The next figure shows the relationship between culling and frame rate per second (fps). It can be concluded that the balance between amount this method performed by CPU, the amount of polygon rendered by GPU and the amount need to do to achieve satisfaction to the result (Young 2004).
Figure 6.6 The relationship between culling and frame rate
3D Graphics Scene Management in Driving Simulator 81
69,451 polys
2,502 polys
251 polys
76 polys
Figure 6.7 Traditional Level Of Detail
Level of Detail Level of detail (LOD) involves reducing the complexity of a 3D object representation as it moves away from the 3D viewer in VE (Luebke et al. 2000). This technique improves the efficiency of rendering by reducing the data structures on graphics pipeline stages, usually vertex transformations. LOD used to improve system performance and quality of graphic system (Reddy 1999). LOD also know as polygonal simplification, geometric simplification, mesh reduction, decimation, multi-resolution modeling. There are two type of LOD. The first is known as Discrete LOD (DLOD). DLOD is that create individual levels of detail in a pre-process. Secondly, considers the polygon mesh being rendered as a function which must be evaluated requiring avoiding excessive errors which are a function of some heuristic (usually distance) themselves known as Continuous LOD (CLOD). Figure 6.7 below shows us about discrete LOD.. Spatial Subdivision Structures
82 Advances in Computer Graphics and Virtual Environment
Spatial Subdivision Structures Spatial subdivision is defined as dividing objects based on their location. In order to accelerate the procedure searching of two objects are visible needed to answer such a query, various structures of spatial subdivision of an environment have been introduced for important procedures in occlusion culling and ray tracing. So this method is good when it combine with culling techniques. Popular spatial subdivision structures will be outlined, namely, the regular grid, octrees and bounding volume hierarchies. Others spatial subdivision structure BSP-trees and kD-trees. There are algorithms are only suitable for static scenes while some are suitable for both static and dynamic scenes because the difference amount of computation needed to update the hierarchy in real-time (Baldeve 2006).
Bounding Volume Hierarchy
Figure 6.8 Bounding volume hierarchy
3D Graphics Scene Management in Driving Simulator 83
Steven Rubin and Turner Whitted (1980) introduced bounding volumes to envelope the objects in the scene to reduce intersection calculations in ray tracing. The objects are put at the correct level of the constructed tree, base on size and semantics. As example, the bounding volume of a desk is considered as a child volume of the room volume where it is in Figure 6.8. Regular Grids
Figure 6.9 Regular grid
One of easiest spatial subdivision is the regular grids. A regular grid is placed over the 3D scene and assigns the geometry to the grid’s voxel that encloses its centre point as shown in Figure 6.9. Polygon also can be clipped to fit into voxels but this can increase the amount of polygons. Total of dimension of the cells are decided early and thus this structure lacks of adaptability to the environment. Octrees
84 Advances in Computer Graphics and Virtual Environment
Octrees
Figure 6.10 Octree space partition
Glassner (1984) introduced hierarchical subdivision of a box called octree correspond to the regular grids. Initially one voxel is used to enclose the entire scene’s geometry. The every voxel is divided recursively into eight sub-voxels of equal size as shown in Figure 6.10. The subdivision continues until the maximum depth or until no object is included in the cell. Finally leaf of trees stores a list of included objects.
CONCLUSION Driving simulator is content a complex virtual environment that needed to manage to make sure efficiently rendered by graphic processing. The first issues discussed are all related to driving simulator: history, type of simulator, architecture of the driving simulator and system design. Part that needed to be highlighted here existence of scene management in driving simulator system design. The next issues discussed are the virtual environment in driving simulator and graphic delay cause simulator sickness. The issue mark here is the increase of complexity will lead to long time lag and decrease frame rate. Last issue discuss is about what is 3D scene management and type of techniques can be used in driving
3D Graphics Scene Management in Driving Simulator 85
simulator VE. The potentials of scene management in complex scene for driving simulator will lead used to others real time graphic system so complexity of the 3D scene able to manage to achieve fidelity and realism.
REFERENCES
Decina L. E, Gish K. W, Staplin L., and Kirchner A. H., 1996, “Feasibility of New Simulation Technology to Train Novice Drivers”, Technical Report, U.S. Department of Transportation. Glassner A., 1984, “Space subdivision for fast ray tracing.” IEEE Computer Graphics and Applications Kallmann M., Lemoine P.,Thalmann D., Cordier F., Thalmann N. M., Ruspa C., and Quattrocolo S., 2003, “Immersive Vehicle Simulators for Prototyping, Training and Ergonomics”, Proceedings of the Computer Graphics International (CGI’03). Kang H., Jalil A., and Mailah M., 2004, “A PC-based Driving Simulator Using Virtual Reality Technology”, In Proceedings of the 2004 ACM SIGGRAPH International Conference on Virtual Reality Continuum and Its Applications in Industry, Singapore. Kolasinski E. M., 1995, “Simulator Sickness in Virtual Environments”, Technical Report 1027, U.S. Army Research Institute. Lee W., Kim J. and Cho J., 1998, “A Driving Simulator as a Virtual Reality Tool”, Proceedings of the 1998 IEEE International Conference on Robotics & Automation Leuven, Belgium. Levine, O. H. and Mourant, R. R., 1995, “A driving simulator based on virtual environments technology”, Washington, DC: Transportation Research Board.
86 Advances in Computer Graphics and Virtual Environment
Liao D., 2006, “A Real-time High-fidelity Driving Simulator System Based on PC Clusters”, Proceedings of the 11th IEEE International Conference on Engineering of Complex Computer Systems (ICECCS'06). Luebke, D., Slide Show, Level of Detail & Visibility:A Brief Overview, University of Virginia. Luebke D.,Reddy M., Cohen J, Varshney A, Watson B., and Huebner R., 2000, “Level of Detail for 3D Graphics”, Morgan Kaufmann, USA. Narayanasamy V., Wong K. W., Fung C. C. and Rai S., 2006, “Distinguishing Games and Simulation Games from Simulators”, ACM Computers in Entertainment, Vol. 4, No. 2, April 2006. ART 6A. Olsen E. C. B., 1996, “Evaluating Driver Performance on The Road and In a Simulator”, Thesis-degree Master of Science, faculty of the human factors/ergonomics program, San Jose State University. Paunoo, B., 2006, “Dynamic Scene Occlusion Culling in Architectural Scenes Based on Dynamic Bounding Volume”, Thesis : Degree of Master of Science (Computer Science). Reddy, M., 1999, “Specification and Evaluation of Level of Detail Selection Criteria”, Virtual Reality: Research, Development and Application, 3, 2, pp. 132-143. Rubin, S. and Whitted, T, 1980, “A 3-dimensional representation for fast renderingof complex scenes.” Computer Graphics (Proceedings of SIGGRAPH 80), 14(3): pp. 110–116. Samuels S. H., 2002, “Driving Simulator”, DISCOVER Magazine, , Date Access: 6/17/2008. Sun C., Xie F., Feng X., Zhang M., and Pan Z., 2007, “A Training Oriented Driving Simulator”, IFIP International Federation for Information Processing 2007.
3D Graphics Scene Management in Driving Simulator 87
Sutherland, I.A., 1968, “Head-Mounted Three-Dimensional Display”. Fall Joint Computer Conference, AFIFS Conference Proceedings, Vol 33, pp. 757-64. Young V, 2004, “Programming a Multiplayer FPS in DirectX”, Game Development Series, pp. 322. Yoshimoto K. and Suetomi T., 2008, “The History of research and development of driving simulators in Japan”, Journal of mechanical systems for transportation and logistics, vol. 1, No. 2. Zerbst S. and Duvel O., 2004, “3D game engine programming”, Thomson Course Technology.
7 AUGMENTED REALITY THEORY AND APPLICATIONS Ajunewanis Ismail Zakiah Noh
INTRODUCTION
T
he chapter covers the state-of-the-art in the field of Augmented Reality (AR), in which 3D virtual objects are integrated into a 3D real environment in real-time. It describes the theory of Augmented Reality and explores AR that can be applied in medical, manufacturing, visualization, path planning, entertainment, military and so on. This chapter will provide a starting point for anyone that is interested in doing research or using Augmented Reality. This chapter will discuss about the theory of AR. Azuma (1997) mentioned three criteria that have to be fulfilled for a system to be classified as an AR system: they combine the real and the virtual, they are supposed should interactive in real-time in other meaning that the user can interact with the system and get response from it without delay, and they are registered and aligned in three dimensions. This section also will discuss about technologies in AR in order to develop AR system. In the next section, we will describe about AR applications. AR applications can be found in diverse domains such as medical, education, military applications, entertainment and infotainment, technical support and industry applications and so on. However, this chapter is focusing on description of AR applications in medical, annotation and visualization, education, manufacturing and repair, robot path planning, entertainment and games.
Augmented Reality Theory and Applications 89
AUGMENTED REALITY’S THEORY
Definition
AR is an environment that supported by real and virtual objects in real time representation. Azuma (1997) defined AR as “an environment that includes both virtual reality and real-world elements. For instance, an AR user might wear translucent goggles; through these, he could see the real world, as well as computer generated images projected on top of that world”. The goal of AR system is to add information and improve the user’s view to a real environment. This definition of AR system is not limited to display technologies only for instance Head Mounted Display (HMD) but it can be potentially applied to the visual common sense including hearing, smell and touch. Milgram and Kishino (1994) elaborated the virtuality continuum of real environment to virtual reality in which AR is one part of Mixed Reality. The environment of Augmented Virtuality and Virtual Environment are virtual whereas the environment of AR is real.
Figure 7.1 Virtuality Continuum (Azuma et al. 2001)
90 Advances in Computer Graphics and Virtual Environment
The differences between VE and AR are based on immersion level. AR can give user an immersive experience however user still can see and feel the presence of real world. Some researchers describe AR with several features; i) combines real and virtual, ii) interactive in real-time and iii) registered in 3D. Registration is an accurate alignment of real and virtual object. The illusion between the virtual and real objects exist in the real space is severely compromising without accurate registration. The application of AR can be applied in medical, manufacturing, visualization, path planning, entertainment, education and others. Technology of AR To develop AR system, several displays technology is required to accomplish the combination of real and virtual environment. These technologies include optical see-through, video see-through, Virtual Retinal Systems and Monitor based AR.
Optical See-Through HMD One of the devices used to merge real and virtual environment is an Optical See-Through HMD. It allows the user to interact with real world using optical technologies to superimpose virtual objects on the real world. The optical see-through is used transparent HMD to produce the virtual environment directly to the real world. Optical see-through HMDs’ function is placing optical combiners in front of the user's eyes. These combiners are partially transmissive to allow users look directly through them to see the real world and partially reflective to see virtual images bounced off the combiners from head mounted monitors. However, these combines will reduce the amount of light that the user sees from the real world. Figure 7.2 shows a conceptual diagram of an optical see through HMD. Figure 7.3 shows two optical see-through HMDs made by Hughes Electronics.
Augmented Reality Theory and Applications 91
Figure 7.2 Conceptual diagram of an optical see through HMD (Azuma 1997)
Figure 7.3 Optical see-through HMDs made by Hughes Electronics (Azuma 1997)
92 Advances in Computer Graphics and Virtual Environment
Video See Through Video see-through HMDs are able to give user a real world sight by combining a closed-view HMD with one or two head-mounted video cameras, due to this mixture give user a view of real world and virtual world in real-time through the monitors in front of the user's eyes in the closed-view HMD. Figure 7.4 shows a conceptual diagram of a video see-through HMD. Figure 7.5 shows a video see-through HMD. Video composition can be done using chroma-key or depth mapping (Silva, Oliveira & Giraldi 2003).
Figure 7.4
A conceptual diagram of a video see-through HMD (Azuma 1997)
Augmented Reality Theory and Applications 93
Figure 7.5 Video see-through HMD (Silva, Oliveira & Giraldi 2003)
Virtual Retinal Systems Virtual Retinal Systems aim to produce a full color, wide field-ofview, high resolution, high brightness and low cost virtual display. This technology can be used in wide range of applications from head-mounted displays for military or aerospace applications to medical purposes. The Virtual Retinal Display (VRD) uses a modulated beam of light (from an electronic source) directly onto the retina to produce a rasterized image. If the viewer stands to feet away in front of 14 inch monitor, the viewer can see the illusion of source image. In real world, the image is on the retina of its eye and not on a screen. The quality of the image that user sees is really superb with stereo view, full color, wide field of view and no flickering characteristics. Figure 7.6 shows the Virtual Retinal System HMD.
94 Advances in Computer Graphics and Virtual Environment
Figure 7.6 Virtual Retinal System HMD (Silva, Oliveira & Giraldi 2003)
Monitor Based AR Monitor based AR uses one or two video cameras to view the environment where the cameras may be static or mobile. The video was produced by combining the video of the real world and graphic image generated by a scene generator and the product was shown to user by using monitor device. The display devices are not wearing by the user but when the images are presenting in the stereo on the monitor, it requires user to wear the display devices such as stereo glasses. Figure 7.7 shows the build of Monitor-based system and Figure 7.8 shows an external view of the ARGOS system, which uses a monitor-based configuration.
Augmented Reality Theory and Applications 95
Figure 7.7 Build of Monitor-based system (Azuma 1997)
Figure 7.8
An external view of the ARGOS system, which uses a monitor-based configuration (Azuma 1997)
96 Advances in Computer Graphics and Virtual Environment
AUGMENTED REALITY APPLICATIONS Medical In medical applications, computer will generate 3D data. These data will be superimposed onto the surgeon’s view of the patient. Then, it provides a spatial data of the organ which belongs to the patient’s body or simply “X-ray” vision. Doctors have been implementing AR in surgery as a visualization and training aid. Non-invasive sensors such as Magnetic Resonance Imaging (MRI), Computed Tomography scan (CT), or ultrasound imaging has been used for the purpose of colleting 3D data of the patient. Those data are being rendered and combined with a view of the real patient in real-time. In context of AR system, those 3D data are combined with the view of real patient when the virtual view of 3D data is rendered in real-time. Therefore, doctors can easily interact with the AR system and get response without delay. As a result, this would give a doctor "X-ray vision" inside the patient. This is an efficient way to reduce high risks of operation during minimallyinvasive surgery. A problem with minimally-invasive techniques is doctor has a lack of ability to see inside the patient, so it is making surgery more complicated. Thereby with AR technology, it might provide an internal view without the need for larger incisions. AR is also helpful for general medical visualization tasks in the surgical room. Surgeons can differentiate certain features with the naked eye which is cannot be seen in MRI or CT scans. However, with AR, surgeons will be given the ability to access simultaneously types of data. This also leads precision tasks, such as display where to bore a hole into the skull for brain surgery or where to make a needle biopsy of a tiny tumor. The information from the non-invasive sensors would be directly displayed on the patient and presented where the accurate part to perform the operation.
Augmented Reality Theory and Applications 97
Figure 7.9 Virtual pregnant patient (State et al. 1996)
AR is very helpful to propose virtual instructions that could help a novice surgeon on the required steps without using the manual as instruction to guide them. As initiated by Kancherla (1995), AR is also useful for training purposes which involve virtual objects. Agreed by Durlach (1995), virtual objects could identify the organ and specific locations to avoid disturbing. Figure 7.9 presents a runs of scanning the womb of a pregnant woman with an ultrasound sensor, a 3D representation of the fetus inside the womb generated and displayed to see through HMD according to State (1996).
98 Advances in Computer Graphics and Virtual Environment
Figure 7.10 These images show Liver-Surgery planning using AR. (a) Process 3D refinement that allows changing of an incorrectly segmented liver. (b) Input and output devices tracking process. (c) Simulation of nonstandard tumor resection process. (d) Simulated calculation for quantitative analysis (Bornik et al. 2003)
AR technology is also being applied in liver surgery. Tumor resection is effective treatment for patients who are suffering from liver cancer. Information of liver shape, tumor location and the arrangement of the vascular structure are required. Therefore, it is necessary for doctors to prepare a systematic intervention plan. In usual routine, an intervention plan is defined using the information that has been retrieved from an imaging modality like X-ray computed tomography. However, manual intervention plan normally causes a few problems. In order to deal with these problems, the virtual liver surgery planning system,
Augmented Reality Theory and Applications 99
LiverPlanner, has been introduced by Bornik et al. (2003). Figure 7.11 shows the images of planning process for LiverPlanner. They use AR technology to manage clinical process of planning liver tumor resections so that it becomes more simple and easier. Education A potential and challenges of using collaborative AR within the context of immersive virtual learning environments becomes a high motivation to propose AR in education. For example, the experiences made during the development of a collaborative AR applications specifically designed for mathematics and geometry education called Construct3D, produced by Kaufmann (2000). Roussos (1999) claimed that the most important purpose of educational environment is to encourage social interaction among users in the same environment. In collaborative AR, multiple users are able to share the same physical space and communicate with themselves for educational purposes. They perform natural means of communication and mixed successfully either with immersive virtual reality or remote collaboration. However, developers have to consider a few problems and issues when developing the AR learning environment. AR definitely cannot be an ideal way to realize all educational application needs but it is considering as optional. The technologies used always need to depend on the pedagogical goals, the needs of the educational application and the target audience.
100 Advances in Computer Graphics and Virtual Environment
Figure 7.11
A tutor assists a student while working on constructing geometry model (Kaufmann 2000)
Figure 7.12 Collaborative works of students within the Augmented Reality application in constructing 3D object (Kaufmann 2000)
Augmented Reality Theory and Applications 101
Annotation and Visualization AR can be used for annotation and visualization tasks, which is efficient to annotate objects and environments with information either private or public. AR annotation system will draw the information given to view the virtual slide to users. Annotation that uses AR will combine the real and the virtual, the interactive annotation system AR in real-time means that the user can interact with the system and get response from it without delay. According to Fitzmaurice (1993), when the user walks around the library, the information of library shelves will be provided by a hand-held display.
Figure 7.13 Annotation represented as reminders (Feiner 1993)
102 Advances in Computer Graphics and Virtual Environment
Figure 7.14 Annotation on engine model (Rose 1995)
According to Rose (1995), at the European ComputerIndustry Research Centre (ECRC), when users point at parts of an engine model and AR system plays their role to display the name of the part once it is being pointed. Figure 7.14 shows where the user points at an engine model and the information appears as label of the part that being pointed. However, researcher at Columbia stated that AR applied for annotations are private notes attached to specific objects. As also stated in Feiner (1993), AR applications are demonstrated by attaching windows from a standard user interface onto specific locations in our real world, or could be represented as reminders. Figure 7.15 shows a window represented as a label upon a student.
Augmented Reality Theory and Applications 103
Manufacturing and Repair In industry, AR can be applied in various areas like repairing and maintenance of complex engines. For example, in the repairing and maintenance of complex engines, AR system can provide labels that aid mechanics in identifying engine components. Additional data such as maintenance reports, schematic manufacturer’s specifications, and repair procedures might be retrieved and displayed next to the specified component which observed in the real environment. AR applications in manufacturing and repairing could be found in diverse domains, such as the assembly, maintenance, and repair of multipart machines. Instructions might be easier to comprehend, compared to the manuals instructions that contain texts and pictures. It seems to be more difficult. But, AR produces a virtual instruction in 3D drawn, showing the task need to be done and how to be handled in easier way, perfect in step-by-step. These virtual instructions can be animated and thus, providing the directions even more clear
Figure 7.15 AR in laser printer maintenance (Feiner 1993)
104 Advances in Computer Graphics and Virtual Environment
Several research projects have confirmed prototypes in this area. As stated by Feiner (1993), they built a laser printer repairs application for printer maintenance. As shown in Figure 7.16, the user's view which the existing wireframe is guiding user on how to remove the paper tray. Almost applications also prove successful in Caudell (1992), Tuceryan (1995) and Sims (1994). Moreover, according to Tuceryan (1995), AR might be used for any complex machines such as automobile engines.
Robot Path Planning The complex problem is the operation of a robot, especially with long delays in means of communication when the robot is moving far away. In effects, especially to conduct the robot directly without delay, it may be to instead run a virtual version of the robot. Manipulation of the local virtual version is needed in order to plan the robot's actions in real-time that are specified by users. The outputs are directly displayed on the real world. Then, user will inform the real robot to run the plan once the plan is tested and determined. This avoids pilot-induced oscillations caused by the lengthy delays. AR technologies useful to predict the effects of manipulating the environment and serve as a tool on planning and previewing to aid the user in completing the task given. As stated in Drascic (1993) and agreed by Milgram (1993), the ARGOS system has demonstrated that stereoscopic AR is an easier and more precise way of doing robot path planning than traditional monoscopic interfaces. Figure 7.17 illustrates how a virtual outline can represent a future location of a robot arm.
Augmented Reality Theory and Applications 105
Figure 7.16
Virtual lines illustrate a designed motion of a robot arm (Milgram 1993)
Entertainments and Games Interactive gaming becomes one of the dominant application areas for computer graphics. Playing game is also becoming the most interesting activities in means of entertainment. Collaborative gaming in AR is familiar with Mobile Augmented Reality (MAR). MAR is a variety of portable devices with flexible computing capabilities has emerged. Handheld computers, mobile phones and personal digital assistants have the potential to commence AR. Under this circumstance, AR can be widely applied for games. AR also can physically complement mobile computing on wearable devices by providing an intuitive interface to a 3D environment embedded within our real world.
106 Advances in Computer Graphics and Virtual Environment
Figure 7.17 Invisible Train PDA Games (Wagner et al. 2005)
As introduced by Wagner et al. (2005) on their successful project delivery, The Invisible Train, which was the first real multi-user AR application for PDA devices. Figure 7.18 shows AR application for mobile games that can be played using PDA technologies. The Invisible Train is a mobile, collaborative multiuser AR game, in which players control virtual trains on a real wooden tiny rail track as shown in Figure 7.19. Virtual trains are only visible to players through their PDA's display as they actually do not exist in real world. According to Wagner et al. (2005), this type of user interface is usually called the "magic lens metaphor".
Figure 7.18 Mini platform shows Invisible Train rail track (Wagner et al. 2005)
Augmented Reality Theory and Applications 107
The social communication aspect can be clearly observed with non-computer based multi-player board-games like MahJongg, Trivial Pursuit, as stated in Gervautz et al. (1998), their research on collaborative gaming in AR. As shown in Figure 7.20, other collaborative AR applied in games, which multi-user interact each other to play the game in virtual mixed with real world.
Figure 7.19 Collaborative AR in Mah-jong game (Gervautz et al. 1998)
According to Thalmann et al. (1998) in their research work in the field of Networked Collaborative Virtual Environments thrives to incorporate such as natural communication in a virtual environment. These efforts are focusing on AR collaborative which is mostly based on the use of realistically modeled and animated virtual humans. There are many ways to use virtual human bodies for facial and gesture communication within a virtual environment.
108 Advances in Computer Graphics and Virtual Environment
Figure 7.20 Nonverbal communication application (Thalmann et al. 1998)
CONCLUSION AR is a field of computer research which deals with the combination of real-world and computer-generated data. AR research is evolved from the areas of virtual reality, wearable and ubiquitous computing and human computer interface. Human factors have been studied in order to know the information representation in such a way that the users are able to distinguish between what is real and what is virtual information. Recently, the field of AR is evolving as its own discipline, with strong ties to these related research areas. This chapter discussed about the theory of AR. This chapter also consists of further discussion about the field of AR, which 3D virtual objects are integrated into a 3D real environment in realtime. It describes the AR technologies that are applied in medical, manufacturing, visualization, robot path planning, entertainment and education applications.
Augmented Reality Theory and Applications 109
REFERENCES
Azuma, R. 1997, “A Survey of Augmented Reality.” Presence: Teleoperators and Virtual Environments 6, no.4, pp. 355385. Azuma, R , Baillot, Y, Behringer, R, Feiner, S, Julier, S and MacIntyre, B., 2001, “Recent Advances in Augmented Reality.” IEEE Computer Graphics and Applications (6), pp. 34-47. Bornik , A, Beichel ,R, Reitinger ,B, Gotschuli ,G, Sorantin, E, Leberl, F and Sonka, M 2003, “Computer aided liver surgery planning: An augmented reality approach.” Medical Imaging 2003: Visualization and Display, Proceedings of SPIE, volume 5029. Caudell, Thomas , P and Mizell, Dw 1992, “Augmented Reality: An Application of Heads-Up Display Technology to Manual Manufacturing Processes,” Proceedings of Hawaii International Conference on System Sciences, pp. 659-669. Chinthammit, Winyu, Burstein, R , Seibel, E and Furness, T., 2001, ‘Head tracking using the Virtual Retinal Display,’ Second IEEE and ACM International Symposium on Augmented Reality, October 29-30, New York, NY. Drascic, D, Grodski, JJ, Milgram,P, Ruffo,K, Wong,P and Zhai, S 1993, “ARGOS: A Display System for Augmenting Reality”, Video Proceedings of INTERCHI '93: Human Factors in Computing Systems (Amsterdam, the Netherlands, 24-29 April 1993). ACM SIGGRAPH Technical Video Review, Volume 88. Extended abstract in Proceedings of INTERCHI '93, pp. 521. Durlach, Nathaniel, I and Mavor, AS 1995, “Virtual Reality: Scientific and Technological Challenges.” Report of the Committee on Virtual Reality Research and Development to the National Research Council) National Academy Press (1995). ISBN 0-309-05135-5.
110 Advances in Computer Graphics and Virtual Environment
Fitzmaurice and George, 1993, “Situated Information Spaces: Spatially Aware Palmtop Computers.” CACM 36, 7 (July 1993), pp. 38-49. Gervautz, M, Szalavári, Z and Eckstein, E, 1998, “Collaborative gaming in augmented reality.” ACM Symposium on Virtual reality software and technology, Taipei, Taiwan. Kancherla, Anantha, R, Rolland,Jp, Wright, Dl and Burdea, G 1995, “A Novel Virtual Reality Tool for Teaching Dynamic 3D Anatomy.” Proceedings of Computer Vision, Virtual Reality, and Robotics in Medicine '95 (CVRMed '95) (Nice, France, 3-6 April 1995), pp. 163-169. Kaufmann, H, Schmalstieg, D and Wagner, M, 2000, “Construct3D: A Virtual Reality Application for Mathematics and Geometry Education,” Education and Information Technologies 5:4 (December 2000), pp. 263276. Milgram ,P and Kishino,F 1994, “A Taxonomy of Mixed Reality Visual Display.” IEICE Trans. Information System, vol.E77-D, no.12, pp. 1321-1329. Milgram, P, Zhai,S, Drascic, D and Grodski, Jj 1993, “Applications of Augmented Reality for Human-Robot Communication.” Proceedings of International Conference on Intelligent Robotics and Systems (Yokohama, Japan, July 1993), 1467-1472. Roussos, M, Johnson, A, Moher, T, Leigh, J, Vasilakis, C and Barnes, C 1999, “Learning and Building Together in an Immersive Virtual World”, PRESENCE 8(3), pp. 247-263. Rose, Eric, Breen,D, Ahlers, K, Crampton,C, Tuceryan,M, Whitaker, R and Greer, D 1995, “Annotating Real-World Objects Using Augmented Reality.” Proceedings of Computer Graphics International '95 (Leeds, UK, 25-30 June 1995), pp. 357-370. Sims, Dave 1994, “New Realities in Aircraft Design and Manufacture”, IEEE Computer Graphics and Applications 14.
Augmented Reality Theory and Applications 111
State, Andrei, Mark, Al, Gentaro , H, William, Fg, Mary Cw, Henry, F and Pisano, Ed 1996, “Techniques for Augmented-Reality Systems: Realizing Ultrasound-Guided Needle Biopsies.” Proceedings of SIGGRAPH ‘96, pp. 439-446. Silva, R, Oliveira, Jc and Giraldi, Ga 2003, “Introduction to Augmented Reality.” viewed 12 August 2007, www.lncc.br/~jauvane/papers/RelatorioTecnicoLNCC2503.pdf Thalmann ,N, Thalmann, D, Tolga, K and Pandzic, Is 1998, “Nonverbal Communication Interface for Collaborative Virtual Environments.”, Proc. Computer Graphics International 1998. Tuceryan, Mihran, Greer,Ds, Whitaker, Rt, Breen,D, Crampton,C, Rose,E and Ahlers, Kh 1995, “Calibration Requirements and Procedures for Augmented Reality” IEEE Transactions on Visualization and Computer Graphics 1, 3 (September 1995), pp. 255-273. Wagner, D, Pintaric,T, Ledermann,F and Schmalstieg, D 2005, ‘Towards Massively Multi-User Augmented Reality on Handheld Devices,’ Vienna University of Technology. Wursthorn, S, Coelho, Ah and Staub, G 2004, “Applications for Mixed Reality.”, XXth ISPRS Congress, Istanbul, Turkey. Yamamoto, H 1999, ‘Case Studies of Producing Mixed Reality Worlds’, IEEE SMC '99 Conference Proceedings. Volume: 6, pp. 42-47
8 AUGMENTED REALITY SYSTEM DEVELOPMENT: FROM SIMPLE TO ADVANCE Ahmad Hoirul Basori Mohd Shahrizal Sunar Daut Daman
INTRODUCTION
A
ugmented Reality (AR) is an extended version of virtual reality where it gives more impression for real world presence rather than virtual reality (virtual environment). Azuma (1997) stated that AR gives additional information to user until the user is able to see real world which is combined with virtual object. AR also has a characteristic that consist of combining virtual world, real time interaction and 3D registration (Azuma 1997). The example of AR application can be shown in Figure 8.1. In augmented reality applications the quality and position of camera hold an important role to give a realistic presence in virtual environment. Figure 8.1 has shown the proper combination between virtual object and real object. We hardly seen which one the real object and virtual object. Other elements that very essential are texture and model of the virtual 3D object. The 3D object should be more realistic and representing the real object in real world.
Augmented Reality System Development: From Simple to Advance 113
Figure 8.1
Real Situation in AR that consists of: Real desk and real phone, but it also added two virtual chairs and one virtual lamp, Courtesy ECRC in (Azuma, R.T 1997)
PREVIOUS WORK Why AR is considered as a hot issue today? It because AR gives high flexibility to the user especially by bridging the real world and virtual world as well as increasing the interaction and perception of user. AR has contributed in a lot of fields in real life such as medical, manufacturing and reparation, annotation and visualization, robot path planning , entertainment and military aircraft (Azuma 1997; Barakonyi et al. 2008; Cheok et al. 2006; Comport et al. 2003; Newman et al. 2004; Bianchi et al. 2006; Azuma 1999). In addition, AR also can be combined with games applications such as racing game or Ping-Pong Oda (2008) and Knoerlein (2007).
114 Advances in Computer Graphics and Virtual Environment
AUGMENTED REALITY DEVELOPMENT USING ARTOOLKIT Nowadays, AR applications become more reliable and mature. That is why, developing applications for AR system can be done using AR Toolkit library and simple equipment such as webcam and PC or laptop that equipped with web cam facility. AR Toolkit is a reliable library that very helpful to build an AR application. AR Toolkit has been built based on computer vision method which is able to detect, calculate real camera position and direction relative to markers in real world during real-time transmission. AR Toolkit consists of several features: i.
Single camera position/orientation tracking
ii.
Tracking code that uses simple black squares
iii.
The ability to use any square marker patterns
iv.
Easy camera calibration code
v.
Fast enough for real-time AR applications
vi.
SGI IRIX, Linux, MacOS and Windows OS distributions
vii.
Distributed with complete source code
On the beginning, ARToolkit has been developed by Dr. Hirokazu Kato and supported by the Human Interface Technology Laboratory (HIT Lab) at the University of Washington, HIT Lab NZ at the University of Canterbury, New Zealand, and ARToolworks, Inc, Seattle. a.
ARToolkit Marker
Marker is a core element in augmented reality application. Therefore, marker position has been used as an orientation for
Augmented Reality System Development: From Simple to Advance 115
matrix transformation of 3D virtual object. However, each marker can be used only for one object. Thus, for multiple objects, we must be able to recognize multiple marker and separate each marker into a different categorization. ARToolkit provides several samples of marker such as pattsample1, pattsample2, pattHiro, pattMulti, and pattKanji. See Figure 8.2 for the details.
Figure 8.2 Sample of patterns which is provided by ARToolkit
Basically, each patterns has a different characteristic, because the pattern coordinate of each marker contains a different coordinate. For example, the marker coordinate of pattern hiro can bee seen as follows, patt.hiro:
116 Advances in Computer Graphics and Virtual Environment
234 235 240 233 240 234 240 235 240 237 240 238 240 240 240 232 229 240 240 240 240 240 240 240 240 240 240 240 240 240 240 228 227 240 240 240 240 240 240 240 240 240 240 240 240 240 240 239 240 240 240 240 240 240 240 240 240 240 240 240 240 240 240 240 236 240 240 240 240 240 240 240 240 240 240 240 240 240 240 240 234 240 240 240 240 240 240 240 240 240 240 240 240 240 240 240 236 240 240 240 240 240 240 240 240 240 240 240 240 240 240 240 231 240 240 240 240 240 240 240 240 240 240 240 240 240 240 240 229 240 240 240 240 240 240 240 240 240 240 240 240 240 240 240 225 149 240 240 186 216 225 174 240 240 240 237 238 240 240 240 150 107 238 231 75 208 115 147 238 228 223 226 237 180 226 240 150 62 181 213 62 187 113 169 197 72 29 237 120 50 53 207 149 63 47 78 53 184 113 101 142 5 150 150 45 217 186 83 121 84 220 222 58 180 121 92 128 109 237 124 155 232 161 64
b.
ARToolkit Development Sample
ARToolkit provide several samples like simple, multitest and exview that has been built using C++ languange. Simple view
Augmented Reality System Development: From Simple to Advance 117
sample recognizes one marker and displays one of the 3D object using matrix transformation from real-time orientation. i.
Initialization Step
#include #include #include #include #include #include #ifdef _WIN32 char *vconf = "showDlg,deviceName=Microsoft DV Camera,deinterlaceState=on,deinterlaceMethod=blend," "pixelFormat=PIXELFORMAT_RGB32,videoWidth=320,videoHe ight=240"; #else char *vconf = ""; #endif int xsize, ysize; int thresh = 100; int count = 0; int mode = 1; char *cparam_name = "Data/camera_para.dat"; ARParam cparam; char *patt_name = "Data/patt.hiro"; int patt_id; double patt_width = 80.0; double patt_center[2] = {0.0, 0.0}; double patt_trans[3][4]; static void init(void); static void cleanup(void); static void keyEvent( unsigned char key, int x, int y); static void mainLoop(void); static void draw(double marker_trans[3][4]);
118 Advances in Computer Graphics and Virtual Environment
ii.
Transformation Step Transformation process is very critical in AR applications. Because of that, it holds an important role on transforming real position coordinate (real world) to virtual coordinate (virtual environment). Transformation matrix has been built from marker reading then it is converted to as a matrix for 3D model drawing using openGL. /* get the transformation between the marker and the real camera */ if(mode == 0) { arGetTransMat(&marker_info[k], patt_center, patt_width, patt_trans); } else { arGetTransMatCont(&marker_info[k], patt_trans, patt_center, patt_width, patt_trans); } draw(patt_trans); argSwapBuffers(); }
iii. Drawing Step Drawing step is used for creating 3D models using OpenGL library. The drawing function gets the input from matrix transformation which is consist of 2 array dimension (3 x 4).
Augmented Reality System Development: From Simple to Advance 119
static void draw(double patt_trans[3][4]){ double gl_para[16]; GLfloat mat_ambient[] = {0.0, 0.0, 1.0, 1.0}; GLfloat mat_flash[] = {0.0, 0.0, 1.0, 1.0}; GLfloat mat_flash_shiny[] = {50.0}; GLfloat light_position[] = {100.0,200.0,200.0,0.0}; GLfloat ambi[] = {0.1, 0.1, 0.1, 0.1}; GLfloat lightZeroColor[] = {0.9, 0.9, 0.9, 0.1}; argDrawMode3D(); argDraw3dCamera( 0, 0 ); glClearDepth( 1.0 ); glClear(GL_DEPTH_BUFFER_BIT); glEnable(GL_DEPTH_TEST); glDepthFunc(GL_LEQUAL); /* load the camera transformation matrix */ argConvGlpara(patt_trans, gl_para); glMatrixMode(GL_MODELVIEW); glLoadMatrixd( gl_para ); glEnable(GL_LIGHTING); glEnable(GL_LIGHT0); glLightfv(GL_LIGHT0, GL_POSITION, light_position); glLightfv(GL_LIGHT0, GL_AMBIENT, ambi); glLightfv(GL_LIGHT0, GL_DIFFUSE, lightZeroColor); glMaterialfv(GL_FRONT, GL_SPECULAR, mat_flash); glMaterialfv(GL_FRONT, GL_SHININESS, mat_flash_shiny); glMaterialfv(GL_FRONT, GL_AMBIENT, mat_ambient); glMatrixMode(GL_MODELVIEW); glTranslatef( 0.0, 0.0, 25.0 ); glTranslatef( 0.0, 0.0, 5.0 ); /* pembuatan object 3D cube using OpenGL syntaks*/ //glutSolidCube(10.0); glutSolidTeapot(30); glDisable( GL_LIGHTING ); glDisable( GL_DEPTH_TEST ); }
120 Advances in Computer Graphics and Virtual Environment
The Code above has been used to create virtual 3D teapot using openGL syntax (glutSolidTeapot(30);). See Figure 8.3 for details.
Figure 8.3 Simple view example using 3D virtual solid cube and tea pot
Augmented Reality System Development: From Simple to Advance 121
AUGMENTED REALITY DEVELOPMENTS USING C# ARTOOLKIT AR Toolkit has been used by other researchers - Chasey Cesnut in (http://www.brains-N-brawn.com/wpfAugReal) by combining it with C# language and .NET library. This AR style is a little bit easy rather than developing AR application from original ARToolkit. Like we said before, Chasey still used ARToolkitPLus as a fundamental of creating AR application in C# environment. Instead of utilizing ARToolkitPLUS, Chasey (2007) also used other several components : DirectShow and WPF 3D. It also saves the model coordinate in .XAML extension and then redraws using OpenGL syntax. Chasey (2007) used marker like in Figure 8.4 for simple view and Figure 8.5 for complex view.
Figure 8.4 Single Marker for simple view
122 Advances in Computer Graphics and Virtual Environment
Figure 8.5 Multiple Marker multi object
AR application using C# is divided into several step as shown below: i.
Importing ARtoolkit DLL DllImport syntax has been used as a syntax for importing ARToolkit dll. [DllImport("ARToolKitPlus.dll")] public static extern int fnARTKPWrapper(); /// ARToolKitPlus simple sample [DllImport("ARToolKitPlus.dll")] public static extern int fnARTKPWrapperSingle(float[] matrix, out int markerId, out float conf); /// ARToolKitPlus multi sample /// [DllImport("ARToolKitPlus.dll")] public static extern int fnARTKPWrapperMulti();
Augmented Reality System Development: From Simple to Advance 123
ii.
Importing .Net Directshow, direct 3D and XML
using System.Windows.Media.Media3D; using System.Runtime.InteropServices; using DirectShowLib; using WPFUtil; using ARTKPManagedWrapper;
iii.
Camera Calibration Every cameras have a different characteristic, that is why in AR development, we must calibrate each camera based on camera device performance. string cameraCalibrationPath = "data/LogitechPro4000.dat"; string cameraCalibrationPath = "data/no_distortion.cal"; //ARToolKitPlus_CamCal_Rev02 //640 480 320 240 1500.0 1500.0 0.0 0.0 0.0 0.0 0.0 0.0 0 string multiPath = "data/markerboard 480-499.cfg";
iv.
XAML Model Loading XAML saves the properties of 3D virtual object in this AR application. It contains shape, size, colour and other properties.
124 Advances in Computer Graphics and Virtual Environment
xdModels.Load("models.xml"); foreach (XmlNode xnNode in xdModels.DocumentElement.ChildNodes){ if (xnNode.NodeType != XmlNodeType.Element) {continue;} XmlElement xeNode = (XmlElement)xnNode; MyModel mm = new MyModel(); mm.id = Int32.Parse(xeNode.Attributes["id"].Value); mm.trans = bool.Parse(xeNode.Attributes["trans"].Value); mm.path = xeNode.Attributes["path"].Value; mm.code = xeNode.Attributes["code"].Value; mm.rotX = Int32.Parse(xeNode.Attributes["rotX"].Value); mm.rotY = Int32.Parse(xeNode.Attributes["rotY"].Value); mm.rotZ = Int32.Parse(xeNode.Attributes["rotZ"].Value); mm.sizeX = Int32.Parse(xeNode.Attributes["sizeX"].Value); mm.sizeY = Int32.Parse(xeNode.Attributes["sizeY"].Value); mm.sizeZ = Int32.Parse(xeNode.Attributes["sizeZ"].Value); mm.offX = Int32.Parse(xeNode.Attributes["offX"].Value); mm.offY = Int32.Parse(xeNode.Attributes["offY"].Value); mm.offZ = Int32.Parse(xeNode.Attributes["offZ"].Value);
FileStream fs = new FileStream(mm.path, FileMode.Open, FileAccess.Read); // Viewbox vb = (Viewbox)XamlReader.Load(fs); fs.Close(); mm.root = vb; //for INameScope Viewport3D v3d = (Viewport3D)vb.Child; ModelVisual3D mv3d = (ModelVisual3D)v3d.Children[0]; Model3DGroup m3dgScene = (Model3DGroup)mv3d.Content; Model3DGroup m3dg = (Model3DGroup)m3dgScene.Children[m3dgScene.Childr en.Count - 1]; mm.m3dg = m3dg;
Augmented Reality System Development: From Simple to Advance 125
v.
3D Drawing The process of 3D drawing involves loading XAML file and drawing of 3D object using OpenGL. Matrix3D wpfModelViewMatrix = ArManWrap.GetWpfMatrixFromOpenGl(modelViewMatrix); Matrix3D wpfProjMatrix = ArManWrap.GetWpfMatrixFromOpenGl(projMatrix); ModelVisual3D mv3d = new ModelVisual3D(); Model3DGroup m3dg = new Model3DGroup(); GeometryModel3D gm3d = new GeometryModel3D(); gm3d.Material = new DiffuseMaterial(brush); gm3d.BackMaterial = new DiffuseMaterial(Brushes.Orange); MeshGeometry3D mg3d = new MeshGeometry3D(); mg3d.Positions.Add(new Point3D(-10, 10, 0)); mg3d.Positions.Add(new Point3D(-10, -10, 0)); mg3d.Positions.Add(new Point3D(10, 10, 0)); mg3d.Positions.Add(new Point3D(10, -10, 0)); mg3d.TextureCoordinates.Add(new Point(0, 0)); mg3d.TextureCoordinates.Add(new Point(0, 1)); mg3d.TextureCoordinates.Add(new Point(1, 0)); mg3d.TextureCoordinates.Add(new Point(1, 1)); mg3d.TriangleIndices.Add(0); mg3d.TriangleIndices.Add(1); mg3d.TriangleIndices.Add(2); mg3d.TriangleIndices.Add(1); mg3d.TriangleIndices.Add(3); mg3d.TriangleIndices.Add(2); gm3d.Geometry = mg3d; m3dg.Children.Add(gm3d); mv3d.Content = m3dg; modelMarkers.Children.Add(mv3d);
The code above will load several 3D object that has been saved in XAML file format that redraws using openGL syntax based on matrix transformation. The result for multimarker AR application can be seen in Figure 8.6.
126 Advances in Computer Graphics and Virtual Environment
Figure 8.6 Sample for interacting with AR application using multiple marker
CONCLUSION AR application is a challenging and interesting application that can be built using a simple programming such as C++ or C# based on ARToolkitPLUS. ARToolkitPLUS is able to support .NET environment and worked together with WPF3D and DirectShow DLL form .NET library. AR provides us a realistic interaction between virtual object and real world.
Augmented Reality System Development: From Simple to Advance 127
REFERENCE
Azuma R. T, 1999, “The Challenge of Making Augmented Reality Work Outdoors.” In Mixed Reality: Merging Real and Virtual Worlds, pp. 379-390. Azuma R. T, 1997, “A Survey of Augmented Reality,” Presence: Teleoperators and Virtual Environments, vol.6, no.4, pp. 355385. ARToolkit Barakonyi, I., Weilguny, M., Psik, T., and Schmalstieg, D., 2005, “MonkeyBridge: autonomous agents in augmented reality games,” In Proc. ACM SIGCHI International Conference on Advances in computer entertainment technology, Valencia, Spain, pp. 172-175. Bianchi G., Jung C., Knoerlein B., Harders M., and Sz´Ekely, 2006, “High-fidelity visuo-haptic interaction with virtual objects in multi-modal ar systems”. In ISMAR 2006, October 2006. Casey Chesnut, 10/2/2007 Cheok, A. D., Sreekumar, A., Lei, C., and Thang, L. N., 2006, “Capture the Flag: Mixed-Reality Social Gaming with Smart Phones,” vol.5, no.2, pp. 62-69. Comport A.I., Marchand É. and Chaumette F., 2003, “A real-time tracker for markerless augmented reality”, Proceedings of the Second IEEE and ACM International Symposium on Mixed and Augmented Reality (ISMAR ’03). Knoerlein B., Sz´Ekely G., and Harders M., 2007, “Visuo-Haptic Collaborative Augmented Reality Ping-Pong”, ACE’07, Salzburg, Austria, pp.91-94 Newman J., Wagner M., Bauer M., Williams A.M., Pintaric T., Beyer D., Pustka D., Strasser F., Schmalstieg D. and Klinker G., 2004, “Ubiquitous Tracking for Augmented Reality”, Proceedings of the Third IEEE and ACM International Symposium on Mixed and Augmented Reality (ISMAR 2004).
128 Advances in Computer Graphics and Virtual Environment
Oda O., Lister L.J., White S.,and Feiner S., 2008, “ Developing an Augmented Reality Racing Game”, The Second International Conference on Intelligent Technologies for Interactive Entertainment (ICST INTETAIN ‘08), pp.1-8.
INDEX
Acoustic, 62, 68 Action-reaction, 62 aerospace, 96 Agents, 27, 70 Animated, 15 annotate, 103 Applications, 16, 70, 87, 111, 113, 114 autonomous, 26, 29, 30, 130
Data, 49, 50 developing, 37, 58, 73, 102, 117, 124 dynamic, 20, 23, 24, 25, 26, 27, 74, 79, 84
Cellular automata, 24 Characters, 59, 60 Characters movements, 59 collaborative, 68, 101, 109, 110 complex, 5, 15, 18, 20, 22, 24, 27, 30, 33, 36, 37, 40, 46, 62, 68, 74, 75, 80, 81, 86, 88, 105, 106, 107, 124
Flocks, 29 Forced feedback, 63 Frustum, 15, 48, 55
educational, 59, 101 Emotion, 71
Haptic, 62, 68, 69, 131 human haptic, 58, 59, 66, 68
Immersion, 67, 68
130 Advances in Computer Graphics and Virtual Environment
immersive, 1, 18, 27, 37, 58, 92, 101, 102
laser printer repairs, 106
medical, 5, 35, 43, 59, 90, 92, 96, 98, 99, 111, 116 military, 5, 59, 90, 96, 116 model, 1, 20, 25, 26, 27, 29, 30, 33, 37, 38, 40, 43, 44, 47, 52, 54, 63, 66, 78, 79, 102, 104, 105, 121, 124 multimarker, 128
Non-invasive sensors, 98 Non-photorealistic, 32, 34, 35 Nonverbal communication application, 110
Objects, 113 Occlusion culling, 82 Outdoor scene, 36 outdoor scenes, 36, 37
Particles, 26 Photorealism, 34
real multi-user, 108
real-time, 1, 2, 13, 18, 19, 20, 23, 27, 28, 29, 30, 38, 47, 48, 49, 54, 84, 90, 92, 94, 98, 103, 107, 111, 117, 120, 130 Real-time crowd, 19 row, 49, 51
scene, 3, 4, 7, 16, 32, 33, 35, 36, 37, 38, 39, 40, 41, 43, 48, 73, 74, 77, 78, 80, 81, 82, 85, 86, 97 Silhouette edge detection, 40 Simulator sickness, 79 sociable, 62, 66, 68 spatial, 20, 46, 47, 84, 85, 98 Spatial subdivision, 84
Terrain, 46, 47, 55, 56 Tile approach, 47
virtual, 1, 2, 4, 5, 6, 7, 10, 11, 12, 14, 15, 18, 19, 20, 21, 23, 27, 28, 29, 30, 32, 33, 37, 38, 43, 46, 58, 59, 60, 61, 62, 63, 64, 66, 68, 70, 71, 73, 74, 75, 77, 79, 80, 81, 86, 88, 90, 91, 92, 94, 95, 98, 99, 101, 102, 103, 106, 107, 109, 110, 111, 115, 116, 118, 121, 123, 126, 129, 130
Index 131
Virtual environment, 43 virtual heritage, 2, 10, 14, 18, 28
Visibility culling, 2, 82 walkthrough, 1, 4, 5, 6, 10, 11, 12, 16