A Video Game-Based Mobile Robot Simulation Environment

6 downloads 1764 Views 2MB Size Report
Modern computer games share much in common with modern mobile robot ... We use the open source Ogre3D [2] as our graphics engine. Ogre3D provides ..... [3] “PhysX physics engine,” http://www.ageia.com/products/physx.html. [4] “Open ...
A Video Game-Based Mobile Robot Simulation Environment Josh Faust

Cheryl Simon

William D. Smart

Department of Computer Science and Engineering Washington University in St. Louis One Brookings Drive St. Louis, MO 63130 United States {jef1,cls1,wds}@cse.wustl.edu

Abstract— Simulation is becoming an increasingly important aspect of mobile robots. As we are better able to simulate the real world, we can usefully perform more research in simulated environments. The key aspects of a good simulator, an accurate physics simulation and realistic graphical rendering system, are also central to modern computer games. In this paper, we describe a robot simulation environment built from technologies typically used in computer video games. The simulator is capable of simulating multiple robots, with realistic physics and rendering. It can also support human-controlled avatars using a traditional first-person interface. This allows us to perform robothuman interaction and collaboration studies in the simulated environment. The distributed nature of the simulation allows us to perform large-scale experiments, with users participating from geographically remote locations.

I. I NTRODUCTION Simulation is becoming an increasingly important aspect of mobile robotics. As we become better able to realistically simulate the world, more and more development can usefully be done in these environments. High-quality simulators allow us to develop software for mobile robots without having to contend with the ever-present hardware failures, and without having to worry about battery life. They also allow us to perform research with large numbers of robots. Simulators also allow us to easily change the environmental parameters of the world (such as the lighting level) quickly, and without inconveniencing other members of the laboratory. Modern computer games share much in common with modern mobile robot simulators. They include high-quality physics simulations. They are capable of rendering highly realistic views of the simulated environment. They are capable of supporting many interacting objects and players in a large world. We have developed a robot simulation environment based on technologies commonly used to develop distributed computer games. The simulator allows us to run several mobile robots in a shared environment, with realistic physics and graphics. Additionally, unlike other existing simulators, we can also support human-controlled characters (avatars) in the world. This allows us to perform robot-human interaction and collaboration studies in the simulation environment.

In this paper, we begin by describing the component technologies used to build the simulator. We describe the overall architecture, and discuss how simulated robots are controlled through the well-known Player Application Programmer Interface [1]. We give examples of the simulator in use, and discuss our plans for future experiments in the simulated environments. II. C OMPONENT T ECHNOLOGIES The simulation environment was designed to take advantage of existing technologies used for distributed interactive computer games. Gaming applications have much in common with robotics simulations, and require many of the same features, such as high-quality computer graphics and an accurate physics simulation. In particular, we use computer games technology in three areas of our simulator: 3D graphics, physics simulation, and networking. All of the technologies that we use in the simulator are cross-platform, allowing both clients and servers to run on a variety of operating systems. A. 3D Graphics We use the open source Ogre3D [2] as our graphics engine. Ogre3D provides many of the advanced features seen in current commercial and open-source games. In particular, it provides support for using the vertex and pixel shaders on the graphics card, which provides high-quality graphics while reducing the load on the CPU. It also provides advanced features such as level-of-detail rendering and scripted animations, which are not present in lower-level APIs, such as OpenGL. The engine provides support for both indoor and outdoor environments and a flexible scene management system (for specifying the environment). It also gives us easy access to “special effects” such as particle systems (for smoke, and similar effects), transparency, realistic shadow effects, and realistic material properties. This is important to us, since it allows us to generate more realistic synthetic camera images. The more realistic these images are, the more likely it is that the same computer vision algorithms will work both in the simulator and in the real world.

B. Physics Simulation For our physics simulation, we use Ageia’s PhysX SDK [3]. This is an advanced commercial physics engine, designed for the computer games industry, and is free for non-commercial use. At the time of writing PhysX is the fastest freely available physics engine of its type. In addition, Ageia also produce a hardware-accelerated physics engine which will allow us to increase the number of robots that we can support in the simulator by one or two orders of magnitude. The PhysX engine is an appealing choice because it supports advanced material properties, such as regular and anisotropic friction. This allows us to model objects in the world accurately, and provide a more realistic simulation that would be possible with other freely-available physics engines. These material properties allow us to support accurate robot-ground interactions for a wide variety of terrain types, such as concrete, carpet, and ice. PhysX is quite similar to the Open Dynamics Engine [4], a commonly-used physics engine in robot simulators. However, PhysX uses a faster physics integration algorithm, allowing it to simulate many more colliding objects, has more exact collision detection than ODE’s heuristic system, and is widely regarded as being more stable. These features, coupled with possible hardware acceleration make PhysX a better choice, in our opinion. C. Networking High-performance networking is provided by the TNL, the Torque Networking Library [5]. This is a networking subsystem designed for multi-player distributed games, and has been used in a number of professional products. It is open source, and provides a secure, efficient and robust networking layer. D. Interoperability All of the technologies that we use are multi-platform and are not tied to a specific operating system. We have adopted the Player [1] API as our interface to robots in our simulator. This allows us to take advantage of a well-developed API and a large existing user community. We have also built in support for COLLADA [6], an emerging XML standard that describes the graphics, physics, and materials of 3D artwork. We can use models created in most major 3D modeling programs, including Maya [7], 3D Studio Max [8], Blender [9], and Softimage/XSI [10], by exploiting the COLLADA format. These COLLADA descriptions can be loaded directly into our simulator. This allows designers to create complete 3D scenes in their preferred modeling software (including structural, physics, texture, and material property information), and to easily import them into the simulator. Again, this allows us to take advantage of a huge body of pre-existing models created for other purposes. Using a professional modeling package makes it easier to develop accurate models of the robots and other objects in the environment. This is important since, without such accurate models, any simulated camera images will be unrealistic, and

any software that uses them will not transfer to the real world. Although it is possible to develop similarly-detailed models using low-level graphics APIs, such as OpenGL, it requires much more effort to do so. III. S YSTEM A RCHITECTURE The simulator environment is organized as a client-server system. Robot and human clients connect to the server, which performs the processing necessary to simulate the world. The number of clients that a single server can accommodate is limited only by computational resources. The overall system architecture is shown in figure 1. A. The Server The server keeps track of the properties of all objects in world, runs the physics simulation, calculates the simulated sensor data, and manages the connections to all the clients. All communications between the server and the clients are handled by TNL. Commands from clients are translated by the server into sequences of actuator motions. The results of these motions are determined by the physics simulation, and the resulting new world state communicated back to the clients. Simulated sensor data are generated by directly querying the underlying representations of the world. These data are then modified to reflect realistic measurement error, and sent back to the robot clients. Currently, we have simple (Gaussian noise) error models for distance sensors, but it would be straightforward to add more realistic models that take the surface properties of the detected object into account. B. The Robot Client The robot client allows the robots in the simulated environment to be controlled. It has no graphical interface, since all interactions are performed though the sensor-actuator Application Programmer Interface (API). A single client can accept connections from many robot controllers, each controlling a different robot in the environment. Information about the robots are loaded from an XML configuration file that specifies the robot type, sensors, initial position in the world, and connection settings to communicate with the controllers. When a robot client connects to the server, a new robot is created (“spawned”) in the environment, and the appropriate TNL communication mechanisms are enabled to allow data to be passed back and forth between the client and the server. Clients request new simulated sensor data from the server, in response to API calls by the robot controllers. One or more robot controllers can communicate with a robot client using the Player protocol [1] over a standard TCP socket. We assume that the controller and the client communicate over a low-latency link, such as a Local Area Network (LAN). We chose to use the Player protocol for two main reasons: the Player API is well-established and many researchers are familiar with it, and API bindings already exist in a number of major languages. Users already familiar with the Player

robot controller robot controller

robot client

robot client server

human client

observer client Fig. 1.

Fig. 2.

robot controller

Overall system architecture

Pioneer 3 model and actual robot [11]

API can use their robot controllers in our simulation with no modification. Although we have chosen to use the Player protocol initially, we have designed the robot clients to make it straightforward to add other APIs. Since the API-specific code is limited to the robot client, no changes need to be made to the server, assuming the models for the robots and sensors already exist. All that must be changed on the client-side is how information is repackaged from the replicated objects for the new API. However, if a new robot or sensor is required, a model of it must be created, and support for generating simulated measurements from it added to the server. Currently, our robot client supports differential drive robots, such as the ActivMedia Pioneer 3 [11] (see figure 2) and a subset of the Player API (position, laser range-finder, and camera proxies). Since there is no graphical interface for the robot client it has very low computational requirements, and can be run on inexpensive, low-end systems. The robot controller, however, will have its own requirements. To ensure good performance, the client machine should be connected to the server by a high speed (10 Mbps or better) connection. C. The Human Client In addition to robot clients that interact with the world through an API, we also provide for human clients using a standard first-person game interface (figure 3(b)). This client allows an operator to control a human-shaped avatar in the simulated environment using standard input devices (mouse,

keyboard, and joystick). The avatar can interact with objects in the world, and is subject to the laws of physics. Interaction with other human clients is currently possible using a textbased chat interface. The avatar appears to other clients, including robots, as a human-shaped entity with appropriate body motions (figure 3(a)). The human operator provides high-level commands to the avatar, such as movement and gaze direction, just as in a first-person game. An automatic low-level controller then supplies appropriate limb motions to make the movement of the avatar look realistic to any observers. D. The Observer Client A third type of client allows human operators to observe the activity in the simulated environment, without interacting with it (figure 3(c)). Observer clients have no simulated physical presence in the world, and are not bound by the laws of physics. They can move arbitrarily about in all three dimensions and cannot be seen by other clients. This allows a third-person view of the activity in the world while running experiments. As with human clients, observers are controlled using standard input devices, such as keyboard, mouse, and joystick. E. Client-Server Interactions All client-server interactions are performed through TNL, and take one of two forms: event messages and replicated objects. Event messages are implemented using TNL NetEvents. These are atomic messages used to inform a client or server of an event, such as user input, or a request for sensor data. Replicated objects are implemented using TNL GhostConnections. This allows for data structures on the server which are automatically replicated on each of the clients. Optimizations are performed to minimize the bandwidth needed by these updates. For example, only data elements that have changed are actually sent, and only objects that are currently necessary for the client are actively updated. For example, only the world objects that are currently in the camera’s view frustum are updated. Updates are performed automatically whenever the data held in the object change, and rely on serialize/deserialize functions being available for the object.

(a) Robot’s view. Fig. 3.

(b) Human’s view.

(c) Observer’s view.

Robot, human, and observer points-of-view in the simulation environment. The human avatar is a model supplied with Ogre3D.

All network communications use a novel network protocol unique to TNL, called the “Notify” protocol. This is a connection-oriented unreliable communication protocol with packet delivery notification. Processes receive receipt notifications for each packet that is successfully delivered. The receipt notifications are packed into the headers of other data packets, to conserve bandwidth. This protocol has been shown to be more efficient than either UDP or TCP in the distributed game setting. A number of mechanisms are available in TNL to provide for guaranteed ordered delivery (like TCP), guaranteed delivery, unguaranteed delivery (like UDP), and quickest possible delivery (similar to TCP out-of-band data). Encryption mechanisms are also available for all delivery options, although they are not currently being used in our simulation environment. Input from a human or observer client is handled in two ways. For the most common interactions (such as movement, rotation, and simple world interactions), an TNL UserAction message is sent to the server periodically. This message contains information about attempted movement, attempted rotation, and a description of the state of a set of “action keys” on the keyboard and mouse that control more complex interactions. This abstraction allows a human to locally map keys and mouse motions to different actions, and makes the addition of new input devices straightforward. To handle other less common actions, a special TNL NetEvent message containing all of the details of the action is dispatched.

approximately 10 laser range-finder sensors, each requiring 360 ray-casts, at a frequency of 10Hz. While we can accurately simulate cameras, based on a 3D rendering of the world, this is also an expensive operation. Each camera in the simulation requires the world to be rendered from a new view point. If there are many cameras, this again increases the computational load on the server. It is possible to address this by adding more, faster graphics cards to the server (since the rendering can be performed on the GPU). However, just like ray-casting, there will be a fundamental limit to how many cameras we can effectively support. Currently, on our test development machine, with a single graphics card, we can support three cameras operating at 30Hz without noticing a slow-down in the simulation. In the clients, the rendering of the scene is the major computational expense. We can improve performance by reducing the quality of the rendering for human operators. We can disable advanced features such as realistic shadows, or reduce the resolution of the rendered scene. In order to provide for better scalability, we are currently looking at ways to distribute the server computations over multiple computers, and across multiple threads in a single computer. We are also investigating how the clients can bear more of the computational load. This will allow us to scale the system to much larger environments, with many more connected clients.

IV. I MPLEMENTATION AND S CALING I SSUES

Our simulator can be used similarly to the other systems described in section VI. Robots can operate in the simulated environment, receiving simulated sensor data and interacting with the world using a realistic physics model (see figures 4, 5, and 6). However, the ability to integrate distributed human players into the simulation allows us to perform novel experiments that are unsupported by any other simulator. In particular, we can perform robot-human collaboration experiments. Such experiments can either involve collaborative teams or antagonistic teams. Collaborative teams are composed of human and robot clients, who work together to perform some task, such as searching a building. There are two possible variations of this task. In the first, the robots are autonomous, and all interaction with them must happen

The most computationally expensive part of the simulation are the physics calculations. In particular, the ray-casting necessary to calculate simulated sensor readings is a severe bottleneck. As we add robots with laser range-finders to the simulation, the slow-down is very noticeable. This can be addressed by reducing the data rate or the fidelity of the sensors, but this is undesirable, since it makes the simulated sensors different from their real counterparts. However, special-purpose hardware support for the PhysX API will allow us to perform ray-casting in hardware. This will significantly increase the number of high-fidelity sensors that we can support in our environment. Currently, using a Pentium-IV class computer, with 512MB of RAM, running Windows XP, we can support

V. U SING THE S IMULATOR

Fig. 4. Several robots in a simulated environment, with laser sensor visualizations enabled.

at a high level. In the second variation, the robots are teleoperated, or have mixed-initiative controllers [12]. The humans controlling the robots work with the humans controlling the avatars in the world to perform a given task. Search and rescue operations are a good example of a collaborative task. In antagonistic experiments, one team tries to accomplish a goal while another tries to stop them. Both teams can have humans and robots on them. Games such as “capture the flag” are good examples of antagonistic tasks. Some tasks, such as perimeter guarding, are criticallydependent on human behavior. Our simulator allows us to verify these algorithms without venturing out into the real world, and without committing to an almost certainly flawed model of human behavior. For example, in the perimeter guarding example, we can deploy our robots on the perimeter, and have a team of human avatars try to penetrate it. We do not have to model the humans avatars’ behavior, since they are being directly controlled by humans. Most likely, these humans have a lot of experience with first-person computer games, and will behave in an appropriately realistic manner in the simulated environment. VI. R ELATED W ORK

Fig. 5. A robot in a simulated environment, with physics debugging information and sensor visualizations enabled.

Fig. 6.

A close-up of the robot in figure 5.

There are a number of robot simulation environments available currently. In this section we briefly survey the most wellknown, and discuss how they differ from our own environment. All of the environments have common elements. They all use a physics simulation, a graphics system, and a modeling language. While the choice of physics simulation and modeling language vary, most of the systems described here use (the low-level) OpenGL as their underlying graphics system. The closest existing system to the one described in this paper is USARsim [13], a high-fidelity simulator based on Epic Games’ UnrealEngine2 engine. USARsim was explicitly developed to serve as a simulator for urban search and rescue (USAR) robots, and includes models of the NIST reference test arenas for mobile robots [14]. Robots sensors and actuators can be accessed through either the Player [1] or Pyro [15] APIs. Six robot types, including the popular ActivMedia Pioneer 2 series, are supported. Since the system is based on a game engine, human-controlled avatars should be available, but these are not mentioned in the published descriptions of the system. Webots [16] is a commercial simulation package produced by Cyberbotics. It uses the open source Open Dynamics Engine (ODE) [4] to simulate realistic physics, and includes an extensive library of sensors and actuators from which complete robots can be built. Users can program their simulated robots in C, C++, or Java, and this code can then be transferred to a real robot (for a subset of the supported robots). The package includes tools for editing robot configurations and environments, and can inter-operate with other modeling systems using the VRML97 standard. Webots runs on Windows, Linux, and Mac operating systems. Darwin2K is an open source simulation built in support of evolutionary robotics research [17]. It models the robot system

at a very low level, since the purpose of the evolutionary algorithms is to discover novel and useful configurations of primitive components, such as gears and bushings. Darwin2k runs on a Linux or Irix operating system under X windows. OpenSim [18] is another open source simulation package under active development. It uses ODE for physics simulation, and is focused on supporting research into the inverse kinematics of redundant manipulators. Perhaps the most widely-used open source simulation environment is the Player/Stage project [1], [19]. Player provides device abstractions for real and simulated robots. Stage provides a traditional 2D simulated environment, while Gazebo [20] is a full 3D simulation with realistic physics. Gazebo uses ODE for its physics simulation, and can run on Linux and Mac OS X machines. Our simulation environment differs from those described in the section in several ways. We use a different physics simulation package, described in section II-B. In itself, this is not particularly significant, since all physics simulations perform the same basic job. However, as we discussed earlier, our system can potentially use hardware acceleration and simulate much more complex systems. We also use a fully-featured 3D graphics engine for our graphics system. This gives us a level of abstraction from the low-level graphics primitives that most other environments do not have. This means that it is easier for us to add realistic visual effects to our environments, such as smoke, transparency, and dynamic lighting conditions. It also allows us to adapt to different operating systems more easily, since the engine can use either OpenGL or DirectX primitives, depending on which is more efficient. The simulator is built on open standards, allowing robot, world, and object development to be carried out in a wide variety of commercial modeling tools. This allows professionallydesigned objects to be used easily in the simulation. Finally, we have the ability to have human-controlled agents in our worlds, using techniques common in video games. While it is possible, in principle, to also do this in the other environments, we have never seen it reported. Having mixed teams of robots and humans allows for a much greater range of possible experiments using the simulation environment. VII. C ONCLUSIONS AND F UTURE W ORK In this paper, we have described a mobile robot simulation environment, built from technologies commonly used in computer games. The simulator includes a realistic physics simulation and high-quality graphics rendering system. In addition to supporting simulated robots controlled using the Player API, we also support human-controlled avatars. This greatly increases the type of experiments that can be usefully carried out in the simulator. In particular, it allows us to conduct human-robot collaboration experiments. The simulator is still under active development at the time of writing. In particular, we are focusing on adding more objects (robots, sensors, avatars, and “props”) to the world, adding

more complete support for the Player API, and adding more realistic, empirically-determined sensor measurement errors. In the longer term, we would like to add a number of new features to the system. We currently have a keyboardbased chat interface to allow humans to communicate with each other. We will replace this with an audio channel that humans could directly speak into. Any humans or robots close to the speaker would be able to “hear” what was said, at an appropriate volume. We also plan on adding realistic robot and environmental noises, so that humans (and other robots) can hear robots approaching. Avatars currently have only a simple movement animation. We plan to implement a richer motion vocabulary for them, based on recent work in computer games. This will also allow an avatar to have a richer interaction with the world, and to manipulate objects directly. ACKNOWLEDGMENTS We would like to thank Joyce Santos for building many of the initial models for the simulation environment, including the Pioneer 3. We would also like to thank the Ogre3D developers and forum members, the TNL developers, and the Ageia PhysX developers, for providing the core technologies for this work. R EFERENCES [1] B. P. Gerkey, R. T. Vaughan, K. Stoy, A. Howard, G. S. Sukhatmem, and M. J. Matari´c, “Most valuable player: A robot device server for distributed control,” in IEEE/RSJ International Conference on Intelligent Robots and Systems, 2001, pp. 1226–1231. [2] “Ogre3D open source graphics engine,” http://www.ogre3d.org/. [3] “PhysX physics engine,” http://www.ageia.com/products/physx.html. [4] “Open dynamics engine,” http://www.ode.org/. [5] “TNL: The Torque Networking Library,” http://www.opentnl.org/. [6] “COLLADA an open digital asset exchange schema for the interactive 3d industry,” http://www.collada.org/. [7] “Maya modeling, animation, and rendering system,” http://www.alias.com/. [8] “3D Studio Max,” http://www.autodesk.com. [9] “Blender open source 3d graphics system,” http://www.blender.org/. [10] “Softimage/XSI digital character software,” http://www.softimage.com/Products/Xsi/v5/. [11] “Activmedia pioneer 3 robot,” http://www.activrobots.com/. [12] D.J.Bruemmer and M. Anderson, “Intelligent autonomy for remote characterization of hazardous environments,” in Proceedings of the IEEE International Symposium on Intelligent Control, Houston, TX, October 2003. [13] J. Wang, M. Lewis, S. Hughes, M. Koes, and S. Carpin, “Validating USARsim for use in HRI research,” in Proceedings of the Human Factors and Ergonomics Society 49th Annual Meeting, 2005, pp. 457– 461. [14] A. Jacoff, E. Messina, and J. Evans, “Experiences in deploying test arenas for autonomous mobile robots,” in Proceedings of the 2001 Performance Metrics for Intelligent Systems (PerMIS) Workshop, Mexico City, Mexico, 2001. [15] D. Blank, D. Kumar, L. Meeden, and H. Yanco, “The pyro toolkit for ai and robotics,” AI Magazine, vol. 27, no. 1, pp. 39–50, Spring 2006. [16] “Webots mobile robot simulation,” http://www.cyberbotics.com/. [17] C. Leger, Darwin2K: An Evolutionary Approach to Automated Design for Robotics. Boston, MA: Kluwer Academic Publishers, 2000. [18] “OpenSim: An open 3d robotics simulator,” http:///opensimulator.sourceforge.net/. [19] “The Player/Stage project,” http://playerstage.sourceforge.net/. [20] N. Koenig and A. Howard, “Design and use paradigms for gazebo, an open-source multi-robot simulator,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2004), 2004, pp. 2149– 2154.

Suggest Documents