Mirror World Navigation for Mobile Users Based on Augmented Reality Patricia P. Wang
Tao Wang
Dayong Ding
Intel China Research Center
Intel China Research Center
Intel China Research Center
[email protected] Yimin Zhang
[email protected] Wenyuan Bi
Intel China Research Center
[email protected]
[email protected] Yingze Bao
Tsinghua University
Tsinghua University
[email protected]
[email protected]
ABSTRACT
Geometry models
Geo-tagged photos
Finding a destination in unfamiliar environments of complex office building or shopping mall has been always bothering us in daily life. Fine-scale directional guidance, a combination of location based service and context aware service, is emerging along with the technology developments of mobile computing, wireless communication, and augmented reality. This paper addresses the core problem of how to align what we see with where we are. In order to achieve robust and accurate navigation by 2D/3D alignment, we developed a hybrid solution by fusing GPS, inertial measurement units and computer vision methods together. We have implemented a prototype of Mobile Augmented Reality and 3D navigation on the manually created virtual environment of Intel China Research Center office and surroundings.
Photo-realistic models
Refinement
Reconstruction
Panorama
Augmentation
City navigation
Registration
Aligned landmarks
Terrain / aerial image
Figure 1: Workflow of the fine-scale directional guidance using the concept of augmented reality.
Categories and Subject Descriptors I.4.9 [Computing Methodologies]: Image Processing and Computer Vision—Applications; H.4.m [Information Systems Applications]: Miscellaneous
great demand for information delivery on the virtual representation of real world, many 3D cities have been created either by companies like Google, Microsoft, or by government like Dublin, Berlin. To maintain and update mirror world is incredibly laborious and challenging. Making use of usergenerated geo-tagging data would be an alternative. Therefore, developing automatic or semi-automatic techniques to facilitate amateurs to create and navigate mirror world is highly interested by both academy and industry [4]. This paper proposes an integrated solution for mirror world navigation using the concept of augmented reality. There are four major stages: reconstruction, refinement, registration and augmentation as illustrated in Figure 1.
General Terms Algorithms, Design, Human Factors, Performance
Keywords Mobile Augmented Reality, mirror world navigation, parallel tracking and mapping, camera pose estimation, GPS, INS
1.
INTRODUCTION
• Reconstruction stage is to recover structure from images either using multi-view stereo techniques or manual work. As relevant images may be captured by various cameras at different lighting conditions with cluttering cars and pedestrians, it has to calibrate camera parameters, remove clutters, and detect 3D point correspondences.
From plain text, 2D images to 3D graphics, the evolution of information representation enables us more immersive experience when browsing and searching things in between virtual space and real world. The development of mobility and communication technologies makes it possible to access information at anywhere and anytime. Being aware of the
• Refinement stage is to enhance geometry structure, extract facade from 2D image, and attach to 3D surfaces. As buildings vary in roof, floor and stylish design, it is hard to apply general architectural grammars to characterize specific mesh. And there is a trade-off between realistic effect and model size.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. MM’09, October 19–24, 2009, Beijing, China. Copyright 2009 ACM 978-1-60558-608-3/09/10 ...$10.00.
• Registration stage is to put 3D models into a certain
1025
3 magnetometers
ωx
3 gyros
ωy
ωz ax
3 accelerometers
Figure 2: Top: manually created mirror world in our navigation and augmentation prototype. Bottom left: Prototype running on SAMSUNG Q1 UMPC. Bottom right: snapshot of indoor augmented reality.
az
an Axis Transform
ae
av
Velocity Numerical Integration Translation
Figure 3: Block diagram of estimating position, velocity, and attitude by Inertial Measurement Units.
3. 3D TRACKING AND MAPPING 3D tracking techniques aim at continuously recovering camera pose (position and orientation) relative to the scene, or, equivalently, the 3D displacement of an object relative to the camera. It is a critical component of Augmented Reality applications. The objects in the real and virtual worlds must be properly aligned with respect to each other [2]. A 3D point M = [x, y, z, 1]T in world coordinates is projected into the 2D point m = [u, v, 1]T in camera coordinates under camera pose P as m = Cam(PM), where P is a 3 × 4 rigid transformation matrix, consisting of rotation R and translation t. Function Cam(·) models the projection from camera frame to image as Cam([x, y, z]T ) = rcoordinates x/z u0 fu 0 , where r = (x2 + y 2 )/z 2 + 0 fv v0 r y/z and r = ω −1 arctan(2r tan(ω/2)). The camera parameters: focal length (fu , fv ), principal point (u0 , v0 ) and distortion ω are assumed to be known by camera calibration. A planar, squared, black on white marker is enough to estimate the camera pose P. Marker-based approaches have become popular as it yields a robust, low-cost solution for real-time 3D tracking [1]. Scene engineering work is done by setting unique markers for every 2 × 2m2 . Because it has a low CPU requirement, marker-based tracking techniques could be running in real-time on mobile devices.
coordinate system. Geo-tagging photos contain GPS information, whose coverage is not available for indoor. Motion sensors like accelerometer, magnetometer, and gyroscope, could provide complementary information to meet the need of fine-scale positioning technologies. • Augmentation stage is to enhance users’ perception by fusing directional guidance into his view of real world. Scene rendering engine overlay text, image, or graphics onto live video stream. 3D tracking and camera pose estimation are required to facilitate real-time user interactions. We manually created hundreds of 3D models for indoor office and outdoor surroundings of Intel China Research Center, where some of the examples have been shown in Figure 2. As to the automated techniques of reconstruction and refinement for large-scale cityscape, that would be our ongoing work. In our current prototype, we focus more efforts on the registration and augmentation techniques.
2.
ay
Quaternions
Compute Attitude
GPS/IMU SENSOR FUSION
Integrating GPS (Global Positioning System) with INS (Inertial Navigation System) has become indispensable for providing precise and continuous positioning information. In our system, we employ MotionNode [3], a total of nine high quality sensors, to compute miniature pose parameters. One accelerometer, one gyroscope, and one magnetometer contribute data for each of the three axes. A block diagram of estimating position, velocity, and attitude by Inertial Measurement Units is illustrated in Figure 3. Parameter ω denotes the angular velocity and a denotes acceleration. Quaternion is a way of pose parameterization in the 3D space. A rotation about the unit vector ω by an angle θ is represented by the unit quaternion q = (cos( 21 θ), ω sin( 21 θ)). This representation avoids gimbal lock therefore does not result in ill-conditioned optimization problem. The accuracy of the integrated system is dependent on 2 two things. The positioning error is expressed as: σRT = 2 2 σGPS + σINS (t). Extended Kalman Filter (EKF) could combine noisy measurements from different cues in a statistically well-grounded way [2]. GPS data is used to obtain reference trajectory, where EKF estimates error states and proceeds to a “measurement update” to correct reference trajectory.
4. CONCLUSIONS This paper presents a hybrid solution to mirror world navigation. Integration of GPS and INS could provide accurate location information in the world coordinate system. And marker-based tracking could reinforce the robustness of visually directional guidance on mobile device. Automated techniques of mirror world creation would be our future work.
5. REFERENCES [1] ARToolKit ”http://www.hitl.washington.edu/artoolkit/” [2] V. Lepetit. Monocular model-based 3d tracking of rigid objects: A survey. Foundations and Trends in Computer Graphics and Vision, 1(1):1-89, 2005. [3] MotionNode. http://www.motionnode.com/. [4] A. Zakhor. Automatic 3d modeling of cities with multimodal air and ground sensors. Multimodal Surveillance, Sensors, Algorithms and Systems, 15:339-362, 2007.
1026