Examining User Experiences in a Mobile Augmented Reality Tourist Guide David Střelák
Filip Škola
Fotis Liarokapis
Masaryk University Faculty of Informatics, HCI Lab Brno, Czech Republic
Masaryk University Faculty of Informatics, HCI Lab Brno, Czech Republic
Masaryk University Faculty of Informatics, HCI Lab Brno, Czech Republic
[email protected]
[email protected]
[email protected]
ABSTRACT
describe applications [6].
This paper presents a mobile augmented reality guide for cultural heritage. The main focus of this research was to examine user experiences in a mobile augmented reality tourist guide. Real-time tracking was performed using either computer vision techniques or sensors technologies (i.e. GPS and gyroscopes). The main features of the AR touristic application were evaluated with 30 healthy volunteers (15 males and 15 females). Results showed that users found the sensor approach easy to use and intuitive. The majority reported fast adaptation to the AR application. As far as gender differences are concerned, females were more satisfied with the AR experience compared to males and also reported higher temporal demand. Overall, feedback showed that AR technology has all the potential to be used for tourist guides since it is easy to use and intuitive.
In the past years, AR has been applied experimentally for a number of cultural heritage applications including both indoor environments (i.e. museums, galleries) as well as outdoor environments (i.e. tourist guides). Mobile AR seems to be able to provide all the necessary technology for guiding tourists through urban environments. An account of the user’s cognitive environment is required to ensure that representations are not just delivered on technical but also usability criteria. A key concept for all mobile applications based upon location is the 'cognitive map' of the environment held in mental image form by the user [7].
Categories and Subject Descriptors • Human-centered computing ➝ Ubiquitous and mobile computing • Human-centered computing ➝ Interaction paradigms ➝ Mixed / augmented reality
Keywords augmented reality; cultural heritage; urban environments.
1. INTRODUCTION Augmented reality (AR) is a collection of interactive technologies that merge real and virtual content and provide accurate registration in three-dimensions [1]. In 1994, the virtuality continuum tried to differentiate between the virtual, real and augmented environments [2]. An enriched taxonomy was proposed in 2002 [3], where for example video warping was included to compensate distortion caused by lens or removing parts of the scene (i.e. advertising billboards). More recently, more taxonomies were introduced including one based on AR subsystems [4], another based on functional purpose [5], and a four-dimensional (tracking degrees of freedom, augmentation type, temporal base and non-visual rendering modalities) to Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from
[email protected]. Petra’16, June 29 - July 1, 2016, Corfu Island, Greece. © 2016 ACM. ISBN 978-1-4503-4337-4/16/06...$15.00 DOI: http://dx.doi.org/10.1145/2910674.2935835
The aim of this paper was to design, implement and evaluate an AR touristic guide for mobile devices (smartphones or tablets) that is able to present interactive historical information to tourists. As a case study the historic city centre of Brno (Czech Republic) was selected. Real-time tracking was performed using either computer vision techniques or sensors technologies (i.e. GPS and gyroscopes). Additionally, focus was given in examining user experiences in the mobile AR tourist guide. The main features of the AR touristic application were evaluated with 30 healthy volunteers (15 male and 15 female). The cognitive workload of the participants as well as their presence in the AR environment was measured. Results showed that AR technology has all the potential to be used for tourist guides since it is easy to use and intuitive. The rest of the paper is structured as follows. Section 2 presents similar AR systems. Section 3 presents how an old church (which does not exist anymore) was modeled in three-dimensions. Sections 4 and 5, present the computer vision approach and sensor solution for the AR application. Section 6 presents the evaluation results whereas section 7, conclusions and future work.
2. BACKGROUND During the past 20 years, there have been a number of AR guides proposed for touristic purposes. In 1997, a prototype system explored how AR and mobile computing might together make possible wearable computer systems that can support users in their everyday interactions with the world [8]. Researchers used a headtracked, see-through, head-worn, 3D display and an untracked, opaque, handheld, 2D display with stylus and track-pad. A few years later, another system, called MARS, was developed to aid navigation and to deliver location-based information to tourists in a city [9]. The system made use of a computer, a GPS system, a see-through head-worn display and a stylus-operated computer.
Interaction was performed via a stylus and display is via a tracked see-through head-worn display. MARS, like most current mobile AR systems, has significant ergonomic restrictions which stretch the definitive of mobile and wearable computing beyond what is acceptable for most users (the system is driven by a computer contained in a backpack). In 2002, a unified interface was designed to support outdoor mobile AR applications and indoor VR applications [10]. The AR solution incorporated 3D interaction techniques, modeling techniques, tracked input gloves and a menu control system, to build VR/AR applications that can be applied to construct complex models of objects in both indoor and outdoor environments. In the same year, ARLib was proposed [11], a mobile location-based AR application aiming in assist users in typical tasks that are performed within a library environment. Another system showed how AR can be used as a part of historical site by obtaining location via DGPS together with comparison of live-view images (using calibrated pictures from viewpoints) [12]. Interesting fact is that they used Fourier shift theorem to estimate translation between images. Another approach to AR in context of historical heritage is shown by [13]. Authors show how drawings and paintings can be superimposed to the live camera view, and how to filter the scene images to keep the same presentation style as the included objects. Time Warp [14] was an outdoor AR mobile game placed in the city of Cologne (Germany), which combined artificial static and animated objects with sound. Authors presented a universal mechanism to define and setup multi-modal user interfaces for the game challenges. MapLens [15] was a location based game combining paper maps and digital information overlay containing photos and Points-of-Interest (POIs). In 2008, a highly customisable mobile framework illustrated how personalized visits to open-air heritage sites can be performed [16]. Navigation is based on the sensors (GPS and digital compass) and presentation was delivered in a two-dimensional domain (digital map), a three-dimensional domain (VR map) or an AR domain (textual information). These tools can provide an intelligent mobile guide, allowing users to define routes through sites that best satisfy their information needs, and take account of their declared interests to ensure that they do not miss any particular exhibit. Users can perform advanced searches that take into account spatial location and personal interest. More recently, in 2014 Dublin AR project [17] investigated 26 international and domestic tourists’ requirements for the development of a mobile AR tourism application in urban heritage. The findings suggested that although AR has passed the hype stage, the technology is just on the verge of being implemented in a meaningful way in the tourism industry. In 2015, the CorfuAR project [18] supported personalized recommendations in a mobile AR environment. A field study on Corfu visitors showed that the functional properties of CorfuAR evoke feelings of pleasure and arousal, which, in turn, influence the behavioral intention of using it.
3. MODELING THE PAST
Origins of St. Nicolas church date back to the end of 13th century when it was built in early gothic style in the ‘lower market’, today’s Náměstí Svobody (Freedom Square) in Brno. In 16th century, a Flèche (Timber spire, spirelet) and a west frontage with tower (in baroque style) were added. Next to it was, later on, built an administrative building, where people could weight and
measure goods and had to pay taxes from it (městská váha). Both buildings were then rebuilt in 18th century in high baroque style. During the rule of Joseph II, the church was deconsecrated and since then, it was used as a military warehouse. Between 18691870, the church with attached building was pulled down. The foundations were further damaged during Second World War and by engineering networks. During reconstruction of the square in 2006, a brass line and plaque were put into the pavement in the place of the original gothic building (Figure 1 - a).
Figure 1. Original information (a) church outline in the pavement (b) perspective grid to estimate sizes, (c) church blueprint In addition to historical paintings, several photos of the church and administrative building were preserved to nowadays, that capture most of the two buildings just before pulling down. Using the blue print with scale, a grid was overlaid over the pictures to estimate the sizes (see Figure 1 - b and Figure 1 -c), using perspective grid tool. Based on these pictures, a first, simple version of the church was created in 3D Max Studio 2015. However, this model was, due to errors in measurements and wrong scale in blueprint, disproportionate. As additional documents from the time just before pulling down were found in archives, new version was created by Archaia, using Google SketchUp. To be able to use the model in ARGuide, it had to be converted to .obj format. Export, however, created number of unwanted vertexes and double faces. These problems were manually corrected, resulting in vertex drop from 220000 to 7500 (65500 polygons to 7400).
4. COMPUTER VISION APPROACH 4.1 Camera Calibration Using homogeneous coordinates and pinhole camera model, projection of a 3D point M can be described as: 𝑚=𝐴 [𝑅 𝑡], where (R, t) represents rotation and translation of the camera (extrinsic parameters) and A is camera intrinsic matrix with five degree of freedom (DOF). There are several algorithms available for calibration. One can use photogrammetric calibration, using known geometry (and optionally its translation) [19] or using selfcalibration, where images of the static scene are taken and the camera parameters are estimated [20]. The method used in this paper is using a known planar pattern shown in different orientations and positions [21]. Neither translation nor orientation has to be known a priory. Zhang’s algorithm for distortion estimation and camera calibration, works as follows. First, several images of the pattern used are taken from different angles. Then,
the calibration classic chessboard pattern is detected in the images. Since the pattern is known, this will create pairs of 3D-2D coordinates. Camera intrinsic parameters are estimated, assuming the distortion parameters to be all zeros. Camera extrinsic parameters are estimated using perspective-n-point (PnP). PnP problem is trying, for n correspondences of 3D points and their 2D projection (n>3), to determine extrinsic (i.e. position and rotation) of the perspective camera with known intrinsic parameters. Next, distortion parameters are estimated, assuming intrinsic and extrinsic are known. Finally, by running LevenbergMarquardt [22], [23] optimization, the minimal re-projection error is computed; hence all parameters values are estimated. At the end of the calculation, calibration matrix is stored to device internal memory and undistorted transformation is applied to video stream frames for user to visually verify that the process was successful.
Náměstí Svobody in Brno, it was essential to have a 3D model with location data of the square. Using photogrammetry software a 3D model was generated out of series of photographs. There are no special requirements, neither for camera quality nor for position from which the images are taken. Output is textured, triangulated mesh of reasonable quality that can be further processed (see Figure 3).
4.2 Natural Feature Tracking The algorithm is processing only the newest frame available ignoring those arriving while the processing is already running. The implementation was a modification of [24] and at the start, ‘Oriented FAST and Rotated BRIEF’ (ORB) detector and descriptor are initialized, following the loading of the camera calibration matrix. ORB detector and descriptor were proposed in [25] as an alternative to SIFT and SURF algorithms, which are of good quality, but impose large computational burden. Its advantage is small noise sensitivity and CPU-oriented calculation, so no GPU acceleration is necessary. While processing incoming frame, detected features (resp. their descriptors) are compared to the descriptors present in files found in initialization phase. Next, the file containing the most (or enough) similar matches is identified. To speed up the matching for next frame, this file is put at the beginning of the feature file list, so that it will be checked next time as the first one. The best estimation of the camera pose is calculated based on matched features and their respective 3D positions. Camera pose is relative to the origin of the model used for data generation, so the position of the origin in the real world is added after converting to world space. For verification reasons, it is possible to visualize the pose computed by the algorithm (Figure 2).
Figure 3. Photogrammetric model (left) textured model of square, (right) its mesh The 3D models of the buildings were simplified to box-like shapes to reduce number of vertices. Input to the ‘backend’ application was the image to be processed, camera calibration file, file name to be created (containing feature descriptors), simplified 3D model of the object, origin coordinates of the model in equirectangular projection (which maps meridians to vertical straight lines of constant spacing and circles of latitude to horizontal straight lines of constant spacing) and maximum number of features to detect.
4.3 Backend Application The 3D model is loaded and shown from two different views by projecting model triangles, using virtual cameras. User is asked to click into the undistorted image on the position of the highlighted vertex, hence creating 3D-2D tuple (see Figure 4). After sufficient number of pairs are obtained the camera position is estimated.
Figure 2. Feature tracking (left) without verification line, (right) with verification lines behind 3D cube In particular, the last used position and file used to calculate it are sent to native side. This position is transformed back to camera space, i.e. axes are rotated back and the location of origin in equal rectangular projection is deducted. Each file, in addition to feature descriptors, also contains list of vertices (triangles) of the model used for calculating feature positions. Together with camera intrinsic obtained from calibration file, it is enough information to project these vertices from 3D to 2D and draw lines between them. Lines are drawn directly to the image so that lines appear to be behind all 3D models. However, for natural feature tracking to work, it is necessary to have a database of features (resp. descriptors) and their 3D position. As the application was tested at
Figure 4. Backend application, auto aligning phase. User is asked to click on location of vertex highlighted in red (i.e. to click to purple circle) To further enhance the estimation, a user can manually change the position and orientation of the virtual camera. Its coordinates (relative to origin of the 3D model) are shown for verification (see Figure 5).
on availability of the sensors and requested power and accuracy demands of the application. To obtain the location, the application connects to Google Play Services and creates location request. The request specifies how fast the application requires the updates and power consumption priority (GPS is considered the most accurate and power demanding). After the request is ready, it obtains events with newly obtained location such as latitude and longitude, altitude, bearing, estimate of accuracy and other information that can be used within the application.
Figure 5. Backend application, manual pose adjustment When the user is satisfied with alignment, features are detected and their descriptors created. Since both camera intrinsic and extrinsic are known, it is possible to test each feature if it lies on the 3D model or not (and where). Here, the Möller-Trubore [26] algorithm was used and results are shown in Figure 6.
Figure 7. AR view of the church
Figure 6. Backend application, found features and descriptors
5. SENSOR SOLUTION The sensor AR solution is using two types of sensors: rotation vector and combination of accelerometer and geomagnetic sensor. Events generated by the latter sensors, contain raw information without any pre-processing or filtering. As such, linear interpolation between previous value and new value is used to smooth the data. Since geomagnetic sensor gives orientation to the north and accelerometer to the sky (or center of the Earth), it is possible to compute east direction by calculating cross product of these vectors. By normalization, orthonormal basis in World coordinate system is obtained. Instead of fusing raw sensor data, which is discouraged, preferred solution is to use data from Rotation vector. This sensor is either software based (most often) or hardware based. Software version computes data based on output of several sensors, most commonly magnetometer, gyroscope and accelerometer. Some implementation can also be smoothed with filtering, typically with Kalman filter, but it is done on native level, without any mean of adjusting. Outputs of this sensor are elements of quaternion, describing device’s orientation in world coordinate system. For position tracking, GPS was used. Since Android 2.2, Google discourage use of Android framework for obtaining devices’ location. Preferred solution is using Google Play services location API, as it provides access to fused location provider, that is using GPS, GSM tracking, Wi-Fi networks and their combination, based
To test the application, participants were asked to perform following three tasks: (a) set the application, (b) calibrating internal camera of the device and (c) running the main activity. When setting the application, the task will start by showing the user an application, as if it was just installed on the device. The task is to set the preferred language, the orientation and location provider to be used when the main activity runs. In the second task, (calibrating internal camera of the device), the task is to calibrate the camera of the device using the provided chessboard pattern and then follow the monitor's instructions. In the final task (running the main activity), the task of the user is to run the main activity (Guide). The user needs to select his/her current location from the menu and run the application. Then use the device to look around and locate the church. Finally, if the user clicks on the 3D object, its name is shown on the screen. There is also an option to load web-pages with additional information (either offline or online).
6. EVALUATION 6.1 Tracking Testing As one of the major requirements was real-time performance, speed of the developed solution has been tested both on artificial data and on real data input. Performance was measured by logging to the device memory, which should have minimal influence on results. For testing purposes, a tablet and a smartphone were used. The tablet was the NVidia Shield Tablet (Tegra K1 Quad-core 2.2 GHz 32-bit, IPS 8" 1920x1200). For computer vision, 800x600 resolution was used (RAM 2GB, Android 5.1.1, Wi-Fi, GPS, Accelerometer, Gyroscope, Geomagnetic Sensor, Rotation Vector Sensor). The smartphone was the Motorola Moto G (Qualcomm Snapdragon 400 Quad-core 1.2 GHz, IPS 4.5" 1280x720. For computer vision, 864x480 resolution was used (RAM 1GB, Android 4.4.4, GSM, Wi-Fi, GPS, Accelerometer, Gyroscope, Geomagnetic Sensor, Rotation Vector Sensor). The worst case
scenario was tested by using highly cluttered scene. Since no sufficient number of matches could be obtained, each time roughly 500x3000 comparisons were always calculated. This corresponds to the red and orange line. As expected, tablet is roughly twice as fast, compared to mobile phone, processing app. 2.27 FPS. Big scatter measured on mobile phone is caused by slow focus of the camera.
comprises of three parts, the first one asking for demographics details, second part dealing with the AR application, and the third part being NASA-TLX questions regarding cognitive workload of the task. Majority of the participants did not have any or just a very little previous experience with AR and VR, with slight advantage for VR. Sometimes they were not able to distinguish between them and an explanation was needed. Feedback for the questions in the AR application questionnaire was gathered in form of rating, higher score meaning subjectively better rating for a given feature of the application. On the other hand, higher rating on the NASA-TLX scale means higher cognitive workload, higher demand. A screenshot of a user experiencing the AR visualization is shown in Figure 10.
Figure 8. Computer vision detection time With files optically good aligned; the error was measured at the corner of the memorial at the square. Results are shown in Figure 9. Except for one outlier, CV can produce reasonable stable output during timely weather conditions. As for GPS, in the open space the accuracy is very good. The initial error is fixed soon, as more satellite fixes are obtained. Figure 10. User evaluation Rated features of the AR application are ease of use, ease of learning, usability, satisfaction, adjustment to the AR experience (“How quickly did you adjust to the augmented reality experience?”), interaction (“How would you rate the interaction?”), intuitiveness (“How intuitive was the interface to use?”), and delay (“How much delay did you experience between your actions and expected outcomes?”). The scale has five points (rating 1-5). NASA-TLX questionnaire is composed of 6 questions rated on a twenty-one-point scale. Its purpose is to measure cognitive workload of a task. Workload is rated on these scales: Mental Demand (“How mentally demanding was the task?”), Physical Demand (“How physically demanding was the task?”), Performance (“How successful were you in accomplishing what were you asked to do?”), Effort (“How hard did you have to work to accomplish your level of performance?”) and Frustration (“How insecure, discouraged, irritated, stressed, and annoyed were you?”). Figure 9. GPS vs. computer vision location accuracy Based on these findings, it was decided to perform the user-testing based only on the sensor solution, since it produced much more reliable results.
6.2 User Evaluation 6.2.1 Data Collection Questionnaires were used for collecting feedback from 30 healthy participants (15 males and 15 females). The age of the sample was adults ranging from students to senior citizens, but the majority was students (between 23-27 years old). The questionnaire
6.2.2 Qualitative Evaluation Qualitative feedback was collected from all participants. In terms of the user experience in AR, several interesting ideas came from the testing. Users would like to have more control over the scene, e.g. be able to click on just part of the models and see some information about it (i.e. height of the bell tower). Interesting idea is to implement zoom in/out of digital objects, either to see them in greater detail or to have a better overview of the scene. This would be useful both in confined and big opened areas. In terms of positive feedback, subjects agreed that the AR sensor solution has possibilities, as this field is promising. One subject mentioned that "overall, it was nice experience and the result was quite impressive". Another subject stated that "It was a nice experience
and I'd wish this would be available also for other locations and monuments". Another welcomed feature might be taking picture in high definition, possible with some (user-driven) postprocessing. Also adding spatial sound effect support might enhance the experience.
correlates with mental demand (r=-0.564, p=0.001), subjective rating of one’s performance (r=-0.482, p=0.007), hardness (effort) of the task (r=-0.621, p=0), and feeling of insecurity, discouragement, and stress (frustration) (r=-0.574, 0.001).
As far as the suggestions that were proposed, participants noted that the design of the application needs to be changed to be more intuitive. Many agreed that some form of interactive tutorial should be embedded directly into the application, explaining its controls. From the monitor’s point of view, people owning nonAndroid devices had the biggest problem while orienting within the application. Also, the calibration activity should be enhanced to increase speed and be easier for non-technical users. Users would like to have more control over the scene, as described above.
6.2.3 Quantitative Evaluation Participants in this study had very strongly correlated answers of questions regarding ease of use and ease of learning (r=0.749, p=0). That makes sense, since they were using the software for the first time and their ability to use the application was indeed determined by how easy they can learn to control it. These two categories were also strongly correlated to “usability” question (r=0.509, p=0.004 for ease of use; r=0.585, p=0.001 for ease of learning). An interesting trend can be spotted in correlation with adjustment to the AR scene. Most of the subjects that reported faster adaptation to the AR scene also rated the software as easy to use (r=0.668, p=0), and as intuitive (r=0.709, p=0). What brings overall satisfaction for the users was usability (r=0.424, p=0.019), intuitiveness (r=0.368, p=0.045), but mostly significant relation with satisfaction hold the interaction ratings (r=0.733, p=0). The other factors correlating with intuitiveness are ease of use (r=0.538, p=0.002), ease of learning (r=0.672, p=0), usability (r=0.442, p=0.014), and satisfaction (r=0.368, p=0.045). The most significant correlations can be shown in Figure 11 and Figure 12.
Figure 12. Overview of significant correlations Similar trends can be observed for questions about ease of use, and overall usability, as these three categories correlate one with another. The participants reported higher mental demand where the application was not intuitive (r=-0.709, p=0), and when they have hard times to fit into the AR scene (r=-0.718, p=0). These correlations are very strong. Lack of intuitiveness also lead participants to mark the overall task as hurried (r=-0.637, p=0), hard (r=-0.649, p=0), and more frustrated (p=-0.719, p=0). A weaker but more unusual correlation is between delays in the application and reported physical demand (r=0.384, p=0.036).
Figure 13. Gender differences in terms of user experience
Figure 11. Significant correlations (performance/frustration, frustration/ease of learning, mental demand/ease of use and intuitiveness/ frustration) Examining relations between questions asking about the AR experience and NASA-TLX questions asking about difficulty and workload of the task confirms rather predictable fact that the easier work with the application a participant has, the less demanding for him/her the task is. Ease of use negatively
In terms of gender, there is some difference between previous experience with AR in males (mean 1.80) and females (mean 1.27), followed by similar trend in experience with VR, and daily computer use. Females, however, rate the experience with the AR application as more satisfying (mean 4.20 vs. 3.87 among males), and also rate the interaction as better (mean 4.00 vs. 3.07), as well as more intuitive (mean 3.40 vs. 3.27). The mean value for ease of use is 3.67 for both males and females. Adjustment to the AR scene was also rated quite similarly (mean rating 3.73 for males and 3.67 for females).
satisfied with the application and got used to AR experience fast. Quite a lot of people found the application usable and would like to have some like it while traveling. Interesting results were collected from the questionnaires. Strong correlations were found regarding ease of use and ease of learning scene. Most of the subjects that reported faster adaptation to the AR scene also rated the software as easy to use and as intuitive. Also, overall satisfaction was high for usability and intuitiveness. In terms of gender differences, females rated the experience with the AR application as more satisfying, the interaction as better as well as more intuitive. Figure 14. Gender comparison results Differences for gender are present in NASA-TLX questions, too (Figure 15). Mean values for mental and physical demand, effort, and frustration do not differ much. Males tend to rate their performance as better (mean 6.20) than females (mean 9.00). Female participants reported higher temporal demand (mean 9.07 females vs. 6.8 males).
In the future, the computer vision approach will be improved based on [27] so that a hybrid solution can be used and tested. In addition, the visual quality of the 3D information will be improved using better quality textures. Finally, more user studies will be performed, by collecting more data such as real-time biofeedback. This will ensure to get a better understanding of the user experiences in mobile AR guides.
8. ACKNOWLEDGMENTS Authors would like to thank Human-Computer Interaction Lab members for their support and inspiration.
9. REFERENCES [1] Azuma, R.T. 1997. A Survey of Augmented Reality. Presence: Teleoperators and Virtual Environments. 6, 4, 355 - 385, DOI=10.1162/pres.1997.6.4.355 [2] Milgram, P. and Kishino, F. 1994. A taxonomy of mixed reality visual displays, IEICE Transactions on Information Systems. E77-D(12):1321-1329. [3] Mann, S. 2002. Mediated Reality with implementations for everyday life, http://wearcam.org/presence_connect/ [4] Braz, J.M. and Pereira, J.M., 2008. TARCAST: Taxonomy for augmented reality CASTing with web support. The International Journal of Virtual Reality. IPI Press, 7(4): 4756. Figure 15. Gender comparison of cognitive workload
7. CONCLUSIONS This paper presented a mobile AR guide for examining user experiences in tourist guides. Real-time tracking was performed using either computer vision techniques or sensors technologies (i.e. GPS and gyroscopes). The sensor technologies worked more robustly and they were used for evaluation with 30 healthy participants. The main features of the AR touristic application were evaluated with 30 healthy volunteers (15 male and 15 female). Results showed that users found the sensor approach easy to use and intuitive. The majority reported fast adaptation to the AR application. As far as gender differences are concerned, females were more satisfied with the AR experience compared to males and also reported higher temporal demand. Overall, feedback showed that AR technology has all the potential to be used for tourist guides since it is easy to use and intuitive. In terms of the received feedback, the intuitiveness and ease of use and learning were rated positively by majority, regardless of the gender, occupation and age. The biggest troubles were, by expectation, reported by elderly people and users of non-Android systems. Even though the interaction and delay could be improved (especially during camera calibration), participants were rather
[5] Hugues, O., Fuchs, P. and Nannipieri, O., 2011. New augmented reality taxonomy: Technologies and features of augmented environment. Handbook of augmented reality. Springer, 47-63. [6] Normand, J.-M., Servières, M. and Moreau, G., 2012. A new typology of augmented reality applications. In Proceedings of the 3rd Augmented Human International Conference (AH'12). Article No. 18, ACM, 445-453. DOI= http://dl.acm.org/citation.cfm?doid=2160125.2160143 [7] Liarokapis, F., Brujic-Okretic, V. and Papakonstantinou, S. 2006. Exploring Urban Environments using Virtual and Augmented Reality, Journal of Virtual Reality and Broadcasting, GRAPP 2006 Special Issue, Digital Peer Publishing, 3(5): 1-13. [8] Feiner, S., MacIntyre, B., Hollerer, T., and Webster, A. 1997. A touring machine: prototyping 3D mobile augmented reality systems for exploring the urban environment. In Proceedings of the 1st International Symposium on Wearable Computers. IEEE Computer Society, 74-81, DOI=10.1109/ISWC.1997.629922 [9] Höllerer, T., Feiner, S.K., Terauchi, T., Rashid, G. and Hallaway, D. 1999. Exploring MARS: developing indoor and outdoor user interfaces to a mobile augmented reality system.
Computers and Graphics. Elsevier, 23(6): 779-785. DOI= 10.1016/S0097-8493(99)00103-X [10] Piekarski, W. and Thomas, B.H. 2002. Unifying Augmented Reality and Virtual Reality User Interfaces. Technical Report, University of South Australia, Adelaide. [11] Umlauf, E. Piringer, Reitmayr, G. and Schmalstieg, D. 2002. ARLib: The Augmented Library. In Proceedings of the First IEEE International Augmented Reality ToolKit Workshop. Darmstadt, Germany. [12] Vlahakis, V., Ioannidis et al., 2002. Archeoguide: An Augmented Reality Guide for Archaeological Sites. Computer Graphics and Applications 22, 5, IEEE Computer Society, 52-60, DOI= 10.1109/MCG.2002.1028726 [13] Zoellner, M. Pagani, A., et al., 2008. Reality Filtering: A Visual Time Machine in Augmented Reality. In Proceedings of the 9th International Symposium on Virtual Reality, Archaeology and Cultural Heritage (VAST 2008). Eurographics, Braga, Portugal, 2-5 Dec, 71-77, DOI=10.2312/VAST/VAST08/071-077 [14] Herbst, I., Braun, A.K., McCall, R. and Broll, W. 2008. TimeWarp: interactive time travel with a mobile mixed reality game. In Proceedings of the 10th international conference on Human computer interaction with mobile devices and services (MobileHCI 2008). ACM Press, 235244, DOI=10.1145/1409240.1409266
augmented reality travel guides: The role of emotions on adoption behavior. Pervasive and Mobile Computing. Elsevier, 18, 71-87. DOI=10.1016/j.pmcj.2014.08.009 [19] Tsai, R. 1987. A versatile camera calibration technique for high-accuracy 3D machine vision metrology using off-theshelf TV cameras and lenses. Robotics and Automation. 3(4), 323-344. [20] Pollefeys, M., Koch, R. and Van Gool, L., 1999. SelfCalibration and Metric Reconstruction in spite of Varying and Unknown Intrinsic Camera Parameters. Hingham, MA, USA, Kluwer Academic Publishers, 7-25. [21] Zhang, Z. 2000. A flexible new technique for camera calibration. Pattern Analysis and Machine Intelligence. 22(11), 1330-1334. [22] Levenberg, K., 1944. A Method for the Solution of Certain Non-Linear Problems in Least Squares. Quarterly of Applied Mathematics. Volume 2, 164-168. [23] Marquardt, D. 1963. An Algorithm for Least-Squares Estimation of Nonlinear Parameters. SIAM Journal on Applied Mathematics. 11(2), 431-441. [24] Riba, E. 2014. Real Time pose estimation of a textured object. OpenCV 3.0.0-dev documentation, OpenCV Tutorials, Available at: http://docs.opencv.org/3.0beta/doc/tutorials/calib3d/real_time_pose/real_time_pose.ht ml#realtimeposeestimation.
[15] Morrison, A., Jacucci, G., Peltonen, P., Juustila, A., Reitmayr, G. 2008. Using locative games to evaluate hybrid technology. Evaluating Player Experiences in Location Aware Games workshop, British HCI 2008. Liverpool, September.
[25] Rublee, E., Rabaud, V., Konolige, K. and Bradski, G. 2011. ORB: an efficient alternative to SIFT or SURF. In Proceedings of the 2011 International Conference on Computer Vision, IEEE Computer Society, Barcelona, Spain, 2564-2571. DOI= 10.1109/ICCV.2011.6126544
[16] Liarokapis, F., Sylaiou, S. and Mountain, D. 2008. Personalizing Virtual and Augmented Reality for Cultural Heritage Indoor and Outdoor Experiences. In Proceedings of the 9th International Symposium on Virtual Reality, Archaeology and Cultural Heritage (VAST 2008). Eurographics, Braga, Portugal, 2-5 Dec, 55-62.
[26] Möller, T. and Trubore, B. 1997. Fast, Minimum Storage Ray/Triangle Intersection. Journal of Graphics Tools, Taylor & Francis, 2(1), 21-28.
[17] Han, D.I., Jung, T. and Gibson, A. 2014. Dublin AR: implementing augmented reality in tourism. In Information and Communication Technologies in Tourism 2014, Springer, 511-523. DOI=10.1007/978-3-319-03973-2_37 [18] Kourouthanassis, P., Boletsis, C., Bardaki, C. and Chasanidou, D. 2015. Tourists responses to mobile
[27] Kaehler, O., Prisacariu, V.A., Ren, C.Y., Sun, X., Torr, T. and David Murray 2015. Very High Frame Rate Volumetric Integration of Depth Images on Mobile Devices. IEEE Transactions on Visualization and Computer Graphics, IEEE Computer Society, 21(11), 1241-1250, DOI= 10.1109/TVCG.2015.2459891