Physical Interaction by Pointing with a Mobile Device

Physical Interaction by Pointing with a Mobile Device

JOHAN

PERSSON

Master of Science Thesis Stockholm, Sweden 2010

Physical Interaction by Pointing with a Mobile Device

JOHAN

PERSSON

Master’s Thesis in Computer Science (30 ECTS credits) at the School of Engineering Physics Royal Institute of Technology year 2010 Supervisor at CSC was Alex Olwal Examiner was Lars Kjelldahl TRITA-CSC-E 2010:068 ISRN-KTH/CSC/E--10/068--SE ISSN-1653-5715

Royal Institute of Technology School of Computer Science and Communication KTH CSC SE-100 44 Stockholm, Sweden URL: www.kth.se/csc

Abstract Today almost all objects and persons around us have some representation on the Internet. This creates a digital world parallel to the physical world we live in. Recent development of mobile phone hardware has made new ways of interaction that bridge these two worlds possible. The main goal of this degree project was to investigate how this bridging could be done, especially when using a GPSreceiver, accelerometer and magnetometer. Other goals were to determine what applications would benefit from such a way of interaction and if users would prefer it over more traditional alternatives. During the project, the idea of pointing at objects of interest to interact with them has been taken from idea to part of a fully working application. The application enables users to see timetables for public transportation by pointing with the mobile phone towards a stop point. A novel and patent pending method for determining what object the mobile phone is pointing at, was created. This method compensates for errors in position and direction estimations. It also allows points of interest to have geometric shapes other than points. Two additional prototypes implementing alternative interaction techniques were developed. In user evaluations, these prototypes were compared to the main application to determine what interaction technique users prefer. The conclusion of this degree project is that it is possible to use interaction by pointing to bridge the physical and digital worlds. There are problems and limitations which need to be handled, but there are also possibilities to create a better user experience.

Referat Fysisk interaktion genom att peka med en handhållen enhet Idag finns nästan alla föremål och personer omkring oss representerade på Internet, vilket skapar en digital värld parallell till den fysisk värld vi lever i. Senaste tidens utveckling av mobiltelefonernas hårdvara har möjliggjort nya sätt att interagera på som kopplar samman dessa två världar. Det huvudsakliga målet för det här examensarbetet var att utreda hur en sådan sammankoppling kan göras, särskilt med hjälp av en GPS-mottagare, accelerometer och magnetometer. Andra mål var att utröna vilken typ av applikationer som skulle vinna på ett sådant interaktionssätt och om användare skulle föredra det över mer traditionella alternativ. Under examensarbetet har idén om att interagera med objekt genom att peka på dem tagits från idé till att vara en del av en fullt fungerande applikation. Applikationen låter användare ta del av tidtabeller för kollektivtrafik genom att peka med mobiltelefonen på en hållplats. En ny och patentsökt metod för att beräkna vad mobiltelefonen pekar på togs fram. Metoden kompenserar för fel i positions- och riktningsuppskattningarna. Den tillåter också att intressepunkter representeras av geometriska figurer och inte enbart av punkter. Ytterligare två prototyper som använder alternativa interaktionstekniker utvecklades. Dessa prototyper jämfördes med huvudapplikationen under användningstester för att utröna vilken interaktionsteknik som användare föredrar. Slutsatsen av det här examensarbetet är att det, med hjälp av interaktion genom att peka, är möjligt att överbrygga klyftan mellan den fysiska och digitala världen. Det finns problem och begränsningar som behöver hanteras, men också möjligheter till att skapa en bättre användarupplevelse.

Contents 1 Introduction 1.1 Background . . . . 1.2 Problem Definition 1.3 Goals . . . . . . . 1.4 Limitations . . . . 1.5 Report Summary . 1.6 Chapter Overview

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

2 Research Overview 2.1 Interaction Paradigms . . . . . . . 2.1.1 Touching . . . . . . . . . . 2.1.2 Pointing . . . . . . . . . . . 2.1.3 Scanning . . . . . . . . . . 2.2 Sensors . . . . . . . . . . . . . . . 2.2.1 Accelerometer . . . . . . . . 2.2.2 Computer Vision . . . . . . 2.2.3 GPS . . . . . . . . . . . . . 2.2.4 IR-light . . . . . . . . . . . 2.2.5 Magnetometer . . . . . . . 2.2.6 RFID . . . . . . . . . . . . 2.3 Related Work . . . . . . . . . . . . 2.3.1 Spatially Aware Handhelds 2.3.2 Mobile Augmented Reality 2.3.3 Geo-Wands . . . . . . . . .

. . . . . .

. . . . . . . . . . . . . . .

. . . . . .

. . . . . . . . . . . . . . .

. . . . . .

. . . . . . . . . . . . . . .

. . . . . .

. . . . . . . . . . . . . . .

. . . . . .

. . . . . . . . . . . . . . .

. . . . . .

. . . . . . . . . . . . . . .

. . . . . .

. . . . . . . . . . . . . . .

. . . . . .

. . . . . . . . . . . . . . .

. . . . . .

. . . . . . . . . . . . . . .

. . . . . .

. . . . . . . . . . . . . . .

. . . . . .

. . . . . . . . . . . . . . .

3 Interaction Prototype 3.1 Interaction Choices . . . . . . . . . . . . . . . . . . . . 3.2 Prototype Requirements . . . . . . . . . . . . . . . . . 3.3 Intersection Calculations . . . . . . . . . . . . . . . . . 3.3.1 Ray Tracing . . . . . . . . . . . . . . . . . . . . 3.3.2 Barycentric Coordinates . . . . . . . . . . . . . 3.3.3 Ray-Triangle Combination . . . . . . . . . . . . 3.3.4 Monte Carlo Sampling - an Alternative Method

. . . . . .

. . . . . . . . . . . . . . .

. . . . . . .

. . . . . .

. . . . . . . . . . . . . . .

. . . . . . .

. . . . . .

. . . . . . . . . . . . . . .

. . . . . . .

. . . . . .

. . . . . . . . . . . . . . .

. . . . . . .

. . . . . .

. . . . . . . . . . . . . . .

. . . . . . .

. . . . . .

. . . . . . . . . . . . . . .

. . . . . . .

. . . . . .

. . . . . . . . . . . . . . .

. . . . . . .

. . . . . .

1 1 2 3 3 4 5

. . . . . . . . . . . . . . .

7 7 7 8 8 8 8 9 11 12 12 13 14 14 15 16

. . . . . . .

19 19 20 20 20 21 23 24

3.4 3.5

Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Prototype Results . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4 Application Prototype 4.1 Concept Development . . . . . . . . . 4.1.1 Hypothesis . . . . . . . . . . . 4.1.2 Four Concept Candidates . . . 4.1.3 Final Concept . . . . . . . . . . 4.2 Resulting Application - Time2Go . . . 4.2.1 Welcome Screen . . . . . . . . 4.2.2 Bus Data . . . . . . . . . . . . 4.2.3 Interacting . . . . . . . . . . . 4.2.4 Result View . . . . . . . . . . . 4.2.5 Detailed View . . . . . . . . . . 4.3 Design Decisions . . . . . . . . . . . . 4.3.1 The Map . . . . . . . . . . . . 4.3.2 Precision . . . . . . . . . . . . 4.3.3 Interaction Trigger . . . . . . . 4.3.4 List of Results . . . . . . . . . 4.3.5 Detailed View . . . . . . . . . . 4.4 User Evaluation . . . . . . . . . . . . . 4.4.1 Alternative Applications . . . . 4.4.2 The Uppsala Test . . . . . . . 4.4.3 Heuristic Expert Evaluation . . 4.4.4 The KTH Test . . . . . . . . . 4.4.5 Results of the User Evaluations

. . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .

5 Conclusions, Discussion and Future Work 5.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.1 Interaction Conclusions . . . . . . . . . . . . . . . . 5.1.2 Conclusions About the Hypothesis . . . . . . . . . . 5.1.3 Concept Conclusions . . . . . . . . . . . . . . . . . . 5.2 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2.1 Interaction Prototype . . . . . . . . . . . . . . . . . 5.2.2 Concept Development . . . . . . . . . . . . . . . . . 5.2.3 User Evaluation . . . . . . . . . . . . . . . . . . . . 5.3 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3.1 Error Prediction for Sensors . . . . . . . . . . . . . . 5.3.2 Evaluation of the Hypothesis . . . . . . . . . . . . . 5.3.3 Other Concepts . . . . . . . . . . . . . . . . . . . . . 5.3.4 Radar2Go and Touch2Go Combined . . . . . . . . . 5.3.5 Extend Time2Go . . . . . . . . . . . . . . . . . . . . 5.3.6 Interaction Paradigms in an Outdoor Environment . 5.3.7 Pointing as a Complementary Interaction Technique

. . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

25 26

. . . . . . . . . . . . . . . . . . . . . .

29 29 29 30 31 32 32 32 33 34 35 35 36 37 37 37 38 38 39 41 42 42 44

. . . . . . . . . . . . . . . .

47 47 47 49 49 50 50 51 52 52 52 52 53 53 53 54 54

Bibliography

55

Appendices

59

A Business Models

61

Chapter 1

Introduction This chapter starts by introducing the reader to the problem and its background and limitations. Thereafter a résumé of the whole report can be found in the Report Summary section. This section can be read to gain an overview of the work performed in this degree project and its most important results and conclusions, without reading the whole report. The Chapter Overview section outlines the rest of the chapters in the report.

1.1

Background

The development of the Internet has introduced a new and digital world, parallel to our own physical one. People publish personal homepages, create Facebook profiles, enter information about famous places in Wikipedia and almost every company and product has its own website. Since much of the content on the Internet has a connection to real world objects, researchers started to investigate how to bridge these two worlds to let users access digital information about physical objects in their vicinity by interacting directly with them, instead of having to enter text into a web browser. Once limited to purpose built research prototypes, recent development of mobile phone hardware has turned mobile phones into enablers of this sort of bridging, making it possible to introduce it to consumers. Ericsson Research is, within its service layer research area, doing research on technologies, solutions and enablers for next generation of end-user services. One example of research within this area is location based map services, which are services aimed at helping users find information about specific geographic locations. This set of services was born when it became possible to geographically locate mobile phones through the use of cell tower triangulation, and let applications provide users with information and services relevant to their geographical position. Today, 1

CHAPTER 1. INTRODUCTION

the latest generations of mobile phones include GPS-receivers, magnetometers and accelerometers. These additional sensors enable applications to recover both the position of the user and the direction in which he or she is holding their phone, making it possible to create so called spatially aware mobile services. This degree project sprang from the introduction of these new sensors, as Ericsson Research wanted to know what could be achieved by combining them and what possible new applications such a combination could enable. Parallel to the work described in this report, another degree project on the same subject was performed by Josefin Löfström. The two projects were performed in conjunction at Ericsson Research. To distinguish the two degree projects, the one described by this report was more focused on techincal aspects while the project performed by Löfström was more focused on human-computer interaction aspects. Löfström’s work can be found in [13]. Both projects were supervised by Richard Carlsson and Hjalmar Olsson at Ericsson Research.

1.2

Problem Definition

The mobile phone is one of the most ubiquitous devices available today, with almost every one of us carrying one around in our pocket. With the integration of new sensors, the mobile phone also becomes increasingly aware of its surrounding, making it a superb platform for bridging the physical and digital world. While many of these sensors have been available for a long time to both users and researchers, it is only recently that they have become an integrated part of mobile phones and thus start to become widespread. The introduction of these new sensors has opened up new areas of possible applications and ways of interaction. The problem that companies like Ericsson now face is how to utilize them in order to bring new experiences to their consumers. This includes to look at

• what conclusions, design guidelines and results researchers have reached • how technical problems can be solved • what the limitations are • what applications benefit from the new ways of interaction • what the users’ preferences are. 2

1.3. GOALS

1.3

Goals

The goal of this master degree project is to investigate how digital and physical reality can be bridged by using a mobile phone equipped with different sensors. In particular, the combination of a GPS-receiver, magnetometer and accelerometer will be focused upon; as these are sensors found in the latest generation of mobile phones and can in conjunction provide spatially aware mobile services. Ericsson Research is not only looking to gain insight into how this bridging can be done, but also what opportunities these new sensors present in terms of possible new services. To be specific, this degree project will try to provide the answer to the following questions: • What research has been done in the field of mobile spatial interaction? • How can this interaction be implemented on an off-the-shelf mobile phone? • What applications benefit from this type of interaction? • Is it useful to interact in this manner, or do users prefer more traditional ways of interacting?

1.4

Limitations

In order to prevent the project from becoming too broad to be viable, restrictions were needed. Focusing on mobile phones restricts it in some ways, but the number of possible scenarios where a mobile phone could be used to bridge physical and digital is still high. Pointing your phone in the direction of your music player could for example present you with a list of your digital music archive, and after selecting a track you could point towards the speakers in the kitchen to enjoy some music while cooking. Or a factory worker could look at a machine through the camera feed of his phone to see operational information such as current temperature, forces, etc. in connection to certain parts of the machine. These are two very different scenarios that have different requirements in terms of sensors and user requirements, among other things. In order to narrow down the number of possible scenarios, Ericsson added more restrictions to the project. The end result was to contain some sort of map, and we were to specifically investigate the combination of GPS, magnetometer and accelerometer. The reason for the first restriction was that Ericsson Research was developing a map API for mobile phones and wanted an application that demonstrates their API in use. The second restriction was due to the fact that Ericsson had specific interests in researching this pointing type of interaction using the above 3


mentioned sensors and wanted to investigate if they could be used to enable new end-user services.

1.5

Report Summary

This section describes the main work performed during the degree project and provides a summary of the most important results and conclusions. It can be read to avoid reading the whole report while still gaining an insight into the key parts of the project, or as an introductory overview before reading the rest of the report. For more details, discussions and motivations, please read the whole report.

Work Performed In this degree project, we have taken an interaction concept from idea through prototypes to a working mobile application. An interaction prototype was created in order to test how well interaction by pointing with the mobile phone in hand worked. In order to determine what the user was pointing at, three different techniques for calculating intersection were created and tested in the prototype. This prototype formed the basis for later applications and concepts. The brainstorming resulted in a hypothesis of what situations would benefit from interaction by pointing. Through focus groups, more information was collected on when potential users would find themselves in situations such as the one described in the hypothesis. Four concepts on a final application were developed from this information, and one of the concepts was chosen to be developed further. From this final concept, a fully functional application was built, which enables users to point at a bus stop to see timetables. The application works with any mobile phone that runs the Android mobile phone operating system and has a built-in GPS-receiver, magnetometer and accelerometer. Timetables and bus stop data were provided by Upplands Länstrafik, UL, and the application downloaded this information interactively when needed. This currently limits the use of the application to areas where there is public transportation operated by UL. In the end, user evaluations were performed using the final application and two additional applications. The two additional applications were created for comparison and featured different ways of interacting. Users described the final application as fun, quick and intuitive to use when in sight of the bus stop they wanted to interact with, but seemed to prefer the other two applications when this was not the case. 4

1.6. CHAPTER OVERVIEW

Result Summary Our work mainly resulted in three different things: three different ways of calculating intersections; a hypothesis of scenarios where interaction by pointing with a mobile device in hand to select an object is useful; and a fully functional mobile phone application. The three ways of calculating intersections are simple yet seemed to perform sufficiently well if the requirements on precision were not too high. The user evaluations conducted did not contradict the hypothesis, but more testing is needed before more certain conclusions can be drawn. Participants of the user evaluation also liked the developed mobile phone application when used in the intended scenario, but since this scenario is narrow, many participants requested additional functionality to support more use cases.

Conclusion Summary The conclusion which can be drawn from the work performed is that it is possible to bridge the physical and virtual world using an off-the-shelf mobile phone with GPS-receiver, magnetometer and accelerometer. These sensors are not very accurate, which is something that developers need to account for when designing applications. Applications that seem to benefit from the pointing type of interaction explored during this work are applications that are used in scenarios where users find themselves in need of information they associate with a certain object and at the same time find themselves within sight of this object. Note however that more research is needed to be able to really conclude this. For scenarios other than the one described, users seemed to prefer other ways of interacting.

1.6

Chapter Overview

This section describes the disposition of the rest of this report.

2. Research Overview This chapter provides an overview of relevant research, in order to help the reader to understand and judge the work performed during the degree project. The reader will be introduced to interaction paradigms, interesting sensors and related work. This will form the theoretical foundation of the rest of the work. 5


3. Interaction Prototype This chapter describes the prototype that was created in order to test the technical aspects of interacting by pointing. First some design decisions and requirements are presented, followed by a description of four methods to calculate what the user was pointing at. In the end of the chapter, the results of the prototype can be found.

4. Application Prototype After the interaction prototype was created, the work continued with the creation of an application prototype. Chapter 4 provides the details of the process to create this application prototype. The concept creation phase is described, followed by a walkthrough of the resulting application. Some of the design decisions are discussed, giving insight to why the application ended up looking as it does. Finally, the user evaluations performed are presented along with their results.

5. Conclusions, Discussion and Future Work In the final chapter, conclusions are drawn about the work performed and the result obtained. Some of the details of the project are discussed in greater detail. Possible topics for other researchers to look into are presented in the Future Work section of the chapter.

6

Chapter 2

Research Overview This chapter introduces the reader to research related to the work performed during the degree project. Common interaction paradigms are introduced and an overview of relevant sensors used in mobile phones today is presented. Finally, related work performed by other researchers is reviewed.

2.1

Interaction Paradigms

In [33], Välkkynen et al. present three different paradigms for physical mobile interaction: touching, pointing and scanning. This section presents these paradigms along with the findings about them in Rukzio et al. in [24].

2.1.1

Touching

When using the touching paradigm, users physically touch their mobile device to objects they wish to interact with. Some sensor is used to register when the user is touching their device to or holding it close to an object that has the ability to interact. According to Rukzio et al. [24] it is seen as “an error resistant, very secure, very quick, intuitive and non-ambiguous selection process which can require physical effort” but it requires the user to be close enough to touch the object he or she is interested in interacting with. If this is not the case, Rukzio et al. state that the benefits of using touch interaction need to be large enough to motivate the user to walk over to the object of interest. 7

CHAPTER 2. RESEARCH OVERVIEW

2.1.2

Pointing

Pointing is a natural human gesture, often used in everyday life to indicate things and point out what we mean. The idea of the pointing paradigm is to enable users to do the same with their mobile devices. When the user points the device towards the object that he or she wants to interact with, the device determines what object the user meant and executes a certain task or display the intended information. Rukzio et al. in [24] describe the pointing paradigm as “an intuitive and quick technique” which “makes most sense because it combines intuitive interaction with less physical effort” but “requires some cognitive effort to point at the smart device and needs line of sight”.

2.1.3

Scanning

When using the scanning paradigm, the user is presented with a list of all objects in the vicinity that he or she can interact with. The user’s mobile device scans the surroundings for objects and the user interact with an object by selecting it in a list on the device itself. Rukzio et al. state that “scanning is seen as a very technical interaction technique which is more complex to use because of its indirectness” [24]. The conclusions of their user studies were also that users try to avoid scanning if possible and use the other two paradigms if there is a line of sight to the object of interest.

2.2

Sensors

This section provides a technical overview of different sensors commonly used in research to create a link between the digital and physical world using mobile phones. The idea is to give the reader insight into what sort of information they can provide, how they are being used in research applications and what their strengths and weaknesses are.

2.2.1

Accelerometer

Perhaps the most popular use of accelerometers is in gesture recognition, where the accelerometer measures the relative motion of a device. An example of this sort of use is Nintendo’s Wii gaming platform, where the controllers have integrated accelerometers which, in combination with other techniques, are used to recognize users’ movement. Another use of the accelerometer is to find out the orientation of a device. The accelerometer will in addition to sensing relative movements also sense 8

2.2. SENSORS

the gravity field of the earth and can thus provide a cue to the current orientation of the device relative to the earth’s surface. Cameras and mobile phones also increasingly contain accelerometers, where they are used to measure if a user is using the device in landscape, portrait or horizontal mode. The device can then adapt the content of the screen to the way the user is holding the device. The ability to measure the gravity field of the earth can be used to improve the bearing reported by a magnetometer [5]. Another use is to sense the tilt of the device. In [23] for example, users could tilt a mobile device to indicate a point of interest. The amount of tilt corresponded to how far away from the user the point was. One of the problems with using an accelerometer is that it measures relative motion. This might lead to drifting problems if one performs consecutive measurements to keep track of movement. Also, if the user for example is travelling in a car or is walking while trying to perform some gesture that an application is normally able to understand, the added relative motion of the car or walking might make the application unable to recognize the gesture.

2.2.2

Computer Vision

Today most mobile phones include digital cameras and the quality is ever increasing. As the processing power of the mobile phones increase, new uses of the integrated camera begin to see the day and nowadays the camera is not only used to take pictures for keeping.

Barcodes and Fiducial Markers This technique is used in augmented reality and sensor-fusion applications to recognize objects by putting some sort of identifying marker on them. Image recognition algorithms are used to recognize the markers and their location and orientation relative to the camera [21]. Applications can then react to this information, for example by displaying a 3D-model hovering over the marker or by displaying informative text about the object that hosts the marker. Several different types of markers and codes exist, with different appearances and attributes. Figure 2.1 show example of such codes. For a more thorough review of different 2D barcodes and fiducial markers, see for example Kato and Tan’s work in [12]. The main advantages of these markers are that they are relatively easy to recognize using algorithms that do not require a lot of processing power, they are cheap to produce and can be put on almost any object that is big enough to house the marker. Another advantage is that the user knows which objects he or she can interact with, 9


Figure 2.1. Example of different 2D barcodes (original barcode pictures from Kato and Tan 2007, [12]).

as they are marked. This could however also count as a drawback, since sticking a paper marker to an object might destroy some of its aesthetics. Also, the user needs to make sure that the whole marker is visible to the camera while seeing to it that the resolution is sufficient [10]. This in general means that the user needs to be fairly close to the object he or she wants to interact with. Alternatively, the marker could be made larger, with the drawback of perhaps further decreasing the aesthetics of the marked object.

Natural Feature Tracking Recent development of phone hardware has made mobile phones powerful enough to run complex computer vision and image search algorithms [6]. This enables the mobile phone to recognize objects such as houses and movie posters without the use of special markers and without the need to send the whole picture to a server for analysis [10]. The lack of markers might sometimes be a drawback. In [6], the researchers developed a tourist application where users could take a picture of a building to receive more information about it. In a field study the researchers noted that participants tended to rotate on the spot while photographing every building around them, just to find out which buildings contained additional information. Using a version of the SURF computer vision algorithm, highly optimized for mobile phones, [32] succeed in recognizing objects almost in real time on a mobile phone. To make their system more scalable, the authors included the use of a GPS-receiver integrated in the mobile phone. Using the position reported by the GPS-receiver, the phone only needs to match the current camera image feed to images taken of objects close to the user’s geographical position. Another example of an application working in real time using the combination of GPS and camera is the Point & Find1 application made by Nokia. This application allows you to point the camera at a movie poster, a restaurant, etc. and receive additional information about it [15]. One advantage here over other mobile spatial interaction techniques is that 1

http://pointandfind.nokia.com/

10

2.2. SENSORS

the uncertainty of the GPS does not affect the performance of the application a lot, as the GPS-position is only used to narrow down the set of possible matches. Computer vision can also be used to simulate other type of sensors. Wang et al. in [36] and Adams et al. in [1] used the camera of a mobile phone together with a computer vision algorithm to achieve functionality similar to that of an accelerometer. Another possibility could perhaps be to match a photo from a camera phone with a database with position tagged photos to simulate GPS functionality. If the image from the phone matches some image in the database, the position of the camera could be calculated.

2.2.3

GPS

GPS, or Global Positioning System, is a system commonly used to estimate the geographical position of a device equipped with a GPS-receiver. This sensor is used to provide an application with awareness of the geographical location of where it is being used. Increasingly popular are so called location based services, LBS, which provide the user with data or services relevant in the current location. Possible applications could be to provide the user with information about sales in the immediate surroundings, or to let the user view Twitter status updates her friends made from the surroundings. The main advantage of GPS is that it in general provides higher accuracy than other positioning methods available to mobile phones. Cell tower positioning can provide a general estimation of the user’s position, but the accuracy depends on the amount of cell towers that the mobile phone is connected to and the distance between these cell towers. Other positioning methods exist, for example through the use of wifi access points [25], but these methods are not generally available and are not commonly used. Even thought the accuracy of the GPS position is usually better than the one obtained through for example cell tower triangulation, it depends a lot on the surroundings. In [27], the authors test the accuracy of the GPS in a few different types of urban outdoor environment. According to these tests, the position estimation error varied from under 10 meters in low-density urban environment 2 to at least 30 to 40 meters in urban environment 3 . Such errors can prove to be problematic to applications that are in need of good accuracy, as they are hard to predict and to measure in real time. Also, the HDOP-value commonly reported by GPS-receivers, which is supposed to indicate the quality of the GPS signal, prove not to be a good measurement of the error in the estimated position [27]. 2 Classified as environments with a large percentage of clear sky, such as city areas with low buildings (2-3 floor) or city areas with higher buildings but with broad streets in between them. 3 Classified as environments with a low percentage of clear sky, such as city areas with higher buildings (up to 6 floors) and narrow streets or alleyways.

11


Another big drawback of GPS is that it does not work indoors. Research effort is being put into developing positioning systems for use indoors as well. Nokia, for example, is trialling an indoor positioning system[16] using wifi access points[14] to estimate users locations in a shopping centre in Helsinki.

2.2.4

IR-light

IR-light, or infrared light, is electromagnetic radiation with a wavelength between approximately 750 nm and 100 µm and is invisible to the human eye. Perhaps the most well known use of IR is the remote control, where an IR LED is used to send instructions to some device. IR-light has also been used as a way of sending information to and from computers and mobile phones. One of the problems with IR light is that a clear line of sight is needed between the two devices that are to be connected. Perhaps it is due to this limitation that techniques like wifi and Bluetooth seem more popular today when it comes to communication between devices. Nevertheless, IR-light has been used in some interesting sensor-fusion applications. [31], [2] and [34] all used IR-light to implement applications where users could point at an object with an IR-equipped mobile phone or hand held computer, to establish a connection. The IR light hits an IR-sensitive marker of some sort and a connection is established through IR, wifi or Bluetooth. Apart from needing a clear line of sight, another downside of using IR-light to target objects is the need to attach an IR-beacon to all objects that should be possible to target. This beacon might affect the looks of a device in the same manner as the fiducial markers mentioned in the section about digital cameras. In addition, they also need to be fed with power, which might not always be convenient.

2.2.5

Magnetometer

Magnetometers measure the magnetic field of the earth and can be used to calculate in which direction relative to the North Pole a device is pointing. In general, just knowing in what direction a user is looking, without knowing the location, might not be that useful. But coupled with a GPS-receiver, the magnetometer can enable devices to display information relevant to where the user is looking or enable users to point out an object in front of them to interact with in some manner. Early visions of applications that use such a combination were provided by Egenhofer in [7]. For example he envisioned a Geo-Wand, which could be used to receive more information about objects in the environment around simply by pointing. Another of Egenhofer’s visions was Smart Horizons. Smart Horizons were applications that would allow users to see enhanced versions of their current horizon, letting them for 12

2.2. SENSORS

example see additional information such as approaching weather fronts or enabling sailors to find landing with a road close by. As mentioned in section 2.2.1, the addition of an accelerometer can enhance the performance of the magnetometer [5]. In [29], the authors test the performance of a magnetometer-accelerometer combination. When the sensor is stationary and undisturbed (i.e. the device was placed on a stationary item), a standard deviation of 0.67◦ was observed, and when users were asked to, while stationary, point at targets at different distances, the standard deviation was a little more than 2◦ . The explanation to why this last value is higher is probably that users are not able to hold the device perfectly still, thus introducing errors in the accelerometers estimation of the orientation of the device relative to the earth. While 2◦ might not seem too grave, the deviation quickly becomes much larger if the user is moving, reaching almost 10◦ when users walk slowly and more than 27◦ when users walk fast. The authors of [29] propose that it might be possible to reduce the deviation when walking by using appropriate filtering of the accelerometer data to compensate for the disturbance introduced by the steps. There are not many alternatives to magnetometers available to mobile phones as of today. In fact, magnetometers seem to just have begun appearing as integrated parts in mobile phones, opening up new possibilities of interaction. Today, the only alternative available for finding out where a user is looking is the use of computer vision algorithms as mentioned in section about digital cameras. These algorithms could provide additional information such as exactly what object the user is looking at, but in turn they are harder to implement and require more computational power.

2.2.6

RFID

Radio-Frequency Identification, RFID, is another tag-based technique to identify objects. A RFID-reader is used to read the contents of RFID-tags, which can be passive or active [2]. Active tags incorporate some sort of power source and constantly emit the content of the tag, which can then be read by the RFID-reader. Passive tags on the other hand require no power source, instead power is transmitted to the tag from the reader using induction. When enough power has been transmitted, the tag in turn transmits its content. The RFID-reader can typically read the contents of the tag from several tens of centimeters away down to a distance of a few millimeters [2]. Even though techniques exist to extend the operation range of RFID [4], when it comes to mobile phone interaction RFID is usually used in situations where the user is close enough to touch what he or she wants to interact with. One advantage with RFID-tags is that they do not need to be visible and can thus be embedded into different objects. In [37] and [19] for example, the authors demonstrated prototypes 13


which could be used to read RFID-tags hidden in the bindings of a book or in a business card. Linked to the tags was additional information, such as the latest version of a technical manual or a webpage.

2.3

Related Work

Section 2.2 provided a per sensor overview of technical aspects and previous work related to each sensor. This section will instead give you an insight into what is being done in the area of mobile spatial interaction, where the GPS-position is coupled with some technique to find out the orientation of the user. The joint information is then used to provide the user with direction and position specific information. The most common combination of sensors is GPS, accelerometer and magnetometer, although there are solutions that utilize a combination of GPS, digital camera and computer vision to achieve similar effects.

2.3.1

Spatially Aware Handhelds

Spatially aware handheld is a term used for mobile devices that are able to sense their position and orientation relative to the environment in which the device is being used. The device is “aware” of its surroundings. The area of spatially aware handhelds was pioneered by Fitzmaurice [8] and Rekimoto et al. [22] in the beginning of the 1990’s. Fitzmaurice developed a proof-of-concept prototype called Chameleon. This prototype used a small handheld display which was attached to a camera filming a computer screen. The handheld display was tracked using a six degree of freedom tracking device and the content of the screen changed according to the orientation and position of the screen itself. Fitzmaurice also envisioned various applications, such as a computer-augmented paper map. The spatially aware handheld would let users view more detailed digital maps of an area by directing the handheld to the area of interest on the paper map. Rekimoto et al. developed the NaviCam prototype, which was based on a handheld screen with a camera attached to its back. The screen display the feed of the camera, annotated with additional information. Computer vision algorithms continuously checked the camera feed for special markers which were attached to real world objects. The markers were associated with information in a database and when a marker was recognized, the related information was displayed on the image in connection to the object. By this recognition, the handheld device became aware of its surroundings. Rekimoto et al. demonstrated applications that displayed additional information in connection to famous paintings or a video messages recorded by the occupier of an office when the camera was pointed at the painting or office door. 14

2.3. RELATED WORK

Today the area of spatially aware handhelds has grown and researchers do not only look at how additional information can be displayed about real world objects. New research include for example how spatially aware handhelds can be used to manipulate and interact with information spaces [38] or how they can be used in collaborative environments [17]. Researchers have demonstrated a variety of applications, ranging from handheld devices that are aware of where above an interactive table they are being held to handheld devices that are aware of where they are relative to the surface of the earth. The next two sections will in more detail describe two areas of spatially aware handhelds that are related to the work performed during this degree project.

2.3.2

Mobile Augmented Reality

One of the areas of research that sprang from Fitzmaurice and Rekimoto et al.’s work is the area of mobile Augmented Reality, mobile AR. Augmented reality refers to augmenting our own reality with digital information or artefacts. By Azuma’s definition of augmented reality in [3], an AR-system needs to combine real and virtual interactively in real time with registration in three dimensions. Registration in three dimensions means that the information or artefact needs to be connected to some point in the real world and that this connection is three dimensional. In other words, the information or artefact should not only move left/right or up/down when the AR-system is moved to correspondingly, but it should also increase or decrease its size when the system is moved back and forth4 . By mobile AR, one usually mean an AR-system that is portable and can be moved around. Examples range from portable computers where the user sees the world through a head-mounted display5 to systems where a mobile phone or some other hand held device is used and the user can see the world through the screen of the device. For the purpose of this project, the latter is the most interesting category of mobile AR. While Fitzmaurice and Rekimoto et al. pioneered the area of mobile AR, their prototypes were not useable outside of the laboratory environment. Since then, much research effort has been spent on realising their visions outside of these controlled environments. As mentioned in section 2.2.2, it has become possible to use natural feature tracking to recognize objects and to calculate from what relative angle they are being viewed. Before this was possible, markers and barcodes were used for the same purpose. With the introduction of cell tower triangulation and the GPS, it became possible to position a mobile device anywhere on the surface of the earth, as 4 It could also be that the world, and thus the registration point, is moving relative to the AR-system. 5 See for example Sutherland’s pioneering head mounted AR-system from 1968 [30] or Piekarski and Thomas’ ARQuake [18].

15


long as there were sufficient cell towers or a strong enough GPS-signal. Combining the position with awareness of direction through the use of digital compasses has become yet another way of estimating what the user is looking at. Using the techniques mentioned above, mobile AR has gradually become more and more reality. In 2003, Wagner et al. presented a system that guides users through an unknown building by using a handheld device that recognize markers and overlaying guiding arrows and a wire frame of the building on the camera feed [35]. Kähäri and Murphy in 2006 used a GPS-receiver and digital compass in combination to annotate the image from the camera of a mobile phone with information about buildings and persons [11]. In 2008, Takacs et al. successfully implemented natural feature tracking on a mobile phone and used it to enable mobile AR on a consumer mobile phone [32]. These advances in research has lead to the introduction of commercial systems as well. Layar6 and Wikitude7 are two examples of commercial applications that have received a lot of attention and publicity. They take publicly available and geographically tagged information from sources such as Wikipedia, Twitter and Qype and use it to allow users to view Wikipedia articles about objects around them, view Twitter updates from people in their surroundings or to find reviews of restaurants near them. The information is overlaid on the video feed in real time, letting users rotate to explore their surroundings.

2.3.3

Geo-Wands

Geo-wands are a group of applications first envisioned by Egenhofer in [7]. By his definition, a Geo-Wand is “an intelligent geographic pointer, which allows users to identify remote geographic objects by pointing to them”. As an example, he mentions a hiker that points with his Geo-Wand to a mountain top to see its name and to receive information about its altitude and the distance to it. Building on Egenhofer’s definition, it is easy to envision a number of applications. Once a user has identified an object with the Geo-Wand, he or she could for example receive or leave information about it, mark it so that a friend could find it or send a command to it. Since Egenhofer presented his vision in 1999, parts of it have been realized in different research and commercial projects. In [28], Simon et al. presented their version of a Geo-Wand, fittingly named GeoWand. Through the use of a standard, ofthe-shelf, mobile phone enhanced with external GPS-receiver, accelerometer and magnetometer, they demonstrated a prototype Geo-Wand application which let users point at restaurants to receive more information about them. In [23], another 6 7

http://layar.eu/ http://www.wikitude.org/

16

2.3. RELATED WORK

implementation of a Geo-Wand was demonstrated. Users were equipped with a mobile device connected to a GPS-receiver and a sensor box, containing accelerometers and magnetometers. This device could then be used to mark points the user found interesting when out walking in the city. The points could later be retrieved and viewed on a map using a computer. Common for these two applications is that they both use computers to process the data captured by the sensors. In the first case, the sensor data registered by the mobile phone was sent to a server that computes which points of interests that can be viewed and the result is then sent back to the phone [26]. In the second case, all points of interests that the user marked were saved on the mobile device and were retrieved later for processing and visualization on a computer. An additional example of a similar Geo-Wand can be found in [29]. Nokia has with its Point & Find application [15] demonstrated an application that in part works as a Geo-Wand. Instead of using magnetometers and accelerometers, their application is based on computer vision algorithms that are analyzing, in real time, the video taken by the camera of the mobile phone. The GPS-position of the user is used to retrieve images of points of interest in the users surrounding and then this set of images is matched to the video. If a point of interest is recognized in the video, the user has the ability to click on it to receive additional information. Google Goggles is a similar application that features both usage of computer vision and the combination of GPS-receiver and digital compass [9].

17

Chapter 3

Interaction Prototype This chapter contains a description of the first, basic prototype that was created. The prototype was created as a means of exploring different techniques of doing intersection calculations and to get a feel for the limitations that the sensors impose. The goal was to create a fully functional implementation of all the technical aspects of the intersection calculations, which could then be used in more advanced applications. This way, we would not have to worry about this part when designing our final application prototype, and could focus entirely on finding areas where this type of interaction felt natural. This chapter also describes the technique of determining what the user is pointing at, which is patent pending.

3.1

Interaction Choices

As seen in section 2.1, there are three popular interaction paradigms that are commonly used in mobile spatial interaction: touching, pointing and scanning. The limitations dictated by Ericsson said that the GPS, magnetometer and accelerometer were to be used. Since the functionality of a GPS-receiver is limited indoors, the interaction had to take place outdoors. The number of possible points of interest was therefore limited to things you normally find interesting to interact with when you are outdoors. In most situations outdoors you find yourself quite some distance from the object that you are interested in interacting with. This ruled out the touching paradigm, since Rukzio et al. in [24] suggest that users are more prone to use pointing instead of touching when they are out of reach of the object they wish to interact with. The scanning paradigm on the other hand would be plausible or maybe even preferred in an outdoor environment in some situations, but since Ericsson had restricted the project to use a pointing type of interaction, we did not investigate this paradigm 19

CHAPTER 3. INTERACTION PROTOTYPE

further. However, scanning was later used as a means of comparison, see section 4.4 for more details. Due to these circumstances, the pointing paradigm was choosen as the base for the interaction.

3.2

Prototype Requirements

A few requirements were established for the prototype. The first was that the calculations should be possible to perform in real time. This would allow for applications where the view is updated in real time, which in turn meant that the phone would have to do the calculations itself, since the bandwidth of a mobile phone is limited. This would also be an advantage, as less server structure and hardware would be needed. As seen in section 2.2, the sensors report far from perfect values, so another requirement was that the prototype would let us try the effects of the sensor uncertainty. The goal was to come up with some way to address them or to compensate for their effect. The last requirement was that the final code for doing the intersection calculations should be as standalone as possible, since Ericsson wanted to be able to include the intersection calculation functionality in an API they were working on and since we wanted to be able to reuse this code in our later development.

3.3

Intersection Calculations

When a user points at some point of interest in the real world, calculations are needed in order to determine which point the user meant. The data from the sensors needs to be combined and matched against a collection of points of interest, so that the correct information or action the user requested can be displayed or performed. In general, researchers seem to seldom describe their methods for calculating intersections when using combination of GPS, magnetometer and accelerometer. In this section, we will therefore first describe three methods we have developed and in the end, for comparison, present a fourth method described in another paper. The three methods developed by us during this project have not been described elsewhere and a patent application for them has been filed.

3.3.1

Ray Tracing

The first method we developed is based on a computer graphics method called ray tracing. The principle of ray tracing is to create a ray that starts in some point and 20

3.3. INTERSECTION CALCULATIONS

travels in some direction. This ray is traced through the world, and what objects it intersects on its way is calculated. The GPS-position provides a starting point and the accelerometer and magnetometer in conjunction provide a direction. Together they form a ray, which can then be traced in some model of the world where points of interest has been marked. In the computer graphics version of ray tracing, it is most common to use triangles to represent the model or scene that is to be rendered, but other geometrical figures are possible as well. As long as it is possible to calculate the intersection between the figure and a ray, any geometrical figure can be used. The dimensionality of the algorithm can also be adopted, as the original algorithm operate in three dimensions but can easily be converted to two dimensions as well. For example, to find out if a ray has intersected a circle in two dimensions, start by finding the point on the ray that is closest to the centre of the circle. This point is defined by the fact that a line passing through this point and centre of the circle is perpendicular to the ray itself. The point can thus be found by solving the equation system formed by the parametric form of the two lines and the fact that the scalar product between two perpendicular lines is zero. If the distance between the ray and the circle centre is less than the radius, the ray intersects the circle. This method is straight forward, easy to implement, and depending on the way points of interest are represented, does not require a lot of computational power. The down side of this method however, is that it is very sensible to errors in sensor data. If the GPS-receiver report a position that is some ten meters off, or if the direction is a few degrees off, the target representation need to be relatively large for the ray to hit it. One solution to this problem is to estimate the errors and then to simply send more rays to cover all possible combinations of position and pointing direction. This will of course require more computational power.

3.3.2

Barycentric Coordinates

Another method we developed to calculate what point of interest a user might have pointed at, is to use triangles. The idea is to compensate for the uncertainty in position and bearing by first estimating their magnitude. The estimation is then used to construct a triangle that represents all the possible directions a user could have been pointing in, from every position the user could have had. After this triangle has been calculated, the points of interests are tested against the triangle to determine if they lay inside it or not. The test is performed using Barycentric coordinates. In the Barycentric coordinate system, the two coordinate axis coincide with two of the sides of the triangle. The normal coordinates of a point of interest is converted into Barycentric coordinates, and depending on the value of these coordinates, it is possible to determine where the point is relative to the triangle. 21


Figure 3.1. The worst-case triangle. (x0 , y0 ) is the estimated position of the user and (xp ,yp ) is the tangent point of a line that is pointing in one of the two extremes of the possible pointing directions.

A “worst-case triangle” is constructed from the estimated error in the sensor data (figure 3.1). The error in GPS-position defines the radius of a circle around the reported position. If the estimation of the error is correct, this circle represents all the positions where the user might actually be standing. Next step is to calculate all the possible direction the user could have been pointing in. This is a simple matter, since the accuracy of the direction sensor is usually estimated in how many degrees off to either side of the reported direction that the actual direction can be. Thus all the directions the user could have been pointing in are reported direction ± estimated error of the direction. Next, the two points where a line, pointing in either of the two extremes of the possible pointing directions, tangents the circle around the reported position is calculated. For every line there are actually two points where the line will tangent the circle, but the sought point is the one that is “behind” the user. By starting in one of the two points behind the user, and going in the direction that is pointing away from the reported direction by the estimated direction error, two lines can be created that will intersect somewhere behind the user. These two lines mark two of the sides of the worst-case triangle. The third side can be determined by defining some maximum distance away from the user that a point of interest is allowed to be. This triangle can then be used to calculate what the user was pointing at. This method provides a simple way of compensating for the uncertainties of the sensors, if these can be determined or estimated. It is also computationally efficient. On the down, this method is limited to finding points within a triangle, preventing the points of interest from taking any shape other than a point. In some scenarios this might not be a problem, but most objects which humans are used to interact with have at least some spatial distribution (as opposed to the zero-dimensional 22

3.3. INTERSECTION CALCULATIONS

Figure 3.2. The principle of the Ray-Triangle Combination method. B is the bearing, ∆B is the error in the bearing, P0 is the user’s position, P1 is the compensated position using the worst-case triangle, T is the position of some point of interest and P3 is the point on the line formed by P1 and P2 where the distance to T is the smallest. P5 is the same as P3 but for the other side of the worst case triangle. The grey circle demonstrates the uncertainty of the position.

point). Another down side is that this method cannot be extended to three dimensions, since the Barycentric coordinates by their very nature is two dimensional as the triangle is a two dimensional figure.

3.3.3

Ray-Triangle Combination

The third and final method we developed can be seen as a combination of the first two. It has the strength of allowing any shape of the intersection points but is still able to compensate for errors in the sensor data. First, the same worst-case triangle as in the Barycentric-method is calculated. Instead of using Barycentric coordinates however, we trace rays through the first two calculated sides of the triangle. These rays hit objects in the same manner as in the ray tracing-method, allowing points of interest to have any intersectable geometric form. We then calculate if there are any points of interest in between the two rays. This can be done either by using the Barycentric coordinates, by comparing angles or by using a third way we have developed. In this third way, we start by first choosing one of the rays. On this ray, the point P3 where the distance to the point of interest T is as small as possible is calculated. P3 can be found using the fact that the shortest distance between the ray and T will be where a line through T and P3 is perpendicular to the ray itself. When P3 has been found, we construct a new line by using the line’s parametric form, L = P3 + a ∗ (T − P3 ), where a is a scalar value. That is, a line that is starting in P3 23


Figure 3.3. The three different cases for T’s position in the Ray-Triangle Combination method. The arrow shows where a = 1 is on the line formed by P3 and T.

and that has positive a for points that lay in the direction of T. After this line has been found, we calculate the value of a for the point where L intersect the second ray. The value of a gives us information about where T is relative to the two rays. If a has a value that is larger than one, T can be found between the rays, otherwise T lies outside of the triangle. A picture showing the method can be found in figure 3.2. Figure 3.3 shows the three different cases of values a can have. This method has the advantages of the both previous methods, but still has a moderate complexity. As with the Barycentric-method, this algorithm is bound to two dimensions. A patent application has been made for this method in combination with the “worst-case triangle” mentioned in the section about Barycentric coordinates.

3.3.4

Monte Carlo Sampling - an Alternative Method

In [29], Strachan and Murray-Smith present an alternative method to estimate what the user wants to interact with. They take a probabilistic approach, using Monte Carlo sampling to evaluate a probability distribution of possible future positions. Monte Carlo sampling is a method to estimate a distribution. Instead of evaluating the distribution explicitly, a number of random samples are drawn from the distribution as a mean of approximating it. It can be shown that if enough samples are drawn, the Monte Carlo sampling solution converges to the real solution. In the method proposed by Strachan and Murray-Smith, a number of samples are drawn from the distribution that describes the sensor uncertainty around an initial position. Each one of these samples represents a possible real location of the user, since the location reported by the GPS-receiver might not be entirely correct. Each one of these samples is then propagated forward in the direction reported by the 24

3.4. IMPLEMENTATION

digital compass. The propagated samples represent possible locations of the user in the future. By sending this “cloud” of possible positions forward, the user can walk forward virtually and obtain digital information from places he or she might physically visit in the future. The propagation of the samples is based on the direction in which the user is pointing the mobile device, but also on a precalculated noise map and a precalculated likelihood map. The noise map is an estimate of the sensor noise for every possible position a sample may visit and the likelihood map gives an estimation of how likely it is that a sample is in a certain position at a certain time. It is for example not very likely for a sample to be inside a wall, since this is not a very likely position for the user in the future. The result is that the samples seem to flow forward along the pointing direction of the user, avoiding buildings and other positions that it is unlikely the user will visit in the future. The strength of this method is that it compensates for uncertainties in the sensor readings, however at the price of the increased complexity of generating the likelihood map and noise map. This method is also bound to two dimensions and could not be used to detect a user pointing at a shop on the second floor. It is also worth noting that this method does not try to determine what the user is pointing at. Instead it calculates likely positions of the user in the future.

3.4

Implementation

Of the four methods presented in the previous chapter, the first three were implemented. The reason for not implementing the forth method was that we did not have the necessary information to construct likelihood and noise maps. The methods were implemented using the Java programming language version that is used by the Android mobile phone operating system and tested on the developer phone released by Google, called G1 (manufactured by HTC Corporation). This phone contained all the necessary hardware and was chosen because it was the only programmable phone that was available to us at the time. All three methods were implemented in a small static class coupled with an interface to define the points of interest. During ray tracing, the points of interest have the geometric form of a circle, defined by a point in space and a radius. To compensate for the fact that the coordinate system used by GPS devices has axes of different scale, a compensation factor was used to convert units from one of the axis to the units of the other axis. A conversion factor was also used to convert from meter units to units of the chosen axis of the GPS coordinate system, to let users give different measurements in the more intuitive meter system instead of in GPS-units. The Android system provides classes that reports the readings of the GPS-receiver, 25


Figure 3.4. To the left: early stage of the development, the circle represents the error in position reported by the hardware and the red line points where the user is pointing the phone. In the middle: using ray-triangle intersection to intersect a street intersection. The green circle segment represents a user configured width of search. To the right: notice how the GPS hardware report a small position error (the circle around the reported position is small) even though the real position is far from the reported one.

the magnetometer and the accelerometer and these were configured to report values as frequent as possible. Along with the GPS-position, an estimation in meters of the accuracy of the GPS-position could be obtained and was used to draw a circle around the user representing possible real locations. The magnetometer and accelerometer reading were combined using methods provided by the Android system to give a bearing in radians relative to the magnetic north pole. A simple smoothing filter was constructed to smooth out the varying bearings reported by the accelerometer. The filter converts the provided bearing from polar coordinates to normal Cartesian coordinates x and y, which are then averaged using the past 30 values before being converted back to a bearing again.

3.5

Prototype Results

Three different screenshots of the prototype in different stages of development can be seen in figure 3.4. No formal tests were conducted on the performance of the sensors or the different methods. Informal testing however suggested that it was hard to hit targets using the ray tracing method because the GPS-position almost always contained errors. On the other hand, the estimation of the error in GPS-position reported by the hardware seems to be very optimistic in most cases, reporting errors of a few meters when the position in reality was several tens of meters wrong. This led to that the worst-case triangle of the Barycentric method and the ray-triangle 26

3.5. PROTOTYPE RESULTS

method did not really make a difference. We tried to come up with other ways of estimating the error in the GPS-position, but none was found. To compensate for the difficulties in estimating the error in GPS-position, we introduced a user defined angle around the reported bearing, which was used to create the triangle needed by the Barycentric and ray-triangle methods. Instead of constructing the worst-case triangle, we simply constructed a triangle with one of its corners in the reported position. The two legs meeting in this corner were pointing in the bearing plus or minus the user defined angle. The motivation for this change was that if a broad enough triangle was used, the user would at least hit the target he or she intended, at the price of possibly having to select it in a list if other targets that were hit as well. Using this new user defined angle to construct the needed triangle, we soon discovered a design problem with using the Barycentric approach. To mark points of interest in the map API an icon provided by the Android system was used. The icon was centered at the GPS-coordinates of the point of interest, which in turn meant that using the Barycentric method, the selection triangle had to contain the centre of the icon to register the point of interest as hit. This lead to situations where it look like you had hit the target, as the triangle intersected a part of the icon, but since the triangle did not contain the centre of the icon, no hit was reported. Using instead the ray-triangle method, we could set an appropriate radius around the centre of the icon, making all parts of the icon possible to hit. Because of this, the ray-triangle method was selected as the method to be used in the rest of our work if we were to work with two dimensional data, otherwise we would use the ray tracing method. A simple test was conducted to see how well the ray tracing and ray-triangle methods performed in the aspect of computational time. Using a thousand randomly generated points of interest in the vicinity, both methods took around 40 milliseconds in average to test all points for intersection, without the use of any special data structure. This performance was deemed sufficient, as it would enable an application with 1000 available points of interest to run in 25 frames per second. If more performance is needed, data structures such as kd-trees, quad-trees or a binary tree could be used to decrease the number of points of interest that need to be tested for intersection.

27

Chapter 4

Application Prototype After the initial prototype had been completed, the work began to create a real application that would demonstrate the pointing interaction in use in a real usage scenario. This chapter will provide insight into the design process of this application, from concept development and design decisions via the resulting application to user evaluation.

4.1

Concept Development

The goal of the whole concept development was to come up with a few different alternative concepts of applications that would utilize the type of interaction that was implemented in the interaction prototype (see chapter 3 for details), and finally to choose one of them to be implemented. This application would serve the purpose of demonstrating a scenario where it would feel natural to interact in this way. Parallel to the work with the interaction prototype, the application concept development started with different brainstorming sessions between me and my co-worker Josefin Löfström. We studied different available applications such as Layar, NRU, Nokia Point&Find and Wikitude.

4.1.1

Hypothesis

After a few brainstorming sessions and after doing analyses and evaluations of other companies’ applications, a hypothesis began to take form about when the interaction supported by the interaction prototype would be useful. Our hypothesis was: “The pointing interaction of the interaction prototype is useful in sit29

CHAPTER 4. APPLICATION PROTOTYPE

uations where you are within viewing distance of an object that you associate with a certain type of information, which you frequently find yourself in need of.”

This hypothesis formed the basis for the concept development, where we sought to develop concepts which included such situations, objects and information.

4.1.2

Four Concept Candidates

To investigate when people find themselves in situations similar to the one in the hypothesis, two focus groups were held at the Royal Institute of Technology, KTH, in Stockholm. The focus groups were based on discussions around a set of slides and the participants got to try the interaction prototype as well as Layar and Wikitude. The result of the focus groups, together with some brainstorming, resulted in the development of four different application concepts.

Cinema & Restaurant Many people mentioned during the focus groups that they sometimes found themselves out walking in town, looking for a restaurant or cafe and that they used a normal computer to sometimes look for new restaurants to go to or to see if a particular restaurant had received good reviews. This resulted in a concept of a mobile application that could help people answer questions such as “what’s on the menu?” and “is this really a good place?”, without the need for a desktop computer. Later cinemas were added to the concept, to let users easily find out which movies they could see in a closeby movie theatre.

Nightclub Another concept was built around the world of clubbing. Many of the participants of the focus group regularly visited the night life of Stockholm and the need for a club guide was identified. In Stockholm there are nightclubs with different dress codes, music themes and age limits. To add to the confusion, some premises host different nightclubs during different weekdays. When out looking for a good club, it would be handy to have a mobile application which could tell you all these things, without you having to spend 20 minutes in a queue to find out. 30

4.1. CONCEPT DEVELOPMENT

Public Transportation Most of the participants of the focus groups were travelling by public transportation. Being frequent travellers, they often found themselves in need of consulting the timetables for buses, subway and commuter trains. The only real alternative if they were not by a computer or at the bus stop itself, were to use the web browser of their mobile phone, a slow process that often leave a lot to desire. A mobile application to solve the above problem therefore became a concept.

Photo Exploration The last concept we developed was based on photos. GPS-receivers have not only started to appear in mobile phones, but also in cameras. Since almost every mobile phone also includes a camera, a new possibility has opened up to tag photos with the exact location where they were taken. The ability to explore this increasing pool of images on site could provide possibilities to for example do time travelling to another year or season, or to view a famous building from inside, even during hours when the building is closed.

4.1.3

Final Concept

After consulting with our supervisors at Ericsson, two of the concepts were kept as final candidates, the nightclub and bus timetable concept. Both the photo exploration and cinema & restaurant concept were deemed too commonly found in already available mobile applications. There are plenty of mobile applications available that let you view photos taken at a certain position or that will guide you to a nearby restaurant with good reviews. The other two however felt unexplored and could both present usage situations where pointing interaction would feel natural. In the end, the final decision became developing the public transportation concept. The reason for this was mainly that it was far easier to get hold of real data in the case of public transportation, since Ericsson was already cooperating with Upplands Länstrafik, UL, which operate the public transportation in the city of Uppsala and in its surroundings. In the case of the nightclub concept, we would have needed to spend time trying to pursue a similar cooperation with some company that owned data about nightclubs in Stockholm or some other larger city in Sweden and this was time we judged we did not have. After improving on the public transportation concept, and narrowing it down a bit, the final goal became to develop a small but fully functional application to support the following use case: 31


“You have just finished for the day and are heading home. As usual you want to take the bus from the nearby bus stop. It is a foul weather outside, so you do not want to miss the bus. Unfortunately, since you have not learned the timetable by memory, you do not know if you need to run to catch the bus. Therefore you take out your mobile phone and point it towards the bus stop. By pushing a button you immediately see that the bus arrive in 2 minutes, so you will need to hurry.”

4.2

Resulting Application - Time2Go

The development finally resulted in an application called Time2Go. The application covers the scenario mentioned in 4.1.3 and can be divided into three views. Firt a welcome screen is first shown. After the user has pointed towards a bus stop and pressed the menu-button, a second screen appears, displaying a list of what the user hit. If nothing was hit, a suggestion is shown instead. The third and last screen is a detailed view of different departures for a certain bus, which appears if the user selects a bus from the list of results in the result view.

4.2.1

Welcome Screen

When the user starts Time2Go, the application begins by searching for a GPS-signal and then proceeds to download data about nearby bus stops from UL’s servers. Without this initial data, the application is not able to perform any work, so these first steps are forced and obligatory to perform. In figure 4.1, the screenshot to the left and in the middle show the popup screen when waiting for GPS-signal and downloading data respectively. On the right is the welcome screen shown after the popup screens has disappeared, indicating that the application is ready for use. The text of the welcome screen instructs users to aim at a bus stop and press the menu button to see departures for that bus stop, or to press the “Jag vill ha instruktioner!”-button for more detailed instructions. Pressing this button will show a small animation (the key frames of the animation can be found in figure 4.2). Once the animation completes, the welcome screen is shown again.

4.2.2

Bus Data

The data is downloaded from a system provided by UL. A query is sent to this system about bus stops in the area surrounding the user, and then data is requested for each one of these bus stops in terms of departures. The data is requested through the use of a mobile or wifi network connection, depending on what is available to 32

4.2. RESULTING APPLICATION - TIME2GO

Figure 4.1. The welcome screen of Time2Go. From left to right: waiting for GPSsignal, downloading bus stop data and the application ready for use.

Figure 4.2. Animation with detailed instructions on the use of Time2Go.

the phone. New data is requested if the old data becomes outdated or if the user moves to a new area.

4.2.3

Interacting

Once a GPS-position and bus stop data has been obtained, the application is ready for use. Figure 4.3 demonstrate how a user uses Time2Go to catch a bus. In the picture on the left, he points the phone towards the bus stop and presses the menu button. Immediately the current timetable for the bus stop appear on the screen (picture in the middle), where he can see that the bus will soon arrive. Walking over to the bus stop, he arrives just in time to catch the bus (picture on the right). 33


Figure 4.3. A user using Time2Go to catch a bus.

Figure 4.4. First picture: list of hit bus stops. Second picture: same as first picture, but with the map shown. The left circle segment is pointing where the user is currently pointing (i.e. update in real time), while the other circle segment shows where the user was pointing when pressing menu. Third picture: no hit, the application suggests the closest bus stop. Fourth picture: the view after the user agreed to the applications suggestion.

4.2.4

Result View

After the user has pressed the menu button, a result view is shown. The appearance of this view depends on whether the user actually hit something or not. Picture one of figure 4.4 show what the result view looks like when the user hit something. For each bus stop that was hit a list item is displayed with name and a color coded icon, followed by several list items showing number, destination and time to departure of buses that soon are to depart. The black arrow in the bottom of the screen is the handle of a drawer, containing a map. Users drag this handle upwards to open the drawer. The second picture in 4.4 shows the result view when this drawer is open. The map shows icons displaying 34

4.3. DESIGN DECISIONS

the locations of nearby bus stops. Bus stops which were hit have a colored icon which match the icon in the result list to facilitate navigation, while bus stops that were not hit are displayed using a grey icon. The map updates its orientation when the user presses the menu button, to reflect the direction where the user was pointing. The circle segment in the middle and the small circle around the icon representing the user shows the area wherein the search for bus stops was performed. The circle segment on the left is updated in real time and shows where the user is currently pointing. If the user did not hit any bus stop, the view will look like the third picture1 in figure 4.4. The text tells the user that there was no direct hit, and asks if the user meant the bus stop which is the closest. The user can now press the button with the name of the bus stop to select it, or point and press menu again to try again hitting a bus stop directly. The fourth picture in figure 4.4 shows the view after the user selected to use the closest bus stop. Notice how the color of the icon of the bus stop changes from grey to indicate that this bus stop is now selected.

4.2.5

Detailed View

The result view provides a quick way to see the next departure of several buses, but sometimes a little more information is needed. What if I miss the next bus, how long do I have to wait? To answer such questions, a detailed view was added. By clicking on a bus using the touch screen, a detailed view is shown. This detailed view, which can be seen in figure 4.5, provide information about the next four departures, as well as the last time the bus departed. An icon display what sort of bus it is2 , and together with the number and destination of the bus, the name of the selected bus stop is displayed as well. If applicable, the stop point within the bus stop is also displayed together with the name of the bus stop. Large bus stops for example, might have many different points where the bus may stop.

4.3

Design Decisions

In this section, a few of the major design decisions will be mentioned and explained. For more details, see the report by my project colleague Josefin Löfström [13]. Not 1 However, the map drawer will be closed. It was opened in this picture to help show what happens when the user selects the suggested bus stop. 2 For example, UL have three different types of buses (city buses, long distance buses and express buses) and one type of train.

35


Figure 4.5. The detailed view of a subway train, showing the last departure as well as the next four departures.

all of these decisions were made before the application development started, as some of them were the result of the user evaluations (see section 4.4).

4.3.1

The Map

One of the requirements imposed on this degree project was that we were to use a map provided by an Ericsson API in the application. Because of the uncertainty of the GPS-position, it was decided that the map would be used to communicate this uncertainty. In cases when the users did not hit what they intended, they could consult the map in order to understand what happened. If the GPS-position was wrong, the user would see that the reported position was some way off and could compensate for this, or at the very least understand why something went wrong. The map API provided by Ericsson includes the capability to rotate the map in any direction. This was used to give users a properly orientated map in the result view. One idea was to let the map itself rotate in real time to reflect where the user is currently pointing (currently done by only having a circle segment rotate, see 4.2.4), and to pin the circle segment indicating the last intersected bus stops to the map itself. This would be almost opposite to how it works today, and might have been a more natural way of displaying the map. However, limitations in the performance of the map API prevent the map from rotating in real time, so it was decided that the current solution was the best possible. 36

4.3. DESIGN DECISIONS

4.3.2

Precision

To prevent users from losing faith in the application, a certain success rate of the pointing interaction is needed. Because of this, a goal was set for the application. If possible, the bus stop the user pointed at should be included in a list of possibly hit bus stops at the user’s first interaction attempt. If the application fails to do this, a second try should result in the proper bus stop being included in the list. Never should the user have to try three times to succeed. In terms of design decisions, this lead to that the circle segment which capture bus stops were made larger to capture more bus stops, in order to increase the chance of capturing the point of interest that the user intended.

4.3.3

Interaction Trigger

As mentioned in 4.2.3, the user press the menu-button to trigger the interaction. This button was chosen simply because it was comfortable to press with the thumb on our G1 mobile phone, while at the same time pointing towards a bus stop. It may be discussed if it was a wise decision to use a button that was so clearly intended to be used for a special purpose (to display a menu in this case). Especially since this button is placed in a different location on other Android mobile phones, a location where it is a lot harder to reach comfortably while pointing. Better alternatives might have been to use the trackball that all Android phones feature, or to place a touch screen button in the bottom of the screen. The down side of the last approach is that this would require use of precious screen space.

4.3.4

List of Results

Using a larger circle segment when selecting bus stop meant the possibility of more bus stops being included in the result list. The results were represented in a list where bus stops appear sorted according to distance from the user, with the closest one at the top. For every bus stop hit by the user, a list entry was made with the name of the bus stop and an icon. After each bus stop list entry followed one list entry for every bus departing from this bus stop, including number, destination and time to next departure. When all buses departing from the bus stop had been listed, a new list entry with a new bus stop was made. It was deemed most likely that the closest of the hit bus stop would be the one the user intended, since situations where a user points over one visible bus stop towards another would probably not be as common as situations where the user could only see one bus stop3 . 3

Note that the system used by UL treated two bus stops on opposite sides of a road as the one and same bus stop and not two separate stops.

37


This way a user would hopefully find the bus stop he or she was looking for among the first few in the list. Next step for the user would then be to locate the correct bus in the list of buses departing from that bus stop. To aid the user in doing this, the buses were sorted using their number, since we noted that people tend to refer to buses by their number rather than their destination. It would have been possible to sort the list according to departure time, but this would mean that the list would suddenly change every time a bus departed. According to Ailisto et al. in [2], the interaction with the mobile phone might not be the primary action of the user. For example the primary action could be to walk down the street while avoiding to collide with someone and the attention the user pays to the phone is therefore fragmented. It might be confusing to the user if the screen of the phone is not the same when he or she returns to it after focusing on the primary action. Technically it would have been possible to support user sorting, but to avoid introducing elements into the interface that were not of interest during the user evaluation, we decided not to include such a feature.

4.3.5

Detailed View

A detailed view for each bus was created to allow users to see more than just the next departure. To access this view, a user would simply click the bus of interest, and the view would appear showing when the bus departed the last time and the next four departures. It would also show the destination, the bus stop for which the departure times were valid and, if applicable, from what part of the bus stop the bus would depart. Using this view, users could see the next few departures, should they miss the current one. Five departures were included in total. The reason for this was that the most frequently departing buses in Uppsala departed in five minute intervals. With the list showing one previous departure and four coming departures, this leads to the user being able to see a minimum of 15 minutes into the future, if the next departure is in a few seconds. Because the intended use case was that the user is already on his or her way to the bus stop and could see it, we believe seeing 15 minutes of future departures should be sufficient for the user to be able to plan ahead. To support more use cases however, the option to see later departures should be added.

4.4

User Evaluation

During the development of the Time2Go application, user evaluations were regularly performed. The first few tests were informal tests, conducted with Ericsson employees in Kista using mocked up data. The result was mainly small changes to 38

4.4. USER EVALUATION

Figure 4.6. The first version of the Touch2Go application, used only in the user evaluation in Uppsala. Interaction flowing from left to right when the user selects an item.

the GUI and to the instructions on how to use the application. Later in the development phase, two formal qualitative user evaluations were conducted, one in the centre of Uppsala and one at the Royal Institute of Technology, KTH, in Stockholm. In addition to this, my supervisor Alex Olwal did a heuristic expert evaluation of the interface in between the two user studies.

4.4.1

Alternative Applications

In addition to Time2Go, two more applications were created to be presented in the user evaluation as alternatives. All three applications provide the same information, the difference lay in how users interact to access this information.

Touch2Go Touch2Go incorporate a more traditional interface compared to Time2Go. Instead of pointing towards a bus stop and hitting a button to select it, Touch2Go lets users select the bus stop they are interested in by clicking on an icon on a map, using the touch screen. The map was fixed to the GPS-position of the user, with north always being up on the screen. We choose to include this interface in the user evaluations to see how our application fared against a more traditional interface. Most users are used to interact directly with their phone, either through a key pad or a touch screen (or a combination there of) and the traditional map is an entity most every one is familiar with. Touch2Go can also be said to use the scanning paradigm, whereas Time2Go is using the pointing paradigm (see section 2.1). 39


Figure 4.7. The second and final version of Touch2Go application. The visual appearance and information provided is the same as in the Time2Go application.

As can be seen in figure 4.6, the first version of Touch2Go differed in appearance from Time2Go. This version was only used during the user evaluation performed in Uppsala. After this Olwal pointed out that the result of the user evaluations would be more fair if only the aspects to be compared differed in the two applications. The update version of Touch2Go that was used during the user evaluations at KTH can be seen in figure 4.7. In this version, the only thing that differs between Touch2Go and the main application Time2Go is the first screen. After this initial screen, both applications look the same.

Radar2Go In Radar2Go, seen in figure 4.8, users are presented with a radar-like egocentric view of the world around them. A small icon represents the user, and when the user turns, so does the view in real time on the phone. Points of interest are represented by icons on the screen. Icons in front of the user on the screen represent points of interest in front of the user in the real world. If the user turns left, points of interest that were previously in front of the user will now be to the right of the user, and in accordance the icons representing the points of interest on the screen now lay to the right of the icon representing the user. Users select points of interests that they are interested in by clicking on their icons on the screen. The purpose of including Radar2Go in the user evaluation was to provide participants with another non-traditional way of interacting for comparison. This application, like Time2Go, takes advantage of the GPS, magnetometer and accelerometer to figure out what the user is viewing. However, instead of using this information to guess what point of interest the user want to interact with, as is done in Time2Go, 40


Figure 4.8. The Radar2Go application. To the left: the radar view showing there are two bus stops to the right of the user. In the middle: the screen shown after the user has selected one of the bus stops. To the right: detailed view of one of the buses.

the information is simply displayed on the screen where the user can indicate exactly which point of interest to interact with. This way we hoped to be able to draw conclusions about the added value of not interacting on the screen.

4.4.2

The Uppsala Test

A pilot study was performed in Uppsala, Sweden, with 6 participants. The participants were divided into groups that were tested separately. The first group consisted of three persons, the second of two and the last of one. The groups were gathered some distance away from the bus stop Stora torget in the centre of Uppsala and the usage scenario was presented (see 4.1.3 for details). The task given was to find out the next few departures of a certain bus to see if there would be time for a short stop in a shop without having to wait too long for the next departure. Each group first got to try out the Time2Go application, followed by Touch2Go. At the time, Radar2Go was not available for testing. After testing each application, the participants were asked what they thought of it, what aspects were good and what were bad. In the end, the participants got to compare the two applications to each other and had to choose which one they liked the best. This study was the first we made, and there were a couple of things that we could have done a lot better. One of the problems was that the bus stop the participants were told they were heading towards was not visible from the spot where the evaluation was performed. Since Stora torget was a bus stop that none of the participants used frequently, it was hard for the participants to stay with the scenario. Another problem was that the order in which the applications were tested was always 41


the same, which might have made the results biased. Also, the Touch2Go application looked far less developed compared to the Time2Go application (see 4.4.1), something that may have affected the results as well. Taking all this into account, we chose to see this user evaluation a pilot study. Some small changes were made to the interface of Time2Go, but no conclusions of a more general kind by comparing the two applications were drawn. Instead we tried to learn from our mistakes and made a second user evaluation to obtain more reliable results.

4.4.3

Heuristic Expert Evaluation

Between the first and second user evaluation, my supervisor Alex Olwal performed a heuristic expert evaluation of the interface of Time2Go and Touch2Go. As usual when doing a heuristic expert evaluation, Olwal walk through the interface screen by screen and his opinions and suggestions were recorded. This evaluation lead to a list of changes to the applications and he also suggested to add an application with a radar-like first screen, which became the Radar2Go application. Examples of changes made were: to include a welcome screen where the instruction animation did not start automatically, as was the case before; to give a suggestion of a bus stop when the user did not hit anything; to give some sort of live feedback of where the user is pointing (now included in the map view) and to make Time2Go and Touch2Go look as similar as possible to make sure people react to the difference in interaction and not to interface differences.

4.4.4

The KTH Test

A second user evaluation was performed at the Royal Institute of Technology, KTH, in Stockholm. This test was performed with the pilot study in Uppsala in mind, trying to fix the problems that were identified.

Method This test was performed as a field study. According to Preece et al. in [20], a field study is typically used when studying a product or prototype being used in its intended environment. The point is to find out how the product work in its intended environment, rather than in a controlled environment. The latter is the case when doing usability testing. We were more interested in what users thought about the different applications when used in their intended environment, since this would be a mayor factor for Ericsson when deciding what interaction technique to use in future products. Pure performance statistics on how users perform when 42


using the different types of interaction would not be of use if users do not enjoy the ways of interacting in the first place. Interviews and observations were used to gather data together with a qualitative approach to data analysis. In [20], Preece et al. state that in field studies, interviews and observations are primarily used, and that qualitative analysis of data “focuses on the nature of something”. By focusing on the nature of the participants’ thoughts and opinions, we hoped to uncover the underlying reasons for why the participants thought as they did. These reasons could point to areas where further and more thorough studies could be performed. One could argue that a combination of qualitative and quantitative data analysis would have given a more solid result and that other data gathering methods could have been used. Given the small amount of participants in the evaluation, we judged that the additional information a quantitative analysis could provide would not be of interest, since it could not be generalized to account for the whole population of users. Preece et al. also suggest that interviews and direct observations in the field are good ways to explore issues of a product and to gain insight into the context of user activity. Therefore we choose to use these two methods.

Test Design The test performed at KTH was designed to provide the answer to the question following question: Which selection paradigm do users prefer in the scenario presented in section 4.1.3 and why? The three different selection paradigms of Time2Go, Touch2Go and Radar2Go were compared using a comparison test. The only thing that differed between the applications was the selection paradigm, thus this was the independent test variable. The reactions and comments of the participants constitutes the dependent test variable. Several qualitative usability factors were identified from the users reactions, what they said and the discussions around their opinions of the different applications.

Participants A total of eight persons participated in the user evaluation. Two of the participants did the evaluation alone and the rest in pairs. By doing the test in pairs, we hoped to inspire discussions between the two participants about the tested applications and their strengths and weaknesses. The average age of the participants was 23, with the oldest being 26 and the youngest 20. Two of the participants were female, the rest were male. All participants were master students at KTH. Two users 43


had experience using smartphones, while the rest were mainly using their phone to schedule things in the calendar, to make calls and occasionally to surf the Internet through WAP.

Test Procedure The evaluation was performed close to Infocenter at KTH, from a spot about 100 meters away from the entrance to a subway station and a bus stop. The subway entrance and bus stops were clearly visible and participants knew well their locations. Each participant began by writing down some information about himself or herself (age, occupation and how they were using their mobile phone). One of the test leaders then introduced the participants to the usage scenario: “You have just finished for the day and are heading home. As usual you want to take the bus from the nearby bus stop. It is a foul weather outside, so you do not want to miss the bus. Unfortunately, since you have not learned the timetable by memory, you do not know if you need to run to catch the bus. Therefore you take out your mobile phone and point it towards the bus stop. By pushing a button you immediately see that the bus arrive in 2 minutes, so you will need to hurry if you do not want to miss it.” Each of the participants got to try out all three different applications: Time2Go, Touch2Go and Radar2Go. The order of testing was shifted between each group, so that no group tested the applications in the same order as another group. After testing each application, the participants were asked about thoughts and opinions about the application they just tested, and if there was something they thought was bad, strange, good or that they would like to change. Opinions were recorded using pen and paper. When all the applications had been tested, participants were asked to compare the applications and to motivate which application they thought were the easiest, fastest and most fun to use respectively and in the end which application they preferred and why.

4.4.5

Results of the User Evaluations

Since the test in Uppsala had some flaws, the results from that user evaluation will not be included here because their correctness cannot be guaranteed. Due to these flaws, most of the results were small changes to the user interface and no conclusions were drawn about the interaction or the hypothesis from this evaluation. Therefore the rest of the results presented in this section will be based on the second user evaluation. 44


Because the KTH user evaluation was a qualitative study, the results will not be expressed in numbers or graphs, but rather in text and perceived thoughts and opinions. Due to the low number of participants in the user studies, we choose to focus on the interesting opinions that surfaced during the evaluation. These opinions could form a base for future and more rigorous user evaluation. During the evaluation, the order in which participants tested the applications was shifted to make sure no bias was introduced by judging one application by another. The order seemed to have little or no effect on the result, as the participants tended to have the same opinions of the applications even though they tested them in different order. The Radar2Go application, most participants felt was missing a map, but that it was easy to use, especially if they did not already know where they were going. Some participants requested a zoom function, to be able to see more of what lay around them. Many participants liked the map of Touch2Go, but would like it to rotate in accordance to where they were looking, instead of it being fixed like a traditional paper map. Some felt that it gave a clearer picture of where bus stops were in relation to each other as well, and suggested that merging Radar2Go and Touch2Go would be a good way to improve both applications. However, one participant noted that “if I know where I am going, I have no need for a map”. When introduced to the Time2Go application, many participants reacted positively to the experience of being able to point to access the timetable. One participant even though it was a joke that she was supposed to point towards the bus stop and press the menu button, but was very pleased when it worked. Easy and intuitive was words used to describe the interaction. Many participants commented that this was a good way of interacting when they saw the bus stop or knew exactly where it was, but that it would not be so useful when the location of the bus stop was not known beforehand. This was also thought to be the application that was the most fun and the quickest to use. Although the number of clicks needed to get the next departure of a bus was the same for all applications, some participants perceived it as if they needed less clicks to get the same information when using Time2Go.

45

Chapter 5

Conclusions, Discussion and Future Work In this final chapter, the conclusions of the results will be presented along with a discussion of different aspects of the work as a whole. The chapter is ended with suggestions on future work to be performed or aspects of our work that would be interesting to investigate further.

5.1

Conclusions

Since our user evaluations were qualitative studies, the conclusions drawn here are based on our interpretations of comments and thoughts expressed by the participants of the user evaluation. As with any qualitative study, it is important to remember that the conclusions have been filtered by us as researchers and that our presence and questions might have affected the result. These conclusions are meant as a base for discussions and further research, and should not be seen as proved facts.

5.1.1

Interaction Conclusions

No direct evaluation of the accuracy of the interaction technique was made, but during the user evaluations, no user complained that they did not hit what they wanted. Neither have we experienced any difficulties while using nor testing the various prototypes created during the course of our work. While this is positive and might suggest that the performance is not entirely bad, it is not possible to say if 47

CHAPTER 5. CONCLUSIONS, DISCUSSION AND FUTURE WORK

the goal of always including the proper bus stop on the first or second try has been met. To do this, a formal evaluation of the performance has to be made. Another goal was to see if it was possible to implement an interaction technique that would allow users to bridge the virtual and physical world using an off-the-shelf mobile phone equipped with GPS-receiver, accelerometer and magnetometer. The conclusion is that it is entirely plausible, but that the success of such an interaction depends on the accuracy needed in order to make the interaction work. During the user evaluation, many participants reacted with surprised and very satisfied expressions the first time they tested Time2Go. When asked about the scenario given in the evaluation, most responded that they found Time2Go to be intuitive, fun and quick to use. This suggests that users prefer the interaction of Time2Go in the given scenario over the more traditional interaction of Touch2Go and over the alternative interaction of Radar2Go. There is several factors that could have affected this result however. Participants only got to test the applications during a few minutes and only on one occasion in one scenario. When testing a new and novel way of interaction, such as by pointing, there is a possibility that the “coolness” of the new interaction technique might surpass its negative aspects and thus biasing the results. It is also possible that the users’ opinions might have changed if they had been allowed more practice in using the three applications, taking away some of the effect the learning curve might have had. Many participants of the user evaluations also commented on the scenario, saying that if they did not see what they wanted to interact with they would prefer other ways of interacting than the one of Time2Go. Since these are all factors that could have affected the result, it is hard to say if users would really prefer the interaction of Time2Go if it was a part of an application that they used on a day-to-day basis. One of the results presented in 4.4.5 was that participants of the user evaluation perceived that they needed less clicks to access timetables when using the Time2Go application, despite that the number of clicks needed to access the information was the same for all tested applications. The difference between Time2Go and the other two applications in this aspect was that in Time2Go users pressed a physical button while pointing with the device, while in the other two applications users pressed an icon representing the bus stop they wanted timetables for. One theory on why users perceived Time2Go to be less clicks is that the cognitive load for choosing bus stop with Time2Go is smaller, since no identification between the wanted physical bus stop and its representation on the screen is needed. With Time2Go users can basically say “that one there” and press a button to access the wanted information, while when using the other two applications an additional connection is needed as in “that one there which is this one here” before they can press the button.

48

5.1. CONCLUSIONS

5.1.2

Conclusions About the Hypothesis

During the concept development phase, a hypothesis (see 4.1.1) was created. This hypothesis describes situations where the pointing interaction implemented in the interaction prototype can be useful. The user evaluation at KTH did not provide any indications that this hypothesis would be wrong. At times participants of the evaluation discussed scenarios other than the one that was presented, and when discussing scenarios where the bus stop was not visible from the point where the user was standing or when the user did not know the location of the bus stop at all, most seemed to request a combination of the Touch2Go and Radar2Go applications. On the other hand, many participants described Time2Go as fun, intuitive and quick to use in the scenario that was presented. As mentioned in 5.1.1 Interaction Conclusions, there are several factors that can have affected this result and the factors discussed in that section applies here as well. While the findings on the pointing paradigm presented in [24] support part of the hypothesis and while some thoughts and opinions that surfaced during the user evaluations seemed to be in favour of it, our studies are not enough to be able to conclude that the hypothesis is valid. For example we cannot say if an association between the information the user is interested in and the object he or she is pointing at is necessary, nor can we conclude that a line of sight is needed since we only tested the interaction in one scenario with one type of object. We did not have the time to perform more studies and therefore have to leave it to other researchers to pursue if this hypothesis is correct or not.

5.1.3

Concept Conclusions

Looking at the results of the user evaluation (see 4.4.5), participants seemed to think that Time2Go was fun and natural way of interacting when they found themselves in the intended usage scenario. As soon as the scenario was changed however, the participants seemed to prefer other ways of interacting. The purpose of the application concept was to demonstrate a situation where the pointing interaction was found to be natural, something which we seem to have succeeded at. It should also be said that the usefulness of this concept as an application is limited, since the users’ preferred interaction technique change when the usage scenario is slightly changed. 5.3.3 and 5.3.7 of the Future Work section of this chapter contains ideas of improvements that could solve this problem. 49


5.2

Discussion

In this section different aspects of the work performed during this degree project will be discussed in more detail.

5.2.1

Interaction Prototype

One of the challenges of this degree project was to handle the performance of the sensors. After having tested and handled the prototypes made during our work, it seems to us as if what was said in [27] about the accuracy of the different sensors is correct. There are situations where the sensors are reporting values that are really off. What makes it hard to predict and handle these errors is that the factors affecting the sensors are so many and so varying. For example, large metallic objects might have magnetic fields of their own, making the magnetometer always pointing in their direction when the user stand in their vicinity. To handle situations when the compass bearing is as much as 180◦ off impose an great challenge, both in terms of handling them technically and in terms of how to communicate the error and its cause to the users. The future success of interaction of this sort of interaction lay in how well such errors can be handled. If the errors demonstrate themselves too often and are not communicated to the users in a suitable manner, users will eventually turn to other ways of interacting. For example, a clever workarounds to handle the technical limitations was presented in [32]. The use of computer vision algorithms in combination with position eliminate some of the dependency on accuracy of the GPS-receiver, since the position is only used to fetch the correct set of image features used by the computer vision algorithms. Before development of a better positioning system for use in urban environments, this sort of hybrid solutions will probably be the most successful. In section 3.1 Interaction Choices, the reasons for choosing the pointing paradigm as the interaction paradigm for this project was explained. The main reason was that Ericsson was most interested in exploring this way of interacting. Without this restriction however, it would have been possible to create a mobile augmented reality application while still honouring the other limitations of the project. This would have lead to completely different concepts and other technical solutions as well. If this would have been a better path or not is hard to say, as the usage scenarios of the two different types of applications (mobile AR and Geo-Wands) are different. Geo-Wands are mostly used as a way of selecting some object to interact with, while mobile AR application are used to provide information about an object in context of the object itself. Our application allows the user to select a bus stop that is 50

5.2. DISCUSSION

of interest, while an AR-version of the same concept would provide information of interest in connection to the image of the bus stop in the camera feed on the phones screen. In this scenario, the difference lay in where the user is keeping his or her focus. With the Geo-Wand, the focus is on the real world, while with a mobile AR application the focus would be on the screen of the phone1 . A problem with using a mobile AR application would also be how to show the necessary information on the camera feed in connection to the bus stop. Larger bus stops in Uppsala have several tens of buses departing from them, and showing them all on the screen would not be feasible.

5.2.2

Concept Development

Seen from a Human-Computer Interaction (HCI) point of view, the process of concept development in this project was in some ways backwards. In HCI, it is much more common to start with users, usage scenarios or a product to improve and create a product or improvement from there by observing or talking to users. In this project, we were given a technique and were asked to find situations and users that would benefit from it. To me, this way of working when developing products for the end user market seems a bit wrong. It is great to have another type of interaction to use as a tool for solving HCI problems, but it seems better to start from the users point of view and find the right tool for them, rather than trying to find the right users or situation for the tool. By starting with the solution in some sense, you have to work to find a problem that you believe, but cannot know for sure, users have. I believe that working in this manner, it is much harder to find a good match between users and technique which makes interaction flow natural. When working with the user in focus, it is possible to identify problems users really have and then work to solve these in the best possible way. Partially due to this, the final concept and application that became the result of the concept development would most likely not on itself be a useful application. The problem solved by the application is a small one, one that users might not experience that often. If this is the case, it could partially explain why some participants had a hard time staying with the scenarios presented in the user evaluations. They could not identify themselves with the problem the application solved, since it was a problem that they seldom or never had encountered themselves. The whole application is built to solve a very specific problem, to give users information about departure times when they are within viewing distance from the bus stop, and thus users naturally suggest improvements to the application that would make it more useful to them. In this sense, maybe the nightclub concept would have been more successful, as its usage scenario might be one that users are more familiar with and experience more often. 1

Using a head-mounted display was not an alternative during this project.

51


5.2.3

User Evaluation

Even though some conclusions and improvements could be made from the user evaluations, there is plenty of room for improvements in this part of our work. Our inexperience in designing and executing user evaluations resulted in evaluations where it was not entirely clear what had been tested, especially during the evaluation in Uppsala. The results obtained mostly consisted of our interpretations of what the participants said and from this it is hard to draw any valid conclusions on how the interaction really performed. Of course time constraints were also a factor here, as doing user evaluations were only a small portion of the total work performed. Despite this, I think this part of the project could be improved with better planning and more thought through user evaluations.

5.3

Future Work

In this section we will present a few different directions we would have liked to explore or that we would have pursued if we were to continue working on this project.

5.3.1

Error Prediction for Sensors

Two of the methods for intersection calculations developed during this project included a mechanism to compensate for errors in sensor values. This mechanism was not tested in this project. It would be interesting to perform some testing on how well it works. The problem was, as it turned out, that it is hard to estimate the errors in the sensor values in a good way. Future work could be to find better ways of estimating these errors, especially for the error in GPS-position which seemed to have the largest negative impact on the application performance.

5.3.2

Evaluation of the Hypothesis

In section 5.1.2, conclusions about the hypothesis were presented. The main finding was that the evaluations performed during the project were not enough to be able to conclude anything about the hypothesis. Nothing was found that contradicted the hypothesis, but the evaluations performed were not carefully enough planed to extensively test it. Performing a carefully planed evaluation of this hypothesis could provide valuable insight into its validity. If proved to be correct, it could provide great help when deciding whether to use the pointing interaction or not in a service or application. 52

5.3. FUTURE WORK

5.3.3

Other Concepts

In the concept development phase of the project, four different concepts were developed, but only one made it into a real application. After developing and evaluating the public transportation concept, we found out that this might not be the best application area for this type of interaction. Because of the lack of time we could not explore the other concept ideas further, but we believe that especially the nightclub idea might have a potential to be a concept which users will find more useful. By not restricting the concept development process to only include concepts that are technically possible to implement today and to applications which should be ready to be used right after they have been downloaded, new and maybe more innovative uses of this interaction technique might be found.

5.3.4

Radar2Go and Touch2Go Combined

Many participants in the user evaluation at KTH suggested that they would have liked to combine Radar2Go and Touch2Go. They liked that the orientation of Radar2Go always showed what they were looking at, but missed the map of the Touch2Go application. A future study on how Time2Go fair against such a system could be an interesting comparison. Researching this might also provide useful guide lines that could be used when developing applications that implement a scanninglike way of interaction for interacting with objects connected to a geographical position. Due to technical limitations in the map used in this project, it was not possible to test such a combination.

5.3.5

Extend Time2Go

As mentioned in the discussion about concept development (see section 5.2.2), the Time2Go application solve a small problem that users might not experience that often. To extend Time2Go and turn it into a complete travel planer could be an interesting way to continue the work performed in this degree project. In such an extension, the current functionality of Time2Go could be one of several alternatives to plan a journey, where the user could select the start or end point of a journey by pointing towards the intended bus stop. Extended functionality could include the ability to plan a whole journey, being able to save destinations or departures as favourites and to see information about traffic disturbances and cancelled departures. Combining Time2Go, Radar2Go and Touch2Go together might even be an alternative, to let users choose the way of interaction that they prefer and to support more usage scenarios. Minor improvements to Time2Go could be to allow users to sort the list of buses and bus stops to their liking or to see more departures in the detailed view. 53


5.3.6

Interaction Paradigms in an Outdoor Environment

In [24], Rukzio et al. tested the touching, pointing and scanning paradigms in an indoor smart environment and suggested that the pointing paradigm would be most successful when users see the object they want to interact with but are not close enough to touch it. They also concluded that users try to avoid the scanning paradigm if possible in this context, but the result from our user evaluations indicated that for an outdoor environment, this might not always be the case. User seemed to request a scanning-like interface when they did not know beforehand where they were going, or if they did not have a line of sight to the object. Further studies to evaluate the three paradigms in an outdoor environment would be interesting to see.

5.3.7

Pointing as a Complementary Interaction Technique

During the concept development phase of the project we tried to find concepts where pointing interaction would be the main way of interacting. Although we found a few concept where this could be true, we also believe that these concepts would benefit even more from a mix of interaction techniques. Having more than one interaction technique available, users could choose the technique they feel is the most natural and best suited to the present situation. Such a mix of interaction technique could prove to be more powerful than the individual interaction techniques and we think pointing interaction would constitute a good alternative interaction technique to be used in special situations. Possible future work could be to evaluate the pointing interaction as a complementary interaction technique, instead of trying to use it as the main interaction technique, which was the case in this project.

54

Bibliography [1]

Adams, A. Gelfand, N. and Pulli, K. 2007. Viewfinder Alignment. Computer Graphics Forum, 27 (2), pp. 597-606. DOI= http://dx.doi.org/10.1111/j.1467-8659.2008.01157.x

[2]

Ailisto, H., Pohjanheimo, L., Välkkynen, P., Strömmer, E., Tuomisto, T., and Korhonen, I. 2006. Bridging the physical and virtual worlds by local connectivity-based physical selection. Personal and Ubiquitous Computing 10 (6), pp. 333-344. DOI= http://dx.doi.org/10.1007/s00779-005-0057-0

[3]

Azuma, R. T. 1997. A survey of augmented reality. In Presence: Teleoperators and Virtual Environments, 6 (4), pp. 355-385.

[4]

Buettner, M. and Wetherall, D. 2008. An empirical study of UHF RFID performance. In Proceedings of the 14th ACM international Conference on Mobile Computing and Networking (San Francisco, California, USA, September 14 - 19, 2008). MobiCom ’08. ACM, New York, NY, 223-234. DOI= http://doi.acm.org/10.1145/1409944.1409970

[5]

Caruso M.J. 2000. Applications of Magnetic Sensors for Low Cost Compass Systems. In Proceedings of IEEE Positioning, Location, and Navigation Symposium (San-Diego, USA, March 13-16, 2000). PLANS. pp 177-184.

[6]

Cuellar, G., Eckles, D., and Spasojevic, M. 2008. Photos for information: a field study of cameraphone computer vision interactions in tourism. In CHI ’08 Extended Abstracts on Human Factors in Computing Systems (Florence, Italy, April 05 - 10, 2008). CHI ’08. ACM, New York, NY, 3243-3248. DOI= http://doi.acm.org/10.1145/1358628.1358838

[7]

Egenhofer, M. J. Spatial Information Appliances: A Next Generation of Geographic Information Systems. 1st Brazilian Workshop on GeoInformatics, 1999. 55

BIBLIOGRAPHY

[8]

Fitzmaurice, G. W. 1993. Situated information spaces and spatially aware palmtop computers. Commun. ACM 36, 7 (Jul. 1993), 39-49. DOI= http://doi.acm.org/10.1145/159544.159566

[9]

Google Corporation, 2010. Google Goggles for Andriod. [Online] Available at: http://www.google.com/mobile/goggles/ [Accessed 22 March 2010].

[10]

Henze, N., Reiners, R., Righetti, X., Rukzio, E., and Boll, S. 2008. Services surround you: Physical-virtual linkage with contextual bookmarks. The Visual Computer: International Journal of Computer Graphics 24 (7), pp. 847-855. DOI= http://dx.doi.org/10.1007/s00371-008-0266-4

[11]

Kähäri, M., Murphy, D. J. 2006. MARA - Sensor Based Augmented Reality System for Mobile Imaging Device. Demonstrated during International Symposium on Mixed and Augmented Reality Demonstation (Santa Barbara, USA, October 22 - 25, 2006). ISMAR06.

[12]

Kato, H. and Tan, K. T. 2007. Pervasive 2D Barcodes for Camera Phone Applications. IEEE Pervasive Computing 6, 4 (Oct. 2007), 76-85. DOI= http://dx.doi.org/10.1109/MPRV.2007.80

[13]

Löfström, J. (In press). Physical Mobile Interaction Design for Spatially-Aware Pointing. Master Thesis in Human Computer Interaction at the School of Computer Science and Engineering. Royal Institute of Technology in Stockholm, Sweden.

[14]

Nokia Corporation, 2009. Nokia Indoor Positioning. [Online] Available at: http://www.nokia.com/technology/upcominginnovations/indoor-positioning [Accessed 18 September 2009].

[15]

Nokia Corporation, 2010. Nokia Point & Find. [Online] Available at: http://pointandfind.nokia.com/ [Accessed 22 March 2010].

[16]

Nokia Corporation, 2009. Nokia Press Bulletin Board - Mobile Indoor Positioning trial launched at Kamppi Shopping Centre in Helsinki. [Online] (Published June 4, 2009) Available at: http://pressbulletinboard.nokia.com/2009/06/04/mobileindoor-positioning-trial-launched-at-kamppi-shopping-center-inhelsinki/ [Accessed 18 September 2009].

[17]

Olwal, A. 2009. Augmenting Surface Interaction through ContextSensitive Mobile Devices. In Proceedings of the 12th IFIP TC 13 56

BIBLIOGRAPHY

International Conference on Human-Computer Interaction: Part II (INTERACT ’09), Tom Gross, Jan Gulliksen, Paula Kotz, Lars Oestreicher, Philippe Palanque, Raquel Oliveira Prates, and Marco Winckler (Eds.). Springer-Verlag, Berlin, Heidelberg, 336-339. DOI= http://dx.doi.org/10.1007/978-3-642-03658-3_39 [18]

Piekarski, W. and Thomas, B. 2002. ARQuake: the outdoor augmented reality gaming system. Commun. ACM 45, 1 (Jan. 2002), 36-38. DOI= http://doi.acm.org/10.1145/502269.502291

[19]

Pohjanheimo, L., Keränen, H., and Ailisto, H. 2005. Implementing touchme paradigm with a mobile phone. In Proceedings of the 2005 Joint Conference on Smart Objects and Ambient intelligence: innovative Context-Aware Services: Usages and Technologies (Grenoble, France, October 12 - 14, 2005). sOc-EUSAI ’05, vol. 121. ACM, New York, NY, 87-92. DOI= http://doi.acm.org/10.1145/1107548.1107576

[20]

Sharp, H., Rogers, Y., and Preece, J. 2007 Interaction Design: Beyond Human Computer Interaction. John Wiley & Sons.

[21]

Rekimoto, J. 1998. Matrix: A Realtime Object Identification and Registration Method for Augmented Reality. In Proceedings of the Third Asian Pacific Computer and Human interaction (July 15 - 17, 1998). APCHI. IEEE Computer Society, Washington, DC, 63. DOI= http://doi.ieeecomputersociety.org/10.1109/APCHI.1998.704151

[22]

Rekimoto, J. and Nagao, K. 1995. The world through the computer: computer augmented interaction with real world environments. In Proceedings of the 8th Annual ACM Symposium on User interface and Software Technology (Pittsburgh, Pennsylvania, United States, November 15 - 17, 1995). UIST ’95. ACM, New York, NY, 29-36. DOI= http://doi.acm.org/10.1145/215585.215639

[23]

Robinson, S., Eslambolchilar, P., and Jones, M. 2008. Point-toGeoBlog: gestures and sensors to support user generated content creation. In Proceedings of the 10th international Conference on Human Computer interaction with Mobile Devices and Services (Amsterdam, The Netherlands, September 02 - 05, 2008). MobileHCI ’08. ACM, New York, NY, 197-206. DOI= http://doi.acm.org/10.1145/1409240.1409262

[24]

Rukzio, E. Leichtenstern, K. Callaghan, V. Holleis, P. Schmidt, A. and Chin, J. 2006. An Experimental Comparison of Physical Mobile Interaction Techniques: Touching, Pointing and Scanning. In Proceedings of the 8th International Conference on Ubiquitous Computing (Orange County, CA, USA, September 17-21, 2006). I. Smith, 57

BIBLIOGRAPHY

M. Y. Chen, F. Vahid, A. LaMarca, S. Benford, S. N. Patel, S. Consolvo, G. D. Abowd, J. Hightower and T. Sohn, Eds. UbiComp 2006, vol. 4206. Springer Berlin / Heidelberg, 87-104. DOI= http://dx.doi.org/10.1007/11853565_6 [25]

Schuler, R. P., Laws, N., Bajaj, S., Grandhi, S. A., and Jones, Q. 2007. Finding your way with CampusWiki: a locationaware wiki. In CHI ’07 Extended Abstracts on Human Factors in Computing Systems (San Jose, CA, USA, April 28 - May 03, 2007). CHI ’07. ACM, New York, NY, 2639-2644. DOI= http://doi.acm.org/10.1145/1240866.1241055

[26]

Simon, R. and Fröhlich, P. 2007. A mobile application framework for the geospatial web. In Proceedings of the 16th international Conference on World Wide Web (Banff, Alberta, Canada, May 08 - 12, 2007). WWW ’07. ACM, New York, NY, 381-390. DOI= http://doi.acm.org/10.1145/1242572.1242624

[27]

Simon, R. and Fröhlich, P. 2008. GeoPointing: Evaluating the Performance of an Orientation Aware Location Based Service under RealWorld Conditions. Journal of Location Based Services, 2 (1), pp. 2440. DOI= http://dx.doi.org/10.1080/17489720802347986

[28]

Simon, R. Fröhlich, P. Obernberger, G. and Wittowetz, E. 2007. The Point to Discover GeoWand. In Proceedings of the 9th International Conference on Ubiquitous Computing (Innsbruck, Austria, September 16 - 19, 2007). UbiComp ’07. Springer LNCS.

[29]

Strachan, S. and Murray-Smith, R. 2008. Bearing-Based Selection in Mobile Spatial Interaction. Personal and Ubiquitous Computing, 13 (4), pp. 265-280. DOI= http://dx.doi.org/10.1007/s00779-008-0205-4

[30]

Sutherland, I. E. 1968. A head-mounted three dimensional display. In Proceedings of the December 9-11, 1968, Fall Joint Computer Conference, Part I (San Francisco, California, December 09 - 11, 1968). AFIPS ’68 (Fall, part I). ACM, New York, NY, 757-764. DOI= http://doi.acm.org/10.1145/1476589.1476686

[31]

Swindells, C., Inkpen, K. M., Dill, J. C., and Tory, M. 2002. That one there! Pointing to establish device identity. In Proceedings of the 15th Annual ACM Symposium on User interface Software and Technology (Paris, France, October 27 - 30, 2002). UIST ’02. ACM, New York, NY, 151-160. DOI= http://doi.acm.org/10.1145/571985.572007

[32]

Takacs, G., Chandrasekhar, V., Gelfand, N., Xiong, Y., Chen, W., Bismpigiannis, T., Grzeszczuk, R., Pulli, K., and Girod, B. 2008. 58

BIBLIOGRAPHY

Outdoors augmented reality on mobile phone using loxel-based visual feature organization. In Proceeding of the 1st ACM international Conference on Multimedia information Retrieval (Vancouver, British Columbia, Canada, October 30 - 31, 2008). MIR ’08. ACM, New York, NY, 427-434. DOI= http://doi.acm.org/10.1145/1460096.1460165 [33]

Välkkynen, P. Korhonen, I. Plomp, J. Tuomisto, T. Cluitmans, L. Ailisto, H. Seppä, H. 2003. A User Interaction Paradigm fro Physical Browsing and Near-Object Control Based on Tags. In Proceedings of Physical Interaction Workshop on Realworld User Interfaces (Udine, Italy, September 2003).

[34]

Välkkynen, P., Niemelä, M., and Tuomisto, T. 2006. Evaluating touching and pointing with a mobile terminal for physical browsing. In Proceedings of the 4th Nordic Conference on Human-Computer interaction: Changing Roles (Oslo, Norway, October 14 - 18, 2006). A. Mørch, K. Morgan, T. Bratteteig, G. Ghosh, and D. Svanaes, Eds. NordiCHI ’06, vol. 189. ACM, New York, NY, 28-37. DOI= http://doi.acm.org/10.1145/1182475.1182479

[35]

Wagner, D. and Schmalstieg, D. 2003. First Steps Towards Handheld Augmented Reality. In Proceedings of the 7th IEEE international Symposium on Wearable Computers (October 21 - 23, 2003). ISWC. IEEE Computer Society, Washington, DC, 127. DOI= http://doi.ieeecomputersociety.org/10.1109/ISWC.2003.1241402

[36]

Wang, J., Zhai, S., and Canny, J. 2006. Camera phone based motion sensing: interaction techniques, applications and performance study. In Proceedings of the 19th Annual ACM Symposium on User interface Software and Technology (Montreux, Switzerland, October 15 - 18, 2006). UIST ’06. ACM, New York, NY, 101-110. DOI= http://doi.acm.org/10.1145/1166253.1166270

[37]

Want, R., Fishkin, K. P., Gujar, A., and Harrison, B. L. 1999. Bridging physical and virtual worlds with electronic tags. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems: the CHI Is the Limit (Pittsburgh, Pennsylvania, United States, May 15 - 20, 1999). CHI ’99. ACM, New York, NY, 370-377. DOI= http://doi.acm.org/10.1145/302979.303111

[38]

Ka-Ping Yee. 2003. Peephole displays: pen interaction on spatially aware handheld computers. In Proceedings of the SIGCHI conference on Human factors in computing systems (CHI ’03). ACM, New York, NY, USA, 1-8. DOI= http://doi.acm.org/10.1145/642611.642613

59

Appendix A

Business Models This appendix will provide the interested reader with a short and entirely unscientific look at the business models currently used by available commercial applications. This section does not contain any information critical to understanding the rest of the report and can easily be skipped. None of the companies developing AR browsers or Geo-Wands today seem to want to charge users directly. Instead, they will be charging content providers for letting them use their platforms to reach customers or to appear in beneficial spots in the applications themselves. One of the founders of Layar, Maarten Lens-FitzGerald, suggested in an interview1 that Layar could charge developers an administration fee and that developers eventually would pay to get a spot in the “featured layer” section of Layar when the amount of layers available to users is large enough. Building on the idea of administration fees, Nokia has in their Point & Find application created a management portal to which content providers can subscribe for a fee. Using this portal, they can add and edit content that then users of the program can access. Another possible business model is to offer other companies tailor made applications that utilize the powers of Geo-Wands and AR browsers. These applications could be offered as a way of exploring the companies’ products, or as part of a broader advertisement campaign to advertise for example a new movie of product.

1

http://www.fastcompany.com/blog/kit-eaton/technomix/layar-web-browser-reality-comingsoon-iphone

61

TRITA-CSC-E 2010:068 ISRN-KTH/CSC/E--10/068--SE ISSN-1653-5715

www.kth.se

Physical Interaction by Pointing with a Mobile Device

Physical Interaction by Pointing with a Mobile Device

Suggest Documents

Improving Mobile Device Interaction by Eye ...

A Finger-Mounted, Direct Pointing Device for Mobile Computing

Understanding Single-Handed Mobile Device Interaction

Pointing Device Communication - Microsoft Research

A Wearable Mobile Electrocardiogram Measurement Device with ...

A Wearable Mobile Electrocardiogram Measurement Device with ...

Evaluating Touching and Pointing with a Mobile ... - ACM Digital Library

Mobile and Physical User Interfaces for NFC-based Mobile Interaction ...

A Novel Pointing Device for Notebook Computers - Semantic Scholar

Anti-Phishing by Smart Mobile Device

On-Device Mobile Visual Location Recognition by

Securing a Remote Terminal Application with a Mobile Trusted Device

PARADIS: Physical 802.11 Device Identification with ... - CiteSeerX

Application of a Novel Integrated Pointing Device ...

rpen a new 3d pointing device - McMaster CAS Dept.

Application of A Novel Integrated Pointing Device ... - Semantic Scholar

A Novel Pointing Device for Notebook Computers - Semantic Scholar

Soap: a Pointing Device that Works in Mid-Air - CiteSeerX

ergonomic development of a computer pointing device

An EMG-controlled omnidirectional pointing device using a HMM

Measuring Multimodal Delays in Mobile Device Touchscreen Interaction

Mobile Device-based Interaction Patterns in Augmented Toy ...

mobile device and intelligent display interaction via scale-invariant

Mobile Device Integration and Interaction in the ... - UniversitÃ¤t Passau