Oct 1, 2002 - The honours project âContext Based Mobile Interfacesâ examines the technology and usability for ... certain parts of the media content, like video, audio or teletext. ...... same time, a large display of the stream in the focus and a large number of small ones .... view 8 since the user has a meeting with the boss.
CONTEXT BASED MOBILE INTERFACES
Christian Matthias Kohl Supervised by Andry Rakotonirainy
faculty of information technology and electrical engineering
submitted in partial fulfilment of the requirements for Bachelor of Information Technology with Honours.
October, 2002
Declaration I certify that this is my original work and that any content which is not my own has been quoted and attributed appropriately. I also declare that I am the student whose name appears on the cover page and that this work has not been submitted previously for assessment.
Christian Kohl
ii
Acknowledgements I would like to thank my supervisor, Andry Rakotonirainy, for his advice and guidance throughout the year. Furthermore, I would also like to thank Douglas Kosovic from DSTC for his help with the MPEG-2 content, the generation and provision of MPEG-2 files with teletext included, and his code for teletext extraction. Many thanks also go to my parents for their patience with me and being guinea pigs in regards to my seminar practices. Besides, I have to thank my sister, Kathrin, for proof-reading my thesis and making sense of my drafts. Finally, I would also like to thank my best friend Aaron and my friends from University, Jon, Dee, Damo and Marc, who kept me sane over the last year.
iii
Abstract The honours project ”Context Based Mobile Interfaces” examines the technology and usability for streaming MPEG-2 media on a mobile device. The unique features in this project are the extraction of teletext content contained in a satellite feed, as well as the design of a suitable user interface for displaying MPEG-2 media on a mobile device. The user interface has the purpose of conveying the information contained in the satellite feed to the user in a simple and easy-to-understand fashion. Additional methods for interacting multimodally with the mobile device have been considered. The significant part about the user interface is also that, depending on the location a user is at, or tasks the user has to fulfill, the interface will adapt and only display certain parts of the media content, like video, audio or teletext. This means that the user interface will dynamically alter with respect to the context users of such a device find themselves in. The research that was done for this project will assist the user in receiving TV-like media content on the go, without being restricted to a certain place or time and being able to choose their own way of representing this on their mobile device.
iv
Contents Declaration
ii
Acknowledgements
iii
Abstract
iv
1 Introduction 1.1 The Scope of this project . . . . . . . . . . . . . . . 1.2 Identification of issues . . . . . . . . . . . . . . . . 1.3 Background material . . . . . . . . . . . . . . . . . 1.3.1 The concept of Multimodality . . . . . . . . 1.3.2 The concept of Focus+ContextVisualisation 1.3.3 The concept of Mobile devices . . . . . . . . 1.4 Summary . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . .
1 1 2 2 3 4 5 5
2 Literature Review 2.1 Multimodal review . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Flip Zooming Review . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7 7 10 17
3 MpegView Model 3.1 Specification . . . . . . . . . . . . 3.1.1 General details . . . . . . 3.1.2 MPEG-2 details . . . . . . 3.2 Information Visualisation . . . . . 3.2.1 General details . . . . . . 3.2.2 MPEG-2 details . . . . . . Amendment to MPEG-2 3.3 Context Management . . . . . . . 3.4 Example . . . . . . . . . . . . . . 3.5 Summary . . . . . . . . . . . . .
. . . . . . . . . .
18 19 19 22 24 25 27 30 30 33 35
4 Implementation 4.1 Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1.1 Linux Familiar . . . . . . . . . . . . . . . . . . . . . . . . . .
36 36 36
v
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . usability requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . .
. . . . . . . . . .
. . . . . . .
. . . . . . . . . .
. . . . . . .
. . . . . . . . . .
. . . . . . .
. . . . . . . . . .
. . . . . . .
. . . . . . . . . .
. . . . . . .
. . . . . . . . . .
4.1.2 4.1.3
OPIE . . . . . . . . . . . MPEG-2 . . . . . . . . . . Transport Stream . . . xine library . . . . . . . 4.2 User Interface Design . . . . . . . 4.2.1 Architecture . . . . . . . . Existing Architecture . Amended Architecture . 4.2.2 Mobile device constraints . 4.2.3 User Interface . . . . . . . User input . . . . . . . Teletext . . . . . . . . . Video output . . . . . . Context Manager . . . . 4.3 Process of Development . . . . . 4.4 Summary . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
37 37 37 38 39 39 40 42 43 43 43 43 44 45 49 50
5 Summary 5.1 Summary of work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Future Development . . . . . . . . . . . . . . . . . . . . . . . . . . .
51 51 52
A xine Code listing
55
B Installation procedure for Familiar linux and OPIE
57
vi
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
List of Figures 2.1 2.2 2.3 2.4 3.1 3.2 3.3 3.4 3.5 3.6 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9 4.10 4.11 4.12
Item in focus [11] . . . . . . . . . . . . . . . . . . . . . . . . . . . . Flip Zooming representation - no page in focus [11] . . . . . . . . . Flip Zooming representation - page in focus [12] . . . . . . . . . . . Hierarchical Flip Zooming - page in focus [12] . . . . . . . . . . . . General Representation states . . . . . . . . . . . . . . . . . . . . . Page-based approach . . . . . . . . . . . . . . . . . . . . . . . . . . MPEG-2 Stream Representational views . . . . . . . . . . . . . . . Tile representation . . . . . . . . . . . . . . . . . . . . . . . . . . . Display Representation . . . . . . . . . . . . . . . . . . . . . . . . . Visual representation of an example of the model . . . . . . . . . . xine-lib architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . Existing opieplayer2 architecture . . . . . . . . . . . . . . . . . . . MpegView architecture . . . . . . . . . . . . . . . . . . . . . . . . . MpegView module screenshot . . . . . . . . . . . . . . . . . . . . . VideoSkeleton module screenshot . . . . . . . . . . . . . . . . . . . Video, Audio and Teletext content user interface . . . . . . . . . . . Video, Audio and Teletext content user interface (showing controls) Video content user interface . . . . . . . . . . . . . . . . . . . . . . Audio content user interface . . . . . . . . . . . . . . . . . . . . . . Teletext content user interface . . . . . . . . . . . . . . . . . . . . . Audio and Teletext content user interface . . . . . . . . . . . . . . . No content user interface . . . . . . . . . . . . . . . . . . . . . . . .
vii
. . . . . . . . . . . . . . . . . . . . . .
11 14 15 16 26 26 28 29 29 34 39 40 42 44 45 46 46 47 47 48 48 49
Chapter 1 Introduction This document contains the findings of my Honours project about ”Context Based Mobile Interfaces” and gives the reader a detailed presentation of the concepts involved. This thesis is part of my Bachelor of Information Technology degree and is the core assessment item for the Honours year. The following paragraphs present the scope of the Honours project as well as required background information in order to understand the sub-sequent findings.
1.1
The Scope of this project
The project concentrates on the development of an effective way of representing information on a mobile computer-systems device for a user whose focus and context changes regularly.
So-called Focus+Context Visualisation techniques summarise
methods of representing extensive and detailed information, so that the visual load is decreased for the user. The concept of this technique is further explained in the following sections. Therefore, the objective of this thesis project is to design a concept model describing rules and guidelines for a user’s change of focus and the relevant alteration in the user interface display of a mobile device depending on the different contexts. The main focus is to find a way on how to represent multiple MPEG-2 streams on a mobile device, depending on the context in which users find themselves. Various context sources such as a calendar, containing meetings and tasks, the physical location or current time will determine the change in user interface. 1
The change will reflect the current needs, constraints and preferences of the user to efficiently use the mobile device. An implementation of the model will develop a user interface for mobile devices that is extremely space efficient and usable for the user’s ever-changing environment. Also, the model will take into consideration multimodal interactions between user and mobile device, which will enable its greater use and flexibility. The output from the mobile device is represented multimodally, with video, audio and teletext output, and input methods can contain pen and voice. The outcome of this project will be a working proof-of-concept of the developed model showing the MPEG-2 streams according to the rules and guidelines that are required. These guidelines are examined in chapter 3 and are part of the model description for the change in the user interface according to the context of the user of a mobile device.
1.2
Identification of issues
At the moment, I do not know of any technique that changes a user’s interface dynamically depending on the user’s context and I think it will greatly increase the usability of mobile devices if there is a dynamic change because the user is less occupied by adjusting the information that is required. Furthermore, there is the issue of trying to integrate streaming data, the MPEG-2 streams, into a dynamically changing interface and the best way to represent it and control it. Another point is the extraction of teletext content from a MPEG-2 stream and the best way on how to integrate and represent it on the mobile device.
1.3
Background material
In the ever-changing world of multimedia it has become more and more important to be able to gain access to media on the go. The way of accessing media, like audio or video, in a mobile environment is by means of small devices that can feature graphical displays, speakers and different input methods. The inherent problem with mobile 2
devices is their size and the difficulty of representing the media on small devices in an efficient and easy to use manner. Technology nowadays allows for different ways of input, through voice, pen or physical location, and output, like sound, video, data and vibration. When used in conjunction, these ways of input and output can be summarized as multimodal input/output techniques. Furthermore, it is often important to represent certain data or media in a specific way, depending on the location a person is at; the current time, or the tasks a person has to do. So far, there has been extensive research on the different ways of input and output and also on the context in which users may find themselves. However, for multimedia applications, the user interface seems to be left in a static state and does not change dynamically depending on the context of the user. The idea of a dynamically changing user interface can greatly increase the usability of mobile devices without distracting the user or the user’s environment. This comes back to interpersonal interactions and the fact that the device and its changing interface should not have an impact on the behaviour between the user and normal correspondence with other individuals. Furthermore, these other individuals should not be affected or distracted by the operation of the device. In order to better understand the subject matter of this thesis, relevant terms will first be defined before going into more detail about the entire project.
1.3.1
The concept of Multimodality
In the context of user interfaces and interacting with devices, modality can be perceived as a way of input to a device or output coming from a device. Using various modalities on the one device is known as multimodal interactions. The interactions with a device can be one of three types. [1, 2] The first type is the sequential multimodal input or output, where only one modality can be used at the one time. Another type is the uncoordinated and simultaneous multimodal input or output, allowing for more than one modality to be used at the same time but the system treats them in isolation and processes the information separately. The third type 3
can be described as coordinated multimodal input or output, taking advantage of the multiple modalities by integrating them into one single input or output that delivers information in a more conceivable way to humans. [1] The idea of ”multimodality” originated from human face-to-face communication where speech, gestures and facial expressions are used to convey information. [2] Ideally, users should be able to choose a modality or combination thereof to enable them to express information to devices more easily and efficiently. Multimodality is relevant in this project for its use of input and output techniques regarding mobile devices and MPEG-2 streams. As is further explained in chapter 3 detailing the model, the input to a mobile device can either be touch or voice and the output contains video, audio and teletext simultaneously. Therefore, the interactions with the device can be classified as type two for input and type three for the output. The purpose of using multimodal techniques is to improve the use of a device in such a way that the user is comfortable with, and willing to utilize.
1.3.2
The concept of Focus+ContextVisualisation
The idea behind Focus+Context Visualisation was developed by Furnas and Spence&Apperley [3].
It describes a way of representing extensive and detailed
information so that the visual load is decreased for the user. The concept combines a focus, which is the information the user currently concentrates on, and context, containing a general representation of information relevant to the user, in one view on the screen. The focus displays a detailed representation whereas the context just displays an overview of some data. The information contained in the focus and the context can be quite different and does not have to be related. Furthermore, there can be different visual representations for the focus and the context, giving the user a clear separation of the two. [3, 4] Techniques such as this are used when more information is required than can be adequately presented on a display. The overdose of information is a concern when faced with the prospect of displaying multiple media streams, as is the case with the possibility of several MPEG-2 streams. The techniques help in the 4
accurate and concise way of displaying the media and chapter 2 reviews some of them and finds the most suitable to be Flip-Zooming. More details about this technique follow in chapter 2, as well as in the model description in chapter 3.
1.3.3
The concept of Mobile devices
One of the interesting aspects of this thesis is the development of the model for mobile devices and the issues involved in that. The manufacturers of such devices have developed different types for mobile computing, and each falls into a different category depending on the level of mobility; the device’s relations to other devices or the environment, and the binding to individuals or groups. The relevant devices for this project are mainly in the mobile; free or embedded, and personal levels of the respective categories. The classification of devices into categories is taken from Exploiting context in HCI Design for mobile systems by T. Rodden, K. Chervest and N. Davies [5] , and is an adequate measure to be considered. The level of mobility is free, which indicates that users can move the device. The relation of a device to other devices or the environment depends on the application that is desired, but is either independent of them (free), or the particular device is part of another device (embedded). In this project the interest regarding mobile devices lies with single user scenarios and are therefore for personal use only, and do not take into account any sharing of the device between users. Examples of such mobile devices are wearable computers; handhelds; mobile phones; laptops; palmtops, and Personal Digital Assistants (PDAs). [5]
1.4
Summary
This introductory chapter covers the scope of the project and identifies issues the project tries to solve.
These include the development of a context-aware user
interface using a Focus+Context Visualisation technique and the use of MPEG-2 media while extracting teletext content from it. Furthermore the chapter explains some background material including the concepts of multimodality, focus+context 5
visualisation and mobile devices.
The following chapters will give a clearer understanding of the research topic. Chapter 2 reviews the current research into multimodal and Focus+Context Visualisation techniques, and discusses previous work in the same topic areas. The concept model is described in chapter 3 and chapter 4 details an implementation thereof. The final chapter concludes with the revelations and outcomes of the project and proposes possible extensions.
6
Chapter 2 Literature Review The research aspect of this project is two-fold.
On the one hand there is the
multimodal aspect, which concerns itself with the different ways of input and output to devices. On the other hand there is the visualization of information aspect, which tries to find ways of representing large amounts of information on a confined display space. The information gathered from these two areas together determined the ideas and development process of the project and formalised in the model description of chapter 3. Firstly, the literature about multimodal interaction techniques is reviewed, which is then followed by the review of Focus+Context Visualisation techniques.
2.1
Multimodal review
The first part of the research concentrated on the past work in the area of multimodal interactions on computing devices. This term comprises all the methods that can be used to input information to a device or generate output from a device. The traditional ways of input to a computing device for example are the mouse and the keyboard using the monitor screen for the output. This traditional form of interacting with a device is unimodal since the user has only one method to input data, by touch, and receives the output also by a single method, the visual output on the monitor screen. One might argue that the sound input via microphone and the output from songs combined with the traditional methods is a form of multimodality but this cannot be said. The concept manifests itself on the control of the actual device by 7
different input methods and the multimodal output such that the users can enhance their ability for decision-making. The output aspect of multimodality has already been exploited to a certain extent in operating systems and games alike. On receiving an error in an operating system the user is not only warned with an error message but also with an according acoustical sound indicating the error. Nevertheless, the biggest uptake of multimodal output would be with entertainment consoles that integrate sound, visual effects and motion, in the form of vibrating controllers. Although this has been around for a few years, multimodal input and output techniques have not integrated themselves into the mainstream computer market. Only now, with the evolution of mobile devices where constraints on the traditional input and output techniques are imposed, has the concept of multimodality a chance to increase usability and be accepted. Because of the nature of mobile devices people are required to interact with them on a regular basis. Hence, they find it preferable to have this interaction with a device resembling human face-to-face communication. The actual human face-to-face communication cannot be accurately modelled but in order to convey to the user the ease-of-use of the mobile device, additional ways of input and output have to be considered. Only three of the five human senses or modes can be mimicked by a mobile devices so far, which are audio, visual, and touch (motion). The existing methods for multimodal interactions comprise of voice (eg. speech); user movement (eg. rotating head), touch (eg. pen) input and sound (eg. music, beep); visual (eg. graphics), motion (eg. vibration) output. [1, 2] For instance, the use of MPEG-2 streams is a form of multimodal output since there is audio, video (motion) and visual(teletext) stimuli. These existing methods are addressed in models designed for Human Computer Interaction (HCI) for mobile devices. The design for mobile devices is unlike that of the traditional HCI design due to the changing context, compared to fixed in the traditional design, and the resulting implications on usability.
[6] The inherent difficulty in designing an adequate HCI model for
mobile devices is the changing demands of the user with respect to tasks and the context the user, as well as the device, is in. [5] This implies that the development
8
of applications wanting to use these HCI models have to ”explicitly consider the impact of context” [5] concerning the different interaction techniques. None of the literature that I have found and reviewed has made any attempt to actually develop guidelines or models but have only stated the design issues and not solved these. In my opinion the influence of such models on the usability and functionality of mobile devices will greatly determine the rate at which this new technology will be accepted. The only guideline, if one can call it that, is the categorisation of types of multimodal input and output. There are three types with the first type being the sequential multimodal input or output, where only a single modality is examined at the one time. The second type is the uncoordinated and simultaneous multimodal input or output, allowing for more than one modality to be used at the same time but the system treats them in isolation and processes the information separately. The third type is the coordinated multimodal input or output, taking advantage of the multiple modalities by integrating them into one single input or output delivering information in a more conceivable way to humans. The three types basically specify how the multimodal data is processed. This gives at least some indication for developers on how to set up an architecture for the handling of multimodal input and output. Such examples can be observed in [1, 7] and [8]. Returning to the existing multimodal methods, there are some studies about the usability and acceptance by users. Most of them focus on the input side of things since the multimodal output is already quite common, due to the nature of multimedia users who make the most of. Overall, it was found that users do not employ multimodal techniques all the time but also interact with devices unimodally depending on the situation they are in. [7] It was discovered that users were interacting multimodally if there was a spatial component in the task they were doing, otherwise unimodal interaction was observed. I guess this comes from the nature of humans in explaining more complex information using all means available to them. Furthermore, when examining the actual combination of techniques used, it was found that the pen and speech combination was the most favoured one amongst all input techniques, may it be unimodular or trimodular. ( [9] for more
9
detail) Also, other documents mainly concentrate on the combination of speech and touch input as the favoured input technique [2, 10, 7] due to the technology that is currently feasible and affordable by consumers. It is important to evaluate the cases where users interact multimodally, such that an adequate model can be constructed to maximize the usability of a mobile device. The essence of the information I have gathered regarding multimodal interactions on mobile devices is the importance of being aware of the context a user and the mobile device are in, as well as the increased usability of mobile devices when multimodal input and output techniques are applied.
2.2
Flip Zooming Review
The second part of the research focused on past work in the area of Focus+Context Visualisation and has been examined in order to adequately conduct this research project and gain experience with the subject matter. From the outset of the project I was aware that I would have to represent information on a mobile device in a manner such that its user(s) can effectively make use of a mobile device and the information accessible by this device. Focus and Context Visualisation has been examined because it concerns itself with the presentation of large amounts of information on a confined display. The purpose of Focus and Context Visualisation techniques is to decrease the visual load on the user while presenting detailed and extensive amounts of information on the display. Such techniques utilize the notion of a focus to represent detailed information that the user currently pays attention to and that of a context, which represents an overview of information related to the information in the focus. One such technique that has emerged over the last few years, and is based on the principals of Focus and Context Visualisation is Flip Zooming. [11] This technique captures the ideas I consider interesting for my work. It offers the user an overview of the entire data and the user can instantly access any part thereof. The developer’s intended purpose of the technique was the use with documents, where it utilizes the sequential ordering of pages, but it can be applied to any data set. [11] The ordering exemplifies a relation from one data item to the next and preserves their spatial information. 10
Figure 2.1: Item in focus [11]
The data item that is in the focus obtains a larger quantity of display space with the remaining data items, the contexts, laid out around the focus, as seen in figure 2.1 [11]. It can be seen in this figure that item 13 is in the focus and items 1 to 12 and 14 to 30 are context items. The following review of several other Focus and Context visualization techniques compared to Flip Zooming gives insight into my design decisions made and the advantages of Flip Zooming.
Existing techniques, in general, had to deal with
preconceptions about Focus+Context Visualisation when developing their technique with the intention of minimizing the limitation on the usability. [4] Staffan Bj¨ork and Johan Redstr¨om describe in their paper [4] the major issues developers have to face and conclude that these have been limiting the progress of Focus+Context Visualization techniques. These factors were the inconclusive use of either a single focus or multiple foci, the implications of Human Computer interactions with respect to the environment, and whether or not the information being represented is homogeneous or heterogeneous.
[4] Generally, most techniques use only one focus
but in some cases, like comparing data, multiple foci would be preferable for better 11
usability. Flip Zooming originally did not provide for several foci but in one of the implementations of the technique this was enabled [4]. The number of foci used for a particular presentation or implementation is therefore dependant upon the actual information that the model is built for. Furthermore, the focus on the mobile device might not be the actual center of attention of the user and other factors such as road traffic or people interactions can play a role in how a user perceives the information represented. Adaptations to the Flip Zooming techniques have to be made for use with mobile devices to recognize the influences of the environment on the usability. [4] In my opinion these influences only marginally alter the layout of the presented information although it is very important to find information again quickly if users look up to orientate themselves in their surroundings. But this has to be taken into consideration from the start of the design process for a model. An issue that Flip Zooming handles great is the reduction of the context through its hierarchical framework helping to remove cluttering of information on the display. In some cases users only require the focus, the context or part of the context to complete a given task and the remaining information would only hinder the efficient use of the actually relevant information. This is where the hierarchical form of Flip Zooming narrows down the information from a global context to a more localized one by ”zooming” in. [4, 12] Generally, a user focuses just on one task, but the context relating to the focus does not always have to be the same and may depend on a certain point-of-view. Therefore, multiple contexts have to be considered, as well as the type of information in the contexts. This information can be homogeneous or heterogeneous and consequently may need different ways of representation. These two issues are an integral part in the use of a mobile device because mobile devices should be helpful in any situation for which the type of information might be different. Some existing implementations of Flip Zooming for the mobile device market, PowerView and PowerCom [4, 13, 14], have solved these issues and support multiple, heterogeneous contexts. It can be seen from these concerns that the Flip Zooming technique does not have ”hard” guidelines or boundaries for its use, but that it has to be adjusted for the type of model and
12
implementation that is desired. Having discussed the preconceptions about Focus and Context Visualisation techniques, the following two sub-classifications can be laid out. One is the distortion-based views, which deform the context of the displayed information in order to separate it from the focus and place more information on the display. The other is the non-distortion based views, which present the context and focus in a similar way, but shrink the context for space efficiency. [11, 15, 16] Up until now the distortion based views have been researched more extensively than the non-distortion based views. This does not mean that they are better, which is shown here that they are not, but they are just a natural progression in the development of suitable techniques for Focus and Context Visualisation. The most popular of these techniques are the ”Fisheye views” and ”Polyfocal display views”. The first simply hides the information that is not in the focus whereas the latter uses magnification in either one or two dimensions. This magnification works actually in the reverse way and reduces the context in such a way that the information is still there but cannot be adequately viewed. The major disadvantage of the distortion-based views is that they deform the context and this makes it hard to obtain information from it even if only spatial information is required, for instance the placement of related objects. [11, 16] In contrast, Flip Zooming is a non-distortion based view and has tried to solve the problems that one can encounter when using other Focus and Context Visualization techniques. These problems are the distortion of information outside the focus, as mentioned previously, and the less than adequate utilization of screen space, as well as performance issues for some techniques. [11] Lars Erik Holmquist claims in his master’s thesis that Flip Zooming has addressed these problems by providing a clear view of information outside the focus and more space-efficient display usage. [11, 17] Furthermore, there is an apparent improvement in real-time performance of the technique compared to other ones according to Leung & Apperley 1994 [16]. The way that Flip Zooming has solved these problems and gained an advantage over other techniques lies in the arrangement of the desired information. The technique requires the information that is being visualised to be put into an order
13
Figure 2.2: Flip Zooming representation - no page in focus [11]
that facilitates the creation of ”pages”. These pages contain certain information that is relatively closely related. The content on consecutive pages is related, however not as much as the content on a single page. [11, 12] The representation of pages in the context has to be a more compact form than that of the focus and Flip Zooming uses ”distortion” in such a way that a context page looks like a focus page viewed from a distance. The arrangement of pages on the display is determined by the page in focus and whether there is a page in focus. Initially all pages obtain an equal amount of display space and are ordered from left to right and top to bottom.(refer to figure 2.2 [11]) As soon as a page goes into focus, its display space increases and all other pages ”loose” an equal amount. The ordering of pages is still preserved, even when a page is in focus. The pages around the focus that have lower order are situated to the left and top of the focus page and the pages of higher order are to the right and bottom. The technique tries to give the focus page the center of the display representing the current focal point of the information.(refer to figure 2.3 [4]) When the focus page changes to a new one, the current focus page is transformed 14
Figure 2.3: Flip Zooming representation - page in focus [12]
into a context page and all the pages are thereafter arranged to fit the above description of page placement. The focus page can be chosen either sequentially or randomly and this can be initiated by a user or automatically, depending upon the desired implementation. Another feature of Flip Zooming is the possibility to apply a so-called ”full focus view”, where the focus page covers the entire display area. This can be used in situations where the context is not relevant for a user. Generally this view is only applied on limited occasions. Nevertheless, Flip Zooming still has issues with representing information in a space efficient way, which lies in the rearrangement of information pages when a particular page is brought into focus. This also results in an unpredictable way of page placement on the display leading to the loss of information of the relative placement of pages to each other. Additionally it can be quite cumbersome to split the information into interrelated pages, which is required for the ordering. [11, 17] Although it should not be too difficult to find some sort of categorisation for the information, since all information has some sort of interrelation to each other, might it be spatial or chronological. Furthermore, the developers of Flip 15
Figure 2.4: Hierarchical Flip Zooming - page in focus [12]
Zooming have not investigated the possibilities of pages appearing and disappearing depending on the user’s context. This relates to context aware applications and their input to the visualisation representation. Hence the arrangement of continuous pages is left in a static configuration. On the one hand this makes it fairly easy to control the layout but can be a problem in the initial set-up of pages when using the technique. A good extension to the Flip Zooming technique would incorporate the dynamic addition and removal of pages depending on context aware stimuli. A more specialised form of Flip Zooming, Hierarchical Flip Zooming, is a small extension to the general form. It provides flip Zooming visualisations within other flip Zooming visualisations, virtually partitioning the information into more sub categories as shown in figure 2.4. [12] Basically, this means that a page in the traditional Flip Zooming technique is now its own Flip Zooming visualisation. Therefore, each page in itself behaves like a Flip Zooming visualisation regarding the page placement and navigation. Furthermore the overall visualisation also behaves that way. It is not very likely that the hierarchical form has more than two levels 16
since the information in the lower levels becomes too small to be efficiently displayed. Overall, Flip Zooming enhances existing Focus and Context visualisation techniques by trying to combine all the positives from the other techniques while solving their issues. One study [18] gives a formative evaluation comparing Flip Zooming to two other methods of viewing images on the web. However, inconsistent and confined sampling weakens the validity of the results, but these were then further used during the development period of the Flip Zooming technique in order to improve it. Although there are some disadvantages, the advantages of Flip Zooming outweigh them and it promises as a suitable concept for visualising large and complex information sets. It is therefore a suitable technique for the work at hand.
2.3
Summary
This chapter covers the research aspect of this project and examines the literature about multimodal techniques , which concern themselves with the different ways of input and output to devices. Possible techniques for touch and voice input as well as output have been looked at. Furthermore, techniques which try to find ways of representing large amounts of information on a confined display space were analysed. The latter is part of focus+context visualisation techniques and the analysis found that Flip Zooming is the most promising technique for the use of this project. This becomes apparent in the development process of model for this project, which is described in chapter 3.
17
Chapter 3 MpegView Model
The model detailed here is the core of this thesis project, describing an extension to existing research into Focus+Context Visualization techniques and it provides a source for the implementation of a useful real-world product described in the chapter 4. This description of the model gives an outline of the important details making up this model and contains general and MPEG-2 specific details. The general section summarises the facts that are integral to the model and that can be applied to any concept whereas the MPEG-2 specific section identifies the factors that only apply to a model concerned with MPEG-2 streams. For both sections the requirements that have to be met are made clear. Furthermore, a justification of the applicability of ”Flip Zooming”, the technique the model is based on, is expressed. The description includes an illustration of the way the user’s context is used to automatically and manually navigate through information as well as the way information will be visualised and controlled. The first point can be summarised as Information Visualisation and the second as Context Management. These two concepts are further explained in their own sections along with the requirements and should all three together make it clear how the model works. The Information Visualisation and Context Management sections are detailed for the general and MPEG-2 stream details. For the MPEG2 stream particularly the information content is taken into account and therefore strongly determines the essence of the separate requirements and the visualisation and context management techniques. The description of the model keeps the context 18
of mobile devices in mind and therefore all the ideas are specifically designed for it. Nevertheless, the model could be applied to other devices, such as laptops and desktops, where the constraints of the small device do not have an impact. Concluding this chapter is a depiction of how the model can be used in a real world scenario which then leads to one implementation of the model in the following chapter.
3.1
Specification
This section depicts the specification, or requirements, for the general and MPEG2 details and forms part of the model. These requirements should be taken into consideration when applying the model to a particular idea or concept and are by no means all inclusive or complete in the sense that there could not be other additional requirements. They include the requirements that I recommend as being important and need to be considered. Especially for the MPEG-2 specific detail they aid as a guideline for the implementation and describe general constraints and necessities for the control and use of a MPEG-2 stream. The description of these requirements is not only a listing of facts but also a place where these facts are explained and suitable solutions found.
3.1.1
General details
This section details the facts forming the basis of the model and defines the common requirements to any concept. Additionally, most concepts will have their specific requirements as can be seen for the MPEG-2 stream concept in the next section. • Functionality requirements: The functionality requirements describe the features that the model has to aid the use of a dynamically changing user interface. -- There is a mechanism to adjust or change the way a user interface displays information which is controlled according to the Context Management. This is detailed in section 3.3 and encompasses the triggers to dynamically change a user interface. 19
• Usability requirements: The usability requirements detail methods that are designed to help the user to effectively confer with the mobile device to retrieve information in a way that the user understands and feels comfortable with. It is a fact that users of a mobile device can gain effectiveness and ease-of-use from the application of a combination of visual and audio input and output techniques and this model takes both into consideration. [3] -- The input to the mobile device is optimised for small handheld devices and can be done through the following techniques. In the case where these techniques are used concurrently, the user is interacting multimodally with the mobile device. ∗ Touch: Most handheld devices come with a stylus pen for input to a touch screen as well as additional buttons on the device for quick lookups of day-to-day activities. These features can be utilised for the purpose of easy navigation and information retrieval. ∗ Voice: Lately there has been innovative and new development into technologies that use voice to navigate and retrieve information but due to time and the complexity, the implementation of this feature cannot be included in proof-of-concept in chapter 4. [11] -- Continuing from the above mentioned input techniques, it is also sometimes important to use the mobile or handheld device with only one hand. This should be done and can be done by utilising the buttons on such a device. This can largely depend on the manufacturer of the device and the amount, size and placement of such buttons. In order to cater for the different styles of products, specific input ”plugins” have to be done for each supported device that maximise the usage for that particular button configuration. -- The usability of mobile devices can be increased by using output techniques multimodally. This serves a more realistic computer to human interaction 20
by giving feedback to the user. Such possibilities have not been examined in a general sense but in a more specific form of MPEG-2 output, as detailed in section 3.1.2 -- Nowadays the provision of certain features is to cater for persons with disabilities and accessibility features are one important issue.
When
creating a product this has to be considered but for the proof-of-concept this will not be included in the implementation yet. The following are only two disabilities that I have considered which I think can prohibit the full use of an information technology product. ∗ Visual Impairment: In order to minimise this impairment, several techniques have already been developed and the most promising in my point of view are ”Earcons” combined with voice-output. Earcons were developed by Stephen Brewster et.al [19]. Earcons are structured audio messages that sonically enhance widgets in order to portray more information to the user. ∗ Hearing Impairment: In this case the sole use of visual controls (pen and device buttons) are the only choice and audio output as well as audio controlled input have to be replaced with equivalent visual alternatives. • User requirements: The user requirements specify features that allow the user to change preferences for the model and the device. -- The most important factor here is the possibility to assign preferences to the Context Management settings for the dynamic changes of the user interface and controls. This will become clearer when reading the section about Context Management. The user can choose from a set amount of options for this and may be able to create arbitrary preferences as long as they fall into the allowable options for Information visualization. 21
-- Additionally, the user can change environment type settings like the size of displayable items, for example buttons or fonts.
3.1.2
MPEG-2 details
This section details the additional facts for the model and defines these. In this case there are specific requirements that only apply to the concept of MPEG-2 streams. • Functionality requirements: The functionality requirements describe the features that the model has with regards to the control of the MPEG-2 stream and the way a MPEG-2 stream can be visualised. -- The following control types are present for the MPEG-2 stream: ∗ Start: This type starts the playing of the stream. ∗ Stop: This type stops the playing of the stream. ∗ Pause: This type temporarily stops the playing of the stream. ∗ Rewind: This type rewinds the stream. ∗ Forward: This type forwards the stream. ∗ The volume of the audio part of the stream can be controlled with increasing, decreasing or muting the sound output. ∗ The brightness of the video picture can be increased for a lighter picture and descreased for a darker picture with a percentage controller common to most CRT monitors. This is greatly device-specific and most of the devices also have a backlight, which greatly increases the brightness when switched on. -- There is a mechanism that makes it possible to switch between different MPEG-2 streams manually and automatically, which is further described in Information Visualisation and Context Management. • Usability requirements: The usability requirements detail methods that are designed to help the user interact effectively with the mobile device and easily identify the controls for the MPEG-2 streams. 22
-- The graphical user interface for the manipulation of the MPEG-2 stream uses large and distinguishable buttons, as well as colour and icons for intuitive recognition of underlying functionality. (green - start, red - stop, light to dark colour range for volume control indication) -- The multimodal input techniques covered in the general details (3.1.1) are the only relevant ones for the MPEG-2 details. However, in regards to the output, MPEG-2 streams deliver video, audio and teletext. This multimodal output gives the user adequate feedback for controlling the represented information. -- The controls and layout of the user interface is done so that they are selfexplanatory to the user. This includes the above mentioned colouration of buttons as well as using icons on such buttons. Furthermore, the user interface can be made to resemble existing and well-known user interfaces for quicker recognition of the available functionality. -- Making the user aware that one is still viewing the same MPEG-2 stream when changing between several views, is another important feature. This can be done by a progress bar and/or time counter. In order to show this continuous flow of a stream, a program guide is utilised to extract start and end times of a program (like a TV program). The difference of these times can then be used to show the relative playing time. This information can either be extracted from a online TV guide or from some MPEG-2 streams that include a start time of the current program and the next program. Of course, this information is only available for TV-like streaming but other streams might have a length descriptor included. • Extendibility requirements: The extendibility requirements give details about the possible extensions that will be made, depending mainly on the implementation. -- The model handles different data types, not only MPEG-2 streams, but also other types like AVI files. This is largely implementation specific 23
and depends on the media library/plugin that is used for the decoding of streams. (xine-lib) This concludes the section on requirements, as well as solutions for these requirements. It also lays down the basic conditions of the model. Two further areas detailed in the model span these. One of them is the display of information in a specific way, the Information Visualisation, and the other the conditions for the dynamic and manual change of the user interface depending on the context, summarised as Context Management.
3.2
Information Visualisation
This section deals with the way of displaying information on a mobile device. It briefly illustrates how any information with different representational views can be displayed and then focuses on the MPEG-2 specifics. The following paragraph gives an illustration and justification of Flip-zooming as the chosen visualisation technique for this model. In order to represent information effectively and consistently a set technique has to be followed. The technique for the model that is developed here applies the Focus+Context Visualisation technique ”Hierarchical Flip Zooming”. Hierarchical Flip Zooming is based on Flip Zooming, which describes a way for viewing large amounts of information on a single display. The technique offers the user an overview of the entire data and instant access to any part of it. Originally the technique was proposed mainly for use with documents to utilise the sequential ordering of pages, but it can be applied to any data set. The ordering exemplifies a relation from one data item to the next and preserves their spatial information. The data item that is in the focus obtains a larger quantity of display space with the remaining data items, the contexts, laid out around the focus. The main advantage of ”Flip Zooming” compared to other Focus+Context Visualisation techniques is that the context is not distorted and the data items are just miniaturised versions of the same data item when it is in the focus. [11] The Flip Zooming technique has already been applied to mobile devices, PDAs, through the implementations of PowerView 24
and PowerCom. These implementations were done to utilise the display space and give access to the most common information on PDAs in a static way. [4] My model on the other hand does this in a dynamic way, thereby utilising the context of the user. The technique was chosen due to its proven applicability to mobile devices and efficient display of information through the use of foci and contexts. Due to the nature of the information that has to be displayed for this project, a MPEG-2 stream, the specialised form Hierachical Flip Zooming of the above-described technique is used. The only difference between normal Flip Zooming and the hierarchical form is that in the hierarchical technique there are Flip Zooming visualisations used within Flip Zooming visualisations. [12] This enables the application of the MPEG-2 streams and the eight different contexts a MPEG-2 stream can be represented as. The purpose and application of these eight representations are further explained in the following sections.
3.2.1
General details
This section describes the general view of context information with the according focus on a representational view that such context information has. Figure 3.1 illustrates this and a context can have from one to an arbitrary finite amount of representational views. The representation view signifies the way the same information, or part thereof, can be displayed in different ways to give the user a specialisation of or easier access to the information. Using the hierarchical flip zooming technique gives two levels of abstraction, where the first level contains the information ”item” and the second level gives a choice of the N representational views. Theoretically, it is possible to have multiple foci and therefore display more than one representational view at the same time, but the model stays with a single focus due to the space issue of the device’s screen area. Otherwise the screen would be cluttered and the benefit of the efficient display of information would be counteracted. A small disadvantage that Hierarchical Flip Zooming has is that it is not space filling. This is especially disadvantageous for a mobile device with 25
Figure 3.1: General Representation states
Figure 3.2: Page-based approach
the confined screen area and this model applies a page-based approach to be more space efficient. This page-based approach is basically the Full Focus View discussed in chapter 2 and is used quite extensively in this model. The following two figures 2.4 and 3.2 compare this space issue. It can be seen here that space is lost on the right; left side and below of item 6 as well as inside each item. The page consists of one information item and all its representational views but only a single representational view is displayed in the foreground of the screen. Mechanisms are provided to switch between the different representational views of 26
an information item as well as between different information items. This is similar to browsing the web where one has a ”back” and ”forward” button to retrieve previously accessed pages, which are chronically before or after the current page. The use of Hierarchical Flip Zooming has to be seen as a conceptual foundation for the display of information and not as an actual graphical layout. As a simple example scenario one can imagine a desktop environment with application programs and files.
A user can apply the information visualisation
technique and create virtual environments where only a subset of all the available application programs and files are available. This can be employed for security or authentication purposes in a company where users input their user name and password on a desktop PC and only have access to the application programs and files they require to do a job. The users are not aware of other application programs or files not relevant to their tasks. This can also be employed for a single user PC where the user only wants certain information (application programs and files) displayed while others are hidden. An example where this can be useful is for an academic who is taking a lecture and only wants information relevant to the lecture to be displayed and potential confidential information hidden. This concludes the section on the general details for Information Visualisation and this general view of context can be applied to any information set as can be seen by the following application to MPEG-2 streams.
3.2.2
MPEG-2 details
This section deals with the view of context information relevant to MPEG-2 streams with the different representational views MPEG-2 streams exhibit and are used for this project. A single MPEG-2 stream has the following eight different stream representational views as described in Figure 3.3. Each MPEG-2 stream representation has its separate requirements depending on the functionality that is needed and the usability that is to be achieved. The views 27
Figure 3.3: MPEG-2 Stream Representational views
are given numerical numbers from 1 to 8 and represent the views from left to right. View 1, for example, represents the full MPEG-2 stream with video content, audio content and Teletext content. Each of the following views removes one or more of the contents until there are none in View 8, which represents a stream being switched off. The default view for the focus is View 1. These views are changed according to the conditions for Context Management described in 3.3. The purpose of the different views is to remove distractions from the user according to the situation users find themselves in. This will also become clearer in the example that follows the Context Management section. The first level of the hierarchy for the MPEG-2 streams shows all available MPEG2 streams in string/picture representation. The amount of MPEG-2 streams can change dynamically and frequently, i.e. when certain files and/or network streams 28
Figure 3.4: Tile representation
Figure 3.5: Display Representation
become available or unavailable. Possible representation is in figure 3.4. The Flip Zooming technique uses a clear view of data outside the focus to represent context. This means that all the MPEG-2 streams would have to be displayed at the same time, a large display of the stream in the focus and a large number of small ones for the context. Such a configuration is unfortunately too computationally intensive for multiple MPEG streams on a mobile device. (regarding the decodingof a stream) Therefore, only the focus has full MPEG-2 display and the context will be represented by a string and picture combination. The second level of the hierarchy for the MPEG-2 stream has always the eight static views for the one stream. The view that is currently selected is in the focus and is the only one that can be seen. Furthermore, there is a mechanism to switch between the representational views as indicated in figure 3.5. Also, there is a way to return to the general overview of all the streams (1st level), as well as the facility to allow for sequential switching of channels while in the second level. 29
Amendment to MPEG-2 usability requirements To signify to the user the current representational view, a few visual additions to the user interface are made. In the case of the audio-only view, an image containing an audio icon will be displayed. The same applies for the teletext-only option and an image of text is displayed. The option of audio and teletext results in a combination of the two previous images. Furthermore, if no content is viewed, the display is left blank except for the selected choices of views. This section has shown how general and MPEG-2 specific information can be visualised using Flip Zooming.
3.3
Context Management
This section gives details about the conditions for the dynamic and manual change of the user interface that are triggered by context sources. Context sources are objects that can change state and by doing so they cause an action to happen, which, in this case, is an interface change. The following context sources are defined for this model. Each context source includes specific elements that form part of the context and are instances thereof. I summarise these elements as tasks since one can imagine that such a task is the user doing some work. • Context sources: -- Calendar: The calendar provides the majority of tasks and can be thought of as a diary where the user inputs day-to-day activities. These activities generally have a start and finish time needed by the calendar and include instances of the task types described below. The first task type coming to mind when talking about a calendar is that of a meeting, which forms part of normal working life. These meetings are arranged into a scaling from 1st to 3rd level and signify the status of a meeting. 1st level meetings are the most important and formal whereas 3rd level meetings are more or less informal. The breakdown into the three levels should reflect the different 30
types of meetings one would find in the real world. This also leads to general appointments, like with a doctor or hair stylist, and get togethers with friends. Furthermore, the user also has to specify the jobs to be done for the day, for example writing documentation, coding or reviewing code for instance, if one would be a software developer. The next task type is summarised as chores and includes all the activities that have to be done around the home. This ranges from ironing to garden work. Another type is phone calls and these can be classified as private and business. Also, the daily task of getting to work or around using a method of transport is a factor to be considered in the calendar. Each of the instances of the task types described above have to be manually input by the user into the calendar to take effect on the user interface. With this I mean writing down that the user is doing a job from time x to time y, approximately. This is not much more work than using a normal calendar or diary. -- Global Positioning System (GPS): The global positioning system keeps track of the user’s physical location and can be used to determine a change in the user interface. This can also detect the transport method of the user depending on the route taken and a pre-existing defined route. -- Clock: The clock keeps track of the time and can trigger events through alarms. This is used in conjunction with the calendar context source. -- Device: The actual device and its physical state give rise to alterations in the way the device can be used and the appearance of the user interface. A major factor is the stress on the battery put by the retrieval of information, which is in this case the decoding of the MPEG-2 stream. This has to be taken into account, and in case of low battery levels the video, then the audio and lastly the teletext decoding have to be neglected such that not all the use of the device is lost by an empty battery. Additionally, the CPU and memory usage have to be taken into account and under extremely high 31
load the quality of the information is reduced in order to still be able to display it at all. This would mean for the MPEG-2 stream reducing the size of the video output or dropping video frames or audio samples. Besides the presence of ear plugs in the audio out jack would indicate that the user is especially interested in the sound output and therefore the sound would stay on the whole time although other factors might indicate a view without the sound. These factors can be attributed to the pre-defined user interface change through triggers or factors from the ”environment”, where the device volume is set to zero and therefore the audio should not be allowed for selection. Also, the buttons on the mobile device can be utilised as indicated in the requirements section. The presence of certain buttons can alter the behaviour of the use of the device depending on the configuration of the buttons. The following task type is only partially related to the device and is more a factor for the surrounding infrastructure. In the case of streaming information, the bandwidth of a connection is a concern and would determine the quality of the supplied service. The connection bandwidth is also able to change dynamically and therefore precautions have to be taken in order to optimise usability in the case of bandwidth degradation. Alternative views are assigned for low quality and partial delivery of the information. -- User: The user him/herself can take action and modify the way the device can be used manually and this is a very common occurrence. There are now two possibilities for an alteration to the user interface and the control of the information. One option is the automated alteration through a change in the context’s state, which is the case in the first for context sources, and the manual alteration of state through the intervention of the user. The user initiates the manual alteration by selecting specific MPEG-2 streams and the representational view thereof. Furthermore, the users can switch the context themselves and task types can be phone calls, unexpected visitors (appointments), 32
for privacy or security reasons, and simply switching the device off. The task types described above trigger the automated alteration and the user can only intervene with user defined context changes. The user of the mobile device has to assign one of the eight representational views of MPEG-2 streams to every task type defined above, or accepts the default ones provided by the system. Additionally, the user has the option of assigning a specific stream or types of streams to a task type but does not have to. The combination of task type and representational view and maybe stream type determine the visual focus of the user/device and is formally defined by the function below.
FUNCTION (TASK TYPE, REPRESENTATIONAL VIEW, STREAM) = FOCUS
The following section shows an example of how this model is meant to work and will give the reader a better understanding thereof.
3.4
Example
In order to better understand the model an example, based on the MPEG-2 stream concept, is given to familiarise oneself with it. The following description contains a calendar with entries that correspond to certain tasks and the resulting user interface for a ”default” configuration of the model. Another context source for this example will be the device itself as well. At the start of the day the calendar shows that the user gets up at 6 am. Before that time the mobile device is off and therefore uses the representational view 8. View 1 is used once the device is switched on automatically. On the user’s way to work the user has got representational view 3, because he doesn’t want to disturb others with the audio. In the case where the user has brought earplugs, this is detected by the device and view 1 is used. Once the user arrives at the office, only audio (view 6) is used because the visual sense of the user are preoccupied with reading email and doing general paperwork. At 10 am the device automatically uses representational 33
Figure 3.6: Visual representation of an example of the model
view 8 since the user has a meeting with the boss. After this meeting view 1 is used again and another meeting, this time an informal one, is held with the device having view 3. Then the device is unexpectedly running low on battery and the video is turned of to conserve and extend the battery life. Therefore view 7 is used. This is also taken into account when the view is again changed at 12 o’clock, where one could use view 1 but due to the low battery view 4 has to be used. Since the battery is still decreasing through continuous use, the view is again automatically changed to view 7 and video + audio are turned off. Later, the user puts the mobile device in the cradle and all possible views are available again but due to the work in the office, view 6 is present. On the user’s way home, the view is 3 again or 1, depending on whether the user has got earplugs. At the end of the day the user is doing some housework and view 1 is used until the device is switched off for private time. This is a very simple description but it should give an idea of how the model is supposed to work. A visual representation of the example is shown in figure 3.6. 34
3.5
Summary
The MpegView model chapter details the specifications of the model as well as the visualisation aspects with respect to general and MPEG-2 specific requirements. Furthermore, it details the aspects of context management and explains each context source in detail, including appropriate examples. The model includes the multimodal aspects in the specification and the Hierarchical Flip Zooming technique for the visualisation of the information content.
The chapter is rounded off with the
application of the model in an example to enhance the reader’s understanding of it.
The foundations of the existing technique of ”Hierarchical Flip Zooming” will be used to develop MPEGView, a visualisation application for several MPEG-2 streams. A point to consider when applying the model is how to integrate it into the existing platform. As stated above, the MPEG-2 use of the model creates a separate standalone application and works just like a normal program on an existing system. The process of developing and integrating the model into an existing platform will be covered in the next chapter.
35
Chapter 4 Implementation The practical part of this thesis project comprises the development of a visualisation application for multiple MPEG-2 streams on a mobile device.
The resultant
application, MPEGVIEW, is a stand-alone piece of software running on Familiar linux together with OPIE on a Compaq Ipaq 3870. The following description covers the technology and setup involved in the running of the software, and the design of the user interface for displaying and manipulating MPEG-2 streams according to the model described in chapter 3.
4.1 4.1.1
Technology Linux Familiar
The chosen operating system for the Compaq Ipaq is Familiar Linux, an Open source project, porting the well-known linux from the desktop to the mobile device. This OS is specialised for the use with the Compaq Ipaq family and is proven to be a stable environment to build on. The version of familiar present on the ipaq is version 0.6-pre1 with the kernel version 2.4.18-rmk3. An image file (jffs2) is provided on the familiar web page with familiar and opie pre-installed.
The
installation process of Familiar linux is well documented and can be viewed online under http://familiar.handhelds.org. Once the image file has been uploaded the user is presented with a working environment. For more details about my installation 36
please refer to appendix B.
4.1.2
OPIE
Additionally, on top of Familiar linux a windowing manager had to be installed that facilitated the development of MPEGVIEW. There were several choices, including Familiar’s version of X, Trolltech’s Qtopia and OPIE. The product of choice was OPIE, an Open Source based graphic platform which is actually based on Qtopia. Opie was preferred due to better compatibility with the familiar platform and binary compatibility of applications. Another reason was the development of a new mediaplayer, opieplayer2, for OPIE that supported streaming and multiple media formats. Furthermore, the media library, xine, used with opieplayer2 was fairly easy to alter and include the code that would extract the teletext from the MPEG-2 and forward it to the user interface. The current cvs version of the opieplayer2 has been used due to the continuing development(last checkout 1/10/2002). The version of opie installed on the ipaq is 0.9.1-snapshot. MPEGVIEW is an adaptation of the opieplayer2 and primarily changes the user interface for the video playback and adds the display of teletext information. Further details of this are explained in section 4.2.
4.1.3
MPEG-2
The intention when selecting a media type for this project was first of all the idea of extracting teletext data that might be included and also to use a type that was widely accepted and used. MPEG-2 seemed to be the logical denominator since it is being used in digital video broadcasting and I had access to research, regarding the inclusion and extraction of teletext information. Transport Stream MPEG-2 is a standard method of transmitting digital video and sound in a compressed format using less bandwidth than the traditional analog method. It is becoming the de-facto standard in the digital TV world. The way MPEG-2 is used for transmitting 37
is in the form of a Transport stream, which contains transport stream packets. Each transport stream packet has a payload which can contain video, audio or teletext information in the form of packetised elementary stream (PES) packets. A PES packet is an elementary stream cut up into suitably sized packets for transmission. Each elementary stream contains information only relevant to video, audio or teletext respectively. Furthermore, control data like a Program association table (PAT) which is like an index to all the available programs in a digitial video broadcast stream, can also be included as PES packets in a transport stream packet. The PAT has an ID entry for every program and every program has a program map table (PMT) identifying the PES packet IDs of the video, audio and teletext packets making up a particular program. This allows for the easy separation of information and extraction of the correct content. [20] xine library The media library that is used with OPIE’s opieplayer2 is xine, allowing for the playback of various media types including MPEG-2 Transport Streams. The architectural structuring of xine allows for the pin-point alteration of code without it affecting any other part of the media library. The library is structured into five main parts, input plugins, demuxer plugins, decoder plugins, output plugins and the xine engine which combines and utilises the four types of plugins. Each plugin type has several plugins associated with specific input methods, media types and output methods respectively. Version 0.9.13 of xine-lib was used for this implementation and can be acquired from http://xine.sourceforge.net . Since I am dealing with MPEG-2 transport streams, I have looked at the transport stream demuxer (demux ts.c) and added code that extracts the teletext. Code for the extraction of the teletext from a digital video broadcast (DVB) stream was provided by Douglas Kosovic, DSTC. I had to integrate the relevant parts of the code into the transport stream demuxer. The added lines of code is about 30 lines. Additional to that, all the methods except the main method of Douglas’s code have been added to the end of the demux ts.c file. (Another 200 LOC) The code segment has been added 38
Figure 4.1: xine-lib architecture
to the demux ts parse packet method, which checks each packet and determines what type of packet it is and then sends it on to the relevant decoder. My code identifies the teletext packet, extracts the teletext using the methods provided by Douglas Kosovic’s code, and then sends the teletext to the user interface using xine events. The code segment of the changed code can be found in appendix A.
4.2 4.2.1
User Interface Design Architecture
This section covers and explains the important aspects of the existing architecture of OPIE’s opieplayer2 and extends this slightly. The description of the architecture is given so that it is easier to understand where changes had to be made for the incorporation of the teletext and the alteration of the user interface to make it more usable and to facilitate the ideas of the MpegView model. 39
Figure 4.2: Existing opieplayer2 architecture
Existing Architecture A simplified view of all the modules of opieplayer2 and the interactions between them can be seen from figure 4.2 and their purpose is explained briefly. PlayListWidget This module handles the input from the user who selects the media files to be viewed. It not only provides a graphical user interface for the input but also keeps track of the currently selected media file and passes this on to the MediaPlayer module. MediaPlayer The MediaPlayer module controls the overall operation of the opieplayer2. It receives input from the PlayListWidget module in regards to the media file to be played (forwarding this to the xineControl module) and initiates a change of state of opieplayer2 handled by the MediaPlayerState module. 40
MediaPlayerState This module keeps track of the state that opieplayer2 is in. For example, whether or not a file is currently being played, streamed over a network or displayed using fullscreen mode. It relays this information to the VideoWidget module and xineControl module respectively. xineControl The controlling module of the xine-related processing in opieplayer2 that handles the playing of the current media file. It uses a media file for input and passes this on to the Lib module and in return it receives all available frames to be displayed one at a time and updates the xineVideoWidget module with a single frame. Lib Another module concerned with the handling of a media file. This module acts as a intermediate between the xine output plugin, which is the NullVideo module, and the xineControl module. It virtually wraps the standard xine API for ease of use and conformity to the OPIE standard. NullVideo This is the output plugin specific for OPIE. The video output plugin is a thin abstraction layer for the Compaq Ipaq video output platform. The video output plugin provides functions like frame allocation and drawing and handles processes like hardware acceleration, scaling and colorspace conversion. There is a generic output plugin API and this is implemented here for the Compaq Ipaq and OPIE. VideoWidget The module that is responsible for the display of video and the controls for the user to manipulate the playback of a media file. It uses the xineVideoWidget module for the display of the video. xineVideoWidget This module displays a single frame of the video and the frame gets updated by the xineControl module. xine-lib This is the media library external to opieplayer2 and handles the demuxing, decoding and synchronisation of audio and video. The audio part of a video media file is handled by one of the supplied audio output plugins. architecture of the xine library can be viewed in figure 4.1.
41
The
Figure 4.3: MpegView architecture
Amended Architecture
An addition to the existing module architecture of opieplayer2 was the inclusion of two new input modules, one which structures the application a bit better logically and the other is a visualisation of the playlist. The purpose and details of the new modules are covered in section 4.2.3. Also, another control module was added that takes care of the management of context as described in section 3.3 and the details are elaborated on in section 4.2.3. Only these minor changes to the opieplayer2 architecture were made with additional minor modifications to the VideoWidget module (refer to sections 4.2.3 and 4.2.3). 42
4.2.2
Mobile device constraints
The maximum resolution for video output on a Compaq Ipaq can be 320x240 pixels if used in landscape mode and maximum vertical resolution that is practical to view is 240x160 pixels.
4.2.3
User Interface
The modifications to the existing modules of opieplayer2 are covered in this section and concentrate on the new input modules for a user, the changes made for the incorporation of teletext data, the visual representation and control of a MPEG-2 stream, and the dynamic control of the user interface through changes in the context. User input The usability of the MpegView application was increased through a simple menu configuration by the addition of the MpegView module. The user finds this as the start-up screen and can then navigate to the other relevant modules for input, the PlayList and VideoSkeleton modules. The VideoSkeleton module is a visual representation of the current list of available media, i.e. playlist. The purpose of this is to show a visualisation in the form of Flip Zooming as explained in section 3.2.2 and shown in figure 3.4. The final design of these two input modules can be seen in figures 4.4 and 4.5, two screenshots of the application running on the Compaq Ipaq. Teletext For the inclusion of teletext extraction not only needed the xine library modifications (refer to 4.1.3)but also the xine related code in opieplayer2. The modification included the catching of events relating to new teletext arriving from the demuxer and the subsequent passing on to the VideoWidget module. The event originates directly from the xine library and the only way of accessing xine library related events is through the Lib module. The xineControl module creates an instance of Lib and then ”listens” for events related to teletext updates. It then forwards the teletext to the VideoWidget module which then decides on how to display the text as output. 43
Figure 4.4: MpegView module screenshot
Video output The VideoWidget module is the module where most of the changes and modifications were made. First of all, a teletext field has been added for the display of the MPEG2 included teletext information, which was one of the initial goals. Furthermore, the visual representations of a MPEG-2 stream have to be made accessible to the user for selection. This was done by adding an area in the top part of the screen where the eight representational views can be selected. This corresponds to the visualisation using Flip Zooming as explained in section 3.2.2 and shown in figure 3.5. The implementation of the selection choices can be viewed in figures 4.6 to 4.12. Additionally the cluttering of the display with too many selection choices was avoided. The video controls are hidden initially and only surface once the user wants to alter the state of the MPEG-2 stream. This can be seen in figures 4.6 and 4.7, where the latter shows the controls. The following figures, 4.8, 4.9 and 4.10, show 44
Figure 4.5: VideoSkeleton module screenshot
the video, audio and teletext content individually. Figures 4.11 and 4.12 show two further screenshots. It can be seen here that the details discussed in section 3.2.2 have been followed and visual icons have been added to signify the current representational view. Context Manager The user interface can also be automatically changed and this is handled by the program control module. This implements the context management, partly in the regard that only the calendar is used so far as a context source. Using the device itself or a GPS device require just a bit more effort. The context source we use here is the Opie calendar which uses an xml parser to store and retrieve the calendar information. The program control uses this the same way as the calendar by triggering alarms once a calendar item reaches the appropriate time. This then causes an event which changes the interface to a predefined view. The definition of the calendar 45
Figure 4.6: Video, Audio and Teletext content user interface
Figure 4.7: Video, Audio and Teletext content user interface (showing controls)
46
Figure 4.8: Video content user interface
Figure 4.9: Audio content user interface
47
Figure 4.10: Teletext content user interface
Figure 4.11: Audio and Teletext content user interface
48
Figure 4.12: No content user interface
item and representational view relation is set in a separate menu in the MpegView application. The calendar from Opie can be utilised by reusing its category attribute for a calendar item and adding more choices to this category that correspond to the defined tasks as covered in section 3.3 under calendar. A user inputs all the tasks into the calendar while using the category attribute to specify the tasks, and from this ground work the MpegView application takes these triggers to automatically change the user interface.
4.3
Process of Development
This section details the difficulties and implementation choices that were encountered throughout the process of the development. Early on in the development phase, I made several attempts on different windowing managers and media players. The windowing managers that I initially used before arriving at Opie, were Qtopia and X. Qtopia was the first choice since it had been previously installed and provided 49
a media player. But there would have been extensive work in extending this media player due to the lack of streaming support and the complexity of the MPEG-2 decoder. Furthermore, I trialled X because of a media player, vlc, from VideoLan that supported streaming and easy modification of the MPEG-2 decoder but I would have had to write a user interface from scratch. As I was more experienced using the QT library, with which one can write applications for Qtopia and Opie, the obvious choices were these. I made several attempts to modify the MPEG-2 decoders for these two media players but was then informed by the Opie developers that a new version that supported streaming and used xine as a backbone was about to be released. This made things much simpler and the development could go ahead. The modification of xine was relatively simple once I understood the way the different modules interacted. The coding of the user interface with regards to the appearance could have been made a bit easier with later versions of the QT library but at this point in time version 3 of QT is not properly supported by Opie yet. I hope to modify the code to make the user interface more aesthetic once the code can be ported to QT version 3. However, for the proof-of-concept this slightly less aesthetic user interface suffices.
4.4
Summary
This chapter described the proof-of-concept implementation of model detailed in chapter 3. It covers the basics of the operating system and windowing manager that is used, as well as the details about MPEG-2 data and xine, the media library for the decoding of the MPEG-2 stream. The existing and new architecture of the application that is modified are explained and how the user interface has been modified in order to reflect the model. Furthermore, the context management part of the implementation is briefly detailed and the difficulties and choices made in the implementation development are listed. The entire code of the application is available as an archive ready to compile and is kept by myself and also my supervisor.
50
Chapter 5 Summary In this final chapter of my thesis I am summing up all the work that I have done including the outcomes I have obtained. Furthermore, I have reached some conclusions due to my research and development work. Additionally, I cover the contributions of new knowledge that my thesis makes, and the possible extensions and future research opportunities for other researchers, so that they have the benefit of the ideas covered in my work.
5.1
Summary of work
In summary, this project investigated an effective way of representing extensive and detailed information on a mobile device for a user with a changing context. Flip Zooming, a Focus+Context Visualisation technique, has solved the information displaying issue by using a non-distortion approach to representing information in the context and detailed information in the focus. Furthermore, advances have been made regarding usability by utilising multimodal techniques in order to interact with the mobile device.
Chapter 3 covers a model that defines the way information
is represented and accessed, as well as how the user interface of a user can be dynamically altered due to changes in the user’s context. The context is determined by various sources like a calendar, global positioning system or the device itself. A proof-of-concept model was developed in chapter 4 that uses MPEG-2 streams as its information content. The two main aspects here were the reflection of the visualisation 51
design from chapter 3 using Flip Zooming and extending the multimodal output by extracting teletext from the MPEG-2 streams. The goal of developing a model for a context aware user interface that changes dynamically depending on the user’s context has been achieved. The findings are detailed in section 3.3 and a proof-of-concept implementation is covered in section 4.2.3. Flip Zooming was successfully used as a Focus and Context Visualisation technique in order to represent large amounts of information on a mobile device. Also, a mechanism to extract the teletext content from MPEG-2 streams and display it on a mobile device has been developed. This is believed to be the first use of MPEG-2 teletext display on a mobile device (definitely linux and OPIE).
5.2
Future Development
Moreover, I have thought about the uses and future research opportunities of my work and hope that researchers picking up this work in future have the benefit of the ideas that I generated while I was working on the project. My first idea was the investigation of the possibilities of context aware operating system. The most likely accelerant for further research in this area are most likely security and privacy concerns of users and companies. Furthermore, and part of the above idea is the extension of my implementation in regards of context awareness and the additional devices, like GPS, and driver plugins, like plugins that handle GPS or diary entries, that are required. I am sure someone will find this subject matter as interesting as I have and further enhance the developments in this area.
In concluding the work for this project I am pleased with what I have achieved and the directions this project has lead me into. I have now found a new interest in developing applications for mobile devices and the possibilities of streaming media to convey information.
52
Bibliography [1] L. Boves and E. d. Os, “Multimodal, multilingual information services for small mobile terminals(MUST),” project 1104, EuresCOM, 2002. [2] R. F. G. Niklfeld and M. Pucher, “Multimodal interface architecture for mobile data services,” Proceedings of TCMC2001 Workshop on Wearable Computing, 2001. [3] D. K. McGookin and S. A. Brewster, “Fishears - the design of a multimodal focus and context system,” In Vol II of Proceedings of BCS IHM-HCI 2001 (Lille, France), pp. 1–4, 2001. [4] S. Bj¨ork and J. Redstr¨om, “Redefining the focus and context of focus+context visualization,” Proceedings of IEEE Symposium on Information Visualization 2000, pp. 85–90, 2000. [5] K. C. T. Rodden and N. Davies, “Exploiting context in hci design for mobile systems,” Proceedings of the First Workshop on Human Computer Interaction with Mobile Devices, pp. 8–22, 1998. [6] P. Johnson, “Usability and mobility; interactions on the move,” Proceedings of the First Workshop on Human Computer Interaction with Mobile Devices, pp. 4– 8, 1998. [7] S. Oviatt, “Ten myths of multimodal interaction,” Communications of the ACM, vol. 42, no. 11, pp. 74–81, 1999. [8] K. Nesbitt, “Designing multi-sensory model for finding patterns in stock market data,” Advances in multimodal interfaces - ICMI 2000 : third international conference, pp. 24–31, 2000. [9] G. Z. X. Ren and G. Dai, “An experimental study of input modes for multimodal human-computer interaction,” Advances in multimodal interfaces - ICMI 2000 : third international conference, pp. 49–56, 2000. [10] C. Bisdikian, J. Christensen, J. Davis, M. R. Ebling, G. Hunt, W. Jerome, H. L. S. Maes, and D. Sow, “Enabling location-based applications,” Proceedings of the first international workshop on Mobile commerce, 2001. [11] L. E. Holmquist, “Flip zooming: An alternative to distortion-based focus+context views,” Master’s thesis, G¨oteborg University, Dept. of Computing Science, 1996. [12] S. Bj¨ork, “Hierarchical flip zooming: Enabling parallel exploration of hierarchical visualizations,” Proceedings of Advanced Visual Interfaces (AVI), 2000. [13] S. Bj¨ork, “Activity-based mobile interfaces - towards a user model for hybrids between mobile phones and pdas,” Workshop Paper at the Mobile 53
[14]
[15]
[16]
[17]
[18] [19]
[20]
Communications: Understanding Users, Adoption, and Design Workshop paper at Computer-Human Interaction (CHI), 2001. ACM Press. S. Bj¨ork, J. Redstr¨om, P. Ljungstrand, and L. E. Holmquist, “Powerview: Using information links and information views to navigate and visualize information on small displays,” Handheld and Ubiquitous Computing 2000 (HUC2k), 2000. L. E. H. S. Bj¨ork and J. Redstr¨om, “A framework for focus+context visualisation,” Proceedings of IEEE Information Visualisation 99, pp. 53–57, 1999. Y. Leung and M. Apperly, “A review and taxonomy of distortion-oriented presentation techniques,” ACM Transactions on Computer-Human Interaction, vol. 1, no. 2, pp. 126–160, 1994. L. E. Holmquist, “Focus+context visualization with flip zooming and the zoom browser,” Extended Abstracts of ACM Computer-Human Interaction (CHI) ’97, 1997. S. Bj¨ork and L. E. Holmquist, “Formative evaluation of a focus and context visualisation technique,” Poster at HCI’98, The British HCI Society, 1998. G. L. S. A. Brewster and M. Crease, “Using non-speech sounds in mobile computing devices,” In Johnson C. (Ed.) Proceedings of the First Workshop on Human Computer Interaction with Mobile Devices, pp. 26–29, 1998. P. Sarginson, “Mpeg-2: Overview of the systems layer,” research and development report, The British Broadcasting Corporation, 1996.
54
Appendix A xine Code listing This code segment is part of the demux ts parse packet method in the file demux ts.c . ... demux_ts_buffer_pes (this, originalPkt+data_offset, this->audioMedia, payload_unit_start_indicator, continuity_counter, data_len); return; } /* * added by Christian Kohl 30/8/2002 * * teletext packet with pid 802 or 770 for the moment * */ else if (pid == 802 || pid == 770){ //one transport packet arrives here int pesPacketLength, i, n; u_int8_t *buf; struct vt_page cvtp = {0};//, pvtp = {0}, tvtp = {0}; /* current, previous & temp teletext pages */ i = 0; pesPacketLength = 0; n = data_len;//sizeof original packet buf = (u_int8_t *) originalPkt + data_offset; while(i < n){ char *the_text; xine_ui_event_t uevent; pesPacketLength = ((buf[i + 4] n) { break; } dvb_handle_pes_payload(&cvtp, buf + i + 45, pesPacketLength - 45); 55
vtpage_print(&cvtp); the_text = vtpage_char(&cvtp); if(strlen(the_text) > 0){ uevent.event.type = XINE_EVENT_GUI_OPIE; uevent.data = the_text; uevent.data_len = strlen(the_text); xine_send_event(this->xine, &uevent.event); } i = i + pesPacketLength; } } else if (pid == 0) { ...
56
Appendix B Installation procedure for Familiar linux and OPIE The appendix for installation procedure of Familiar Linux and OPIE.
Obtaining the required files: Go to http://www.handhelds.org/projects/h3800.html Follow the download link Follow the familiar distribution link to the ftp site Then go to releases -> v0.5.3 -> install -> H3800 (ftp://ftp.handhelds.org/pub/linux/dists/familiar/releases /v0.5.3/install/H3800/) The *.jffs2 used here is task-opie-2.4.18-*.jffs2 Tera Term pro setup: This program is used to transfer the jffs2 image to the ipaq. Other hyperterminal programs can be used as well. Install Tera Term and then go to Setup -> Serial Port Use the following settings: Port: COM1 (or the port the IPAQ is connected to) Baud rate: 115200 Data: 8bit Parity: none Stop: 1bit Flow Control: none NOTE: When copying using the terminal and wanting to send or receive from the device, use sz to send and rz to 57
receive. Installing Familiar: The bootloader is already installed. (For information on installing the bootloader see the H3800 homepage and follow the links) Start up Tera Term and try and connect to IPAQ. Reboot the IPAQ and to get to the boot prompt press the "Serial bootloader console" equivalent button on the IPAQ(for me it was the calendar button) At the boot prompt issue the following command. boot> load root It then asks you to do a xmodem upload of the *.jffs2 file. boot> load The important part of this process is that the bootloader successfully erased, wrote, and verified the filesystem image. Boot the installation for the First Time At the "boot>" prompt, type: boot You should see Linux startup, and numerous daemons execute. It will pause for 10 or 20 seconds while it generates an ssh server key for your iPAQ. If all goes well, you should be presented with a "login:" prompt, as shown above. At the login prompt, login in as root using rootme as the password. PPP and SSH setup: Ipaq: The following is only as a reference in case something does not work with PPP or SSH. In version 5.3 this worked straight away whereas in version 5.2 I had trouble to connect. Make sure that /usr/sbin/pppd exists. (for Familiar version 5.3) Make sure that /etc/passwd contains a line like
58
ppp::99:99:PPP User:/home/ppp:/sbin/pppd Create /etc/ppp/options. (in 5.3 already there - has some other options set) >mkdir /etc/ppp >echo "-detach defaultroute noauth nocrtscts lock lcp-echo-interval 5 lcp-echo-failure 3 /dev/ttySA0 115200" > /etc/ppp/options Add the PPP modules to /etc/modules so that they are loaded whenever you boot. >echo "slhc ppp_generic ppp_async" >> /etc/modules If you are making these changes on a running iPAQ you will want to load these modules now since they were not in /etc/modules at boot time: >insmod slhc >insmod ppp_generic >insmod ppp_async Make sure /etc/modules.conf has the appropriate aliases. >echo "alias char-major-108 ppp_generic alias /dev/ppp ppp_generic alias tty-ldisc-3 ppp_async" >> /etc/modules.conf Make sure /usr/sbin/pppd is executable by user ppp: >chmod 4755 /usr/sbin/pppd Generate SSH keys so you can login remotely >ssh-keygen -b 512 -f /etc/ssh/ssh_host_key -N ’’ >ssh-keygen -d -b 512 -f /etc/ssh/ssh_host_dsa_key -N ’’ Now start the sshd service >sshd The sshd service will start automatically when you reboot, this 59
get’s it up and running now. Host computer: (Description of PC) Create an /etc/ppp/peers/ipaq file containing the following: -detach noauth nocrtscts lock local user ppp connect ’/usr/sbin/chat -v -t3 ogin--login: ppp’ /dev/ttyS0 115200 192.168.0.100:192.168.0.101 Note: you may need to adjust the /dev/ttyS0 line to whatever tty you actually have the serial cable plugged into. It’s likely either /dev/ttyS0 or /dev/ttyS1, (corresponding to DOS COM1 and COM2 respectively). (/dev/ttyS3 in my case) Making the connection Connect the serial cable between the host and the iPAQ Make sure that you are not logged in at the serial console, (you should see the login: prompt in minicom) Close minicom or any other terminal program that may have a connection to your serial port Add the IP addresses to my /etc/hosts file so I can simply type ipaq rather than 192.168.0.101 Finally, execute pppd on the host, (as root): >/usr/sbin/pppd call ipaq A successful connection will report local and remote IP Addresses. Now that you have a PPP connection, you have full TCP/IP networking between the host and the iPAQ. You can access either one from the other using the IP addresses reported when the PPP connection was established. Internet setup: IPAQ: Copy the file /etc/resolv.conf from the host computer to the IPAQ scp /etc/resolv.conf root@ipaq:/etc/resolv.conf On the host computer: Follow the "PPP Howto" instructions linked above to get your host machine connected to your handheld. Then, as root on your host machine, issue the following statements: cd /sbin ipchains -F ipchains -A forward -s 192.168.0.101 -j MASQ ipchains -A forward -l -j DENY echo 1 > /proc/sys/net/ipv4/ip_forward 60
This tells your host machine "masquerade all traffic coming from the handheld to the rest of the world, route that traffic back and forth between the handheld and your default route, and log headers for any packets that don’t match the forwarding rules." (Packet logging is very useful for finding out why it might not be working.) When you are done, the commands: ipchains -F
echo 0 > /proc/sys/net/ipv4/ip_forward
will flush your masquerade tables and disable IP forwarding.
61