FP7–216529
PinView Deliverable D1.4
Deliverable D1.4 Browser extension for pointer track feedback
Contract number: FP7–216529 PinView Personal Information Navigator Adapting Through Viewing
The research leading to these results has received funding from the European Community’s Seventh Framework Programme (FP7/2007–2013) under grant agreement n◦ 216529.
Revision: 1.42
Page 1 of 16
FP7–216529
PinView Deliverable D1.4
Identification sheet Project ref. no.
FP7–216529
Project acronym
PinView
Status and version
Final, Revision: 1.42
Contractual date of delivery
31.12.2008
Actual date of delivery
31.12.2008
Deliverable number
D1.4
Deliverable title
Browser extension for pointer track feedback
Nature
Prototype
Dissemination level
PU – Public
WP contributing to the deliverable
WP1 Forms and protocols for enriched relevance feedback
Task contributing to the deliverable
Task 1.4 Implementation of pointer track feedback
WP responsible
Teknillinen korkeakoulu
Task responsible
Teknillinen korkeakoulu
Editor
He Zhang
Editor address
P.O.BOX 5400, FI-02015 TKK, Finland
Authors in alphabetical order
Jorma Laaksonen, Mats Sj¨oberg, He Zhang
EC Project Officer
Pierre-Paul Sondag
Keywords
browser add-on, extension, enriched relevance feedback, AJAX, XMLHttpRequest
Abstract
This report describes the implementation of a web browser extension for gathering enriched relevance feedback in the PinView project. The extension, written in JavaScript, tracks and collects mouse pointer movement, clicks and keyboard events on a document displayed in the browser. The recorded data are transferred to a web server that collects and analyses them. The communication is based on the AJAX technology, where the client and server asynchronously exchange XML-formatted content by using the XMLHttpRequest method. The same extension framework will be used also for implementing speech and gaze direction feedback.
List of annexes pinview-1.00.xpi – XPI package containing the software of the PinView extension, v. 1.00, available at http://www.pinview.eu/extension/
Revision: 1.42
Page 2 of 16
FP7–216529
PinView Deliverable D1.4
Contents 1 Overview
4
2 Introduction
5
3 Implementation principles 3.1 Content of the extension . . . . . . . . . . . 3.2 Class structure . . . . . . . . . . . . . . . . 3.3 Interaction between extension and web page 3.4 Passing information on displayed images . . 3.5 Passing the coordinate values . . . . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
5 6 8 9 9 10
4 Installing the extension 4.1 Downloading the XPI package . . . . . . . . 4.2 Setting up preference values . . . . . . . . . 4.3 Accessing preferences through about:config 4.4 Updating the extension . . . . . . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
10 11 12 14 15
5 Conclusions
Revision: 1.42
15
Page 3 of 16
FP7–216529
1
PinView Deliverable D1.4
Overview
This document accompanies the Deliverable D1.4 of the Personal Information Navigator Adapting Through Viewing, PinView, project, funded by the European Community’s Seventh Framework Programme under Grant Agreement n◦ 216529, that constitutes the output of Task 1.4 Implementation of pointer track feedback. Deliverable D1.4 was in the PinView Description of Work tentatively named as Browser applet or plug-in for pointer track feedback, but due to the actual nature of the implemented software, it has now been renamed as Browser extension for pointer track feedback. This document aims at describing in sufficient detail the implementation of the pointer track feedback in a web browser. The pointer track is one of the four user interaction modalities that have been planned for being used for providing explicit or implicit enriched relevance feedback in the PinView project. The other three modalities that can be used for extracting enriched relevance feedback are eye movements, keyboard events and audio, including speech. The forms of data transport for all these modalities were specified in PinView’s earlier Task 1.3 Definition of transport protocol for enriched feedback [1] upon which the current implementation work has been based on. The developed browser extension framework in its current form already includes the functionality for collecting and processing also keyboard events. The pointer track and keyboard event feedbacks have been implemented by programming a browser extension for Mozilla Firefox with the JavaScript programming language. The programming environment and implementation principles of the extension follow the guidelines specified in the Mozilla Developer Center (MDC) website. The developed web browser extension works in Windows, Linux and Mac OS X operating systems that have Firefox Mozilla browser version 3.00 or newer. In this report, detailed and illustrated instructions for installing and configuring the extension are presented. The implementation work described in this report has been coordinated with the concurrent Task 1.5 Implementation of point-and-speak feedback. The future work in Task 8.4 Study of proactive eye-movement-based user interface will include incorporation of eye movement measurements in forthcoming versions of the browser extension.
Revision: 1.42
Page 4 of 16
FP7–216529
2
PinView Deliverable D1.4
Introduction
The main goal of the current work has been to implement a software mechanism for collecting and transmitting pointer track data from a web browser to a content-based image retrieval (CBIR) server for being used there as a form of enriched relevance feedback. Based on the general transport framework of the previous Task 1.3 Definition of transport protocol for enriched feedback [1], a web browser extension has been programmed to collect track of client’s pointer movements on a displayed still (and possibly also video) image. Web browser extensions can in general be used to add new features that extend the functionality of a web browser such as the Mozilla Firefox1 , which is a free and open source web browser of great popularity today. The implementation work detailed in this report follows the instructions on building extensions in the Mozilla Developer Center (MDC)2 , the official Mozilla Foundation website for developing Mozilla applications. JavaScript is the primary implementation language of Mozilla Firefox. It is a dynamic scripting language widely used for client-side web development. The official name of JavaScript is ECMAScript3 , and it supports prototype-based object construction and object-oriented programming including class inheritance. The implemented browser extension, described in this report, enables recording data of the user’s pointer movements, as well as keyboard event data. The data are stored in proper XML-formatted4 packets. The extension then transfers the data to the search server asynchronously over the XMLHttpRequest protocol5 defined by the World Wide Web Consortium (W3C)6 . The XMLHttpRequest protocol has often been used in implementing Asynchronous JavaScript and XML (AJAX) type of asynchronous content updates in web applications. The rest of this report is organised as follows. In Section 3, we begin by describing the content and implementation principles of the extension. In Section 4, a step-by-step installation and configuration guideline for the extension is presented. Finally in Section 5, we present conclusions and discuss the related future work in the PinView project.
3
Implementation principles
As stated in PinView’s Task 1.3 Definition of transport protocol for enriched feedback [1], the user’s web browser will in general need to be equipped with a special add-on or extension that is capable of interpreting the tags in the server’s messages. Then the browser extension needs to record data of the specified user interaction modalities and store them in proper XML-formatted packets. Finally, it has to communicate the recorded data over the XMLHttpRequest protocol to the search server. The XMLHttpRequest protocol is already available in all modern web browsers including Internet Explorer, Firefox, Opera and Safari. In this section we will describe the principles how the browser extension has been implemented. First in Section 3.1 we introduce the content of the extension, i.e. the directories and files contained in the extension’s installation bundle. Then in Section 3.2 we describe the object-oriented class structure used in implementing the data collection in different interaction modalities. Section 3.3 briefly discusses how the extension and the displayed web page can make use of the AJAX-type XMLHttpRequest responses from the server. Finally, Sections 3.4 and 3.5 describe how the extension passes information on the displayed images and different pointer coordinate values, respectively, to the server. 1
http://www.mozilla.com/en-US/firefox/ https://developer.mozilla.org/En 3 http://www.ecma-international.org/publications/standards/Ecma-262.htm 4 http://www.w3.org/XML/ 5 http://www.w3.org/TR/XMLHttpRequest/ 6 http://www.w3.org/ 2
Revision: 1.42
Page 5 of 16
FP7–216529
3.1
PinView Deliverable D1.4
Content of the extension
Extensions add new functionality to Mozilla applications such as Firefox. In practice, extensions are packaged and distributed in a form of Cross-Platform Installer Module (XPI)7 , which is a ZIP file utilising the XPInstall technology. XPI contains installation instructions (install.rdf) as well as the actual software to install, which is often itself packaged as a JAR file. The user interface of the extension, such as the preferences window (prefs-window.xul), is written in XUL8 , which is Mozilla’s XML-based language that enables building feature-rich cross platform applications that can run connected or disconnected from the Internet. User actions, such as pointer movements and keyboard events, are bound to respective functionality using JavaScript9 programming language. pinview-1.00.xpi: /install.rdf /chrome.manifest /chrome/pinview.jar /defaults/preferences/prefs.js /platform/Linux/components/ /platform/WINNT/components/ /platform/Darwin/components/ Figure 1: The content of the pinview-1.00.xpi file. Figure 1 shows the directory and file structure of the extension packaged in a ZIP file named pinview-1.00.xpi. The content within this XPI file are as follows: install.rdf10 is an XML file by which the Mozilla’s Add-on Manager determines information about the add-on or extension as it is being installed. It contains metadata such as the extension’s unique ID, version, author, compatibility and updating information. chrome.manifest is a text file used as the chrome registry11 that provides mappings from chrome package names to the physical location of chrome packages on the disk. The manifest file used for the extension concerned in this report is shown as below: content pinview jar:chrome/pinview.jar!/chrome/content/ skin pinview classic jar:chrome/pinview.jar!/chrome/skin/ overlay chrome://browser/content/browser.xul chrome://pinview/content/pinview.xul
pinview.jar contains the actual software to install, packaged as a JAR file. The content of the file will be described below. prefs.js is a default preference JavaScript file stored in defaults/preferences/. When placed here, it will be automatically loaded by Firefox’s preferences system as it starts. One can save and retrieve named string, number and boolean values by using Mozilla’s Preferences API. The preference file used for the extension concerned in this report is shown as below:
7
https://developer.mozilla.org/en/XPI https://developer.mozilla.org/en/XUL 9 https://developer.mozilla.org/en/JavaScript 10 https://developer.mozilla.org/en/Install.rdf 11 https://developer.mozilla.org/en/Chrome 8
Revision: 1.42
Page 6 of 16
FP7–216529
PinView Deliverable D1.4
pref("pinview.general.collectorurl", pref("pinview.pointer.collectorurl", pref("pinview.keyboard.collectorurl", pref("pinview.audio.collectorurl", pref("pinview.gaze.collectorurl", pref("pinview.general.opts", 1); pref("pinview.pointer.opts", 0); pref("pinview.keyboard.opts", 0); pref("pinview.audio.opts", 0); pref("pinview.gaze.opts", 0);
""); ""); ""); ""); "");
platform is a subdirectory that can contain platform specific extension components. These components will be stored in separate subdirectories for computers running Linux, Windows (WINNT) and Mac OS X (Darwin) operating systems. The current version of the extension does not require operating system specific components for the processing of pointer and keyboard data, but the audio and gaze modalities data will most likely require them. Figure 2 shows the directory structure of pinview.jar. Chrome is Mozilla’s name for the user interface elements of the browser window that are outside of the window’s content area, such as toolbars, status bars and the like. Three basic types that chrome provides are content, locale and skin. The main source file for a window description comes from the content provider. Typically, it is a XUL file since XUL is designed for describing the contents of windows and dialogs. For example, the contents of the PinView extension’s preferences window are described in the prefs-window.xul file. The JavaScript files that define the user interface are also contained within the content package. For example, the event handling concerning the pointer movements and keyboard events are defined in pointer.js and keyboard.js files, respectively. In addition, the skin provider is responsible for providing a complete set of files that describe the visual appearance of the chrome. For example, the icon appearances for our extension are provided by image files pinview.png and pinview16.png. pinview.jar: /chrome/content/about.xul /chrome/content/pinview.js /chrome/content/pinview.xul /chrome/content/prefs-window.js /chrome/content/prefs-window.xul /chrome/content/pointer.js /chrome/content/gaze.js /chrome/content/keyboard.js /chrome/content/audio.js /chrome/skin/pinview.png /chrome/skin/pinview16.png Figure 2: The content of the pinview.jar file.
Revision: 1.42
Page 7 of 16
FP7–216529
3.2
PinView Deliverable D1.4
Class structure
The modalities of user interaction concerned in PinView include eye movements, pointer movements and events, keyboard events and audio including speech. This report concerns two modalities that have been implemented, pointer movements and events, and keyboard events. Figure 3 shows a class inheritance diagram of the JavaScript implementation. The roles of the defined classes are detailed as follows: Modality is defined to be the super-class of the classes for pointer movements, keyboard events and other modalities. These all collect the input and store them internally in the xmlDoc object which is then periodically transmitted to the server URL. Pointer is defined to be a sub-class of Modality. This means that it inherits all the functions defined in Modality, but also adds its own methods for pointer-specific functionality. Keyboard is defined to be a sub-class of Modality. This means that it inherits all the functions defined in Modality, but adds its own methods for keyboard-specific functionality. The method handleModality() is overridden in each sub-class for a specific modality. It generates the XML data structure for that particular interaction modality. For example in the class Pointer this method calls the method sampleXML() to generate XML for pointer
Modality + xmlDoc: string + timedelay: int + eventlimit: int + sendXML() + getURL() + postXML() + processResponse(req) + createXML() + handleEvents(event)
Pointer
Keyboard
+ type: string = "pointer"
+ type: string = "keyboard"
+ handleModality(event)
+ handleModality(event)
+ sampleXML(event) + clickXML(event) ’ + btnClick(event)
Figure 3: A class diagram showing the relationships between the Modality super-class and its two sub-classes, Pointer and Keyboard. Revision: 1.42
Page 8 of 16
FP7–216529
PinView Deliverable D1.4
movement samples, and clickXML() for pointer clicks. The general method handleEvents() in the super-class periodically sends the collected data to the specified server URL.
3.3
Interaction between extension and web page
We have implemented a mechanism for launching web-page-specific actions in response to the extension receiving data from the collecting server. This is implemented in the method processResponse() of the class Modality. If the current web page has a function with the name erfcallback(), it will be called with the XML document provided by the search server as an argument. A web page could then potentially change its contents or otherwise update its status to reflect the response from the server. If this function does not exist, the extension silently continues as normally.
3.4
Passing information on displayed images
When the extension detects that the browser has started to display a new web page, it sends the server a special element named . This list contains one
element for each displayed image. The attributes src, top, left, width and height specify the content, location and size of the image. Figure 4 illustrates the element and its contents. Based on that information, the server process will gain knowledge on the layout and proximity of the images in their actual rendering positions.
Figure 4: A list of images on a web page.
Revision: 1.42
Page 9 of 16
FP7–216529
3.5
PinView Deliverable D1.4
Passing the coordinate values
In specification [1] it was defined that the pointer coordinates will be passed in and elements within the and elements. However, it is most probable that in real applications the extension will not be able to decide relative to which coordinate system the spatial coordinates should be measured. Therefore we have chosen to record the coordinates in all available coordinate systems and pass all of them to the server for storage, analysis and later use. In particular, there exist four coordinate systems illustrated with a element in Figure 5: • , screen coordinates whose origin is in the top-left corner of the monitor screen; • , absolute or page coordinates that are relative to the top-left corner of the displayed HTML page; • , relative or window coordinates that are relative to the currently visible part of the HTML page and thus dependent on the positions of the horizontal and vertical scrolling of the page; •
,
image coordinates that exist if the pointer resides inside an image and are relative to the top-left corner of that image.
Tue, 16 Dec 2008 14:50:21.801 GMT 135 92 636 92 1173 361 848 left
Figure 5: Coordinate values are recorded and sent relative to four coordinate systems.
4
Installing the extension
In this section we provide a installation and configuration guideline for the implemented PinView extension described in this report. We first introduce how to download the package and then explain how to personalise the extension by modifying the preference variables. All these steps will be illustrated in Figures 6 to 15. Revision: 1.42
Page 10 of 16
FP7–216529
4.1
PinView Deliverable D1.4
Downloading the XPI package
Extensions can be downloaded from a web page or from the direct URL of the XPI file. The PinView extension and its future releases will be placed under the directory URL http://www.pinview.eu/extension/. The direct URL of the extension will be http://www.pinview.eu/extension/pinview-X.YY.xpi, where X.YY specifies the extension’s version number. Before the installation the user is requested to accept the install by a pop-up window as shown in Figure 6.
Figure 6: Starting the installation of the pinview-1.00.xpi extension. Awaiting a few seconds, the extension is enabled to download. On accepting the install, the XPInstall system automatically interacts with the installation instructions contained in the XPI file and installs the software as shown in Figure 7. After restarting Firefox, as suggested by the system, the installation is completed and a miniature PinView logo appears on the browser’s bottom-right status bar. This is illustrated in Figure 8.
Figure 7: The installation of the pinview-1.00.xpi extension has succeeded.
Revision: 1.42
Page 11 of 16
FP7–216529
PinView Deliverable D1.4
Figure 8: After a successful installation, a miniature PinView logo has appeared on the browser’s bottom-right status bar.
4.2
Setting up preference values
Next, one can manually make the option settings concerning different user interaction modalities by using the preferences window. One can launch the PinView extension’s preferences window by selecting from Firefox’s menu bar first “Tools” and then “Add-ons”. This will start the Add-on Manager that will show the miniature PinView logo in the extension list as is shown in Figure 9. Clicking the “Preferences” button will pop up the preferences window showing the current settings. The same state can also be reached with an obvious shortcut, namely by clicking on the PinView logo on the browser’s status bar (see Figure 8). The preferences window has five option tabs named as “General”, “Pointer”, “Keyboard”, “Audio” and “Gaze”. The sixth tab is named “About”, as seen in Figure 10. The options in the“General” tab are as follows: Recording policy contains four mutually exclusive options: – Do not collect at all (default) – Send to url specified in – Send to collector if no – Always send to collector
Collector address specifies the collector URL for the last two recording policies. When manually changed, the new collector address becomes valid as soon as the “OK” button is pressed. If the collector address is empty, as it is initially, or invalid, then the collector-mode operation of the extension is silently prohibited.
Figure 9: The preferences window can be launched from the Add-on Manager’s list of installed extensions.
Revision: 1.42
Page 12 of 16
FP7–216529
PinView Deliverable D1.4
Figure 10: Option settings in the “General” tab. Similar recording policy selection applies also to the specific user interaction modalities, such as “Pointer” and “Keyboard”, except that they each have an additional option of obeying the “General” recording policy setting, which is the default. The “Pointer” and “Keyboard” preference tabs are shown in Figures 11 and 12, respectively. Each of the modalities can be manually configured such that the respective feedback data is sent to its own URL specified in the tag or to its own collector URL.
Figure 11: Option settings in the “Pointer” tab.
Figure 12: Option settings in the “Keyboard” tab. Revision: 1.42
Page 13 of 16
FP7–216529
PinView Deliverable D1.4
The “About” tab displays acknowledgement information for the European Commission’s research funding as shown in Figure 13.
Figure 13: The “About” tab shows acknowledgements.
4.3
Accessing preferences through about:config
In addition to using the preferences window, one can also access and modify the preferences through the special about:config URL, which lists the preference variables that have originally been read from the default preference file prefs.js. One may type about:config in the browser’s Location Bar to display the list of preferences. Figure 14 shows how Firefox presents the list of the existing preferences through about:config, filtered by the expression “pinview”. To modify a preference variable, e.g. pinview.general.opts, one must double click on it and edit its value in the subsequent dialog, as shown in Figure 15. Note that some preferences will require a restart for the change to take effect.
Figure 14: The about:config’s preferences list.
Figure 15: A pop-up window for modifying the value of a preference variable.
Revision: 1.42
Page 14 of 16
FP7–216529
4.4
PinView Deliverable D1.4
Updating the extension
Updating the extension is simple and straightforward by using Firefox’s Add-on Manager. One first starts the Add-on Manager by clicking “Tools” in Firefox’s menu bar and then chooses “Add-ons” in the pop-up menu. By right-clicking the miniature PinView icon shown in the extension list and choosing “Find Update”, as shown in Figure 16, the Add-on Manager will check for the update of the extension automatically. If a new update is available, the user can then choose to download it. Firefox should also, by default, automatically check for new updates without user interaction.
Figure 16: Finding updates of the PinView extension.
5
Conclusions
In this report, we have described the implementation of a Firefox web browser extension for obtaining true on-line asynchronous enriched relevance feedback from versatile user interaction modalities planned in the PinView project. The extension has now been made capable of tracking the user’s pointer movements and keyboard events. In the future, it will also facilitate capturing eye movements data and audio including but not limited to speech. The extension is the first implementation based on the XMLHttpRequest communication principles and XML data formats specified in PinView’s previous Task 1.3 and it facilitates sending the recorded feedback data from the browser client to the search server. The extension allows to personalise the option settings concerning each individual interaction modality. This is achieved by modifying the existing preference variables either from the preferences window of the extension or from Firefox’s about:config list. In the future, we will combine eye movement recording into the extension. Once a new version is launched, one can easily download and update the extension through the Add-on Manager of the Mozilla Firefox browser.
Acknowledgements Collaborators in the PinView project, especially Bernhard Lackner and Michael Kumar of celum, are acknowledged for their valuable comments and contributions concerning the content of this report. Revision: 1.42
Page 15 of 16
FP7–216529
PinView Deliverable D1.4
References [1] Jorma Laaksonen. Definition of transport protocol for enriched feedback. PinView FP7216529 Project Deliverable Report D1.3, September 2008. Available online at http: //www.pinview.eu/deliverables.php.
Revision: 1.42
Page 16 of 16