Published in Proceedings of AVI ‘98: Advanced Visual Interfaces, L'Aquila, ITALY, May 25-27, 1998, pp. 76-82, ACM Press. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution of this work in other works, must be obtained from the ACM.
Rapid-Fire Image Previews for Information Navigation Kent Wittenburg, Wissam Ali-Ahmad, Daniel LaLiberte, Tom Lanning GTE Laboratories Incorporated 40 Sylvan Rd. Waltham, MA 02254 USA kentw/wali-ahmad/dlaliberte/
[email protected]
ABSTRACT
In this paper we consider the role of rapid-fire presentation of images in the service of navigation in information spaces. We presume a model of information navigation in which the user performs a cycle of (pre)viewing, selecting, and moving. Our hypothesis is that images presented to the user in rapid succession can significantly enhance the previewing step, thus optimizing the selection step and improving navigability. We discuss two prototypes for navigation tools in Web information spaces in which images are used as the primary means for presenting meta-information about “upcoming” Web pages. The presentation is modeled as a flow of information streaming to the user, and orientation is visualized through positions in ordered sequences. KEYWORDS: Information navigation, visualization, images,
previewing INTRODUCTION
George Furnas [4] has developed a theory of navigation in information spaces that serves as a useful springboard for positioning research on this subject. The overall process for the user is described as a sequence of steps composed of viewing, selecting, and moving. Furnas distinguishes a viewgraph from an underlying information graph. The viewgraph is layered over the information graph, serving the purpose of presenting out-links from successive views so that the user can proceed with a next move to other parts of the information graph. Cognitive limits and screen real estate typically constrain any single view to relatively few out-links. Thus the navigator, at each step of the way, is faced with the task of choosing among the moves offered such that the overall paths to any targets in the information space as a whole are minimized. Designing for strong navigability, according to Furnas, involves making the presentation of out-links predictive of those parts of the information graph that subsequently become accessible. From the point of view of the nodes of the graph, the goal is to include scent [11] or residue [4] in as many viewgraph windows as possible.
The work we discuss here addresses the issue of optimizing the user’s decision-making process at key steps by making the presentation of outlinks not a static listing of some kind, but a dynamic interactive presentation supporting rapid scanning. In so doing the residue of nodes in the information graph is increased in the viewgraph. We use images extracted from Web pages as the primary vehicle for conveying residue. As one should expect, radically increasing the number of outlinks conveyed in a single view from a single location in the information space leads to an independent set of usability issues that may themselves include navigation aspects. As an example, consider the fly-through style of interface found in PAD++* [2] or a commercial cousin PerspectaView* [1]. Used as maps of an information space, such interfaces require that users master panning and zooming navigational moves in order to view the information map. Flying-style interfaces must deal with the problem of users getting lost inside the space (consisting here of a map); signposts or landmarks may be needed to guide users to parts of the space that may not be apparent at the scale in which the user is currently viewing the environment [9]. A different slant on this problem was suggested to us by certain aspects of navigation practiced historically by peoples in Micronesia and Polynesia [8]. The mental model of locomotion used by these master navigators inspired us to think of a very simple set of controls for users of preview material. As reported by Hutchins [8] and his sources, the notion of locomotion in these navigators’ mental model of travel is that the traveler remains stationary while the terrain moves relative to the traveler. While we do not presume to apply the model of Micronesian navigation as such (the spatial orientation process in particular relies on complex region-specific knowledge of star patterns, currents, and wind patterns), this alternative notion of loco* Trademarks, tradenames, and service marks denoted by “*” are the property of their respective holders.
motion did suggest to us the idea of shifting the initiative for movement away from the user and onto the system while in a preview mode. The user could, minimally, set a direction and let the world roll by, so to speak, on predefined paths. Although this alternative sense of movement can be a subtle distinction in physical environments, the effect in an abstract information space might be to eliminate any sensation of the user getting lost inside the space itself since the user never has the sensation of going anywhere. In fact, we may be able to avoid the problem of spatial orientation altogether. As pointed out in [3], there is more than one sense in which persons can arrive at a feeling of knowing where they are. A sense of “whereness” may also be characterized as a process description, i.e., a sequence of steps necessary to arrive at a certain state. Spatial, in contrast to sequential, orientation methods for navigating in the physical world can in fact be quite difficult to impose on information spaces such as the World Wide Web, which is inherently multidimensional and changing continuously. A second aspect of our work is the use of imagery in the presentation of preview information (out-links). Among the motivations for using images are to utilize more fully the human capacity for rapid visual information processing [5] as well as to make the experience of previewing a Web space more engaging and perhaps even entertaining. The style of rapidly presenting a series of images is common in the entertainment industry today, particularly among those seeking to appeal to younger viewers. Jonathan Helfman [6][7] and Andruid Kerne [10] have previously explored applications that use images extracted from Web pages. It is apparent from their work that collages of such images can present compelling gestalt impressions of Web spaces. Helfman has proposed a number of interesting applications, including using images cached in a proxy server as a method for sharing information within a community. He continues to explore the use of images in the larger context of associative organizations of Web information (see http://cs.unm.edu/~jon/). Kerne’s main interest has been in the artistic possibilities of web image collages. Our focus is on using images for the purposes of previewing and scanning in a previously structured space. Our work is directed towards providers of Web information who are able to select, structure, and cache images in advance in order to support a previewing function over indexed material. The primary issues we have addressed are in making image presentations dynamic and controllable to serve the needs of previewing and rapid scanning. In the remainder of this paper, we discuss two prototypes and some of the underlying component architecture for
PolyNav*, the name for our system to support image-based previewing in information spaces. These prototypes are not intended to replace current commercial Web browsers, but rather to augment them. We use images as the primary means for presenting meta-information about accessible locale (Web pages), and movement within a potentially large space of out-links is modeled as a stream of images coming to the user. Orientation within the streams is done through positioning in simple linear sequences. The sequences themselves can be defined in various ways, say, through sorting operations on any metadata that might be available or by categorizing the pages in an off-line authoring process. A JAVA CLIENT PROTOTYPE
Figure 1 shows a screen shot of our Java-based PolyNav applet. This applet appears when an appropriately labeled preview link is followed from a Web page. At any point a user can click on an image in the central viewing area and the Web browser will bring up an associated URL link. The most obvious association is that the image appears directly on a static HTML-based Web page, a property that can be discovered through parsing the HTML documents in advance through Web crawling. However, the associations between images and URLs can be derived in other ways; for example, in a site whose pages are dynamically generated, images and URLs may be associated with fields in a database. In this example, the choice menu labeled “Page Set” contains a list of the primary nodes in a hierarchical organization of the site’s Web space. The applet serves to preview the main sections of the site; the current one being previewed is the “Shop Online Service.” This preview (a sequence of pages) contains twenty entries. The current entry is “Gifts and Flowers.” The images visible at 100% opacity are associated with this page. Partially transparent images are those that have been viewed previously in the sequence. When a user selects a preview target, the PolyNav applet begins “playing” the sequence. The user sees images appearing on top of the viewing area as the sequence is followed from the beginning. The effect we are trying to convey is that the sequences represent flows of objects coming into the user’s space, and as they arrive, the associated images are placed on the top of the current view. The visual metaphor is of photographs being thrown onto the top of an ever-increasing pile. The user can pause or continue the sequence, as well as reverse its direction, through the controls at the bottom. A slider conveys the position of the current page in the overall sequence, and users can jump to * PolyNav is a trademark of GTE Laboratories Incorporated.
Figure 1: A screen shot of the Java version of the PolyNav previewer. (See color plate.)
other positions by manipulating the slider.
viewing axis. This allows the user to inspect images that are not directly aligned with the current viewing axis.
A VRML CLIENT PROTOTYPE
Figure 2 shows a screen shot of our VRML-based PolyNav prototype. This prototype conveys a stronger sense of immersion and depth than the Java-based prototype. Panels textured with images representing pages appear on the horizon through a light fog and float by the user while faint background music helps complete the setting. Unlike the Java prototype, the VRML prototype can present animated GIF images, as well as AVI, QuickTime, and MPEG movies, on the floating panels. Spatialized MIDI and WAV audio files from each page can even be played when the image panels are in a certain proximity to the viewer. As for the controls, a slider (on the right side of the figure) provides continuous forward and reverse control of the pace of motion. The pitch of an engine sound varies directly with the pace selected by the slider. Currently, there is no control for absolute position in the space, other than a drop-down menu that allows the user to select a page sequence (track). The control in the lower-right corner allows the user to slide the images along a planar surface perpendicular to the
The VRML-based prototype provides more continuous and precise control of image pacing than the Java-based client. We suspect that there are significant individual and task differences in the pace at which users would like to view the images. For example, if users want to get an overall impression of a Web space, they may want to let images stream by very quickly as long as they can “back up” to inspect any images that passed by before they could react. On the other hand, if the user wants to extract more detailed information from each image, for example in a browsing mode typical of window-shopping behavior, the pace will need to be slower. POLYNAV SPACE PREPROCESSING
Preparation for previewing an information space through one of the PolyNav clients involves a process of acquiring, structuring, and storing meta-information for the objects of navigation. So far we have concentrated on acquiring images from HTML Web pages; however, we have also extracted associations of images with URLs from databases
Figure 2: A screen shot of the VRML version of the PolyNav previewer. (See color plate.)
to handle dynamically generated information and have experimented with other kinds of metadata. Besides the association of named URLs to images, the clients expect a set of one or more sequences (tracks) of URLs. The criteria for deciding what constitutes a sequence is up to the application. We have experimented with the following types of sequences: (1) links out (with specified depth limit) from one or more start URLs
and bandwidth constraints. We offer a set of tools and services summarized in Figure 3 that prepare a client for viewing a PolyNav space. The main format for communication is a data file in what we call PNF format. (We expect to use XML in future for this purpose.) The Java PolyNav client reads this format directly. The VRML client goes through a process of generating a WRL file from the PNF file at startup. Specification of a PolyNav space
(3) different sortings of a page set (e.g., by name, by lastmodified date, by frequency of visit, by profile match, etc.)
Our space specification tools have both server and client components. The server does the work of processing URLs and Web sites to build both the PNF file and an image archive. The Launcher is the client tool to specify the Web space to the server.
While it may be that in the future all of the necessary image metadata will be available in real time on the Web, that is not the case today. Automatically acquiring the metadata necessitates parsing HTML for each URL (unless the URL/ image association is available through a database query), filtering the images contained therein, and then downloading and preprocessing the images to address computation
As with the collage services found on Kerne’s Web page [10], users can specify Web spaces through queries or Web walks. Our query service makes use of two external search engines: one a popular search engine available on the Web; the second an experimental metasearch service at GTE Laboratories. The PN_Server passes the query on to a search engine, parses the query results, and creates a PNF file. The
(2) ranges of results from a query (first 30, second 30...)
PolyNav Previewer Servers PNF Image requests
Launcher
PNF file spec.
HTTP
Thumbnailer
Site URL / Query
Thumbnails / Collages
PN_Server Crawler Service
Query Service
WWW Figure 3: Architecture of the PolyNav system.
PNF file is then processed by a service that creates individual image thumbnails as well as collage thumbnails (all images from each page URL included in one image). The PNF file is now ready for use by one of the two PolyNav preview clients, or it may be processed further for the purposes of individual applications. The Web crawler service follows links originating from one or more start URLs and generates a PNF file to a specifiable depth. Unfortunately, all this preprocessing takes time. Our specification tools are used at present to set up application prototypes that access a predefined Web space at runtime; however, we also envision a batch-mode service in which a PolyNav space is created on demand for users who may be willing to wait for its completion. Image processing
If the image-based navigator is to be successful for the previewing task, it must perform at a rate that can take advantage of the very rapid rate at which humans are able to process images. Unfortunately, while an image may have a small display area, the data size of the image may be comparable to that of the proverbial thousand words, and this
large data size is the main contributor to a slow display. This motivated us to explore some techniques for drastically reducing the latency between the time a user selects a sequence and the display of relevant images. In the current prototype, we pregenerate thumbnail images that are used in place of the images and store these thumbnails within our server. Thus we eliminate the need to download an image from remote servers more than once and reduce the size of the data that needs to be transferred to the navigator. We have also experimented with pregenerating a single collage of all thumbnails for each page, essentially doing the same work that would be done within the navigator client, but here we can afford to take the time necessary to layout the images more optimally. Furthermore, we are reducing the time it takes a navigator to retrieve one condensed collage verses several independent thumbnail images. When the PNF file is generated, in place of the URL of each full-sized images we substitute a modified form of the URL, which we call the thumbnail URL. The thumbnails for all images in a PNF file are produced using our thumb-
nail generation service. This service fetches each original full-sized image and determines whether and how much to reduce the image. The reduction factor is computed differently for each image but the resulting thumbnail is restricted to a maximum size. Images that are already small, based on square area, are not reduced at all, and the larger an image is, the more it is reduced. The generator then applies the reduction if any, and stores the result in a file where it will remain archived until manually removed. The thumbnail URL for an image will resolve directly to the file that holds the thumbnail. In some applications of our navigator tools, particularly those involving a user-specified query, we must dynamically generate the PNF file and possibly request images from remote servers. We have begun to explore how to better support this kind of application by using a cache together with the thumbnail generation service. When an image is requested by a navigator via the corresponding thumbnail URL, the server that gets the request is really a caching server that acts as a front for the real thumbnail server. If the requested thumbnail is still in the cache from a previous request, it is returned immediately. Otherwise, the caching server forwards the request on to the real thumbnail server which generates the thumbnail at that time just as described above. The initial generation of each thumbnail does take significant time, but they are then cached for future access. Thumbnails that are not used as much may be flushed from the cache to make room for others. DISCUSSION
Applications using the PolyNav tools are prototypes whose designs are still evolving. Nevertheless, we reached the stage where we could run some preliminary testing of the main concept. We have conducted some initial informal user tests of the Java client for the task of previewing a large Web site. Five out of six subjects--none of them GTE employees, all with some previous Web experience--commented positively on the service. All were intrigued with the presentation itself, although there was some initial confusion about what they were witnessing. It was suggested that the controls are in need of revision. We were able to conclude from these initial experiments that rapid-fire image previews can be an intriguing and engaging experience for users but that further refinement of the interface is necessary. In the VRML client, we still face design trade-offs associated with “heads-up display”, navigation in a virtual 3D space, and the integration of multi-media types. The implementation also suffers from the fact that current VRML browsers do not afford the programmer the hooks to control incremental loading of images and other media objects. Such control is a requirement to meet our goal of rapidly
previewing many hundreds of such objects in a session lasting a minute or less. Among the outstanding application issues is the question of exactly when images are most useful for previewing purposes. It is clear that images that happen to appear on the Web at large do not always convey useful information about their host pages. Obviously, there is a good deal of variation in the amount and use of images as well as in the professionalism of the graphic design when there are images. Our tools currently employ some simple heuristics for filtering out some of the images on Web pages--for instance, navigational bars, ads and small decorative images (like bullets, lines, etc.)--*and we also expect that in many cases there will be a human in the loop to verify choices off-line. But Web page layout as a whole may be as important in conveying a visual message as individual images extracted from the pages and displayed in rapid succession. If, on the other hand, Web providers included a manually crafted thumbnail as part of each page’s metadata, there might be no need for automated techniques for image selection. Other issues remain in creating the higher-level structures for previewing. So far, our navigator clients expect one or more sequences of URLs, each of which is paired with one or more images. But how should such sequences be defined? What relationship do they bear, say, to hyperlink structures or other kinds of information clustering operations? We have only begun to address these questions, but our intention is to investigate sequences that can change dynamically under the control of the user. CONCLUSION
Problems to be solved in designing navigable interfaces in abstract information spaces include the problem of revealing information about the objects reachable in a single navigational move. With conventional HTML methods, only a relatively small number of next moves can be reasonably accommodated. The work reported on here offers some ideas for how users might see farther and faster into an information space to which they may potentially jump in a single move. We have presented an overview here of a collection of tools for image-based previewing of an information space. Our hypothesis is that users will find a rapidly presented sequence of images both enjoyable and useful for previewing their next move. Our prototypes include two image-previewing clients--one written in Java, the second in VRML-that allow users control over the image stream. Both client programs run with today’s commercial Web browsers. We also discussed the preprocessing services needed to prepare a Web space for previewing with these tools. Many issues
remain, but preliminary user tests indicate that the concept is engaging and appealing. ACKNOWLEDGMENTS
We thank John Vittal for his support of this work under the Advanced Interactive Internet Technologies project of the Advanced Systems Lab of GTE Laboratories. The comments of an anonymous reviewer were instrumental in revising our thinking of the relationship of this work to Micronesian navigation practices. Demetrios Karis and John Huitema organized and conducted user testing.
REFERENCES
1. Andrews, W. Alternatives to Hit Lists Include Ability to 'Fly' Through Data. Web Week, September 15, 1997. 2. Bedersen, B., and J. Hollan. PAD++: zooming graphical interface for exploring alternate interface physics. In Proceedings of ACM UIST ‘94 (Marina Del Ray, CA, 1994), ACM Press, pp. 17-26. 3. Downs, R. M., and D. Stea. Maps in Minds: Reflections on Cognitive Mapping. Harper and Row, 1977. 4. Furnas, G. W. Effective view navigation. In CHI ‘97: Human Factors in Computing Systems (Atlanta, GA, USA, March 1997), ACM Press, 367-374, 1997.
5. Haber, R. N., and M. Hershenson. The Psychology of Visual Perception. Holt, Rinehart and Winston, 1980. 6. Helfman, J. Passive Surfing in the Communal Cache. Talk presented at Human Computer Interaction Consortium Workshop, February 1996. [See http://cs.unm.edu/ ~jon/montage/.] 7. Hollan, J. D., B. B. Bederson, and J. I. Helfman. Information Visualization. In M. G. Helander, T. K. Landauer, and P. V. Prabhu (eds.), Handbook of Human-Computer Interaction, North Holland, 33-48, 1997. 8. Hutchins, E. Understanding Micronesian Navigation. In D. Gentner and A. L. Stevens (eds.), Mental Models. Erlbaum, 1983. 9. Jul, S., and G. W. Furnas. Navigation in Electronic Worlds: Workshop Report. In SIGCHI Bulletin, 29:4, October 1997, 44-49. 10. Kerne, A. Collage Machine: temporality and indeterminacy in media browsing via interface ecology. In CHI ‘97: Human Factors in Computing Systems: Extended Abstracts, ACM Press, 297-298. 11. Pirolli, P. Computational models of information scentfollowing in a very large browsable text collection. In CHI ‘97: Factors in Computing Systems (Atlanta, GA, USA, March 1997), ACM Press, 3-10, 1997.