Interface Issues for Interactive Navigation and Browsing of Recorded Lectures and Presentations Wolfgang Hürst, Georg Götz Institut für Informatik, Albert-Ludwigs-Universität Freiburg, D-79110 Freiburg, Germany
[email protected],
[email protected]
Abstract: Being able to visually scan a document that contains a recorded lecture is essential for the usability of such files. In addition, flexible and easy interaction and navigation functionality should be provided to the students in order to improve the overall learning experience. In this paper we discuss the issues involved in these tasks and present a new interaction design that highly improves usability. First, we describe common approaches for browsing, navigating, and reviewing recorded lectures. Then we introduce a new interface design which supports higher interactivity and solves some of the most common problems with regular approaches for these tasks.
Motivation – Why Navigation and Browsing of Recorded Lectures is Important Automatic lecture recording and presentation capturing has gained increasing popularity over the last couple of years. Today, many commercial tools exist that make automatic lecture recording an easy, straightforward task. Lecturers at many universities have begun to record their courses on a routinely basis, thus producing a tremendous amount of multimedia documents that can be used for further learning by the students. Many of the commercial systems available today have their roots in earlier research projects related to automatic lecture recording. Most of these projects started already in the middle of the 90s, for example the Cornell/Berkeley Lecture Browser (Mukhopadhyay and Smith (1999)), the Classroom 2000/eClass project (Brotherton and Abowd (2004)), or the Authoring on the Fly system (AOF, Müller and Ottmann (2000)), which was developed by our research group. Over the years, those research projects have not only come up with new solutions for the automatic capturing of presentations but also gained a lot of experience with the usage of the produced files by the students for further learning and studying. For example, at our university many students use them to repeat selected parts of a lecture when doing their homework or when preparing for an exam. One important aspect of this usage is that students usually do not continuously replay the whole recording of a presentation but instead use it very selectively, for example, by reviewing only the parts they haven’t understood completely or that are necessary to solve a particular exercise (Zupancic and Horz (2002)). In fact, using digital, multimedia documents for learning appears to be similar to working with traditional media, for example, books. When reading a novel, one usually starts with the first page and continues reading page by page till the end. In the same way, users watch, for example, a movie on a DVD linearly from the beginning to its end without much interaction during its replay. On the other hand, when a text book is used for learning, its content is usually processed more selectively: Chapters of particular interest are accessed directly, parts of minor interest are scanned quickly or skipped completely while more complicated topics are re-read over and over again, sometimes sentence by sentence, and so on. Similarly, students want to “work” with multimedia data, such as lecture recordings, when they use it for learning. They want to quickly skim the file in order to locate parts of particular relevance and to skip portions of minor interest. They need to be able to set back and replay a part they just listened to but didn’t understand. They want to examine complicated parts of the content in more detail than other portions of the file, and so on. Consequently, system designers have to offer comfortable, flexible, and easy browsing, navigation, and interaction support in the player software in order to provide the highest possible usability for the students.
Common Approaches for Visual Navigation and Browsing of Lecture Recordings Most systems for replay of recorded presentations offer a thumbnail overview of the slides used in the lecture. Each of these thumbnails is usually linked to the corresponding position in the multimedia file that contains the recording (compare Figure 1a). In a similar way as static keyframes are often represented in order to facilitate browsing and navigation of video data (compare Girgensohn et al. (2001), for example), thumbnails of slides can be used to quickly scan the recorded lecture’s content, to identify relevant parts, and to access them directly by clicking on the
respective thumbnail icon. However, such approaches have significant disadvantages. The minimized display of the slides reduces readability making it sometimes hard to identify its content. In addition, thumbnails can only provide a static representation of the continuous media signal. This is no problem, if the visual data stream of the lecture recording just contains slides whose content is not changing but static between the transition of two slides. However, the full power of modern systems for presentation recording lie in their ability to record continuously changing information as well, such as handwritten annotations made on a slide. A static representation of these annotations removes information contained in the original signal and makes browsing of the content much harder. For example, in the case illustrated in Figure 1, it is impossible to decided in which order the annotations on the graphics were made by just looking at its static representation due to the removal of the temporal information. However, this order might be important, if not essential for the learning process of the students. Maybe the most critical disadvantage of thumbnail representations is that they restrict navigation to specific entry points of the file, namely the transitions from one slide to another. Consider a situation where a student wants to go back and repeat the last one or two sentences that have just been replayed. If navigation is only supported via thumbnails, one has to go back to the position where the corresponding slide appeared on the screen for the first time, what in the worst case might be as much as a several minutes. This is truly not a comfortable and flexible way to interact with the system or the corresponding data.
(a) Thumbnails of the slides They enable quick scanning of the document’s content and navigation between single slides by clicking on the corresponding thumbnail icons
(b) Time-based slider interface Moving the slider knob immediately displays any change of the visual data stream due to the real-time random accessibility provided by the synchronization model
Figure 1: Original interface design of our player software which offers different possibilities to browse and navigate through a lecture recording, for example, (a) by using the thumbnail representation of the slides or (b) by using the time-based slider that offers real-time random access to any position within the document.
End
(a) Scaling problem
End
(b) Zooming of the slider’s scale
Document
Document
Moving the slider one pixel results in a large jump in the document due to their different lengths.
Solves the scaling problem but pushes slider scale over window borders and requires additional interaction from the user Begin
Begin
Figure 2: Illustration of the scaling problem which appears when a small slider is mapped to a long document. In order to provide a more flexible and interactive interface, we introduced a special synchronization model for replay of the lecture recordings (see Hürst and Müller (1999)) whose main features include real-time random access to any position within the data streams. By being able to access any position in real-time, we can provide a timebased slider in the player software that allows a user to visually browse through the document’s content in a similar way as using a scrollbar for navigation in text documents. That is, every visual change in the data stream is displayed immediately in the replay window once a user moves the slider knob along the time line that represents the document’s length (compare Figure 1b). Due to the real-time random access functionality provided by the underlying synchronization model, this offers a very convenient and flexible way for browsing and navigating of the recorded lecture’s content. We used this design in various classes and students highly appreciated the easy and intuitive possibility to work with the data in a more interactive way. However, there is one significant problem when using sliders for visual data browsing and navigation (not only with recorded lectures but any kind of documents): While documents can be arbitrarily long, sliders are restricted by the size of the corresponding window as well as by the screen resolution. This can result in a situation where the smallest unit to move the slider knob on the screen – i.e. one pixel – can result in a relatively large step in the document (compare the illustration in Figure 2a). If the number of pixels on the slider’s scale is less than the number of units in the document, this results in a jerky, not smooth visual representation of the content during scrolling. Such a jerky behavior is generally disliked and considered as very disturbing by the users. In the worst case, it might even be possible, that the resulting jumps in the document are that large, that one is not able to access particular positions of interest with the slider interface at all. For example, the slide presented in Figure 1 contains a lot of very small annotations. Being able to access any of them directly by moving the slider knob might be impossible due to the document’s length which in this case is larger than 60 minutes. This problem is well known in context with scrollbars and static, textual data and different approaches have been proposed to solve it (see, for example, Ahlberg and Shneiderman (1994) or Ayatsuka et al (1998)). However, only few attempts have been made so far to solve this problem in context with continuous, time-dependent media streams (see, for example, Richter et al. (1999) or Ramos and Balakrishnan (2003)). Most of them rely on some kind of zooming of the slider’s scale (compare Figure 2b). While this allows one to slowly navigate in a small range, it is sometimes considered to be less intuitive since the overview of the whole document somehow gets lost. For example, if the slider’s scale is enlarged (and thus expands across the borders of the window) it is much harder to specify if the currently displayed part of the document is in the first, second, third, or fourth quarter of the file. Most importantly, such approaches make interaction more complex since they usually require additional operations in addition to the actual navigation, namely the zooming in order to modify the slider’s scale. In addition, given a document and a target position, users have to decide which level of granularity of the slider’s scale is the most appropriate in order to access this position in the best possible way, which is no easy or straightforward task.
1.) User clicks on the scale instead of grabbing the knob
2.) Distance between knob and pointer is large => faster scrolling
(b) Mapping of pointer/knob-distance to browsing speed Speed
(a) Elastic interfaces
3.) Distance between knob and pointer is small => slower scrolling Distance between knob and pointer
Figure 3: Illustration of the basic idea of the concept of elastic interfaces. (a)
(b)
(c)
rescaling
rescaling
initial clicking position The initial clicking position is associated with the current position of the original slider knob
Moving the pointer to the left or right initiates backward or forward browsing, respecitively
The mapping of virtual to actual slider scale can result in a rescaling on both sides of the knob
Figure 4: Illustration of the elastic panning approach. For this reason, we introduce a new approach for visual data browsing called elastic panning that offers an easier, more interactive and flexible way to navigate and browse the data. It builds on the concept of elastic interfaces. This concept was originally introduced by Masui et al. (1995) in order to solve the scaling problem illustrated in Figure 2a for discrete, time-independent data such as text, images, or graphics. Its basic idea is that one does not move the slider knob directly but clicks next to it on the slider’s scale and the knob starts moving automatically with a speed that depends on the distance between the position of the knob and the mouse pointer, respectively; if this distance is larger, scrolling becomes faster, if it gets smaller, scrolling slows down, as illustrated in Figure 3a. The distance is usually mapped linearly to the corresponding scrolling speed (compare Figure 3b). The approach is called elastic, because the connection between slider knob and mouse pointer can be interpreted as a rubber band: If the rubber band is stretched, the force that pulls the slider knob towards the mouse pointer gets stronger, thus resulting in a faster movement of the knob. If the mouse pointer is moved towards the slider knob (of when the knob moves towards the mouse pointer which is held at a fixed position by the user), the pressure on the rubber band decreases what results in a slower movement of the slider knob. In Hürst et al. (2004) we showed that this idea is not only feasible for discrete, time-independent data, but can be applied to continuous, time-dependent media steams as well. In the following, we describe how this concept can be used for navigation and browsing of lecture recordings in order to provide an easy and comfortable interaction mode that circumvents the problems occurring with standard slider interfaces.
Elastic Panning for Flexible and Interactive Lecture Browsing In order to start navigation in the lecture recordings with our approach, called elastic panning, users initially click anywhere on the screen. This initial clicking position is associated with the current position of the slider knob on the corresponding time line (and thus with the actually displayed content of the document), as illustrated in Figure 4a. Moving the mouse pointer to the left or right starts scrolling along a virtual slider scale which extends to both sides of the slider knob’s representation on the replay window (compare Figure 4b). Scrolling behavior is similar to elastic skimming as described in Figure 3, i.e., if the mouse pointer moves further away from the slider knob, the knob follows with a speed that is proportional to the distance between the pointer and the knob. That way, a user is able to browse through a document at a very high speed (for example, to skip a larger part of minor interest), as well as to navigate very slowly (for example, to reset just a few seconds in order to replay the last one or two sentences). In contrast to a regular slider, this slow scrolling behavior can be achieved independent of the document’s length
because of the elastic interface concept. Moving the mouse pointer up or down makes the visualization of the virtual slider on the screen move but does not influence the overall scrolling behavior. Both ends of this virtual slider scale are mapped to the left or right window border, respectively. It should be noted, that mapping the initial clicking position to the position of the original slider at that time and mapping the document borders to the window borders can result in a modification of the slider’s scale, as illustrated in Figure 4c. However, such a rescaling is uncritical, because of the concept of elastic interfaces which makes scrolling independent of the document’s length and thus of the slider’s scale. Hence, this rescaling does not influence the overall browsing behavior. However, we realized that its visualization confuses the users, especially when the scrolling direction changes. For this reason, we only visualize one side of the virtual slider scale in the final implementation in our player software, i.e., the one in the current scrolling direction, as can be seen in Figure 5, which contains a snapshot of the actual implementation. In addition, we do not show the scale explicitly but use relative position labels instead, which indicate the initial clicking position, the current position, as well as the end of the document (or its beginning, depending on the scrolling direction), respectively. First feedback from the users about both, the interaction experience as well as the visualization, was very positive. They highly appreciated the smooth scrolling behavior compared to the original slider interface. Being able to navigate through the document at any speed, independent of its length, is especially an advantage when the corresponding lecture contains a lot of handwritten annotations such as the examples shown in Figure 1 or 5, respectively. While using a standard slider in such a situation results in a jerky, unpleasant visualization, using elastic panning improves the overall visualization and thus interaction experience by realizing a much smoother illustration of the content during scrolling. With elastic panning, users can accurately access any
Initial clicking position
End of the slider or document, respectively
Current position (Associated with the original slider knob)
Figure 5: New interface design of our player software which offers the possibility of elastic panning in addition to browse and navigate the content by using the slides’ thumbnails or the time-based slider, respectively.
little part of the annotations and replay the corresponding part of the lecture immediately. While the small annotations made on the graphics on the slide shown in Figure 1 are hard, if not impossible to access directly with the original slider (due to the scaling problem discussed above), they can be very easily accessed using elastic panning. In addition, users liked the idea of being able to click directly on the documents surface for navigation instead of being forced to continuously switch between the replay window and the corresponding slider widget when browsing the document’s content.
Summary At our university, we have been using automatic lecture recording for a couple of years now. Based on this experience, we identified a strong need on the student’s side for a comfortable and flexible interaction functionality during replay of the corresponding files. Students need to be able to browse and scan the recorded lectures in an easy and intuitive way in order to gain the highest possible profit from them during their learning experience. In this paper, we reviewed common approaches for navigation in recorded lectures, identified its weaknesses, and introduced a new, alternative interface design which improves the interaction with the document’s content, thus enabling the users to really “work” with the files in a similar way as using text books for learning. The final interface design realized in our player software (compare Figure 5), offers various opportunities for the students to browse and interact with the data: Thumbnails of the slides can be used to get a quick, rough overview of the content and to access a recording at a higher level, i.e. the slide transitions. Using a regular time-based slider widget, the users can visually navigate and browse through the document due to the real-time random access feature that is realized by the used synchronization model. In addition, we introduced a new way of interaction called elastic panning, which is especially useful if access of the data with a regular slider interface fails or if the movements become to jerky (both due to the length of the document and the resulting scaling problem between slider and document, as illustrated in Figure 2a). While first user feedback has been very positive, a detailed empirical evaluation of its performance is part of our future research. References Ahlberg, C. and Shneiderman, B. (1994). The Alphaslider: A compact and rapid selector. In Proceedings of the SIGCHI conference on Human factors in computing systems, ACM Press, pp. 365-371. Ayatsuka, Y., Rekimoto, J., and Matsuoka, S. (1998). Popup Vernier: A tool for sub-pixel-pitch dragging with smooth mode transition. In Proceedings of the 11th annual ACM symposium on User interface software and technology, ACM Press, pp. 3948. Brotherton, J. and Abowd, G. (2004). Lessons learned from eClass: Assessing automated capture and access in the classroom. In ACM Transactions on Computer-Human Interaction (ToCHI). To appear Spring/Summer 2004. Girgensohn, A., Boreczky, J., and Wilcox, L. (2001). Keyframe-Based User Interfaces for Digital Video. In IEEE Computer, Vol. 34(9), pp. 61-67, 2001. Hürst, W., Müller, R. (1999). A synchronization model for recorded presentations and its relevance for information retrieval. In Proceedings of ACM Multimedia ’99, ACM Press, pp. 333-342. Hürst, W., Götz, G., and Lauer, T. (2004) New methods for visual information seeking through video browsing. Proceedings of the 8th International Conference on Information Visualisation, IV04, London, UK (to appear, July 2004). Masui, T., Kashiwagi, K., Borden IV, G.R. (1995). Elastic graphical interfaces for precise data manipulation. In Conference companion on Human factors in computing systems, ACM Press, pp. 143-144. Mukhopadhyay, S., Smith, B. (1999). Passive capture and structuring of lectures. In Proceedings of ACM Multimedia ’99, ACM Press, pp. 477-487. Müller, R. and Ottmann, T. (2000). The “Authoring on the fly'” system for automated recording and replay of (tele)presentations. In Special Issue on “Multimedia Authoring and Presentation Techniques” of ACM/Springer Multimedia Systems Journal, Vol. 8(3), pp. 158-176. Ramos, G. and Balakrishnan, R. (2003). Fluid interaction techniques for the control and annotation of digital video. In Proceedings of the 16th annual ACM symposium on User interface software and technology, ACM Press, pp. 105-114. Richter, H., Brotherton, J., Abowd, G., and Truong, T. (1999). A Multi-Scale Timeline Slider for Stream Visualization and Control, GVU Center, Georgia Institute of Technology, Technical Report GIT-GVU-99-30. June 1999. Zupancic, B. and Horz, H. (2002). Lecture Recordings and its use in a traditional university course. In Proceedings of the 7th International Conference on Innovation and technology in computer science education (ITiCSE), Aarhus, Denmark, ACM Press, pp. 24-28.
Acknowlegments. The work presented in this paper was supported by the German Research Foundation (DFG) as part of the research initiative V3D2 ("Distributed Processing and Delivery of Digital Documents").