FlipPresentations.com - Lecture Capture with Portable ("Flip") Camera, Intelligent Automated Processing, and a Customizable Web 2.0 Presentation (Draft) Eric M. Ssebanakitta, W. Richards Adrion, and Paul E. Dickson University of Massachusetts Amherst,
[email protected],
[email protected], Hampshire College,
[email protected] Abstract Conventional educators and business officials frequently engage in a lecture style interaction with their listeners when attempting to carry out an important point. The technological instruments they use consist of rather rudimentary devices and systems such as Powerpoint and white boards. The apparent dearth of efficient and sophisticated technology that could be used to facilitate communication compels one to consider a novel system that holds great potential to solve this problem. Notably, FlipPresentations.com is designed to accomplish a host of the following objectives. Deliver presentations with a great deal of ease and flow which shall capture the attention of the audience. Formatting of presentations will be done in an automatic, quick, intelligent and an easily accessible manner which shall greatly benefit the viewer. Although the above listed qualities are of great value both to individuals delivering and to individual listening presentations, they do not by any means constitute the exhaustive list of virtues of FlipPresentations.This system is very affordable as it requires only a low cost digital video which provides fully indexed visual aids. In recapitulation, the entire Flip Presentation system offers the users a host of opportunities that only the most sophisticated technological systems could at an affordable price which offers one a compelling reason to abandon the contemporary rudimentary and outdated presentation aid in favor of Flip-presentations. As researchers, we believe that our system will be of benefit to the educational community. I. Introduction The most distinguishing feature of Flip-presentations is that this system is not a simple ‘record-and-playback’ system, but is able to store a wealth of presentation relevant material that could be easily accessed both by the presenter and the audience. In order to grant the reader a clear idea with respect to the nature and magnitude of the accomplishments of Flip-presentations, a tentative list of its salient attributes shall be necessary. This system is able to easily capture all of the information relevant to the respective presentation in a much more visually appealing and a succinct manner than any power-point or a white-board presentation. Flip-presentations allows for usage of a portable digital camera (we prefer the ~$200 flip camera (c) by Cisco Systems, Inc.) that greatly facilitates and expedites the aforementioned presentation task with far greater success than any rudimentary tool frequently used by contemporary conventional lecturers. In addition to this; the above mentioned system processes and compresses information with great pace and precision. The outstanding indexing features of Flip-presentations allow the viewer a clear-cut opportunity to retrieve the desired information quickly and with very few errors. A Web 2.0 user interface system has been offered and it that allows the presenter and the audience to engage in an interaction of ease and fluidity comparable to that of a typical face to face discourse. Additionally, we have developed an iPhone application that allows for streaming of these presentations to iPhones and/or iPod touches allowing for more flexibility in usage. The above introduction of Flip-presentations allows us to address the relevant research
problems of this inquiry into technology regarding presentations II. Data Accumulation Since the commonly used classroom/lecture presentation tools are PowerPoint (PPT) or PDF slides, overhead projections, and blackboards/whiteboards, virtually all presentations can be captured quickly and effectively with a hand-held digital camera by a single operator of any skill level. The operator shall not be required to perform a variety of tedious and arduous logistical and set-up tasks. The presentation video and the presenter's slides can be uploaded to our server with a relative ease which underscores the aforementioned merit of the Flip-presentation in storing information. This confers the benefit of presentations always available in to a user. Our system works with a slides-only presentation, a whiteboard-only presentation, an audio/video-only presentation, or a combination of any of them. By these merits one may justify the claim that Flip-presentations excels at storing information and presenting it to viewers in an easy and an accessible manner. III. Media Processing We employ several computer vision and machine learning techniques in the processing of the captured video many of which have been developed by other researchers such as UMASS Ripples group [7, 16] as well as several open source tools like FFmpeg. The lecture video is digitally enhanced to improve audio, remove noise, correct radial distortion and perspective distortion in the video frames, and improve video quality before it is converted to flash format. The reason the video is converted to flash is that most web browsers can play flash video easier that other formats. Also, crucial points of presentation in the lecture are identified and extracted to form an in index for navigating the presentation. If the lecture slides (which can be in any format including ppt and pdf) are uploaded these are also processed into an image slide show to be used in the web application. This is one of the most useful features of our application because, as Microsoft research showed, lectures are often viewed piecemeal [10] so an index allows viewers to get the most out of a presentation. Because all the presentation is captured using a single camera there is no need for sophisticated media synchronization of presentation slides, whiteboard/blackboard content and audio/video streams. As part of our optimization process, we employ multi-pass processing technique such that a presentation that is uploaded is ready within minutes for viewing with minimal processing and is progressively improved during subsequent passes. We are still studying ways of tracking student interaction other than as background audio as well as methods of improving video captured using a hand-held camera. IV. Presentation Our current implementation of the presention system is as a Web application that involves several movable and re-sizable panels. A video panel is one of them, a note taking panel is another and the last but not least there is an index panel which greatly contributes to the organization and flow of the presentation. Below is a screen shot of the set up.
Our system takes a careful note of the fact that the various people viewing a presentation tend to have different learning preferences[8].Accordingly, in order to maximize the retention and attention of viewers, panels can be repositioned to appeal to a set of certain aesthetical preferences associated with the learning preferences of their respective groups of audience. For the example the video panel may be closed if a viewer wishes to listen to audio while viewing lecture slides. The note taking panel allows the user to make careful notations while attending the presentation. Arguably even more importantly, there is a social interaction component to Flip-presentations which allows members of the audience to share their presentations and exchange online messages with one another. We aim to incorporate speech to text and searching to allow users to query video, audio and slide content. Additonally we have developed an iPhone/iPod touch application that lets users access the Flip-presentations server to be able to take notes, view indexed presentations, listen to audio recordings of presentations while taking notes or reviewing instructor slides. This allows for users to view presentations at their own pace, wherever and whenever they want. Below are some screenshots of the iPhone application.
V. Related Work Other related automatic lecture capture projects merit a consideration. Projects focused on automatic lecture capture include: PAOL[7], STREAMS [6], Cornell Lecture Browser[17], Microsoft's lecture capture system [18, 19, 20], CARMUL[16], Syeda's Lecture Capture System [23, 22] and Mediasite[21]. These systems in general require specialized lecture rooms and/or installation of software. There are also commercial systems like Camtasia Studio that run on the lecturer's computer[24]. Blackboard Academic Suite[4] and Echo360[1], EClass [3, 2, 5], Authoring on the Fly[14, 13, 25] and Panopto which also in general require specialized hardware and software. Although all of these systems have merit in their own right that deserves special recognition, our system has virtues that all of them lack. Namely, the setup costs are much lower, the interface is versatile as well as user friendly and the lecture material is presented in a way that is clear and easily accessible to a wide range of viewers. VI. Plans for future works regarding this topic Although up until this point we have resolutely held to the position that Flip-presentation is superior to the conventional contemporary presentation technologies, it is our intention to substantiate this claim further by conducting a careful inquiry into other instructing technologies. In addition to that we will put our software to test by placing it in different contexts or situations where it has not been commonly implemented hitherto. For instance, Flip-presentation will be applied in settings ranging from fine arts to exact sciences and mathematics. User feedback will be accumulated accordingly and steps shall be made to the end of accommodating viewers by adjusting the existing features of Flip-presentation to the end of making the presented material even more accessible and enjoyable to all audiences. The use of OCR on captured content for use in indexing in search will be explored. since perfect OCR is not required for effective indexing [11, 12].
We will also explore ways of digitally enhancing video captured with a hand held camera as well as add more features to UI like printing. VII. Acknowledgments Special thanks to Wendy Cooper for her support. Also our thanks go to the entire RIPPLES/ PAOL team at UMASS Amherst for casual support and informative feedback during this project. VIII. References [1] 360, Echo. Echo360. http://www.apreso.com/, Aug 2009. [2] Abowd, Gregory D. Classroom 2000: An experiment with the instrumentation of a living educational environment. IBM Systems Journal 38, 4 (1999), 508-530. [3] Abowd, Gregory D., Atkeson, Christopher G., Feinstein, Ami, Hmelo, Cindy E., Kooper, Rob, Long, Sue, Sawhney, Nitin "Nick", and Tani, Mikiya. Teaching and learning as multimedia authoring: The classroom 2000 project. In ACM Multimedia(1996), pp. 187-198. [4] Blackboard. Blackboard academic suite. http://www.blakboard.om/products/ Academic_Suite. Dec 2007. [5] Brotherton, Jason A., and Abowd, Gregory D. Lessons learned from eclass: Assessing automated capture and access in the classroom. ACM Trans. Comput. Hum. Interat. 11, 2 (2004), 121-155 [6] Cruz, G., and Hill, R. Capturing and playing multimedia events with streams. In MULTIMEDIA '94: Proceedings of the second ACM international conference on Multimedia (1994), ACM Press, pp. 193-200. [7] Dickson, P, Adrion, W, R, Hanson, "Automatic Capture and Presentation Creation from Multimedia Lectures", Frontiers in Education 2008, Vol 38., Oct 2008, 14-19. [8] Dickson, P, Adrion, W, R, Hanson, Arbour, D, T, "First experiences with a classroom recording system", Proceedings of the 14th annual ACM SIGCSE conference on Innovation and technology in computer science education (2009) [9] Flip. http://www.theflip.com/, Jul 2009 [10] He, Liwei, Gupta, Anoop, White, Stephen A., and Grudin, Jonathan. Design lessons from deployment of on-demand video. In CHI '99: CHI '99 extended abstracts on Human factors in computing systems (New York, NY, USA, 1999), ACM Press, pp. 276-277. [11] Hurst, Wolfgang. Indexing, searching, and skimming of multimedia documents containing recorded lectures and live presentations. In MULTIMEDIA '03: Proceedings of the eleventh ACM international conference on Multimedia (2003), ACM Press, pp 450-451. [12] Hurst, Wolfgang, Kreuzer, Thorsten, and Wiesenhutter, Marc. A qualitative study towards using large vocabulary automatic speech recognition to index recorded presentations for search and access over the web. In ICWI (2002), IADIS, pp. 135-143. [13] Hurst, Wolfgang, Maass, Gabriela, Muller, Rainer, and Ottmann, Thomas. The "authoring on the fly" system for automatic presentation recording. In CHI '01: CHI '01 extended abstracts on Human factors in computing systems (New York, NY, USA, 2001), ACM, pp 5-6. [14] Hurst, Wolfgang, Muller, Rainer, and Mayer, Christoph. Multimedia information retrieval from recorded presentations (poster session). In SIGIR '00: Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval (New York, NY, USA, 2000), ACM, pp. 339-341. [15] Kameda, Y., Nishiguhi, S., and Minoh, M. Carmul: concurrent automatic recording for multimedia lecture. In ICME '03: Proceedings of the 2003 International Conference on Multimedia and Expo (Washington, DC, USA, 2003), IEEE Computer Society, pp. 677-680. [16] Li, Weihong, Tang, Hao, and Zhu, Zhigang. Automated registration of high resolution images from slide presentation and whiteboard handwriting via a video camera. In CVPRW
'04 Proceedings of the 2004 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'04) Volume 11 (2004), IEEE Computer Society, pp. 168. [17] Liu, Qiong, Rui, Yong, Gupta, Anoop, and Cadiz, J. J. Automating camera management for lecture room environments. In CHI '01: Proceedings of the SIGCHI conference on Human factors in computing systems (2001), ACM Press, pp. 442-449. [18] Mukhopadhyay, Sugata, and Smith, Brian. Passive capture and structuring of lectures. In MULTIMEDIA '99: Proceedings of the seventh ACM international conference on Multimedia (Part 1) (1999), ACM Press, pp. 477-487. [19] Rui, Yong, Gupta, Anoop, and Grudin, Jonathan. Videography for telepresentations. In CHI '03: Proceedings of the conference on Human factors in computing systems (2003), ACM Press, pp. 457-464. [20] Rui, Yong, He, Liwei, Gupta, Anoop, and Liu, Qiong. Building an intelligent camera management system. In MULTIMEDIA '01: Proceedings of the ninth ACM international conference on Multimedia (2001), ACM Press, pp. 2-11. [21] SonicFoundry. Mediasite.com. http://www.Mediasite.com/, Apr 2006. [22] Syeda-Mahmood, Tanveer Fathima. Indexing for topics in videos using foils. In CVPR (2000), IEEE Computer Society, pp. 2312-2319. [23] Syeda-Mahmood, Tanveer Fathima, and Srinivasan, Savitha. Detecting topical events in digital video. In ACM Multimedia (2000), pp. 85-94. [24] TechSmith. Camtasia studio screen recorder for demos, presentations and training. http://www.techsmith.com/camtasia.asp, Jul 2009. [25] Zupancic, Bernd, and Horz, Holger. Lecture recording and its use in a traditional university course. In ITiCSE '02: Proceedings of the 7th annual conference on Innovation and technology in computer science education (New York, NY, USA, 2002), ACM, pp. 24-28. [26]Zhigang Zhu, Chad McKittrick1 and Weihong Li, Virtualized Classroom – Automated Production, Media Integration and User-Customized Presentation