Jun 11, 2005 - Life like characters are becoming a part of user interfaces these days. .... Using small video clips, still images, voice dialogues and flash ..... Libraries might be on top of the list among the places where rules and regulations are.
Neva: A Conversational Agent Based Interface for Library Information Systems
by
ABDUL AHAD
Presented to the Faculty of the ISNM International School of New Media in Partial Fulfilment of the Requirements for the Degree of MASTER OF SCIENCE IN DIGITAL MEDIA
UNIVERSITY OF LÜBECK June 2005
Supervisors:
Prof. Dr. Bernhard Jung (1st advisor) Prof. Dr. Andreas Schrader (2nd advisor)
Table of content
1
INTRODUCTION............................................................................................. 5
1.1
Conversational Interfaces and Embodied Conversational Agents ............................................... 5
1.2
Neva, a Conversational Agent Based Interface for Library Information Systems...................... 8
1.3
Overview .......................................................................................................................................... 11
2 2.1
STATE OF THE ART .................................................................................... 12 Embodied Agents for Human Computer Interaction .................................................................. 12
2.2 Avatar as Helper, Hosts and Tour Guide...................................................................................... 14 2.2.1 Helper Agent ................................................................................................................................ 15 2.2.2 PPP Persona.................................................................................................................................. 16 2.2.3 Agneta and Frida .......................................................................................................................... 16 2.2.4 AiA Persona ................................................................................................................................. 17 2.2.5 Digital City Kyoto ........................................................................................................................ 18 2.2.6 PEACH......................................................................................................................................... 19 2.2.7 Interactive agents for kiosks......................................................................................................... 20 2.3
Summery .......................................................................................................................................... 22
3 TECHNOLOGY SETUP FOR THE MCLUHAN DOCUMENTATION CENTER KIOSK .................................................................................................................. 23 3.1
Character Modeling ........................................................................................................................ 23
3.2
RFID................................................................................................................................................. 23
3.3
Kiosk Configuration........................................................................................................................ 24
3.4
Database........................................................................................................................................... 25
3.5
Amazon Web Services (AWS)........................................................................................................ 25
4 4.1
LIBRARY SERVICES OVERVIEW ............................................................... 27 Services............................................................................................................................................. 27
4.2 General Services .............................................................................................................................. 28 4.2.1 Library Information...................................................................................................................... 28 4.2.1.1 Staff .................................................................................................................................... 28 4.2.1.2 Rules and Regulations ........................................................................................................ 29 4.2.1.3 Books and Journals ............................................................................................................. 29 4.2.2 Technology................................................................................................................................... 29 4.2.2.1 Technology Overview......................................................................................................... 30 4.2.2.2 PDA overview .................................................................................................................... 30 4.2.3 Searching and Locating Media Items ........................................................................................... 31 4.2.4 Rating Books ................................................................................................................................ 32 4.3 Personal Services............................................................................................................................. 32 4.3.1 Alerts ............................................................................................................................................ 32 4.3.2 New Arrivals ................................................................................................................................ 33
1
4.3.3 Preference..................................................................................................................................... 34 4.3.3.1 Avatar Selection.................................................................................................................. 34 4.3.3.2 Personal Data ...................................................................................................................... 35
5
REALIZATION .............................................................................................. 37
5.1 System Architecture........................................................................................................................ 37 5.1.1 Software Logic ............................................................................................................................. 39 5.1.2 Presentation Logic........................................................................................................................ 40 5.1.2.1 Overview............................................................................................................................. 40 5.1.2.2 Media Controller................................................................................................................. 42 5.2
6
System Walkthrough ...................................................................................................................... 43
CONCLUSION AND FUTURE EXTENSIONS.............................................. 49
6.1 Lessons Learned.............................................................................................................................. 49 6.1.1 Museums ...................................................................................................................................... 50 6.1.2 Exhibitions ................................................................................................................................... 50 6.2 Directions for Future Work ........................................................................................................... 50 6.2.1 Learning Capability...................................................................................................................... 51 6.2.2 Technology Realization................................................................................................................ 51 6.3
Conclusion ....................................................................................................................................... 52
7
REFERENCES.............................................................................................. 53
8
INDEX ........................................................................................................... 57
8.1
Configuration File ........................................................................................................................... 57
8.2
Character Preferences .................................................................................................................... 58
8.3
Character Sound ............................................................................................................................. 59
8.4
Media XML File .............................................................................................................................. 61
8.5
Tables Structure and Relationship ................................................................................................ 62
8.6
Class Structure ................................................................................................................................ 63
8.7 Installation Guide for the Neva...................................................................................................... 64 8.7.1 Hardware Requirements: .............................................................................................................. 64 8.7.2 Software Requirements: ............................................................................................................... 64
2
Abstract Since last decade computers are witnessing a new paradigm in human computer interfaces technology. Arrival of new gadgets and technologies are boosting the way into the realization of friendly and easy to use interfaces. Virtual human characters are becoming an integral part of interfaces because of their services and utilities they offer. Human computer conversations are becoming more interactive and user friendly.
Libraries contain huge amount of information relating to almost every aspect of life. Because of the technologies, libraries have been evolved from card based catalogs to online access able media. Many of them are also helping their readers with the aid of information systems installed in the library. These systems mainly help the users in searching the books and they employ conventional user interfaces to fulfill these tasks.
By taking a step ahead of the situation, this thesis report introduces the concept of using virtual human characters to support the user personalization in the library environment. The new aspect is the user personalization and the improved user and tangible interface supported by the intelligent virtual humans and books. Not only these virtual characters know about the library in detail but they also learn what their users are looking for.
As realization of the above concept, a kiosk based information system, named Neva, is developed for the McLuhan Documentation Center at the ISNM International School of New Media at the University of Lübeck, Germany as a test case for the libraries. By employing state of the art technologies like RFID to identify the users and Haptek character animation to realize the character support, the system offers a simple and user friendly interface. Neva offers various services to the users based on their interest and fields. For example, when a user logs in, it alerts the user if there is any book needed to be returned before the deadline passes. It also lets the user know if there are any newly arrived books in the library matching to his or her interest. It does not only support the contents available in the library but it also receives the information from the WWW by accessing Amazon book web services. The combination of text, graphics, spoken language, facial expression and gazes are used to present the results to the users.
3
The Neva experience and the users’ feedback motivate us not to limit its capabilities to libraries only but to extend them to other areas where information is difficult to manage like museums and exhibitions. The visits at the museums and the exhibitions can be more fruitful and less time consuming with the help of such virtual humans and tangible interface based information systems.
Keywords: Conversational Interface, Avatar, Virtual Character, Kiosk, RFID, Embodied Conversational Agent (ECA), Socially Intelligent Agent (SIA).
4
1 Introduction
This chapter introduces the concept and elaborates the objectives of this thesis report. This also involves the explanations of fundamental terms involved. Next after the defining part, it states the objectives and the characteristics of the developed system.
1.1 Conversational Interfaces and Embodied Conversational Agents Information and communication technologies have taken dramatic changes in the last 25 years. Computers are becoming a ubiquitous part of our daily lives. Whether we are traveling, enjoying our free time at home, studying or conducting business we are in a complete process of interaction with computers. This availability of computers has dramatically increased our appetite for information. In order to get most out of this information, there was a natural need of developing such interfaces that are easy to use, robust and offer services while keeping themselves in the boundaries of natural ways. This brings a lot of new challenges, since as human beings we have a quite complex system of communication. We make complex representation gestures by using our hands; eyes gazes, and use our voices in different tones and pitch to have different impact in conversations. This motivates the metaphor of using face-to-face conversation in human computer interface design. It includes mixed initiative, non-verbal communication, presence or movement detection, spoken dialogue and so forth [1]. Therefore a true conversational interface should promise more initiative to learn, it should be more open to verbal communication and more functional in high noise environments [2].
Life like characters are becoming a part of user interfaces these days. When users use the interface, what do they see? They only see the interface that is designed, that is there in front of their eyes. The challenge is to get an interface that is more like a human being and that can offer the same services as well. And how to achieve this level of interaction welcome to the world of avatar1 (in this report the terms avatar and agent will be used interchangeably). 1
Definition 1: The incarnation of a Hindu deity, especially Vishnu, in human or animal form. (The American Heritage® Dictionary of the English Language) Definition 2: The software representation of a person as the person appears to others in a shared virtual universe. The avatar may or may not resemble an actual person. (Sun Microsystems, Inc. 1994 - 2005)
5
Beginning with chatbots2 and infobots3, continuing with cartoon like images and now entering into the domain of 3d presentation, an avatar should be a capable, useful and friendly personality that can stand in for a real person. Figure 1 shows some example avatars.
1.1: Image avatar4
1.2: Microsoft “Clippy” [43]
1.3: Seonaid (“Shona”) a flexible presentation avatar [11]
Figure 1: Example avatars In the early stages simple images were used to present an avatar. Later cartoon like animation became part of avatar technology. These animations were composed of small video clips and at that time they were incapable of interaction as well. Their role was quite simple. We have seen characters reading the frequently asked questions or contents of a website. With time passing, few modifications were made to these cartoon avatars, like they were equipped with the interaction facility. One example of this case is Microsoft Clippy who helps the user in many ways. It is interactive, animated and full of information. Now as we are in the 3rd phase of avatar generation, realistic 3d models are being used. In figure 1.3, Shona is an example of flexible presentation avatar. She presents the junior section of Swedish Executive and is responsible of reading news and spreading information especially for children and young students [11]. Her capabilities are now extended from reading to teaching.
Bot: Any type of autonomous software that operates as an agent for a user or a program or simulates a human activity [31]. 2
Chatbot is a bot that can interact with the human in conversations. It might be a complete Artificial
Intelligence (AI) implementation or an interface [32]. 3
A bot that serves as a common database of information (often noteworthy URLs) for users on a chat system.
Infobots often have a simple Chatbot interface, responding to key-phrases, as well as to direct queries [33]. 4
Image source: http://www.celshader.com/images/bboards/avatar-jen-small.gif
6
The motivation for this embodiment [8] is to transform human machine interaction into face-to-face dialogues. This way information exchange is done via an agent. This motivation for embodiment becomes quite clear when we consider the role of our own body in daily life conversation, the way we use our hand movements, non-verbal uttering, and eye gestures [6].
There is always a question of whether to model these characters more realistically and more human or to model them for an abstract representation. There has already been a lot of research in the area of modeling characters more realistically and this notifies the success of the Jack system and the work within the European humanoid project [9]. Beside this, the systems should be able to convey more information than their counterparts.
7
1.2 Neva, a Conversational Agent Based Interface for Library Information Systems Libraries contain mass storage of information and require the use of information systems to serve the needs of their users. In order to find relevant information, many libraries offer services like searching the books and providing the online contents etc. Use of the information systems in the libraries is very common, which mainly provide the searching features or online catalogues. The interfaces typically comprise form-based queries and results (see figure 2) that do not appear friendly to many users.
Figure 2: A screenshot of the existing web-based interface of McLuhan Documentation Center
Figure 2 shows the result of a search query for the book, “Handbook of Virtual Humans” that provides the textual output to the users.
As modern technologies are becoming a part of the every day life, there is an opportunity of using them in the library as well. Conversational interfaces, Radio Frequency Identification (RFID), kiosk based information systems, Internet web services, it’s all happening (for more details consult section 3). But this all becomes useless when the typical users don’t know how to get a hand over it or to make it useful for their quests. This scenario leads us to develop such an environment or interface where the user feels towards it. It takes care of user queries and fulfills them with no time. 8
In order to support and overcome these challenges, Neva is developed, which has been realized on embodied conversational agent technology [19]. Neva supports multi-modal virtual characters whose verbal and non-verbal behaviors are designed to support conversations. Its sole purpose is to entertain requests and provide help to the library users. It can be installed on computers or on kiosk system in libraries. In order to use the system, the user has to identify him to the system with the use of an RFID tag (see section 3.2 for more details on RFID). Once the system recognizes the user, a personalized avatar pops up on the screen and greets the user. Because of the reason that it recognizes its users and knows about their preferences it starts serving the users according to their needs (see figure 3).
Figure 3: Library users experiencing the Neva on a kiosk system
A number of services are being offered by the system e.g. the avatar installed at the McLuhan Documentation Center can teach the user how to use the automated machines in the library, what rules and regulations are to be followed in the domain of the library, how can they find particular books in library, etc.
9
Following are the prime characteristics of Neva: •
Neva has characters, with both human and non-human like appearance. Their interaction with the users involves the use of emotions, various voice tones and eye gazes. Because of their realistic looks, they present an affective and user friendly interface.
•
Neva’s interaction with the user is supported by recorded human voices and text to speech conversion. Voices are lip synchronized and hence enhancing the impact of conversation between two parties.
•
Neva is an automated and personalized system whose main focus is to provide personalization to users. Each character knows about the interest and behavior of their users and so they correspond accordingly whenever a request from the user is initiated. Thus making it more convenient and comfortable for the user to realize the fact that the system knows about his or her attitude and interests in books and magazines.
•
Beside personalization, Neva offers guidance facility to the users. When the users look for a particular book in the library, Neva highlights the targeted area by showing either a video fragment or an animation clip. This features is quite helpful in big libraries where book finding is always been a hectic task.
•
Neva also has capabilities of accessing book information in WWW. A built-in Amazon web service in the Neva is used for the realization.
•
Neva has a tangible interface. By simply placing the book on the kiosk system, Neva offers the users services, which are related to the books, e.g. book searching and book rating.
•
One exciting feature of Neva is to alert and remind the user for particular things on time. For example the system alerts its user when a deadline for a rented book is approaching or a book is available to them for rent, which was occupied by another
10
user before their last visit. This facility extends the personalization concept and thus makes the system friendlier. •
Neva makes use of multiple digital material support that is available to respond to user requests. Using small video clips, still images, voice dialogues and flash animation to enhance the interaction are few of the examples.
1.3 Overview Section 2, state of art, describes the underlying technology used in the system. Further it explains the use of avatars for other scenarios like museum or exhibitions.
Section 3 elaborates the use of radio frequency identification and other technology architecture that are supporting the interface design. It also describes the technology requirements and the setup to install the system.
Section 4 clarifies the concept of the system and provides answers to questions like why the system is required and which services are offered by it.
Then in the realization part, section 5, the system implementation is elaborated with the aid of interface screenshots. It also explains what challenges were faced during the implementation and how were they answered. This part is more about the technology implementation and system walk through.
In the end future prospects and user experiences are described in the conclusion.
11
2 State of the Art
This chapter reviews the related work to the Neva. Starting from the basic concept it breaches into the details underlying those concepts. Section 2.1 explores the field of embodied agents and states the related references. Section 2.2 provides the details on numerous examples of helper, guide and tour avatars and explains how these examples are related to the Neva. The last section 2.3 summarizes the chapter.
2.1 Embodied Agents for Human Computer Interaction Human species are far more different from others and what makes them so distinguishable? One may argue over their power of exploring spaces, diving into the deep seas, climbing over the world high mountains and so on. In my mind, the most critical aspect of their distinguishness is their way of communication with others. They have a variety of responses that include use of multiple languages, gestures, eye gazes and so on. Mehrabian claims that these non-verbal communications play a crucial role in conversations [12]. Bearing this in mind there is an emphasis of creating such interfaces and interactions that can serve the purpose to some extent. These requirements are giving birth to the field of Embodied Conversational Agents (ECA), which by the time passing is becoming an active and interdisciplinary research [10]. This field is characterized by the agent system that shows human style social intelligence. This brings a new breed of challenges ahead for designers and creators.
In the following some examples of agents where non verbal techniques have been applied to support social interaction are listed, •
Bickmore and Cassell’s virtual real estate agent uses gestures to support the content [15].
•
Rickel and Johnson make use of deictic, eye gaze and gestures to support the teaching capabilities of their agent that teaches tasks within a shared virtual context [16].
•
Lester et al. equip their avatar with deictic, emotional gestures and facial expression for pedagogical purposes [17]. 12
Human-Human social interaction provides the guidance to these challenges. Humans have the ability to exhibit and recognize the variation in behavior while communicating with others because we as humans have different affective states and personalities. Simulation of these behavior and using emotions has already been realized by Affective Virtual Patient (AVP) [4]. AVP is an e-learning tool for social interaction training within medical field. It trains medical students not only for problem solving but also prepares them for social interaction. Doctors face a screen where a simulation of an easily agitated mother and her injured child are presented (see figure 4).
4-a: Both child and mother looks calm and relax
4-b: When clicked on the wrong option, faces of mother and child show anger
Figure 4: Affective virtual patient: characters making use of emotions to depict the situation intensity. After the scene description, they have to make decisions upon a number of choices displayed. A correct decision will lead them to another scene and meanwhile will relax the mother (see figure 4-a) and her child to some extent while a wrong decision will make the 13
situation more out of control (see figure 4-b). Upon completion of successful procedure, doctors can visualize the situations from the emotions of theirs patients. Researchers are developing cognitive architectures that will be capable of modeling the variety of individual differences, emotional states, affective states, personality traits, etc. [13], [14].
After establishing the importance and concept of SIA, lets have a look on various application and research areas where these avatars have marked their importance. The following section explains the use of SIA in the fields of tour guiding, their role on kiosk systems and their participation in library environments.
2.2 Avatar as Helper, Hosts and Tour Guide Interface agents have the capability of transforming information in a productive way. They can be designed for situations where information flow is too high or complex. Busy environments like exhibitions, conferences or celebration parties are example use cases where they can really prove their usefulness. Agents can react to the situations, can answer questions posed by the audience or help them to navigate around.
Katherine Isbister argues that one has to keep the following issues in mind when designing interface agents for host, guide, and helper applications [18]. •
The agent should have an up to date knowledge. He must have knowledge of what to do at what time. It must know the prerequisites, which are required for certain actions.
•
During a conversation, the agent should take proper turns whenever required. It should also facilitate the social flow rather than focusing on one person. In multi user environments, the agent should face itself to the person he is communicating with.
•
The agent role should be clear and appropriate according to the situation he is in such as host or guide.
14
Following examples provide an idea of how agents are performing their role to convey information, helping and guiding the users. 2.2.1 Helper Agent The Helper Agent system is an example depicting the support of interface agents, which helps the human- human conversations in a video chat environment [18]. Each user has his or her avatar that can move freely around the space provided in the interface. The Helper Agent is an animated dog faced avatar, which soul purpose is to listen the two-person conversations among the users and to detect silences (see figure 5). Whenever there is an odd silence in the conversation, the agent approaches the users’ avatars and directs a series of text based or yes/no questions to both people and by doing this he suggests new topics to talk about.
The Neva also initiates contacts with the users by giving them hints about certain tasks available. This proactivety adds a nice feature to the Neva services.
Figure 5: Helper Agent: helping in two-people conversations. (1-3) directing questions (4) facing towards the user
The Helper Agent makes use of social cues by performing the following actions: 15
•
He turns his face to the user when he poses the questions.
•
He approaches and departs the conversation physically.
•
There are numerous animations that support his social attitude, like using nonverbal cues for asking questions, reacting to affirmative or negative response and making suggestions.
2.2.2 PPP Persona PPP, Personalized Plan-based Presenter, is an example of an interactive avatar application [29, 30]. It was designed to train people for technical stuff. To aid its operation it makes use of animations, videos, images and audio. For example how to turn on or off a switch, it gives verbal instructions and points to the location (see figure 6a). Figure 6b describes its architecture. In the very first step, it takes a decision of how to describe an object either by showing an image or a text label. Further it decides among the audio output and gestures.
Figure 6a: PPP Persona screenshot
Figure 6b: Persona architecture
Like PPP Persona, Neva avatars make use of gesture and head movements as Neva knows where the relevant multimedia information is presented on the screen (see Chapter 4, Services overview for more details).
2.2.3 Agneta and Frida
16
Navigation in information systems like the Internet is becoming harder with the arrival of huge information every day. Processing this newly arrived information would be a tough task to do. Beside this, there are large individual differences in how well people navigate in information spaces [20]. Agneta and Frida5 are two female characters, acting as in a role of mother and daughter [23]. They watch the user browser and comment on the contents, on computer technology in general, and provide users their own stories. They pretend not to be a technology or computer user and so their responses and comments are often ironic.
Likewise Agenta and Frida, the Neva system also provides the facility of commenting on the contents. When the user places the book on the kiosk system to let know about the rating of the book, the avatar comments on the contents as well. Because of the reason that the avatar knows the users’ interest, these comments can be very effective (see chapter 4, Services overview, for more details). 2.2.4 AiA Persona AiA (Adaptive InfoBahn Access) is based on the concept of personalized information assistant [22]. Figure 7 shows a screen shot of the AiA travel agent whose prime goal is to provide travel information to its users. His inputs are based on specific servers and databases. After retrieving all required information from these resources, he then complies with the user query.
Figure 7: AiA travel agent screenshot
5
Downloadable files: http://www.sics.se/humle/projects/persona/web/aandf_instructions.html
17
Like AiA Persona, Neva also employs the concept of grabbing information from the Internet. It connects itself to the Amazon web site with the aid of its web services to retrieve the additional information it needs.
2.2.5 Digital City Kyoto Agents can be quite helpful for guiding the tour around the city. Digital City Kyoto is an excellent example of this [24]. Based on emerging technologies like GIS, social avatars and 3D animation, this project is about exploring the city virtually. The agent takes the role of guidance here. He tracks the quantity of conversations and finds out whether the conversation is positive or negative based on his internal vocabulary system. Then he selects the stories accordingly. Here the quantity of conversation and helping words help him out to shorten or broaden his speech. For example if the user response is positive and the quantity of conversation is long, then he tries to middle the story out. Otherwise if response is not positive, he shortens the story length and tells a part of the story (see Figure 8). At the bottom of the figure, one can see the comments of the user on the current scene and the agent (the parrot on left top of the image) explaining about the current site.
Figure 8: A screenshot from Digital City Kyoto tour guide application. 18
Like the parrot agent in the Digital City Kyoto project, Neva’s agents also have the capabilities of commenting on books.
2.2.6 PEACH Museums offer lots of information and history at one place. Thus requiring the services of avatars for the users in more concise ways. Above that, adaptively to user interest and places can play a crucial role in this scenario as well. In museums avatars can play a role of an accompanying agent and can provide continuous assistance. Michael et al. [7] employ the idea of conveying information with the use of user-adaptive and context-sensitive multimedia presentation. Their approach is to make use of several life-like characters to convey different kind of information (see figure 9). Content selection is done on various parameters, like users’ position and orientation, their history of visit and interest. Thus the offered services are personalized.
Figure 9: Avatars in museum, a screen shot of PEACH project [25].
Museums and libraries are similar in a way that they provide plenty of information to their users. And the users differ with each other in their interests as well. While visiting museums they might be interested in certain parts of museums or history. The same way
19
library users might be interested in certain types of books. That is why Neva also works on users’ preferences and provides them personalized services.
2.2.7 Interactive agents for kiosks Kiosks are computer terminals that provide information about the environment where they are installed or they offer electronic services to their users. Holfelder et al. defines a kiosk system as a, "... computer-based information system in a publicly accessible place, offering access to information or transactions for an anonymous, constantly varying group of users, with typically short dialogue times and a simple user interface [27]."
Kiosks present a nice structure to provide useful information based on the theme that people with limited knowledge of computer usage can interact with them. From automatic teller machines to information desks there is a variety of kiosks around. Companies are putting these kiosks in lobbies to advertise their products. Further kiosks in museums are gaining popularity because of the comprehensive information one can have at one place. Kiosks for the libraries are also becoming an emerging factor in information flow at the libraries. Whether one need to search for the books or require any assistance, kiosk based user interfaces would be great help to these.
Borchers et al. [26] categories kiosks into the following four streams: •
Information kiosks
•
Advertising kiosks
•
Service kiosks
•
Entertainment kiosks
Information kiosks are primarily used to provide information for limited subject field. An example of this case would be the kiosk systems at railway station where users can find the train connections for chosen destinations. Advertising kiosks are used by companies or institutes to present either themselves or their products in public spaces. Service kiosks are an extension of information kiosk in a way that they require more data from users in order to process their requests. An example of this case would be hotel reservation systems.
20
Entertainment kiosks, as clear from their name, have the purpose of entertaining users in spaces like waiting rooms or exhibitions.
Beside the touch screen interactivity with kiosks, interactive avatars can further increase the interaction among the users and kiosk systems. Mäkinen et al. [30] implements a talking head as an interactive agent that provide users with information. The system shows content on a bigger area and the talking head (agent) explains or facilitates the content whereas the user image is shown in the part known as machine vision (see figure 10).
Figure 10: Kiosk interface for TAUCHI information kiosk project
Christian and Avery describe an information kiosk that is equipped with an avatar and vision component [28]. Their kiosk detects the user who comes near the kiosk. Similarly Lamel et al. [29] describes a service kiosk that makes use of speech recognition technology to assist user interaction.
21
2.3 Summery
This chapter gave an overview of related work to the Neva and built the concept required for better understanding of how Neva works. It discussed various applications within the field of human computer interaction and explained how system can make use of various resources to enhance the impact. The design of Neva integrates several features of the discussed systems, by combining them in a novel way. These features include: •
Speech and gaze: Neva is an affective and conversational interface that makes use of the speech, emotions and gazes.
•
RFID detection: Neva provides the tangible interface with the help of RFID technology instead of mouse and keyboard.
•
WWW access: It does not only use the local library database for the book information but also has the access to WWW.
Next chapter, Technology setup for the McLuhan Documentation Center kiosk, describes the required technology and the installation setup in details.
22
3 Technology Setup for the McLuhan Documentation Center Kiosk
The following chapter briefly discusses few of the core components involved in the technology part. This covers the character modeling of the life like characters, RFID technology, kiosk configuration, Internet web services and database.
3.1 Character Modeling With the upcoming of several technologies, character modeling is becoming much easier and improved. Facial expressions and emotions are becoming the part of standards. Nowadays there are plenty of tools available to achieve these tasks e.g. Haptek, Cal3d etc. Cal3d is an OpenSource library which for the body animation. Similarly Haptek6 allows the creation of the realistic 3D characters. With the help of PeoplePutty, a modeling program for Haptek characters, the designers can create various types of 3D characters. The characters used in the Neva are built in PeoplePutty. PeoplePutty not only provides the opportunity of creating life like characters but it also realizes lip synchronization as well. On the development level, Haptek provides the API support to integrate these characters in various languages platforms. The API contains the content building functionality in a rich set of “Haptek Hypertext” commands. It is a set of commands useable for authoring both web based and non-web based control.
3.2 RFID With the everyday advent of technologies computers are becoming a part of daily life. Devices are becoming smarter and smarter and there have recently been efforts to make them invisible. Radio Frequency Identification (RFID) is an example of these emerging technologies. The McLuhan library at ISNM International School of New Media is equipped with an RFID technology that allows automated book borrowing and returning.
RFID is a form of automated data collection. Alone itself it might have limited usage but coupling it with software applications makes it powerful and many different solutions can 6
© 1999-2005 Haptek Inc. www.haptek.com
23
be made with the help of these. It serves as a purpose of automatic identification and can be used in many areas depending on the needs. Automotives, rental services, waste management, livestock maintenance and supply chain management are already using benefits of this technology.
In the library information system developed in chapter 3, the system makes use of RFID technology to identify the users. An RFID card containing a unique identification number is given to the every user of the library. This number is issued by the librarian who stores this information in the database afterwards (see section 3.4 for more details). As books are also equipped with the RFID tags, it uses this identification to analysis the book contents and provides users various services based on book such as user ratings, book recommendations etc. This way Neva provides a tangible interface to the users. The idea itself centers on the Tangible Bits: Towards Seamless Interface between People, Bits and Atoms [3]; an article from MIT media lab.
3.3 Kiosk Configuration The Neva’s interface is designed for a kiosk based information system which is installed in the McLuhan Documentation Center at the International School of New Media (see figure 11).
Beside the state of the art requirements for a kiosk like processing speed, memory requirements and graphics quality etc., an
RFID reader
RFID reader is attached to the kiosk that let the Neva to interpret the RFID tags of the books and the users. These RFID tags and readers are manufactured by Phidgets Inc. [5] and operate on 125 KHz. The kiosk is equipped with a touch screen that let the users to select the library services easily. It also has a keypad and a mouse that helps the users to enter the certain information Figure 11: An image of the kiosk when required.
installed at McLuhan Documentation Center 24
In order to prepare the kiosk to support the Neva’s software, certain plug-ins were installed e.g. Haptek player, Windows media player ver. 9.0 or higher, Macromedia Flash player 7.0, Microsoft .Net framework ver. 1.1 or higher, Phidget library and ODBC for MySql (see index 8.7 for a complete guide of how to install the Neva).
3.4 Database In order to support personalization to the users and to keep track of their visits at the library, the database performs a crucial role. There are plenty of databases either commercial or opensource are available to operate e.g. MS Access, Oracle, MySql etc.
In this case, MySql database serves the purpose of storing the data. Neva has its own database and holds all the necessary information required to operate the Neva. This covers the users’ information such as their ids, names, their avatar ids etc (see index 8.5 for table structure and relationship). As described in the above section 3.3, that each user carries an RFID card which holds a unique identification number, serves the purpose of the user ids in the database. In the same way the books contain the unique identification numbers on their tags which allow the database to identify them. The book’s table consists of their ids, names, author names, year of publication, their arrival date etc. Beside the users and the books information the database also contains the information about the avatars e.g. their names, their ids etc.
This way Neva makes use of the database services when retrieving, storing or updating the records.
3.5 Amazon Web Services (AWS)
One of the promising features of the Neva is its capability of extracting the books information not only from its database but also from the remote data servers like Amazon [35]. To accomplish this task an Amazon web service is coded in the Neva’s software.
Amazon Web Services is a platform which enables the creation of websites and applications that perform various functions, such as enabling and completing transactions, 25
retrieving information about Amazon products or adding a product to an Amazon shopping cart, wish list, or registry. Amazon Web Services can be accessed through two interfaces: XML over HTTP or SOAP. Both of these methods return "structured data" (product name, manufacturer, price, etc.) about products available at the Amazon servers based on parameters such as keyword search terms and browse tree nodes.
Neva incorporates the SOAP interface to extracts the user ratings from the Amazon. But before initiating the request to the Amazon servers, one has to register itself once to the Amazon web services as a developer. Once registered, the Amazon web services team issues a subscription id that must be include in every request made to the Amazon.
Next chapter, library services overview, talks about the services which Neva offers to the library users. It establishes the concept of these services with the Neva and state their objective and usefulness in the library environment.
26
4 Library Services Overview
This chapter gives the overview of the services that Neva provides. It also explains the objectives and motivation of these services in detail. Figure 12 shows the kiosk screen and the services which Neva offers to the user when logs in. The services then can be explored with a touch (see figure 12).
Figure 12: A student engaging with Neva services
4.1 Services
When building Neva, the idea was to develop a system that should not only support personalization but also user-independent services as well. The system offers the following services: •
Library information
•
Technology overview
•
Book search and user ratings
•
Alerts
•
New arrivals
•
Preferences
27
There is a distinction between the services as they fall into two different categories, general and personal services. Among these services, some further offer a sub level of services in order to make them more concise and understandable.
4.2 General Services
The main focus of general services is to prevail the information related to the library infrastructure and to create the user awareness with the surrounding environment.
4.2.1 Library Information This service deals with the information which is related to the library infrastructure. It answers the questions which are mostly raised by the new library users. For example one might be interested in the number of books the library contains, what types of books are more likely to be found in this library, whether the library offers literature to only specific areas or whether it has books on almost all aspects. This is important because of the fact that there are many institutional libraries which only support those kind of books that are related to their degree awarding program. On the other hand one might be interested that what sorts of magazines or journals are regularly prescribed for this library and so on.
Library information itself is divided into three sub services. •
Staff
•
Rules and regulations
•
Books and Journals
4.2.1.1 Staff
From the user point of view, the library staff becomes the key entity when entertaining their requests. That is why it is often important to know who offers which services or who could be the appropriate person to ask something? This sub service serves this purpose. When user selects this service, an avatar introduces the staff and explains their 28
responsibilities. During this process of introduction, still images of staffs and text are used to enhance the impression of introduction.
4.2.1.2 Rules and Regulations
Libraries might be on top of the list among the places where rules and regulations are critical to hold. This service is more helpful to the new users who often need to be aware of the rules and obligations in the library. In general it explains what sorts of actions are allowed or prohibited in the library. When the users select this service, the avatar reads the rules and regulation for them.
4.2.1.3 Books and Journals
This service discloses the types of books and journals in the library. As described earlier most of the non public libraries do not offer a wide variety of subjects but they are more specific to their interested areas. Thus this service informs the user about the content of the library. Beside this it also lets the user to view the list of all prescribed journals and magazines in one glance.
4.2.2 Technology With the advent of new technologies and smart devices around, more often it makes the life difficult for the users whose knowledge of the technology is not up to date. Despite the fact that we are living in a world where one finds machines all around, every new machine requires a bit of additional knowledge to handle. Keeping this attribute in mind, Neva offers a comprehensive view of technology equipments installed in the library. Thus offering a great deal of help for newcomers to understand their operational capabilities and handling issues. Neva implements the technology service to educate the library users in this regard and makes use of full digital media support to further the experience. This includes a number of still images and pre-recorded video sequences.
29
The technology service is further categorized into two sub services.
4.2.2.1 Technology Overview
This service offers an overview of technology installed in the library. This is important because of the fact that without this, users might not be able to interact properly with the automated machines. This service explains the procedure from start to the end. This includes the automated borrowing and returning of books. Because of these complex procedures, this service employs the visual aid in a form of short video sequences. When showing a video, the avatar moves to the left of the screen and comments on the video when required (see figure 13). Further it explains the involvement of RFID technology in the library.
Figure 13: Avatar explaining technology overview with the aid of video sequences
4.2.2.2 PDA overview
This service explains the involvement and the usage of PDAs (Personal Digital Assistant) in the library infrastructure. It also explains the procedure of lending PDAs from the library staff. Further it provides a tutorial of how to operate them. Again it makes use of videos and images for explanation. 30
4.2.3 Searching and Locating Media Items The search facility is one of the most important services in the library infrastructure. Among thousands of media items, it is not always an easy task to find the desired ones. Different libraries employ different approaches to overcome this problem. Some manage catalogues of books in special draws where readers can find their desired books. This approach has certain drawbacks as it requires a lot of time and extensive book searching. Nowadays most of the libraries are using computer technology to enhance the searching. Using computers can save a great deal of user’s energy and time.
Neva offers much more than a conventional book search. It provides not only the names of the book searched but it also guides the user to the section where those books should be located. To realize this task, the library is divided in multiple sections, e.g. Computer Graphics, E-Commerce, Art & Culture, Game Design and Ubiquitous Computing etc. and each section knows what sort of books it contains (how this is realized in Neva, see chapter 5). Indeed this is very important because of the fact that in a library surrounded by thousands of books, sometimes it doesn’t matter that you got the name of your book but most importantly you know where to locate that. That is why Neva makes use of animation to show the users where to find the searched books. This way the library is viewed from top and the system highlights the target area with an animation (see figure 14).
Figure 14: A screenshot of Neva when guiding the user to the “Game Design” section by highlighting the area. The books related to the Game Design can be found on shelf number 3. 31
The figure 13 represents the library divided in multiple sections. Each section in this case is a part of the shelf which is numbered from 1 to 6 in order to avoid any confusion. For example when a user looks for the book “The Art of the Game Design”, shelf 3 gets highlighted and starts illuminating itself and the user would know that the book should be in this section “Game Design”. The text “Game Design” also represents the side of the shelf as well. However it doesn’t guarantee that the book will be found there as it might be mistakenly placed at any other section.
4.2.4 Rating Books As stated in the section 3.3 (kiosk configuration), the kiosk installed in the McLuhan Documentation Center is equipped with an RFID reader that reads the book’s tag when placed on it (like the users, the books also has the RFID tags attached to them). Once placed, the Neva can rate the book and so providing the users an idea about the quality of the contents. Again Neva introduces here a novel concept of user interfaces. Instead of providing the name of the book on Amazon web site to view its rating, the user simply places the book and the system does the remaining (see section 3.5 (Amazon Web Services) for the technical details). For this, Neva connects itself to Amazon via web services by providing the necessary information. It grabs the rating from there and displays it to the user. This service makes Neva a lot more powerful than a conventional information kiosk. Besides the rating, users can also view other useful information about the books from the Amazon.
4.3 Personal Services
4.3.1 Alerts Because of the capability of logging the visits of its users to the library, the alert service supports the library users when certain deadlines are approaching. This service alerts the library users and reminds them about certain dates. After logging into the system, when user selects this service, the personal avatar checks the user database and retrieves the last visit information and matches it with the deadlines, and if there is some notification to be 32
made, it alerts the user. In academic libraries where students are often charged by late submission of books, this service has certainly the potential of saving their money.
4.3.2 New Arrivals Almost every library maintains a section where newly arrived books are placed. But being a reader, chances are on higher side that you might not be able to view them on time. Before your visit to the library few of them might be older enough to be moved to their related shelf. To the next step of personalization Neva offers this service to notify its user at the right time. Here not only the timing is crucial but most importantly it remembers the user’s taste (see section 4.3.3.2 (Personal Data) for more details) of reading and as soon as the user logs into the system and selects this service, it retrieves the newly arrived books information from the book’s database and checks the log information of the user’s visit.
If there are new books arrived after the user’s last visit and it also matches with the user’s interest, the avatar announces the books information and shows their cover pages and necessary information (see figure 15).
The cover page
Navigational button
Figure 15: Screenshot of application where avatar announces the newly arrived books. Book’s cover page is shown on top right section
33
Figure 15 shows the cover page of the first book which newly arrived at the library. In case there is more than one book to display, the list can be browsed with the navigational buttons under the cover page.
Further it also provides the possibility of emailing the list of newly arrived books information to the user’s email address. This gives the users an opportunity of reviewing books in details.
4.3.3 Preference In order to keep track of user’s interest, Neva employs this service. It is more like a personal data gathering centre rather than providing some feedback from Neva. It has categories in the following two sub services.
4.3.3.1 Avatar Selection
Avatar selection provides the facility of selecting one of the six avatars available to play the role of librarian (see figure 16). Every avatar comes with different voice, shape and expressions. This brings new life and excitement to the interface every time a different avatar is selected.
Figure 16: A screen shot of Neva when the user is asked to select his or her avatar
34
When designing avatars for selection, the following three requirements have been considered: •
At least one avatar should be known by the general library users to avid any confusion. In this case, Nina was selected to be the default character (see section 6.1 for details).
•
There should be equal distribution of gender among avatars.
•
Having this in mind, two male, two female and two non-human characters were designed.
When one of them is selected by the user, this remains his or her personal avatar unless changed.
4.3.3.2 Personal Data
This sub service holds the personal information entered by the user. In order to support personalization and to know about the likeness and unlikeness of the user, this service proves very helpful. The user is asked to provide the following information (see figure 17): •
Email (optional): User’s email is required because of the reason that “New Arrival” service required this email id to send the newly arrived books information.
•
User’s interest: For the sake of simplicity, books are categorized in different sections, e.g. Computer Graphics, Networking, Media and Society, Culture, Ecommerce etc. A combo box shows the list of the available sections and the users are asked to select their preferred ones. This helps to notify the user when there are new books arrived matching to their interests.
•
Alert duration: Here users can specify alert duration in number of days so that the system starts alerting them before the real deadline.
35
Figure 17: A screenshot of the personal data service while entering the data into the system
The figure 17 shows the interface when the user is asked to enter the personal data into the system. In the Alert duration combo box, the user can select the duration ranging from 1 to 5 days to deadline. Similarly in the interested area combo box, the user can select his or her areas of choice. The list under the interested areas displays the sections already selected. The user can remove the sections from the list by pressing the delete button.
The next chapter discusses the technology in detail and explains how the above services are realized. Further it summarizes the compositions of multiple software components which played an integral role in building the Neva architecture.
36
5 Realization
This chapter provides the explanation of how Neva is realized. Section 4.1 (system architecture) describes the core components involved in building Neva as a complete information system for the library users. Section 4.2 (system walkthrough) describes the flow of the system when the user interacts with Neva.
5.1 System Architecture In order to engineer the personalize and general services for the library system, one must firstly identify the core notions and components of software module involved. Neva makes use of various technologies to realize the whole scenario. Figure 18 explains this in detail.
Figure 18: System architecture and component overview
From the software point of view, the system is divided in three sections i.e., front-end, logic layer and back-end. The front-end is responsible of the user interface that is designed in Macromedia Flash MX [36], thus providing the capability of enhancing the front-end 37
design while employing the maximum of animation and design. It also incorporates with the RFID reader and sends the signals to the logic layer. There is a two way communication between front-end and logic layer. When ever the user clicks on the touch screen, the front-end generates an event and lets the logic layer know about it. There is lot of feedback and support from XML that helps the front-end to decide about what to display in each situation. Use of xml files makes it very easy to change the information accordingly and also a person with little computer knowledge can update these files easily.
XML files contain the following information: •
Configuration
•
Media controller
•
Character preferences
•
Character sounds
Configuration: This file carries all the configurations needed to launch the application. This contains the background information, default avatar for the library and the list of user's RFID tag numbers (see index 8.1). The background information holds the name of the image file which will be displayed on the back of the avatar (see figure 19).
Figure 19: An image of the library is shown on the back of the avatar
In the figure 19, an image of the book shelf is shown on the back of the avatar. And if the 38
library has more than one kiosk and they are located at different places, then with different backgrounds one can create the illusion that the avatar is following the user in the library. This enhances the impact and realization.
Media controller holds the media information i.e., the names of the related images and video sequences (see index 8.4). It also categories them into various sections. This lets the system know which video should be played at what time. Further it contains the path information for the media. This helps the system to locate the files in the hard drive. Character preferences identify the avatar information, their names, position and the path (see index 8.2).
Character sound is the most populated XML file because of the reason that it contains the voice information for the avatar and various services (see index 8.3). The system is operable on two types of voices i.e., pre recorded human voices or TTS (Text To Speech). The first step is to record the voices in .wav format in such a way that for every service there exists an audio file. This can either be done by recording in the chunks and saving it in separate files or recording in one file and then creating the separate files with an editing tool. Once these files are edited, they are ready to be processed for the lip synchronization information. This is done with the help of Haptek Peopleputty software that allows the developers to create the lip synchronized audio files from .wav files and saves them in .ogg format.
During the system start, it configures itself on the above information and eases the further tasks for the logic layer.
Logic layer is the most important and crucial part in this architecture. It controls all the activities going around the system and instructs front-end or back-end when required. Logic layer is also responsible of initiating requests to the Amazon web services and complying with the front-end. It consists of two parts.
5.1.1 Software Logic The software logic performs multiple tasks. It is responsible of taking appropriate actions on various inputs. It receives the events from the front-end and gives commands 39
accordingly. It is intelligent enough to read the situation properly and to act accordingly. The logic is developed in C# (a programming language which comes with the Microsoft Visual Studio .Net7 suit), which gives the developer the ability of controlling the logical flow of the system. The software logic is built on various classes which operate closely to give the results back to the front-end or to retrieve the records from the back-end (for the class structure see index 8.6.) In the start it creates the connection with the database and initializes the relationship with the RFID hardware. Once the connection is established, it creates an instance of the presentation class and transfers the control to it by providing the required information from the database and the RFID hardware.
5.1.2 Presentation Logic
The presentation logic concern is the view part of the system, i.e., what is being displayed on the front-end when certain options are being pressed by the user. The following section describes how the presentation logic works.
5.1.2.1 Overview
When the software logic identifies the view events it lets the presentation logic know about it, which is built on windows application forms created in Visual Studio .Net and holds the Component Object Model (COM) of the Windows Media Player, Shockwave Flash and Haptek. These COM objects are responsible for playing their respective media, i.e. Windows Media Player COM object plays the videos, still images and sounds. The Haptek COM object controls the characters animation while the Shockwave Flash object supports all the Flash animations. In order to create an effect of full screen application, the windows application form is set to the certain parameters.
Figure 20-a shows the composition of the COM objects on the windows application form. The form contains 2 COM objects of each type. The front-end design, supported by the Shockwave Flash object (1), remains on the top of this form and the remaining COM objects are aligned in such ways that they fit in the front-end design. In the center of the windows application form, there are 3 COM objects, Haptek (1), Shockwave (2) and 7
http://msdn.microsoft.com/vstudio/. [34]
40
Windows media player (1) lying over each other (see figure 20-b).
Figure 20-b: Layers of the COM Figure 20-a: Composition of the COM objects on the form
objects in the center of the form
Figure 20: The formation of the Haptek, Shockwave Flash and Windows Media Player COM objects on the windows application form.
Depending on the situation, one of them is brought to the front, while keeping the other in the background. For example if there is no user log in, the presentation logic brings the Shockwave Flash (1) object to the front of the screen and plays the screen saver animation (see figure 21). And when the user swaps his or her card on RFID reader, it brings the Haptek (1) COM object to the front of the screen and sends the Shockwave Flash (1) object in the background. The presentation logic consults with the XML and the database and loads the user’s character on the screen (see figure 22). If the user selects the service “Technology overview”, the presentation logic brings the Windows Media Player (1) object to the front of the screen and sends the Haptek (1) in the background. Meanwhile it also brings the Haptek (2) to the front of the screen and loads the avatar there while Windows Media Player (1) loads the video (see figure 13).
41
5.1.2.2 Media Controller
To control the flow of the media (animations, audios, videos, images and avatar’s presentations), few controller classes are designed in C# and each type of the COM object is controlled by these classes. All these controllers work for the presentation logic and support the operations which it demands. The windows application form class, RegularUser, serves as a base class for these following controllers (see index 8.6).
The FlashController class is responsible of rendering the flash animations media type. This class receives a Shockwave Flash controller object from the base class and prepares this controller to play the animations. This whole process of playing the animations is done in three steps.
1. In the first step the media URL is passed to the Flash controller. Animation media is prepared in a way that it contains all the front-end design animations in a single file and each frame in the file then specifies the animation sequence. This approach lets the system to load the animation file once at the start. Otherwise if there are separate animation files for the each operation, then the system has to load the files at the runtime. Because of this, the users might not feel the smooth transaction between the animation sequences. 2. In the 2nd step the controller is set to the start frame. This lets the controller decide to go to a specific frame. 3. The third step brings the controller in front of the screen and starts playing the animation.
The WindowsMediaController class is responsible of rendering the audios, videos and images media types. Like the FlashController this class receives a WindowsMedia controller object. Because this controller is responsible of playing three different media types, in the first step it decides which media is passed to it. Once decided, the controller starts playing the specified media type while bringing itself to the front of the screen.
The HaptekController class specifies the parameters to load the character in one of the Haptek COM objects. Besides the loading, it sets the background information and the mood value of the avatar. 42
Both software logic and presentation logic work together and closely to comply any requests from the front-end. The back-end of Neva consists of a database which holds the records for the users, avatars and books. Again the communication between logic layer and back-end is bi directional. For table structure and their relationships, consult index 7.5.
5.2 System Walkthrough Let’s consider a scenario where a user approaches the system and sees how Neva responds to it. When there is no body login, the system displays an intro of the library and invites the users to login. This acts like a screen saver for the system (see figure 21). The screensaver animation is designed in Macromedia Flash Mx and the media configuration XML file contains the name of this file. The software logic loads the animation file and gives the control to the presentation logic which starts playing the animation.
Figure 21: The introductory screen of Neva’s interface
As discussed earlier in the chapter 3 that each user carries an RFID card which acts like a security or a password key (in the future, the system could also be used without logging as 43
one might place the book on the RFID reader to use the books services only). Each card is programmed on a unique key and the kiosk in the library is equipped with an RFID card reader. When the user swaps its card, the reader extracts the required information from it and passes it to the logic layer. There are two pools of cards which distinguish among the users and books. In this case, the user pool responds to the request and identifies the user. Once the user is identified, Neva contacts the user database to extract the user settings that maintains the following information: •
Avatar information
•
Interested reading areas
•
Visit history
Avatar information holds the names and the properties of user's avatar so that every time the user logins, the system should load the avatar which the user has selected. Interested reading areas keep track of user interests in reading. Here user specifies in which area he or she would like to explore the library. This is very helpful to notify the user when there are new arrivals related to user's interest. Visit history tracks the user's visit at the library. Again this information is crucial to announce the new arrivals because this way the system would know whether the user has been notified by the books which were arrived before or after the user's last visit.
Once this login procedure is complete, avatar welcome the users and waits for the user to select one of the services it offers (see figure 22). The services are explained in details in the services section.
44
Figure 22: Neva interface screen where all services are displayed
When the user clicks on the "library info", this service expands itself into three sub service and invites the user to select one of them (see figure 23).
Figure 23: Library info service expanding itself into its sub services
As one can see that there are three sub services for the "Library info". If the user goes for the “Staff” service, the avatar introduces the library staff and to aid this, the system displays the images of staff in the right corner during the introduction (see figure 24).
45
Figure 24: Staff service, introducing the staff to the user while staff images are shown in the right corner
If there are more than one staff members in the library, the system attaches the navigational buttons at the bottom of the images to let the users explore the list.
When the user clicks on the "Rules and Regulation" services, the avatar starts explaining the regulations held by the library. When clicked on the “Books and Articles” button, the avatar let the user know about the types of books and journal that can be read in the library.
Technology service expands itself into two sub services, when clicked (see figure 25).
Figure 25: Screenshot of Neva's technology service, expanding into sub services
“Technology Overview” shows small video sequences to educate the users about the technology installed in the library. While showing the video, the avatar moves itself to the left corner and faces towards the video screen (see figure 13). “Lending PDA” does the 46
same thing by showing the video, but this time the video contains material more specific to the PDA involvement in the library.
The “Alert” service alerts the user if there are any deadlines approaching depending on the alarming value set in the Personal data service (see section 4.3.3.2 for more details). It also lets the user know about the names of the books that need to be returned. When the user selects this service, the software logic queries the Neva database to retrieve the alarming value from the user table and the issuing information from the user_book_issue table, which keeps records of the issued books to the users (see index 8.5 for table structure). Any results from this query are returned to the software logic, which compares the issue date with the system date. And if the difference is less than or equal to the alarming value it alerts the user by announcing the book names.
Then comes the "Search / Rating" service, which supports the user to find the required book or to preview the user ratings from Amazon to get an idea about the contents of the book (see section 3.5 or 4.2.4 for more details). The system provides a tangible interface for the books rating. All the books at the McLuhan Documentation Center have RFID tags attached with them. When the users place the books on the RFID reader attached with the kiosk, the system extracts the information from their tags and uses the Amazon web service to get the users’ preview or rating.
When the user clicks on this service, the system asks the user to enter the name of the book in a text field (see figure 26).
Figure 26: System asking the user to enter the name of the book to search the book in the library or to check the user rating on the Amazon web site
Let’s consider a scenario when the user types the “Learn PHP in 24h” in the search text box to see whether the library has this book or not. And if the library has it then where one 47
can find this book. In response to the users’ request, the system searches its database to get the information about the book and the avatar announces the results. If the library has this book then the system shows the location to the user with the help of a small animation by highlighting the book shelf (see figure 14).
When clicked on the service, "New Arrival", Neva scans the newly arrived books in the database and compares it with the user's last visit and announces the titles of the new books. During this announcement, it displays the cover page of the book in the right corner (see figure 15). If there is more than one book to show, it provides the navigation bar under the title to see the others one.
When the user clicks on the "Preferences" service, it expands itself into sub services: "Avatar Selection" and "Personal Data" (see figure 27).
Figure 27: Screenshot of Preferences service, expanding into “Avatar Selection” and “Personal Data” sub services
“Personal Data” demands the user to enter the required data to support personalization (see the figure 17). For more details on the personal data, consult the chapter 4, services.
“Avatar Selection” provides the opportunity of selecting one of the avatars available (see figure 16). If the user hasn’t selected any avatar yet then the system loads the default character (Nina) for the user. Once it has been selected, the system updates the users’ database and loads the user’s avatar when the user logs in next time.
The next chapter (Conclusion and Future extension), concludes the thesis report by summarizing the thoughts and the work revolve around Neva. It also presents a preliminary user-review of Neva. In the end it describes the possible future extension that can be made to Neva. 48
6 Conclusion and Future Extensions
6.1 Lessons Learned The development of the Neva system is part of a research area in HCI that aims at producing more realistic and personalized life-like characters in the library environment. The prototype was implemented and demonstrated in the McLuhan Documentation Center at the ISNM-International School of New Media at the University of Lübeck, Germany. Several dummy profiles were defined for the users and the students were given the login chips to evaluate the performance and express their feelings (see figure 3).
Although it wasn’t a formal evaluation, the preliminary review of user’s reaction and feedback was helpful enough to flash out Neva’s potential as a helpful, friendly and interesting to use information system. The design of Neva encouraged the students to intuitively engage in appropriate interaction with the system.
The idea of using multiple characters with different emotions and expression was mostly enjoyed by the students. Among the characters, Nina was the most popular and attractive figure and keeping this in mind, she is selected the default character for the Neva (see figure 28).
Figure 28: Nina; the most popular avatar in the library
One might argue over the silent state in the library especially as the system involves talking agents. Two suggestions were made to compensate this situation. One option was to install the system in such a place where its volume creates no distraction to the library readers. Secondly, even if it creates a bit of distraction, the user can use the headphone attached to the kiosk.
49
Although the initial prototype was developed and displayed at the university library, I believe that the results are applicable to a broad variety of application areas, as outlined below.
6.1.1 Museums Like libraries, museums offer a great deal of information to their users. Because of the robustness of our lives, time is becoming a critical factor. In a situation like this Neva can prove its effectiveness by taking the advantages of its personalization and friendly interface. Visitors can be asked to provide some details about their interests in museums and once they approach the kiosk, it offers them the services and information matching to their interests. This way the users can be guided to their most attractive areas in the museums.
6.1.2 Exhibitions Exhibition spaces require a great deal of guidance to their visitors. Sign boards and floor maps are being considered as the most traditional ways of explaining the territory. This works fine with the small area but when it comes to a bigger area, these traditional ways loose their effectiveness. Again Neva can be quite helpful in these situations where it can guide the visitors to their desired direction. With the help of its personalization capability it can guide them to their most wanted places and can suggest them about stalls according to their interests and likeness.
6.2 Directions for Future Work
The Neva prototype experience showed that in order to provide the user with an engaging experience, design and the role of services plays an anchor role. Possible future work in the field of personalized life-like characters is limitless at this point, because of its realistic and practical approach to the real life problems.
50
So is the case of Neva. Its future expansion relies on the following aspects.
6.2.1 Learning Capability Right now, Neva’s learning is bound to its user’s input. At the very first time, users provide the necessary information to invoke the system to react it to their interested areas. Future models of Neva can bypass this process and can be intelligent enough to know about its users by examining and exploring their visit history and type of surfing they do in the library. Neva then can build a learning model that can help her to recognise its user’s need and interests.
6.2.2 Technology Realization Adding new and improved technology to the Neva definitely can boost her performance and capabilities to serve her users. Transferring the avatar capabilities from kiosk to the PDA would be a step forward in this process. This way the avatar would not only be available at certain points in the library but the users will have the controls all the time. This way the location tracking could be realized in the Neva as well. This service will also enhance the power of guidance to the Neva as avatar then can guide the users along the path rather than showing the book shelf from one perspective. Evaluation of the books will also be at very ease as the users then don’t have to pick and place the books on the kiosk every time but by simply closing the PDA to the book and selecting the user’s rating service will tell the story.
To add more effects in the existing kiosk systems, one might think of installing the cameras with them. This would make Neva’s characters more intelligent and proactive as with the cameras, users’ position and movement can be detected. Then the avatars could greet and face towards the users if there are no users logged in.
Neva has its own database, which at this point is not connected with the real library database of McLuhan Documentation Center. In future, it would be the next logical step to combine the features of both databases.
51
6.3 Conclusion I built a personalized library assistant prototype whose purpose is to facilitate the interaction between library users and computers. I feel that the support of human computer interaction in the field of libraries is an exciting and useful new domain for like life characters. Given the proliferation of libraries and the interest in reading specialized and specific materials, this kind of interface design may become a familiar part of future libraries.
Human performance in the use of computer and information systems will remain an area of intense research and development. Design for information systems refocuses the designer’s effort to meet the needs of an ordinary user and to create an interesting and novel set of challenges.
Neva provides all these opportunities to the designer to think and adapt it. It involves life like characters which belong to the emerging technologies. Their operational areas are expanding and they have their usefulness in many fields of life. The results obtained with Neva’s prototype encourage us to go on in these directions.
52
7 References
1: Andre, E., Rist, T., Mueller, J., Integrating Reactive and Scripted Behaviors in a LifeLike Presentation Agent. In Proceedings of Autonomous Agents 98, (Minneapolis/St. Paul, May 1998), ACM Press
2 Cassell, J., Bickmore, T., Billinghurst, M., Campbell, L., Chang, K., Hiljülmsson, H., Yan, H. Embodiment in conversational interfaces: Rea, Conference on Human Factors in Computing Systems (CHI 99), pp.520-527.
3: Hiroshi, I., Brygg, U., Tangible bits: towards seamless interfaces between people, bits and atoms. In proceedings of the SIGCHI conference on Human factors in computing systems, March 22-27, 1997, pp.234-241.
4: Jung, B., Ahad, A., Weber, M., The Affective Virtual Patient: An E-Learning Tool for Social Interaction Training within the Medical Field. In Proceeding TESI 2005 - Training Education & Education International Conference. Nexus Media, 2005.
5: Phidgets USA.com, retrieved on (13 July 2005) http://www.phidgetsusa.com/
6: Franklin, S., Graesser, A., Is it an agent, or just a program? A taxonomy for autonomous agent. In Proceedings of the Third International Workshop on Agent Theories, Architectures, and Languages, published as Intelligent Agents III, Springer-Verlag, 1997, pp. 21–35.
7: Michael K., Dominik H., Antonio K., Adaptive multimodal presentation of multimedia content in museum scenerios, Künstliche Intelligenz, 2005, pp. 56-59.
8: Benford S., Bowers J., Fahlen L.E., Greenlhalgh C., Snowdon D., Embodiments, avatars, clones and agents for multi-user, multi-sensory virtual worlds, Multimedia Systems, Springer-Verlag, 1997. 53
9: Badler, N., Phillips, C., Webber, L., Simulating humans – computer graphics animation and control, Oxford University Press, Oxford. 1993.
10: Kerstin D., The art of designing socially intelligent agents: science, fiction and the human in the loop. Applied Artificial Intelligence Journal, Special Issue on Socially Intelligent Agents, 1998. pp.573–617.
11: Junior Exec, retrieved on (1 March 1, 2005), http://www.juniorexec.gov.uk/juniorexec/jx_display_home.jsp?p_applic=CCC&p_service =Content.show&pContentID=419&
12: Mehrabian, A., Communication without Words, Psychology Today, vol. 2, no. 4, 1968, pp. 53-56.
13: Hudlicka, E., Billingsley, J., Representing behaviour moderators in military human performance models. In Proceedings of the Eighth Conference on Computer Generated Forces and Behavioural Representation, 1999.
14: Cañamero, L., Gaussier, P., Emotion Understanding: Robots As Tools and Models. Emotional Development: Recent research advances, Oxford University Press.
15: Bickmore, T., Cassell, J., Relational Agents: A Model and Implementation of Building User Trust. In Proceedings of ACM CHI 2001 Conference, Seattle, Washington, 2001.
16: Jeff, R., Lewis, J., Task-oriented collaboration with embodied agents in virtual worlds, Embodied Conversational Agents. MIT Press, Cambridge, MA, 2000.
17: Johnson, W. L., Rickel, J. W., Lester, J. C., Animated Pedagogical Agents: Face-toFace Interaction in Interactive Learning Environments. International Journal of Artificial Intelligence in Education, 1968, pp.47-78.
18: Isbister, K., Nakanishi, H., Ishida, T., Nass, C., Helper agent: Designing an assistant for human-human interaction in a virtual meeting space. In Proceedings of the 54
International Conference on Human Factors in Computing Systems—CHI00 (The Hague, The Netherlands, Apr. 2000). ACM, New York, pp. 57–64.
19: Justine, C., et al., Embodied Conversational Agents. MIT Press, Cambridge, MA., 2000.
20: Dahlbäck, N., Höök, K., Sjölinder, M., Spatial Cognition in the Mind and in the World - the Case of Hypermedia Navigation, The Eighteenth Annual Meeting of the Cognitive Science Society, University of California, San Diego, 1996.
21: Andre, E., Rist, T., MÄuller, J., Employing AI methods to control the behavior of animated interface agents. Applied Artificial Intelligence ,1999, pp. 415-448.
22: Cañamero, L., Gaussier, P., Emotion Understanding: Robots As Tools and Models. Emotional Development: Recent research advances, Oxford University Press.
23: Persson P., AGNETA & FRIDA: A Narrative Experience of the Web? In Proceeding of the AAAI, Fall Symposium on Narrative Intelligence, North Falmouth, Massachusetts, 1999.
24: Ishida. T., Digital City Kyoto: Social Information Infrastructure for Everyday Life. Communications of the ACM (CACM), Vol. 45, 2002, pp. 76-81.
25: Andrè, E., Rist, T., Presenting Through Performing. On the Use of Multiple Lifelike Characters in Knowledge-Based Presentation. In Proceedings of the Second International Conference on Intelligent User Interfaces (IUI 2000), pp. 1–8.
26: Borchers J., Deussen O., Knörzer C., Getting It Across: Layout Issues for Kiosk Systems. In Proceedings of the Workshop on W3-Based Online Kiosk Systems, Third International World-Wide Web Conference, Darmstadt 1995. Reprinted in: SIGCH Bulletin 27, pp. 68–74.
55
27: Holfelder, W., Hehmann, D., A Networked Multimedia Retrieval Management System for Distributed Kiosk Applications, In Proceedings of the 1994 IEEE International Conference on Multimedia Computing and Systems, May 1994.
28: Christian A. D., Avery B. L., Digital Smart Kiosk Project. Proceedings of CHI ’98 (1998), ACM Press, pp.155–162.
29: Lamel, L., Bennacef, S., Gauvain, J. L., Dartiguest, H., Temem J. N., User Evaluation of the Mask Kiosk, ICSLP '98, Sydney, Australia (1998).
30: Erno, M., Saija, P., Roope, R., Experiences on a Multimodal Information Kiosk with an Interactive Agent. In Proceedings of NordiCHI 2002, ACM Press, 2002, pp.273-276.
31: Free On-line Dictionary of Computing, retrieved on (28 February 2005), http://www.dictionary.net/bot
32: A.L.I.C.E. Artificial Intelligence Foundation, retrieved on (28 February 2005), http://www.alicebot.org/
33: Computing Dictionary, retrieved on (28 February 2005), http://www.hyperdictionary.com/computing/infobot
34: Microsoft Corporation © 2005, retrieved on (11 June 2005), http://www.microsoft.com/
35: Amazon.com, Inc. © 1996-2005, retrieved on (11 June 2005), http://www.amazon.com/gp/browse.html/104-44649080451105?%5Fencoding=UTF8&node=3435361
36: Macromdedia Inc, © 1995-2005, retrieved on (11 June 2005), http://www.macromedia.com/software/flash/
56
8 Index
8.1 Configuration File
57
8.2 Character Preferences
58
8.3 Character Sound 59
60
8.4 Media XML File
61
8.5 Tables Structure and Relationship
62
8.6 Class Structure
63
8.7 Installation Guide for the Neva
8.7.1 Hardware Requirements: • • • • •
Memory: Minimum 512 MB CPU : 2 G Hz or higher Disk space: 1 GB RFID reader Internet connection
8.7.2 Software Requirements: Before installing the Neva, following components should be installed first,
1: Windows XP
2. Haptek Player. http://www.haptek.com/products/player/autoinstall/
3: Windows Media Player ver.9.0 or higher. http://www.microsoft.com/windows/windowsmedia/default.aspx
4: Microsoft .Net Framework 1.1 or higher http://www.microsoft.com/downloads/details.aspx?FamilyID=262d25e3f589-4842-8157-034d1e7cf3a3&displaylang=en (Usually option 3 & 4 are already installed on a computer with Windows XP.)
5: Phidget Library ver. 2.05 B
6: Macromedia Flash Player ver. 7.0
7: ODBC plug-in for MySQL
8: After installing the above components, run the Setup.exe file of Neva from the installation CD. 64
9: Once the application has been successfully installed, always run the application from the installed folder by double clicking on the Neva icon and not from the desktop shortcut or from the program menu, as doing this might create an audio problem.
65