BUILDING DIGITAL COMMUNITIES AND SUPPORTING COLLABORATION IN A HUMANITIES DIGITAL LIBRARY Hamed Alhoori*^, Omar Alvarez X.*^, Richard Furuta*^, Du Li*, and Eduardo Urbina#
*Center for the Study of Digital Libraries, ^Department of Computer Science, and # Department of Hispanic Studies, Texas A&M University College Station / United States
[email protected] ,
[email protected] ,
[email protected] ,
[email protected] ,
[email protected]
Abstract New Web technologies could enhance collaboration within humanities digital libraries. We have investigated possible extensions to the Cervantes Project’s digital library that support collaboration across the collection of materials centering on the author of Don Quixote’s life and works. We describe the basis for our extensions, drawing from the experiences of users of the collection, and the implementation of an integrated collaborative environment that improves collaboration between users of the library through well-known Web 2.0 technologies. The resulting system provides an initial step towards building a humanities-centered Digital Community.
Keywords Digital Communities, Digital Libraries, Geographical Maps, Awareness, Geotagging, Folksonomy, CSCW, Wikis, Web 2.0.
1. Background Collaboration in the humanities has been identified as a critical issue—and one that may be addressable through digital tools. Bruce Cole mentions three goals for the Digital Humanities initiative: 1) use technology to make the humanities more accessible to everyone; 2) use the technology to foster increased collaboration in the humanities; 3) explore how digital technology will change humanities scholarship and teaching [1]. Further, digital libraries (DLs) are now playing an increasingly important role in the humanities, although with a primary focus on supporting individual scholars working alone. Some researchers have recognized the collaborative value of digital libraries [2, 3]. A principal digital library collaborative focus has been on their potential to contribute to the collection contents through users’ annotations and feedback [4, 5]. More recent works have highlighted the importance of Web-based services supporting users’ collaboration, such as workspace sharing, end-users interaction, and users’ awareness [6, 7, and 8]. However, those proposals in general do not emphasize the integration of the collaboration services with the main functionality of DLs, which implies that the users need to switch between different services when using DLs in collaboration. The collaboration gap in the humanities can, in part, be bridged by through networked applications, particularly Web-based services. However, conventional Web technologies are mostly suited for centralized applications with no strong requirements for notification and tightly-coupled collaboration [9]. Recently, the so-called Web 2.0 revolution is transforming the Web into a platform for sharing and collaboration by overcoming the above limitations, which provides unprecedented opportunities for building digital communities and enabling collaboration in DL [7]. We describe extensions to the Cervantes Project’s digital library (http://cervantes.tamu.edu/), based on users’ experiences with the library, that integrate the following Web 2.0-based collaboration services: a geotagging service, which allows the users know about other users, to be aware about their specific interests and geographical locations in order to support digital communities building and growing; a folksonomy tagging service, which allows the users to classify and appraise the DL contents in familiar terms; and a wiki environment, which allows the users to share contents and communicate ideas freely. These services are integrated with standard services such as information searching and retrieving.
2. MOTIVATION AND RELATED WORK 2.1.
Users’ Experiences with a Humanities DL System
The Cervantes project is a joint effort between Texas A&M University (TAMU) in the United States and the Universidad de Castilla-La Mancha in Spain, which started over ten years ago to build a digital library around Cervantes, one of the greatest writers in the Hispanic world. Main users of the Cervantes project include students attending courses from the Department of Hispanic Studies at TAMU, and scholarly users with different languages and cultural backgrounds from many humanities courses around the world. In order to investigate the collaboration attitudes, experiences, and expectations from users of the Cervantes project, we performed a user study to determine how they collaborate, which mechanisms are used for their collaboration, and with whom they collaborate. We conducted short semi-structured interviews with 8 experienced users of the Cervantes DL web site. Hence, the information we obtained was mainly on a qualitative basis. Most users claim that they can surf the DL site and retrieve important information. However, they miss the collaboration part that can be found in a classroom. One user pointed out: “The Cervantes Project data collection is amazing work, and it allows us to find out in just one site a bunch of important information”. Adequate information in DLs is a target from users’ interest, but collaboration features will enrich users’ experience. The same user exclaimed, when we asked about any feature to improve the Cervantes DL learning environment: “frequently, while I search or use some content, I need some kind of classmate or professor feedbacks or supports, an instantaneous way to interact with them could be helpful”. Currently, users not focused on computer matters know and use successfully synchronous and asynchronous collaboration tools such as email, wikis, instant messengers, blogs, and so forth. A user explains how they collaborate when interaction requirements arise in using the Cervantes project: “sometimes I send an email to some friend but preferably I use Instant Messenger or phone, you know, for a faster communication”. When we asked him about common problems using this communication means, he said, “while we discussed over the Internet about some Cervantes assignment, we need to use more than one application at the same time, for instance, some Cervantes web page, instant messenger, and our personal annotations. Most of the time coordination is a problem”. Finally, when asked with whom they collaborate, one user commented: “I have been interacting with my classmates and two other users from Spain” and another one said: “I have shared Cervantes information with some friends in other countries”. Largely, the whole number of interviewed users assessed that they just interacted with their classmates by using email or instant messenger, addressing the importance to know and interact with other countries users. Based on the user study, we identified three key gaps in supporting collaboration: 1) Users know that other users around the world are using the same DL but they do not know who they are; 2) They know that some other users are interested in the same DL but they do not know what exactly they are interested in; and 3) When they know someone with similar interests, they do not have a customized environment to collaborate over the DL.
2.2.
Related work
Marshall and Bly [10] carried out twenty contextual interviews to DL users and investigated how people encounter, save, and share published materials. In particular, their study highlights the amount of time and effort the users have to spend when they often have to interrupt one activity to enter another due to the lack of integrated collaboration features in current DL systems. They suggest the possibility of using more informal Web-based information spaces such as bulletin boards as a natural interactivity facilitator. However, it remains on the agenda how these technologies should be integrated together and especially with current intentional collections such as digital libraries. Bentley et al. [9] showed the World Wide Web (Web 1.0) as a promising enabling technology for CSCW via their experience with the BSCW (Basic Support for Cooperative Work) project. They saw the Web’s potential in the CSCW context because it allows the users to work in a heterogeneous environment, and provides easy and well-known methods to search, browse, retrieve, and publish
information. However, they also emphasized that the Web just then did not offer genuine features for more collaborative forms of information sharing [8]. Nowadays, more and more collaborative applications are present in a variety of domains using Web 2.0 technologies. While they are using formal and informal collaborative services, in most of the cases the incorporation of these new technologies is not totally integrated. However, current use of Web 2.0 technology in DL applications have been more about systems feedback, what users think about available services, not like a way to users interaction. Donoso et al. show a comparative study about Web 2.0 technologies used on DL systems. In 2006 they did a ten Digital Libraries survey about available Web 2.0 services. They found that this technology is not a commonplace functionality in these systems [18]. In the following, we present a set of functional projects and proposals in which their collaborative characteristics are intended principally to exploit current Web 2.0 technologies in order to improve the user experiences in a variety of domains. The CYCLADES project [6] provides an integrated environment for scholars and scholarly communities to access multidisciplinary web-based files. It includes a set of collaboration features that allow a user to create his own information space, maintain awareness of other users, and build communities by exchanging information and knowledge with other users. However, this tailored application merely integrates heterogeneous information spaces based on archives adhering to the Open Archive Initiative (OAI) [11]. It currently only allows community members interact by using “free tagging” and sharing folder-based contents. The University of Houston Library recently started a project to reshape its Web services based on Web 2.0 pillars [7]. They have sought out ways to make their Web site more interactive by providing a combination of different technologies, principally based on blogs and wikis as shared content spaces. This allows staff members to interact and collaborate, including workspaces personalization and customization. They also have considered some final users’ collaborative services such as ways to interact with staff members and adding contents. The idea of adding geographical identification metadata to information resources has recently appeared in many non-DL Web 2.0 systems. For example, Flickr allows users to add and search pictures structured by location [12]. The Web-a-Where system [13], by using an automatic tagging process, makes geography metadata associations with Web pages. This was implemented using WebFountain, an IBM data mining system, for automatic geotagging of Russian websites [14]. Michael Habib described his focus on developing virtual communities including a set of skills that may prove most useful in future library environments described like Library 2.0 model [8]. He is planning to develop online communities and services model that promote the idea of digital library as a collaborative place. Like physical libraries, digital libraries need to be community centers, collaborative study spaces, meeting spaces, and so on. DL users need integrated services supporting a genuine collaborative environment, and DL designers and developers have the compromise to provide them. In addition to information retrieval services, robust and easy to use collaborative services may strengthen users’ learning and researching experiences. In this sense, the availability of Web-based digital libraries communities in the humanities field, by using well known collaborative services (such as Web 2.0 technologies proposed in Habib model) used in a not disturbing way, shall allow to geographically distributed scholars and scholarly communicate and share digital contents, ideas, and thoughts productively.
3. WEB 2.0 TECHNOLOGIES Web 2.0 technologies have been widely used in many web services for supporting collaboration and sharing in a range of application domains. To address the above-identified collaboration gaps in DL systems, we focus on the following three well known Web 2.0 technologies: tagging, geotagging, and wikis.
3.1.
Geotagging
Geotagging is known as the process of adding geographical identification metadata to various media such as websites. This data usually consists of geographic coordinates that can be used to help users find a wide variety of location-related information. We were interested in the creation of a set of nodes
(users) around the world, setting up a location map using the Google Map technology, and grouping nodes by cities or countries. We also would like to give the users the capability of adding points of interests onto the maps that are related to Cervantes work.
3.2.
Tagging
Tagging is a common way of classifying and filtering information from data collections. Taxonomy uses a “predetermined tagging” option (controlled vocabulary) to classify content that largely incorporates some form of semantic and hierarchical structure [8]. Although providing many benefits for searching and browsing, it has long been recognized that traditional controlled vocabularies are not always adequate for online resource discovery [15, 16, 17]. The folksonomy method, on the other hand, organizes content by using the "free tagging" (open vocabulary) option, which is used widely in many Web 2.0 services recently. The implementation of a folksonomy approach was our target in this project, in order to build a content classification database based on a users’ international vocabulary.
3.3.
Shared Workspaces (Wiki)
Wikis allow users to freely create and edit Web page content using any Web browser. Current use of Wikis in Digital Libraries has been more on collecting user feedback about available services, not to support users’ interaction [17]. Our goal is using a wiki as an integrator element, the core part of our system that links the DL content with the other collaborative services (tagging and geotagging elements). Nevertheless, we continue to use its conventional capability as a communication medium and a shared content space in the DL context.
4. Supporting Collaboration The analysis of how current Web 2.0 technology is used and the identified gaps in supporting collaboration: 1) Users do not know who else is using the same DL; 2) They do not know what exactly other users are interested in; and 3), they do not have a customized environment to collaborate with other users over the DL, gave us the rationale to choose and implement available Web 2.0 collaborative tools. We addressed the identified collaborative gaps by focusing on three techniques: geotagging, folksonomy tagging, and wikis.
4.1.
Geotagging
Addressing the first two gaps, an environment based on Google Map was implemented. It allows the Cervantes users to quickly identify other users with common interests and their physical location. Each set of users is classified with a different color, helping the users in minimizing the time to search for a new collaborator that matches their needs.
Figure 1. Cervantes Project users’ map
Figures 1 shows an example of an online user. The node color suggests the user’s capability according to his role in the online society. Roles can range from visitors, interested users, students, scholars, and professors. A mouse-over or click event would trigger a popup menu showing detailed users information such as user’s background, interests, image and link to his preferred sites, and the local web space where users can discuss with him. In addition, the system allows the users to create and maintain the whole range of elements related to Cervantes like Events, Landmarks, Iconography, Bibliography, Texts and Biographies (taxonomy tagging) (see Figure 2). Furthermore, users can tag and filter content according to his element of interest. For example, all the locations that are related to Cervantes would be represented in a yellow node. A user can add a point of interest with various information and links to that point. Other users can discuss regarding this point in the web space with the user who added that point. High ranked user such as a professor or scholar can edit this point by adding or removing some information. This would assist to maintain an up to date scholarly web space where users can share they information and in the same time scholars can clarify and retain an accurate area for scholarly research.
Figure 2. Cervantes Project Map with points of Interest
4.2.
Folksonomy Tagging
Addressing the second gap as well, a folksonomy-based tagging environment was implemented within an online Internet forum. We provided the users with various sub-forums so that they can discuss Cervantes topics online (see Figure 3). When members create a new Cervantes thread in the forum, they can enter a group of tags that describe the thread idea. The entered tags are shown in a “tag cloud” (see Figure 4) that displays the most used and searched tags within the forum. The tag cloud provides a way for other users to browse the forums.
Figure 3. The Cervantes Project Tag Cloud
If a tag is used more than others then its text grows as can be seen from Figure 4. Users can click on any tag to list all the threads associated with that tag. If the user selects a thread, he will get that thread with a list of all tags above that thread. Moreover, tags can be added automatically with a functionality to not select a certain keywords (stop-words). Manually tagging would help the users to know what other users want to know about. While automatically tagging would identify what are the most used keywords in the online environment.
Figure 4. The Cervantes Project Tag Cloud
4.3.
Integrated Shared Workspaces with wikis
Addressing the third gap, a customized wiki was implemented. To organize the whole project functionalities, we created a wiki space. The wiki is used as a portal to the main sections for our project. It also can be used to edit and organize the previous sections in the CP web site. Users have the freedom of editing text, including and sharing multimedia content, and accessing to controlled Cervantes Project content. Using a wiki in textual editing environment, would help in maintaining an up to date edited text by allowing the users to share their information and knowledge (see Figure 5). In the same time it allows to store all the changes that were made to a specific text. Hence, this allows the users to contribute to the editing process and provides the scholars an easy mechanism to restore any inaccurate changes that were made, in a speedy mode.
Figure 4. Cervantes Project Wiki.
5. Conclusion Nowadays, more and more collaborative applications are present in DL and other domains using Web 2.0 technologies. While they are using collaborative services, in most of the cases their incorporation is not totally integrated with DL content. There are functional projects and proposals in which their collaborative characteristics are intended principally to exploit current Web 2.0 technologies in order to improve the user experience [6; 7]. But integration of collaborative services with heterogeneous DLs content is a limitation [6], and the main interest is not on final users’ collaboration [7]. However, no DL systems to our best knowledge have addressed the above-identified three elements embodying the users’ collaboration gap. Our focus is based on the integration of geotagging as a graphical means that allows users to know about each others, tagging as a way of identifying common user interests, and wiki spaces as a communication and shared-work space, all of them as collaboration enable technologies. We are proposing those as an integrated collaborative environment, inserted into DL sites, avoiding interruptions on users’ main activities. We expect that it would encourage and allow geographically distributed users to communicate and share digital contents and thoughts more quickly, easily and productively. Thus humanists will have the tools to create communities with similar interests, as well as to share and classify their analyses. These Web 2.0 services also allow the researcher to better understand the collaborative behavior of these developing online communities. Although we were able to create a collaborative environment that provides the user with many wellknown collaborative features, the users’ relationship would be improved by connecting their similar interests automatically and by providing synchronous collaborative tools. Also, we are conscious about the required functionality and usability evaluation by real users. In this sense, some issues that we are going to analyze are related with these questions: “How do humanists see the fusion of formal content and informal collaborative tools (Web 2.0)?”, “What benefits does this fusion brings to them?”, “How can they be encouraged to feel comfortable in using this technology?”.
6. References [1] Cole, B., Digital Humanities Centers Summit Meeting, National Endowment of the Humanities, Washington, DC, April 2007. [2] Levy, D. M. and Marshall, C.C., Going digital: a look at assumptions underlying digital libraries, Communications of the ACM, 1995, pages 77-84. [3] Twidale, M. B. and Nichols D. M., A Survey of Applications of CSCW for Digital Libraries, Technical Report CSEG, Computing Department, Lancaster University, 1998 [4] Nichols, D. M., Pemberton, D., Dalhoumi S., Larouk, O., Belisle C., Twidale M. B., DEBORA: Developing an Interface to Support Collaboration in a Digital Library, ECDL, Lisbon, Portugal. Springer-Verlag. 2000, Pages 239-248. [5] Torres, R., McNee, S. M., Abel, M., Constan J. A.,Riedl J., Enhancing Digital Libraries with TechLens+, ACM, JCDL, Tucson, Arizona, USA., 2004 [6] Candela L., Straccia U., The Personalized, Collaborative Digital Library Environment CYCLADES and Its Collections Management, SIGIR 2003 Ws Distributed IR, LNCS 2924, pp 156-172, 2003. [7] Coombs, K. A., Building a Library Web Site on the Pillars of Web 2.0, Computers in Libraries, Vol. 27 No. 1, January 2007. http://www.infotoday.com/cilmag/jan07/Coombs.shtml [8] Stephens, M., The Academic Library 2.0 Model: An ALA TS Blog Interview with Michael C. Habib, ALA TechSource, January 2007. http://www.techsource.ala.org/blog/2007/01/the-academic-library-20-model-an-ala-ts-bloginterview-with-michael-c-habib.html [9] Bentley, R., Horstmann, T., Trevor, J., The World Wide Web as enabling technology for CSCW: The case of BSCW, in Journal of Computer-Supported Cooperative Work: Special issue on CSCW and the Web, Vol. 6 (1997). [10] Marshall, C. C., Bly S., Sharing Encountered Information: Digital Libraries Get a Social Life, ACM JCDL’04, June 7–11, 2004, Tucson, Arizona, USA. [11] Open Archive Initiative, Accessed on May 2007
http://www.openarchives.org/ [12] Geotagging Flickr, Accessed on April 2007 http://www.flickr.com/groups/geotagging/pool/map?mode=group [13] Amitay, E., Har’El, N., Sivan, R., Soffer, A., Web-a-Where: Geotagging Web Content, SIGIR’04, ACM, July 2004 [14] Pyalling, A., Maslov, M., Braslavski, P., Automatic Geotagging of Russian Web Sites, WWW’06, ACM, May 2006. [15] Lancaster, F, W., Indexing and Abstracting in Theory and Practice (3rd Ed.), Thomson-Shore, Inc., Michigan USA. 2003 [16] Macgregor, G., McCulloch, E., Collaborative Tagging as a Knowledge Organization and Resource Discovery Tool, Centre for Digital Library Research, Department of Computer & Information Sciences, University of Strathclyde, 2005. [17] Mai, J. E., Classification in Context: Relativity, Reality, and Representation, Knowledge Organization, Vol. 31 No. 1, pp.39-48., 2004. [18] Donoso, R., Ramirez, J., Services Diversification for Digital Libraries Libraries 2.0: Wikis, Blogs, Social Book Mark, RSS (Diversificación de Servicios para Bibliotecas Digitales), Seminario de Infotecnologia en Accion, 2007.