1
Log Data in Digital Libraries Maristella Agosti Department of Information Engineering University of Padua – Italy
[email protected]
Abstract— This study presents relevant characteristics of library automation systems and digital library systems that need to be taken into account in addressing the study of log data in digital libraries. For all the different categories of users of a digital library system, the quality of services and documents the digital library supplies are very important. Log data constitute a relevant aspect in the evaluation process of the quality of a digital library system and of the quality of interoperability of digital library services. A general approach for the analysis of log data generated in the use of services of digital library systems is introduced.
I. I NTRODUCTION Log is a concept commonly used in computer science; in fact, log data are collected by an operating system to make a permanent record of events during the usage of the operating system itself. This is done to better support its operations, and in particular its recovery procedures. Due to the experience gained in the management of operating systems and many application systems that manage permanent data, log procedures are commonly put in place to collect and store data on the usage of the application system by its users. Initially, these data were mainly used to manage recovery procedures of the application system, but over time it became apparent that they could also be used to study the usage of the application by its users, and to better adapt the system to the objectives the users were expecting to reach. The paper addresses the study of log data in digital libraries. Although this type of study is still at an early stage of development, it needs to be addressed comprehensively and future directions need to be proposed. To achieve this, the paper is organized as follows. Section II introduces the relevant characteristics of library automation systems that can be considered the precursors of digital library systems and their use by final users has been extensively studied through log data. Section III presents the relevant characteristics of digital library systems. Section IV introduces the proposed approach to study information access services in digital library systems. II. L IBRARY AUTOMATION S YSTEMS Initial application systems able to manage the permanent data of interest to libraries were named library automation systems and they first appeared in the 1970s. Those application systems were only able to manage the catalogue data representing physical library objects – such as books, journals,
and reports – that were held in a real and physical library i.e. a physical place external to the application system, and were only able to refer to the physical library objects through the managed catalogue data, as, for example, in the library automation project that was conceived in the late 1970s by CINECA1 [1], [2]. Catalogue data are now usually referred to as metadata, since they are data that represent other data. Catalogue data only represented physical objects held in real and physical libraries, so objects held in archives and museums were not represented and managed at that time in software application systems. In the 1980s the most advanced library automation systems were designed so as to include procedures able to collect log data. Log data were collected to manage the system itself, and especially to monitor the usage of system search facilities by users, where the search facility which was designed for user search and access to catalogue data was named Online Public Access Catalogue (OPAC) and this name is still in use today. An OPAC is a sophisticated software designed to provide final users with direct access to the catalogue data without the intervention of a professional user and to make available to them all the data in the catalogue database managed by the software system. The catalogue database is constructed by professional librarians who use authority control rules in describing author, place names and other relevant catalogue data [3]; over time the librarians construct many authority files where the software system stores all lists of preferred or accepted forms of names and other relevant headings [4]. The complex database which is managed by a library automation system is a coherent collection of catalogue data and authority files which can be searched by the OPAC system to give a more professional and reliable answer to the final user. The log data on the usage of an OPAC system store information on the specific queries which have been made by final users referring to the specific authority files from which the data were extracted. This means that the analysis of the OPAC queries can be used to better understand the effective use the final user makes of the data stored by the library automation system. Traditional OPAC systems were accessible by registered users and through public login procedures. In both cases it was possible to trace each user/system interaction and each user session was identifiable, wherein this phase each application
115
1 http://www.cineca.it/en/index.htm
2
log data which could jeopardize the interpretation of the results and the comparison between systems. We analyze in the following such differences and we underline aspects that have to be taken into account in analyzing log data with the purpose of improving available systems or services, and of comparing the search systems at hand.
Library Automation Search Software
Management Software Internet
Digital Contents 0111100100010101000111000111 011100101011110010101010010 0011100011111011100101011110 0010101000110111000111011
Fig. 1. Architecture of a Library Automation Search Service before the Web.
system, even in a distributed environment, was reached using a system dependent interface. We call this type of access, which is depicted in Figure 1, pre-Web access, since it is only from the introduction of Web browsers that an application has become reachable in a distributed environment through a standard software interface. A. Log Data in Library Automation Systems Like with an operating system, log data can be collected during the use of a library automation system to monitor the functioning of the system and the usage of the system itself by final users. Log data concerning the interaction between final user and system can be collected in a systematic manner to monitor the use of the system by final users, which means recording log data of the OPAC system to study its use and to consolidate it in a tool to meet end-user search requirements. Two significant examples on the automatic and systematic collection of log data for monitoring and studying the interaction between user and search facilities of library automation systems are the logging procedures designed and implemented in Okapi [5], and in DUO [6]. B. Usage of Log Data in Evaluating OPAC Systems In [5] a detailed record of the features which can be evaluated in an OPAC directly accessible by its final user is given as follows: technical performance, information retrieval performance, and user behavior, which includes studies of users and of use, user profiles, user search patterns, and user interaction success. As it can be appreciated, most of those aspects are still considered of interest in studying the usersystem interaction in present search systems. C. Differences among Pre-Web and Web Accessible Systems It is important to note that there are some dramatic differences between systems that are directly accessible through dedicated software, such as the OPAC systems of the time before the Web, and the systems now available through the Web; by way of an example the OPACs accessible through the Web make the interaction available through a standard and generic interface. If we were to overlook those differences, we would risk committing errors in the design of our analysis of
III. D IGITAL L IBRARY S YSTEMS Towards the end of the 1980s it became apparent that a library automation system could not only manage catalogue data or metadata describing physical objects, but also digital representations of some types of physical objects. Later on some objects started to appear in a digital form, so the collection of types of descriptions of physical objects and of digital objects themselves was becoming increasingly diversified and complex. Former library automation systems appeared to be limited in managing data related to such a diversified situation so the need to envisage and design a new generation of systems able to face the new reality of interest was evident. This new type of systems was named digital library system to highlight that the objects comprising the collection of interest were mostly full digital objects. Maintaining the term library was later considered misleading, since the new collections of interest were not only still the objects managed in real physical libraries, but also the objects managed in real archives or museums, and also in film or music libraries. Present digital library systems are complex software systems, often based on a service-oriented architecture, able to manage complex and diversified collections of digital objects. One significant aspect that still relates present systems to the old ones is that the representation of the content of the digital objects that constitute the collection of interest is still done by professionals. This means that the management of metadata can still be based on the use of authority control rules in describing author, place names and other relevant catalogue data. A digital library system can exploit authority data that keep lists of preferred or accepted forms of names and all other relevant headings. This is a dramatic difference between digital library systems and search engines, and it is usually overcome with the analysis of log data. In fact a search engine often becomes a specific component of a digital library system, when the digital library system faces the management and search of digital objects by content in the same manner as information retrieval systems and search engines [7]. In all other types of searches, either the digital library system makes use of authority data to respond to final users in a more consistent and coherent way through a search system that is a sort of a new generation OPAC system, or the system supports the full content search with a service that gives the final users the facilities of a search engine. Finally, it is worth underlining that the access to each service a digital library system provides is usually supplied through a Web browser, and not through a specifically designed interface. This means that the analysis of user interaction with systems that have a Web-based interface requires the forecasting of ways that support the reconstruction of sessions in a setting, like the Web, where sessions are not naturally identified and kept.
116
3
Digital Library System Digital Library Services Multilinguality
Index
Visualization Annotation
…
Personalization
Service-oriented Digital Library Framework
Documents and Collections
Fig. 2.
…
An Example of a Service-oriented Digital Library System.
A. Two Significative Examples of Digital Library Systems The previous discussion clearly demonstrates that a digital library system is a complex system able to support diversified functions and services. To better clarify how complex and powerful a digital library system can be, two recently available significant and distinct examples of digital library systems are briefly presented. They are DelosDLMS and The European Library. DelosDLMS is a prototype for the next generation of digital library management systems, which is the result of the joint effort of partners in the DELOS Network of Excellence2 representing the state of the art in the conception and design of digital library management systems. The European Library3 represents a state of the art effective service providing access to the catalogues and digital collections of most European national libraries via one central multi-lingual Web interface and constituting a good starting point and forerunner for a future European Digital Library. 1) DelosDLMS: DelosDLMS, the prototype Digital Library Management System developed in the context of the DELOS Network of Excellence4 , is a relevant example of the new generation service-oriented digital library systems. DelosDLMS [8] combines a rich set of features in a combination unavailable in any existing system. It combines text and audio-visual searching, offers personalized browsing using new information visualization and relevance feedback tools, provides novel interfaces, allows retrieved information to be annotated and processed, and integrates and processes sensor data streams. The system is built over OSIRIS, a middleware environment initially developed at ETH Zurich and then expanded and maintained at the University of Basel. OSIRIS allows the building of process-based digital library applications starting from services (and already existing processes alike), and executes them in a distributed P2P fashion. The philosophy behind DelosDLMS is that digital library applications can be easily built starting from specialized services produced independently from each other. The basic architecture of DelosDLMS and, in general, of a serviceoriented digital library system is depicted Figure 2. DelosDLMS has been developed in two different integration phases, with the results of the first integration phase being 2 http://www.delos.info/ 3 http://www.theeuropeanlibrary.org/ 4 DELOS Network of Excellence on Digital Libraries, Information Society Technologies (IST) Program of the European Commission (Contract G038507618).
reported in [9], and the result of the second phase being reported in [10]; the overview of all the services which have been integrated so far is given in Figure 35 . 2) The European Library: The European Library is a noncommercial organization, which provides the services of a physical library and offers search facilities for the resources of many of the European national libraries, available resources can be both digital or bibliographical, e.g. books, posters, maps, sound recordings, and videos. The European Library is a service of the “Conference of European National Librarians” (CENL)6 and it is hosted by the Koninklijke Bibliotheek, The Netherlands7. The European Library initiative aims at providing a “low barrier of entry” for the national libraries that should be able to join the federation with only minimal changes to their systems [11]. This means that The European Library exists to open up the universe of knowledge, information and culture of all European national libraries, where a national library is the library specifically established by a country to store its information database. National libraries usually host the legal deposit and the bibliographic control centre of a nation. Currently The European Library gives access to 150 million entries across Europe, but the amount of referenced digital collections is constantly increasing8. The European Library portal9 is an evolving service, at present it is in its version 1.6, its home page is reported in Figure 4. The European Library Portal is constituted by the three components represented in Figure 5: • a Web server - which provides access to the services to the users; • a central index - which harvests catalogue records from national libraries, supports the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) 10 , and provides integrated access to them via Search/Retrieve via URL (SRU) ; • a gateway between SRU and Z39.50 - which makes accessible through SRU also national libraries which would otherwise be accessible only through Z39.5011. In addition, the interaction between the portal, the federated libraries, and the user mainly happens on the client side by means of an extensive use of Javascript and Asynchronous JavaScript Technology and XML (AJAX) 12 . 5 An overall presentation of the DelosDLMS prototype is available at the URL: http://dbis.cs.unibas.ch/delos_website/ delosdlms.html 6 http://www.cenl.org/, from 2006 the Conference of European National Librarians is added to the list of International Non-Governmental Organisations (INGO) enjoying participatory status with the Council of Europe. 7 http://www.kb.nl/index-en.html 8 At present 48 national libraries of Europe are collaborating with The European Library, an update map of the participating national libraries and information on them are available at the URL: http://www. theeuropeanlibrary.org/portal/libraries/map_en.html 9 http://www.theeuropeanlibrary.org/portal/ 10 http://www.openarchives.org/OAI/ openarchivesprotocol.html 11 http://www.loc.gov/z3950/agency/ 12 http://www.w3.org/TR/XMLHttpRequest/
117
4
Fig. 3.
DelosDLMS: Overview of the Service-oriented Digital Library Management System Architecture.
National Library SRU
National Library Z39.50 via SRU
①
National Library OAI-PMH
②
SRU
③
SRU
SRU
National Library Z39.50
④
OAI-PMH
Z39.50
TEL Central Index OAI-PMH/SRU
TEL Gateway SRU/Z39.50
SRU
TEL System TEL Web Server HTTP
HTTP
Browser
(javascript)
Fig. 5.
Fig. 4.
The European Library Portal Home Page.
Once the client, which is a standard Web browser, accesses the service and downloads all the necessary information from the Web server, all the subsequent requests are managed locally by the client. The client interacts directly with each federated library and the central index, according to the SRU protocol, makes separate AJAX calls towards each federated library or the central index, and manages the responses to such calls in order to present the results to the user and to organize user interaction. B. Varied Production of Log Data in Digital Libraries The differences between the two previously introduced digital library initiatives demonstrate how diverse digital library systems can be. It becomes evident how complex a digital library system can be and how complex can be the study of the log data produced by its different and diversified services which can be collected during the actual use of all its services. Each specific service of a digital library system can be analyzed using the log data collected during its use by final
Architecture of the The European Library Portal.
users. The analysis of the log data that are produced by the diverse services of a digital library system must therefore be organized and finalized to the specific aspects that in turn are under examination. If log data are used to analyze digital library systems, the specific service or set of coherent services to be analyzed using log data needs to be decided in advance and the collection of log data and the subsequent analysis needs to be designed accordingly.
C. DELOS Digital Library Reference Model and Log Data Because of the complexity and diversity of different digital library systems, DELOS started to invest in a Reference Model for Digital Libraries as an abstract framework that can be used to understand significant relationships among the entities of the digital library environment. The model is based on a small number of unifying concepts, as depicted in Figure 6, and provides a common semantics that can be used unambiguously across and between different digital library system implementations [12]. Such a reference model can be useful as a basis for the design of a log data harvesting tool, since log data are produced both by the basic core and additional services.
118
5
rendered anonymous, as privacy is an important issue. A common repository and infrastructure for storing primary and preprocessed data needs to be put in place with the collaborative formation of evaluation best practices, and modular building blocks need to be used in evaluation activities [16], [17]. On the basis of the experience gained keeping and managing the data of interest of a series of evaluation campaigns of an international evaluation forum, as reported in [18], it is possible to develop a correspondent approach that can be valid for the management of log data. The second recommendation underlines the usefulness of adopting standard logging formats. In the preliminary analysis conducted on The European Library portal Web log data [14], Web log standards were adopted giving the possibility of future comparison on produced results. Fig. 6.
The Main Concepts of the Digital Library Universe.
B. Information Access Services: Proposed Approach of Analysis IV. L OG DATA TO S TUDY I NFORMATION ACCESS S ERVICES IN D IGITAL L IBRARIES One set of coherent services that a digital library system supplies are those which provide final users with access to information and documents. These usually include: an OPAClike service, a search engine-like service and a navigation service among the different collections of documents to which the digital library gives access. Since this coherent set of services is of great interest to final users, these can be used to start the log data collection and analysis. The relevant characteristics of digital library systems suggest addressing the issue in a wide and comprehensive way, otherwise many aspects of the interaction between the final user and the digital library system with its diversified services may be missed. A. Background Efforts The study of log data in digital libraries is still in an early stage of development, but some initial results have been produced in a joint effort among DELOS and The European Library. [13] presents a systematic methodology for the analysis of logs from The European Library portal – this methodology provides preliminary insights into user navigation patterns within the portal, as well as first ideas for an improvement of the user search experience. [14] presents an analysis of eleven months of The European Library Web log data – the work reports initial findings of the study of user sessions reconstructed by means of heuristic methods, since no personal data was available to track each user. This issue has been previously addressed also in [15] where, among all the other aspects that need to be taken into account in the evaluation of digital libraries, some initial recommendations for the study of log data are given for: 1) the establishment of primary data repositories, and 2) the adoption of standard logging formats. For the first recommendation, the experience gained in the context of building the CLEF evaluation infrastructure indicates that the provision of open access to primary log data is needed, keeping in mind that the primary data must be
For all the different categories of users of a digital library system, the quality of the services and documents the digital library supplies are very important [19]. Log data constitute a relevant aspect in the evaluation process of the quality of a digital library system and of the quality of interoperability of digital library services [20]. Work is under way to put in place a framework to be used in handling log data in the evaluation of services that give access to information and documents to final users of a digital library system, bearing in mind that in a study of this sort the central factors of the studies are the final users. In fact, the final user needs to be considered the guide of the system designers, prompting them to conceive and invent solutions of real use for the user himself. The services under study are those of The European Library portal that give access to information, including the OPAClike service, the search engine and the Web service, because The European Library is one of the most relevant effective digital library initiatives that can be studied and that constitutes a significative building block towards the common European Digital Library that the European Commission is promoting13. In the view of the European Commission the European Digital Library needs to be a common multilingual access point to Europe’s cultural heritage. The “European Commission Working Group on Digital Library Interoperability”, active from January to June 2007, had the objective of providing recommendations for both a short-term and a long-term strategy towards the setting up of the European Digital Library as a common multilingual access point to Europe’s distributed digital cultural heritage including all types of cultural heritage institutions [20]. 13 Answers to frequently asked questions on the European Digital Library are reported at the Web page: http://europa.eu/rapid/ pressReleasesAction.do?reference=MEMO/07/523&format= HTML&aged=0&language=EN&guiLanguage=en. Other useful information available at the URL: http://ec.europa.eu/information_ society/activities/digital_libraries/index_en.htm and at the URL: http://europa.eu/rapid/pressReleasesAction. do?reference=IP/06/253
119
6
In particular, the recipient of these recommendations is the Europeana thematic network14, which is a project launched in July 2007 with the aim of addressing the interoperability issues among European museums, archives, audio-visual archives and libraries towards the creation of the “European Digital Library”. The framework that has to be designed and put in place is going to be a coherent infrastructure for the collection, storage, curation and management of relevant data which are derived from sources of different nature; among those sources two are most relevant: the data collected through log systems, and the data which are generated and collected through user studies. 1) Data to Log: The logging requirements of a relevant digital library initiative, as The European Library service is, suggest logging data throughout the whole portal, which means collecting data for the user navigation on both static and dynamic Web pages. Among those log data there are: • HyperText Transfer Protocol (HTTP) or Web logs, • action logs, and • static content logs. In particular the structure of HTTP logs often conforms to the W3C Extended Log File Format [21]. This kind of log contains, among other things, the following useful information: 1) the Internet Protocol (IP) address and the user-agent which allow the identification of single users [14]; and 2) the referrer field, a Uniform Resource Locator (URL) address which communicates the last page viewed by the user, and this can be used to know how visitors get to The European Library service. The European Library service HTTP logs also contain the cookie15 saved on the client which reports extra information: 1) the language selected by the user during the navigation of the service; 2) the collections of documents selected during the query or query refinement; 3) the identifier of the session assigned by the server to a specific user. 2) User Studies: Together with log data analysis, it is envisaged the necessity of collecting data generated by controlled studies which have to be performed on groups of users that freely crawl and navigate The European Library portal and then fill in specifically designed questionnaires to report and describe their impressions. The goal of the controlled studies is to combine the data of the sessions of the people who have compiled the questionnaires, data which are present in the log data, with those that have been reported in the questionnaires. The final aim is to gain insights from data on user sessions and judgments in the questionnaires to generalize the results obtained. The insights gained by analyzing log data together with data from controlled studies are more informative than the results that can be derived by separately analyzing the groups of data. Previous studies on logs and observations in naturalistic settings, combined with interviews, have shown that the results are more scientifically informative than those obtained when the two types of studies are conducted alone [22]. 14 http://www.europeana.eu/
ACKNOWLEDGEMENTS The paper reports on work which originated in the context of the DELOS Network of Excellence on Digital Libraries. The author thanks Nicola Ferro for the useful discussions and suggestions on the topics addressed by the work. The work reported has been partially supported by the TELplus Targeted Project for digital libraries, as part of the eContentplus Program of the European Commission (Contract ECP-2006-DILI-510003)16 and by the TrebleCLEF Coordination Action, as part of the 7th Framework Programme of the European Commission, Theme ICT-1-4-1 Digital libraries and technology-enhanced learning (Grant agreement: 215231)17. R EFERENCES [1] M. Agosti and M. E. Ronchi, “Progetto B.A.C.: La Biblioteca Automatica del CINECA,” in Atti del Congresso AICA 79, Bari, Italy, 1979, pp. 367–370. [2] ——, “DOC-5 - The Bibliographic Information Retrieval System in CINECA Library Automation Project,” in Proc. of ECODU-29 Conference, Berlin, Germany, 1980, pp. 20–31. [3] M. Guerrini and L. Sardo, Authority control. Roma, Associazione italiana biblioteche, 2003. [4] B. Baldacci and R. Sprugnoli, Informatica e biblioteche: automazione dei sistemi informativi bibliotecari. Roma, NIS, 1983. [5] N. N. Mitev, G. Venner, and S. Walker, “Designing an online public access catalogue: Okapi, a catalogue on a local area network,” British Library, Library and Information Research Report 39, 1985. [6] M. Agosti and M. Masotti, “Design of an OPAC Database to Permit Different Subject Searching Accesses in a Multi-disciplines Universities Library Catalogue Database,” in Proc. 15th Annual Int. ACM SIGIR Conf. on Research and Development in Information Retrieval (SIGIR 1992), N. J. Belkin, P. Ingwersen, A. Mark Pejtersen, and E. A. Fox, Eds. ACM Press, New York, USA, 1992, pp. 245–255. [7] M. Agosti, Ed., Information access through search engines and digital libraries. Berlin, Germany: Springer, 2008. [8] H.-J. Schek and H. Schuldt, “DelosDLMS – Infrastructure for the Next Generation of Digital Management Systems,” ERCIM News, Special Issue on the European Digital Library, vol. 66, pp. 22–24, July 2006. [9] M. Agosti, S. Berretti, G. Brettlecker, A. del Bimbo, N. Ferro, N. Fuhr, D. Keim, C.-P. Klas, T. Lidy, D. Milano, M. Norrie, P. Ranaldi, A. Rauber, H.-J. Schek, T. Schreck, H. Schuldt, B. Signer, and M. Springmann, “DelosDLMS – the Integrated DELOS Digital Library Management System,” in Digital Libraries: Research and Development. First Int. DELOS Conference. Revised Selected Papers, C. Thanos, F. Borri, and L. Candela, Eds. Lecture Notes in Computer Science (LNCS) 4877, Springer, Heidelberg, Germany, 2007, pp. 36–45. [10] C. Binding, G. Brettlecker, T. Catarci, S. Christodoulakis, T. Crecelius, N. Gioldasis, H.-C. Jetter, M. Kacimi, D. Milano, P. Ranaldi, H. Reiterer, G. Santucci, H.-J. Schek, H. Schuldt, D. Tudhope, and G. G. Weikum, “DelosDLMS: infrastructure and services for future digital library systems,” in Second DELOS Conference on Digital Libraries Working Notes, C. Thanos and F. Borri, Eds. ISTI-CNR, Gruppo ALI, Pisa, Italy, December 2007. [11] T. van Veen and B. Oldroyd, “Search and Retrieval in The European Library. A New Approach,” D-Lib Magazine, vol. 10, no. 2, February 2004. [12] L. Candela, D. Castelli, N. Ferro, Y. Ioannidis, G. Koutrika, C. Meghini, P. Pagano, S. Ross, D. Soergel, M. Agosti, M. Dobreva, V. Katifori, and H. Schuldt, The DELOS Digital Library Reference Model. Foundations for Digital Libraries. Version 0.98. ISTI-CNR at Gruppo ALI, Pisa, Italy, December 2007. [13] J. Luxenburger, E. van der Meulen, and G. Weikum, “User Interactions with The European Library Portal,” in Proc. 10th DELOS Thematic Workshop on Personalized Access, Profile Management, and Context Awareness in Digital Libraries (PersDL 2007), Corfu, Greece, Jun 2007, pp. 18–22.
15 Cookies are plain text information stored locally by the client. The stored data are initially sent by a Web server to a Web client and then sent back to the server on subsequent requests.
120
16 http://www.theeuropeanlibrary.org/telplus/ 17 http://www.trebleclef.eu/
7
[14] M. Agosti and G. M. Di Nunzio, “Web Log Mining: A study of user sessions,” in Proc. 10th DELOS Thematic Workshop on Personalized Access, Profile Management, and Context Awareness in Digital Libraries (PersDL 2007), Corfu, Greece, Jun 2007, pp. 70–74. [15] N. Fuhr, G. Tsakonas, T. Aalberg, M. Agosti, P. Hansen, S. Kapidakis, C.-P. Klas, L. Kov´acs, M. Landoni, A. Micsik, C. Papatheodorou, C. Peters, and S. Sølvberg, “Evaluation of digital libraries,” Int. Jour. on Digital Libraries, 2007. [16] M. Agosti, G. M. Di Nunzio, and N. Ferro, “Scientific Data of an Evaluation Campaign: Do We Properly Deal With Them?” in Working Notes for the CLEF 2006 Workshop, A. Nardi, C. Peters, and J. L. Vicedo, Eds. http://www.clef-campaign.org/2006/working notes/ workingnotes2006/agostiCLEF2006.pdf, 2006. [17] ——, “The Importance of Scientific Data Curation for Evaluation Campaigns,” in DELOS Conference 2007 Working Notes, C. Thanos and F. Borri, Eds. ISTI-CNR, Gruppo ALI, Pisa, Italy, February 2007, pp. 185–193. [18] ——, “A Data Curation Approach to Support In-depth Evaluation Studies,” in Proc. Int. Workshop on New Directions in Multilingual Information Access (MLIA 2006), F. C. Gey, N. Kando, C. Peters, and C.-Y. Lin, Eds., 2006, pp. 65–68. [19] M. Agosti, N. Ferro, E. A. Fox, and M. A. Gonc¸alves, “Modelling DL Quality - a Comparison between Approaches: the DELOS Reference Model and the 5S Model,” in Second DELOS Conference on Digital Libraries Working Notes, C. Thanos and F. Borri, Eds. ISTI-CNR, Gruppo ALI, Pisa, Italy, December 2007. [20] S. Gradmann, “Interoperability of Digital Libraries: Report on the work of the EC working group on DL interoperability,” in Seminar on Disclosure and Preservation: Fostering European Culture in The Digital Landscape. National Library of Portugal, DirectorateGeneral of the Portuguese Archives, Lisbon, Portugal, http://bnd.bn.pt/ seminario-conhecer-preservar/doc/Stefan%20Gradmann.pdf, September 2007. [21] P. M. Hallam-Baker and B. Behlendorf, “Extended Log File Format – W3C Working Draft WD-logfile-960323,” http://www.w3.org/TR/ WD-logfile.html, March 1996. [22] P. Ingwersen and K. J¨arvelin, The Turn. Springer, The Netherlands, 2005.
121