Information Processing and Management 42 (2006) 1345–1365 www.elsevier.com/locate/infoproman
The information seeking behaviour of the users of digital scholarly journals David Nicholas 1, Paul Huntington 2, Hamid R. Jamali *, Anthony Watkinson School of Library, Archive and Information Studies, CIBER,3 University College London, Henry Morley Building, Gower Street, London WC1E 6BT, United Kingdom Received 23 November 2005; received in revised form 1 February 2006; accepted 1 February 2006 Available online 20 March 2006
Abstract The article employs deep log analysis (DLA) techniques, a more sophisticated form of transaction log analysis, to demonstrate what usage data can disclose about information seeking behaviour of virtual scholars – academics, and researchers. DLA works with the raw server log data, not the processed, pre-defined and selective data provided by journal publishers. It can generate types of analysis that are not generally available via proprietary web logging software because the software filters out relevant data and makes unhelpful assumptions about the meaning of the data. DLA also enables usage data to be associated with search/navigational and/or user demographic data, hence the name ‘deep’. In this connection the usage of two digital journal libraries, those of EmeraldInsight, and Blackwell Synergy are investigated. The information seeking behaviour of nearly three million users is analyzed in respect to the extent to which they penetrate the site, the number of visits made, as well as the type of items and content they view. The users are broken down by occupation, place of work, type of subscriber (‘‘Big Deal’’, non-subscriber, etc.), geographical location, type of university (old and new), referrer link used, and number of items viewed in a session. ! 2006 Elsevier Ltd. All rights reserved. Keywords: Transaction log analysis; Electronic periodicals; Information-seeking behaviour; Usage statistics
1. Introduction In this article we present and collate the findings of a number of recent investigations that have been conducted under the Virtual Scholar Research Program at University College London, a program that seeks to bring robust evaluation to the digital scholar environment.4 The robust analysis is the product of a *
Corresponding author. Tel.: +44 20 7679 7205; fax: +44 20 7383 0557. E-mail addresses:
[email protected] (D. Nicholas),
[email protected] (P. Huntington),
[email protected] (H.R. Jamali),
[email protected] (A. Watkinson). 1 Tel.: +44 20 7679 2477. 2 Tel.: +44 20 7679 7205. 3 Centre for Information Behaviour and the Evaluation of Research. 4 http://www.ucl.ac.uk/ciber/ciber.php. 0306-4573/$ - see front matter ! 2006 Elsevier Ltd. All rights reserved. doi:10.1016/j.ipm.2006.02.001
1346
D. Nicholas et al. / Information Processing and Management 42 (2006) 1345–1365
methodology we called deep log analysis (DLA) that takes its lead from, but goes much further than, transaction log analysis. Together the results of these studies provide a comprehensive, detailed – and a sometimes surprising – picture of the information seeking behaviour of the digital scholar (academic and researcher) in regard to two major digital journal libraries, those of EmeraldInsight5 (Emerald Group Publishing Limited, Bradford, England), a business and information studies publisher, and Blackwell Synergy6 (Blackwell Publishing, Oxford, England), a learned journal publisher. The investigation is probably one of the largest ever undertaken, covering as it does, the online transactions of nearly three million virtual scholars. From the individual investigations, we have selected analyses which provide a good overview of the research and which we believe to be particularly pertinent. 2. Aims, objectives and scope The major aim of the paper is to demonstrate what deep log analysis can disclose about the kinds of people that search scholarly digital journal libraries and their information seeking behaviour, in the belief that the methodology provides a bigger, more accurate, and fuller picture than what is possible with standard survey techniques and provides some very powerful types of analyses not obtainable from the standard commercial log analyzing software. Deep log analysis refers not simply, as the name suggests, to mining the raw log data more deeply and accurately than proprietary software, but also to relating usage data to user data to provide that all important triangulation. It also generates the questions that interviewers, focus groups, and questionnaire originators should be asking, but seldom do. To demonstrate this we have taken server log transaction data from two digital libraries (publisher platforms) containing large numbers of full-text scholarly journals: those of the publishers, Emerald, which features around 150 business and library studies journals, and Blackwell, which contains some 700 journals, with a strong presence in the sciences and medicine. Both publisher platforms were subject to a range of enhanced or deep log analyses. We have already conducted and published some other kinds of analyses on the logs of these two digital libraries (Nicholas, Huntington, & Watkinson, 2003, 2005). Here we shall concentrate on, arguably, the two most powerful deep log metrics which we believe provide especially illuminating data: ! the number of items viewed per online session (something we call ‘site penetration’), ! the number of visits (returnee analysis). These two use metrics were enhanced with user details to provide deeper, more meaningful, data. In the case of Blackwell, this was obtained by relating the logs to a database containing the demographic data relating to subscribers, and, in the case of Emerald, by means of desk research (obtaining background information on using institutions from reference works and websites). While we describe the technical procedures and problems associated with deep log analysis in this paper this is not our main purpose, which was not so much as to explain ‘‘how’’ it was done, but more to show what can be produced – really, to demonstrate the utility and significance of the data. (For those wanting more details of the techniques please refer to Nicholas, Huntington, Lievesley, & Wasti, 2000; Nicholas, Huntington, Rowlands, Russell, & Cousins, 2004.) Many of the ideas and methods presented in this article were developed as part of work we have conducted in helping the UK government map and evaluate the roll-out of digital health services to the consumer (Nicholas, Huntington, & Williams, 2004). The goal of the Virtual Scholar Research Program is to do the same kind of thing in the scholarly journal field. 3. Literature review A whole range of different methods with different approaches and objectives have been employed to study the use of digital journals. Questionnaire surveys (Finholt & Brooks, 1999; Nelson, 2001; Rusch-Feja & Siebeky,
5 6
http://www.emeraldinsight.com. http://www.blackwell-synergy.com.
D. Nicholas et al. / Information Processing and Management 42 (2006) 1345–1365
1347
1999; Salisbury & Noguera, 2003; Tenopir & King, 2001; Teskey & Urquhart, 2001) and interviews/focus groups (Bonthron et al., 2003; Talja & Maula, 2003; Teskey & Urquhart, 2001) are favourite methodologies. Transactional log studies are not so common, but they are becoming more popular (Davis, 2004a; Davis & Solla, 2003; Gargiulo, 2003; Ke, Kwakkelaar, Tai, & Chen, 2002; Tulip, 1996; Yu & Apps, 2000; Zhang, 1999). 3.1. Questionnaire survey and interview studies Generally, social survey methods tell us that the reading of scholarly articles has increased during the last decade and has been boosted by the advent of electronic journals. A series of survey studies conducted by Tenopir and King (2001) over the last two decades shows that scientists not only read more articles but also read from a broader range of journals. On average, nearly one third of journal articles currently being read come from digital databases, and almost half of all scientists now use electronic journals at least part of the time, with considerable variations among disciplines. Tenopir (2002, 2003) reports on findings from a number of research studies, including those by the Council on Library and Information Resources (CLIR) and the Online Computer Library Center (OCLC), on the use of electronic sources. She notes, ‘‘Although the use of electronic versions still varies from discipline to discipline, almost everyone will adopt peer-reviewed electronic journals that make their work easier and for which the cost is free or subsidized by the library’’. The use of electronic journals was highest among physicists. Other surveys (Rusch-Feja & Siebeky, 1999) verified the finding that use of electronic journals is high among physicists, biologists, and biomedical scientists and this fits with transaction statistics obtained from publishers. Smith (2003) also found that science faculty members make more use of e-journals than those from the social science faculty. Tomney and Burton (1998) found the highest e-journal use among the business, science, and engineering faculties at a British university, while history faculty members made no use of e-journals. Another survey conducted by Nelson (2001) at another British university shows the highest use of e-journals among academics in the business school, while the lowest use occurred in the art, media, and design faculties. Scholars from all disciplines point out that a major factor in the non-use of electronic resources was the lack of archival and retrospective material. Lack of archival material has been mentioned as a disadvantage of e-journals by respondents in some other studies (Institute for the Future, 2002b; Pullinger & Baldwin, 2002). Although lack of awareness was once mentioned as one of the contributing factors for non-use of e-journals (Nelson, 2001; Tenner & Ye, 1999; Teskey & Urquhart, 2001; Tomney & Burton, 1998), it does now appear that the awareness and adoption of e-journals is increasing rapidly while convenience of use has remained the most important concern for users (Tenopir, 2003). Both Borghuis et al. (1996) and Entlich et al. (1996), as cited in Bishop et al. (2000) report that, in academia, digital journals tend to be used more by students than faculty. The findings of the Tulip project (Tulip, 1996) and a survey by Tomney and Burton (1998) show the same result. A survey by Liew, Foo, and Chennupati (2000) also shows high acceptance of e-journals by graduate students. Different interfaces among databases makes comparison difficult but some basic search functions are common throughout all commercial databases. These include a search by journal title, author, publication date, and table of contents. It has been found that users from different subject disciplines search differently for both electronic and print material (Bonthron et al., 2003; Tenopir, 2003). For example, Finholt and Brooks (1999) surveyed economics and history faculties at the University of Michigan and found that historians use abstracts of e-journal articles less than economists. Users of Internet-based subject gateways prefer to browse rather than search for a specific article, and when they do they tend to use ‘‘keyword’’ searching (Monopoli & Nicholas, 2001). Use of the online help facility is not widespread. Browsing and chaining (following bibliographic references already known) is also a popular method (Talja & Maula, 2003). Recent questionnaire surveys illustrate a tendency among online journals’ users to search rather than browse (Boyce, King, Montgomery, & Tenopir, 2004; Sathe, Grady, & Giuse, 2002). As it is clear from the above mentioned results, a considerable part of our knowledge of use and users of digital journals is based on the results of the questionnaire survey and interview studies. However, both interview and questionnaire survey studies are based on self-reported data. They tell us what users say they might or would do, or what they think they do. They are open to bias for the researcher may prompt respondents to say a particular response.
1348
D. Nicholas et al. / Information Processing and Management 42 (2006) 1345–1365
3.2. Log and usage data studies The number of studies which are based on the analysis of log or usage data has been increasing. Log analysis has been applied for different purposes such as assessing system performance, studying users’ searching and browsing behaviours, investigating the effectiveness of Big Deal subscriptions, studying literature decay, and so on. Log studies have been particularly helpful in understanding the searching and browsing behaviour of e-journals’ users. Using log analysis, eJUSt project researchers (Institute for the Future, 2002a), found three common seeking patterns: (a) journal homepage – TOC – HTML full text – PDF full text; (b) PubMed – HTML full text – PDF full text; (c) journal homepage – search – HTML full text – PDF full text. The findings showed that most requests were for full text in HTML, which were then followed by requesting the full text in PDF, as if the final goal of most visits was to take away a PDF version of an article. Log analysis of ScienceDirect OnSite (SDOS) in Taiwan shed some light on the searching behaviour of users. The analysis revealed that roughly 32% of all recorded page accesses related to full-text accesses, 34% of accesses related to browsing, 13% related to searching, and 9% of accesses related to abstract page views. In terms of search queries, of all users, 42% made 1 to 20 queries. A total of 91% of the queries were of the Simple Search type, while only 9% of the queries were of the Expanded Search type. ‘‘Any Field’’ was the default query field, matching any of the fields that can be searched, and was used in 84% of simple searches. On the other hand, about half (49%) of Expanded Search usage included fields other than the default field. Article Title, Author’s Name, and Abstract were the three query fields most frequently used in Expanded Search mode (Ke et al., 2002). The SuperJournal project showed that researchers were not very good at searching (Eason, Richardson, & Yu, 2000). But things have changed in the ten years that have elapsed since the SuperJournal project. The analysis of referral logs of chemical journals showed that library catalogues and bibliographic databases, which are both searching mechanisms, were the top two sources that led users to journals (Davis, 2004b). This supports the findings of some recent questionnaire surveys indicating a tendency among online journals’ users to search rather than browse (Boyce et al., 2004; Sathe et al., 2002). On the other hand, some other studies indicate that browsing seems to be the favoured method when using electronic journals (Eason et al., 2000; Eason, Yu, & Harker, 2000; Monopoli, Nicholas, Georgiou, & Korfiati, 2002; Tenopir, 2003). These discrepancies in the findings of different studies may be due to the fact that users behave differently when they have different goals or tasks. They may prefer to browse for keeping up-to-date while they may search if they have a task or look for information on a specific subject. This is another area where log analysis fails to deliver. Log analysis is carried out without taking into account the intention of the users. Log analysis is not all that helpful at discovering the value and use of the articles retrieved, or about what lies behind expressed information seeking behaviour. Essentially one of the limitations of basic log analysis is the fact that there is not much possibility to link use data with user data, hence a vague and general picture of users’ information seeking behaviour. This technical restriction makes it difficult to use demographic data of users for finding out about differences in information seeking activities of users with different tasks, statuses, genders and so on. However, those studies such as SuperJournal project and eJUSt that have applied triangulation have been able to illustrate a fuller picture of user’s information seeking behaviour. The SuperJournal study revealed that task, discipline, and relevance of the collection are major factors in determining patterns of use. The study showed that social scientists are more task-driven than scientists are. They search for relevant articles when prompted by tasks, while scientists browse journals on a regular basis to keep up-to-date (Pullinger & Baldwin, 2002). It should be mentioned that the range of subjects was very limited in the SuperJournal project. Its data about scientists just refer to some areas of Genetics and Chemistry, and Social scientists include Political Studies, Communications, and Cultural Studies. The eJUSt study showed that user’s status is a significant factor on how they search for information (Institute for the Future, 2002b). As mentioned earlier, this is probably because people in different positions have different tasks to do or more clearly their different goals require them to use different information seeking behaviours. Undergraduate students tend to search the Internet first, and then go to library-based services, unless they have been provided with and instructed on how to use a specific resource. It turned out in the SuperJournal project that undergraduate students used electronic journals in a ‘‘binge’’ way – making great use of them in a short time, while those whose primary task was to research (postgraduates and researchers) used the e-journals the most with undergraduates and academics a little less (Pullinger & Baldwin, 2002).
D. Nicholas et al. / Information Processing and Management 42 (2006) 1345–1365
1349
The primary goal of many log or usage data studies is to find out about use rather than users. In terms of usage studies, previous log studies have led to different conclusions about the success or otherwise of the Big Deal and consortium subscriptions to journals. Davis (2002) challenged the composition of geographic based consortia. He recommended libraries create consortia based on homogeneous membership. Obst, 2003 also saw ‘‘no future’’ for package deals on the basis of the results of his comparative study. On the other hand, Gargiulo (2003) analysed logs of an Italian consortium and strongly recommended Big Deal subscriptions. Most of the studies that employ log analysis do not provide details about the log analysis process or the software involved and further investigation is required to determine this. However, in the case of Gargiulo (2003), they used an ‘‘intelligent parser’’ and commercial statistical software for extracting and analyzing the download statistics from log files. They used SAS (Statistical Analysis Software) to deal with raw log files, extracting statistics, and creating reports. Davis (2002) was provided with data by Academic Press (San Diego, CA) as summary statistics: by journal, by institution, by month, and used SPSS for the statistical analysis. Davis in his most recent research (2004a) used Microsoft Excel and SPSS to analyze referral URLs in transaction logs of the American Chemical Society journals to find out about information seeking behaviour of chemists at Cornell University (Ithaca, New York). Ke et al. (2002) studied Elsevier ScienceDirect logs in Taiwan and used the C programming language for processing log files. They paid more attention to searching behaviour (e.g., use of search facilities, browsing, keywords used, and operators). Problems with floating IP addresses and proxies meant that they could not investigate as deeply as they would have liked. In the SuperJournal project, researchers used a program written by C++ to transform original log files into SPSS format. They emphasized that ‘‘for most SuperJournal tasks SPSS was efficient and adequate’’ (Yu & Apps, 2000). 4. Methods All digital information platforms have a facility by which logs are generated that provides an automatic and real-time record of use by everyone who accesses information services on these platforms. They represent the digital information footprints of the users and by analyzing them, you can track and map their information seeking behaviour, and, when enhanced, they can tell us something about the kinds of people that use the services. The attraction of logs is that they provide abundant and robust evidence of use. With log analysis, it is possible to monitor the use of a system by millions of people, around the country or world. Logs record use by everyone who happens to engage with the system there is no need to take a sample. The great advantages of the logs are not simply their size and reach, although the dividend here is indeed a rich and unparalleled one. Most important, they are a direct and immediately available record of what people have done: not what they say they might or would do; not what they were prompted to say, not what they thought they did. The data are unfiltered, speak for themselves, and provide a reality check that both represents the users and complements important contextual data obtained by engaging with real users and exploring their experiences and concerns. Publishers usually contract out a lot of their log analysis to third parties (e.g., Catch Word/Ingenta, Atypon) or rely on proprietary software, like WebTrends, Netracker, etc. Not withstanding the undoubted technical expertise of the third parties and the software suppliers, the analyses performed are very limited and the dangers inherent in this are that publishers (and their clients, libraries, to whom they provide data) are ‘‘once removed’’, and typically find themselves in an information dust storm kicked up by the log data. Clearly to obtain rich and accurate data from log files that really inform, it is necessary to go beyond proprietary logging software, mine the raw data more sophisticatedly and triangulate the data with other datasets or data collection methods. In other words, adopt deep log analysis techniques (DLA). Deep log analysis is best viewed as a four-step process. First, the assumptions on how the data are defined and recorded (e.g., who is a user, what is a hit, what represents success or satisfaction?) are questioned and realigned, and their statistical significance assessed. This is important, as skewed data is a real problem. This ensures both that incorrect, overinflated readings that give a false sense of achievement and progress are avoided. Second, the raw data are re-engineered to provide more metrics and powerful combined metrics to ensure that data gathering is better aligned to organizational goals and policies. The third step is to enrich the usage data by adding user demographic data (e.g., occupation, subject specialty), either with data obtained from a subscriber database (ideal) or online questionnaires (not so ideal, as user data cannot be mapped so closely on usage data). Of course, logs and user databases enable us to map the digital environment more accurately but provide only a little in the way of
1350
D. Nicholas et al. / Information Processing and Management 42 (2006) 1345–1365
explanation, satisfaction, and impacts. They do, however, raise the questions that really need to be asked in questionnaires, interviews, or by observation, i.e., to explain information seeking behaviour – the fourth step in our analyses. The research reported here has progressed to only the third step, the fourth step needs further planning and the result might be published later. The main advantage of DLA over the usual log analysis undertaken by proprietary software is that use data is enriched with the data about users and this leads to better knowledge of user behaviour. Moreover, DLA is powerful in generating some kinds of metrics that are not achievable with proprietary software; returnees and site penetration are the two most powerful metrics of DLA, which are demonstrated here. 4.1. Data collection and definitions Our analyses are based mainly upon two sets of raw server transaction logs obtained from the Emerald and Blackwell journal libraries. The datasets were: 1. One year (January–December 2002) of Emerald’s digital library logs. A year was required to pick-up on return visits, a key deep log analysis. Raw logs are enormous in size and the fact that the Emerald database is relatively small enabled us to take such a long period. 2. Two month’s worth of logs for Blackwell Synergy (February–March 2003), in which, among other things, usage data was related to user data. Site penetration and type of item viewed analyses were undertaken only on February’s data. In addition, one day’s logs (September 17, 2003) were analyzed a little over half a million user transactions in all, which constituted a test-bed for analyses. Of course, the fact that the two datasets cover different periods means that any comparisons between the two publisher platforms have to be treated with caution. In all cases the raw logs were obtained and subjected to standard deep log techniques, parsed, and then processed by SPSS. Standard usage (e.g., type of items viewed) and deep log analyses (site penetration and returnee) analyses were generated. For full details of the methods used, see Nicholas et al. (2000). The size of the datasets was enormous; nevertheless, in the case of Blackwell, we are only commenting on a month or two of data and our results should be looked at in this light. The working definitions for the metrics employed by the project are as follows: ! User. In the case of Emerald, user identification was based on the ‘‘Urn’’ number the unique identification number used by the server to write and read cookies. A user is effectively a computer; sometimes that computer represents an individual, (i.e., a professor in his office), in other cases a number of people (i.e., students in the library). For Blackwell user identification was based on a combination of IP number and browser details. Again, a user was effectively a computer; sometimes that computer represents an individual and in other cases a number of people. Sessions. They are identified in the logs by a session identification number. Both Emerald and Blackwell had session identification numbers. Logs include a session-beginning tag and a session-ending tag, which enables us to make time calculations as well. ! Items viewed/requests made. A ‘‘complete’’ item returned by the server to the client in response to a user action. Typically, this might include an abstract, an article, or a table of contents. A complete item might be all the pages, charts, etc. from an article, and this is recorded as a single item; hence, the digital library logs are quite different from traditional server log files that record pictures and text documents separately. The Blackwell logs also recorded views to the home page and a returned search screen. For both digital libraries, we embellished and supplemented the usage data with data about the user and/or their organization. In the case of Emerald, this was limited to data obtained on the country and type of organization they belonged to. However, in Blackwell’s case, the data collection was much more than that. User background data (on occupation, organizational affiliation, and geographical location) held on a registered user database was related, via an identification number, to the usage logs generated in February 2002 by registered users. The user database contained records of over 500,000 registered users. The database was not a complete record of subscribers entering the site. This was because there was a number of ways that subscribers
D. Nicholas et al. / Information Processing and Management 42 (2006) 1345–1365
1351
entering the site were recognized, for example, they could be identified as coming via a trusted proxy sever user, a society member, a location such as a university, or were users at a given IP address and so on. The number of subscribers entering the site via their user name and password was relatively small about 10%. The log file of these subscribers was extracted and supplemented with information extracted from the form that users fill in; this gives information on the users’ occupation, place of work, and how they first heard about the Blackwell online library. Much of the form is free text entry; hence, it was not possible to place all users within the user categories we employ in our analysis. 4.2. Websites’ interfaces7 The interfaces of EmeraldInsight and Blackwell Synergy have some common features. They both provide users with the options of simple and advanced searches for articles. Users can browse the list of journals by subject or alphabetical order of titles and they can also limit the journals just to those that they subscribe to. The main difference between their interfaces is the way users can access the full-text of articles. On a table of contents page of a journal issue on EmeraldInsight, users can opt to view the article. By clicking on the option ‘View’ they will be taken to another page which includes the abstract of the article as well as options to view full-text PDF, full-text HTML (if available), or download the PDF file. This means users have to visit the abstract before viewing the full-text. But on a table of content page of a journal issue on the Blackwell Synergy users have options to view either full-text of the article or the abstract. 5. Results For each of the key usage metrics – number of items viewed in a session and return visits, the data are broken down by a range of user characteristics. These two metrics offer solid platforms for characterizing and comparing the information seeking behaviour of subgroups of users. We need to do this because generalizations based upon millions of users, while sounding impressive, can prove very misleading indeed, camouflaging, possibly, big differences between individual user groups, like that between students and professors, for instance. We will demonstrate this by defining users by: ! ! ! ! ! ! !
occupation (academic status); place of work; type of subscriber (big deal, non-subscriber etc); geographical location of user; type of university (old and new); referrer link used; number of items viewed in a session.
We have not provided the same user analyses for both digital libraries because of the essential differences between the content and logs of the two publishers, and the particular emphases of the individual investigations that have been combined for the purpose of this paper. This should not prove a problem as the aim of the paper is to show what deep logging could offer and the comparison between the two libraries is of secondary interest only. Table 1 provides a summary of the usage data collected for the two digital libraries. Nearly 3 million users, viewing over 34 million items, are represented in our analyses. 5.1. Type of item viewed When online to a digital library users can view different kinds of pages or perform a number of different transactions and by mapping them we can obtain an idea of what they obtain from the site and how they
7
This description is based on the situation of the sites in 2002–2003 when the log data were collected.
1352
D. Nicholas et al. / Information Processing and Management 42 (2006) 1345–1365
Table 1 Keynote statistics Database
Number of items viewed
Number of sessions conducted
Number of users
Blackwell two month study Emerald one year study
10,573,353 23,564,578
2,783,727 4,789,140
820,230 2,013,827
use it. We identified the following type of views as being particularly significant, views to: the list of journal issues, individual journal table of contents (ToCs), abstracts, and full text articles (Table 2). On Synergy full text articles proved to be the most viewed item, which suggests three possibilities, users: (1) wanted to go directly to the source itself to make their own opinions as to its relevance, we call such people ‘end-user checkers’ (Nicholas, Huntington, Williams, & Dobrowolski, 2004); (2) simply did not understand the navigational qualities of abstracts – and this supposition is partly supported by a separate study micro-tracking two users, which showed that out of the 16 sessions undertaken only one featured an abstract view; (3) used A&I services, like PubMed, for the first trawl, which then took them directly to the article they required, and we shall see evidence later to show that this also provides part of the explanation. Articles accounted for nearly one third (31%) of all views. About two thirds of these views were undertaken in PDF format and one third in HTML. A quarter of views concerned journal title content lists, 23% individual journal issue table of contents (ToCs) and 20% abstracts. The picture for Emerald is quite different with abstracts being viewed most – nearly half (49%) of all views were to abstracts, 17% of views were to content pages, 8% to issues and 26% to articles. Clearly, site structure plays a role here. Thus in the case of Emerald, when you choose your article in the ToC, you have to see the abstract to choose whether you want full text and in which format, but in the case of the Blackwell ToC you have options to go directly to PDF, HTML, references or abstract. The distribution of item views is likely to be biased as the logs only record documents sent and will not record repeated views to locally cached table of contents or issue pages stored on the users’ machine. Therefore, the relatively low use of content and issue documents may reflect the caching of these pages to the users local machine. However this does not explain the big differences between the two digital libraries. The following user analyses by type of subscriber and referrer link are just examples to show how deep log analysis can burrow deeper into the type of item viewed data to seek further explanation and clarification. Further analyses of type of item viewed have published before in Nicholas, Huntington, and Watkinson (2005). 5.1.1. Users as defined by type of subscriber (Emerald) In the case of the Emerald logs users were classified according to whether they were subscribers or not. Subscribers can be categorised in to two types – Big Deal or non-Big Deal. The difference between the two is essentially their download rights – the former can download full-text articles from virtually any journal on the database, the latter typically a half dozen or so. Non-subscribers have to use their credit card if they want a full-text article, unless it is one of the journals featured as Journal of the Week, in which case they can download this for free. Trialists are a sub-group of non-subscribers who have signed up to a free one month’s trial
Table 2 Type of item viewed – comparison between Blackwell and Emerald Type of item viewed
Blackwell synergy (%)
Emerald insight (%)
Issue lists Table of contents Abstracts Full-text articles, of which a. % in PDF b. % in HTML
25 23 20 31 (66) (34)
8 17 49 26 (56) (44)
D. Nicholas et al. / Information Processing and Management 42 (2006) 1345–1365
1353
Fig. 1. Type of item viewed by type of subscriber.
during which they can download five full-text articles. Fig. 1 gives the distribution of type of document viewed by type of subscriber. Surprisingly perhaps, it was not the user group who had the most generous downloading rights – the Big Deal users, who viewed the most full-text articles, but the trialists. For trialists articles made up 29% of their views, whereas for the Big Deal subscribers this was 24%. This can be put down to a kind of digital sales mentality. Non-subscribers (45% of views to abstracts), were plainly using abstracts as a substitute for the real thing (the article). Non-subscribers obviously used the digital library to check/identify material, 56% of views were to lists of various sort, with 36% being table of contents. 5.1.2. Users defined by referrer link used (Blackwell) The referrer link details the site previously visited by the user before arriving at the Synergy site. Many sites block this information and additionally it is difficult to categorise sites, as there is no standard convention for categorising sites. For example, picking out academic library sites involves searching through the dataset and picking out all referrer links with the word library in the link reference name. However, many libraries will not necessarily include this in their name. Referrer links were crudely classified into six categories: other, library portals, journal links, via Blackwell Publishing (the parent site), via Blackwell Synergy (believed to be internal links) and Google. For example, the category journal links was based only on users coming to the site via Journal of Nursing and Journal of Addiction, as these two were easily identifiable from the logs. However, not all links were so easy to identify, and consequently we have not identified all sessions coming in via journal links. Hence the following will not give a true estimated distribution over referrer categories but is given for illustration only as to the kind of analysis that can be done with the assistance of further fieldwork. The route by which a person reaches the Synergy site possibly says something about them and we investigated this possibility. Fig. 2 examines the distribution of what type of items are viewed by referrer link. Those coming in via Blackwell Synergy (the internal link) were unsurprisingly most likely to view articles: 36% did so. Those arriving via a library link or Blackwell Publishing (host site) were more likely to view content pages/ issue lists, respectively, 66% and 64% did so, and were less likely to view abstracts, between 8% and 9% viewed abstracts. 5.2. Site penetration A more powerful and illustrative way of examining the number of items viewed is to categorise search sessions by the number of items viewed. We call such an analysis ‘site penetration’. Research we have conducted
1354
D. Nicholas et al. / Information Processing and Management 42 (2006) 1345–1365
Fig. 2. Type of item viewed by referrer-link.
elsewhere in health and media (Nicholas et al., 2004) showed that many web users do not dwell, they examine just a few items/pages before they leave – sometimes satisfied or, if not, go and search for information elsewhere. In some cases only a home page or introductory page is visited and in these cases no substantial content is consumed, although knowledge might have been gained. We call these people ‘bouncers’ or ‘end-user checkers’. The question we sought to answer was whether there was anything about a digital journal library that would make it different, in site penetration terms, to other consumer websites? Thus, in a way, you might expect a high level of penetration as a result of: (a) the bibliographic and full-text mix which gives a natural movement as a result of the toing and froing; (b) the massive choice of data on offer (hundreds of full text journals); (c) the investigative nature of some information seeking; (d) the presence of an embedded search engine and other retrieval aids. But as our data shows this does not seem to have made much of a difference, what we see instead is classic web consumer searching (shopping) behaviour that results from massive choice. Thus Table 3 shows that well over two-thirds of Blackwell and Emerald users viewed between 1 and 3 items and in the case of Emerald 42% of users viewed just one item. The similar figures for the two Blackwell datasets suggest that the metric is quite stable. How deeply a person penetrates or investigates a site is clearly an interesting metric, showing variously, interest, satisfaction and ‘busyness’. It might also tell us something about searching style, digital visibility, the structure and nature of the website. A number of hypotheses may be postulated which explain this distribution. Users might access the site just to see what is there but return later to pick up their material. Alternatively, users (students more likely) may be given the exact Internet reference of an item in a bibliography or link to an A&I service like PubMed, and thus go directly to view the item without investigating other pages (and indeed, as we shall see later, this does happen quite frequently). A further possibility relates to the nature of the Internet itself. In a many cases users will use a search engine to find the site and these engines return a number of clickable links that the user will cycle through. Clicking on Table 3 User classification by number of items viewed in a session Type of user/session
Number of items viewed
Bouncer/checker Moderately engaged Engaged Seriously engaged
1–3 4–10 11–20 Over 21 Total
Emerald (January–December 2002)
Blackwell (17th September 2004)
Blackwell (February 2004)
70 20 6 4
68 24 5 3
67 26 5 2
100
100
100
D. Nicholas et al. / Information Processing and Management 42 (2006) 1345–1365
1355
Fig. 3. Number of views in a session by occupation. Adopted from Nicholas et al. (2005, p. 274) (with permission from Emerald).
the first link, viewing maybe a page or two to see what’s there and then go on, if their search has not been satisfied or only partially satisfied, to the next link – hence the term end-user checkers. There are other possible explanations as well. For example, users might access a site and look at one article and determine that it meets their information need and end the session. Users may be diverted to a more urgent task and prematurely terminate a session. If a fee is required, users may terminate the session in favour of using a ‘‘free’’ resource until they have narrowed-down or better understand the scope of the search topic before returning to a fee-based resource. However, these are all hypotheses and follow-up qualitative research is needed to find the reason and rational behind these kinds of behaviours. The number of views made in a session provides an idea of the degree of penetration of a site, but, the metric says little about the quality or substance of content retrieved. For example a session featuring 1 to 3 views suggests limited or checking use; however, this would be truer if these pages were what we might term menu pages (issue lists & TOCs) rather than article (or abstract) views. Clearly what the user is viewing in a session impacts on the site penetration metric. 5.2.1. Users defined by occupation (Blackwell) Postgraduates turned out to penetrate the site least, with well over one-third viewing three items or less in a session (Fig. 3). Undergraduates, perhaps contrary to expectation, penetrated the site most, with 19% viewing 11 or more items in a session. This may be due to their unfamiliarity with a research topic which requires them to view more items compared to postgraduate students who can better specify their information needs and are more knowledgeable about a topic, so they are able to disregard (i.e., filter out) irrelevant content, which allows them to view fewer items. These were registered users and the fact that the bouncer/checker proportions were about half that of the total population of users (as shown in Table 3) probably reflects the commitment and loyalty shown by people who had bothered to register. 5.2.2. Users defined by place of work (Blackwell) Interestingly, the user’s place of work is not a statistically significant8 variable (Fig. 4) and there are no real differences in the number of requests in a session by place of work. 8
There was insufficient evidence to reject the null hypothesis (chi squared) at the 5% significance level. Hence for example the difference of 29–24% is due to sampling and does not reflect an actual difference between place of work and requests in a session.
1356
D. Nicholas et al. / Information Processing and Management 42 (2006) 1345–1365
Fig. 4. Place of work by requests in a session.
5.2.3. Users defined by type of subscriber (Emerald) For this analysis subscriber groups have been divided further: trialists have been divided into two types – (those joining) online or offline, and new groups have been identified – users who searched via Ingenta and took advantage of journal of the week promotions (Fig. 5). Non-subscribers recorded the highest percentage
Fig. 5. Items viewed in a session by type of subscriber (Chi = 862931, 24df.000).
D. Nicholas et al. / Information Processing and Management 42 (2006) 1345–1365
1357
of sessions where only one item was viewed. Three-quarters of non-subscribers (75%) only viewed 1 page. Fewer than 6 times as many non-subscribers viewed a single item compared to Big Deal subscribers. Clearly these users see what is there and leave without fully exploring the site, perhaps with the view of returning at a later date or simply going somewhere better. We call these people bouncers, and they are a big feature of most Web sites, even academic ones. Off-line trialists were the least likely to view one item in a session, these people were plainly giving the site a real test – 55% of these users viewed 4 or more items in a session. Journal of the Week users also made good use of the site as nearly two-thirds (64%) viewed 4 or more items in a session. On-line trialists were least likely to conduct a session where only a single item was viewed, which does suggest that this metric is one that measures interest. 5.2.4. Users defined by geographical location of the user (Emerald) Fig. 6 shows that UK users were the most active when online – 53% viewed 4 or more items in a session and Western European users the least active, 35% viewed 4 or more items in a session. This data was obtained from an analysis of IP addresses, and as a result is less robust (UK users may register with a USA service provider). 5.2.5. Users defined by type of university (Emerald) Universities were classified according to whether they were one of the ‘new’ or ‘old’ UK universities. Old universities tend to be most research active and we wanted to see whether this had an impact on digital information seeking. We also subdivided them according to whether they subscribed to Emerald’s Big Deal as this was clearly an important variable (Fig. 7). Old universities penetrated the site more deeply, and having a Big Deal did not really make much difference. Big Deals made a big difference in the case of new universities and it was non-deal new universities that were more likely to have ‘‘bouncer’’ sessions; 15% had as compared to an expected value of about 9%.
Fig. 6. Items viewed in a session by geographical location of user.
1358
D. Nicholas et al. / Information Processing and Management 42 (2006) 1345–1365
Fig. 7. Items requested in a session: Old vs New Universities (Chi = 867.7 12df.000).
5.3. Return visitors The number of times someone returns to a site to search is plainly a key metric, which tells us something about site loyalty and satisfaction. Coming back to a site constitutes conscious and directed use. The industry calls it site stickiness, and everyone wants their site to be sticky. However, in our previous research we have found, that not only people view very little of a site’s contents, but also they do not come back very often. We put this down to an information promiscuity that has arisen out of massive digital choice. In theory how frequently they return should depend on the nature of the site – a newspaper site, for instance, might be expected to obtain more return visits. It is not clear what would constitute a natural frequency for a journal site. However, almost by definition, in the case of academics, one would have thought that subscribers would naturally develop a repeat behaviour in order to fulfil their current awareness needs. Table of Content email alert services is one of the factors that can trigger returnees to revisit the sites. Both Blackwell Synergy and EmeraldInsight offer this service for their journals. Table 4 (Column 2) which shows the number of times Emerald users returned to the site during 2002, however, tells us otherwise. It shows that the large majority of people (69%) visited the site once during the 12 Table 4 Users grouped by number of visits made during survey period Number of visits
Emerald (January–December 2002)
Blackwell (February–March 2003)
1 2 to 5 6 to 15 Over 15
69% 24% 5% 2%
63% 28% 6% 3%
Total
100
100
D. Nicholas et al. / Information Processing and Management 42 (2006) 1345–1365
1359
month period. Just under a quarter of the users visited the site between 2 and 5 times, about 5% of users visited the site between 6 and 15 times and just one and half percent of users visited over 15 times. Given the fact, that in some cases, the user is a multi-user the numbers of individuals returning is probably an overestimation. Interestingly, the Blackwell data (Column 3), despite being collected over a much shorter period (2 months), shows higher levels of return visits, although even here just less than two-thirds of users did not revisit within the survey period. 5.3.1. Users defined by occupation (Blackwell) Fig. 8 should go some way to removing the worries that educational policymakers might have regarding the current awareness activities of academics – current awareness is, after all, an important performance metric. Professors and teachers were the most likely to return to the site over the one month period, 48% did so, while undergraduates were the least likely; only 32% returned. 5.3.2. Users defined by place of work (Blackwell) Interestingly, the user’s place of work was not a statistically significant variable (Fig. 9) and there are no differences in the number of visits by place of work. 5.3.3. Users defined by type of subscriber (Emerald) Unsurprisingly, non-subscribers were most likely to visit once in the survey period, 87% of them did so (Fig. 10). The real question here for publishers and librarians is to determine why that should be so, was it because they: (a) accidentally arrived at the site; (b) were not pleased with what they saw; (c) saw something better elsewhere? On-line trialists were most likely to return to the site. Just under three-quarters (71%) visited two or more times; this is really fascinating consumer behaviour because all these people get over and above that of ordinary non-subscribers is the right to download 5 full text articles during the one month trial period. Sufficient bait it would seem for the on-line trialists. Nearly half (49%) of these users visited between 2 and 5 times, 18% visited 6–15 times and 4% visited over 15 times. With the notable exception of trialists, Big Deal subscribers were the most likely to return to the site, 56% of deal users just visited once, one-third or 33% visited 2–5 times, 9% visited 6–15 times and 3% visited over 15 times. It would appear that the Big Deal does engender a loyalty or repeat behaviour – choice and greater opportunity to download proved to be an attraction.
Fig. 8. Number of visits by occupation.
1360
D. Nicholas et al. / Information Processing and Management 42 (2006) 1345–1365
Fig. 9. Place of work by number of visits.
Fig. 10. Number of visits in a year by type of subscriber (Chi = 298268 18df.000).
5.3.4. Users by geographical location (Emerald) Fig. 11 examines repeat behaviour by country in which the user was resident (based upon subscriber details). UK residents come back more often to the site and Western Europeans least frequently. This may reflect a nationalistic information trait among users. The web may be a wholly international environment
D. Nicholas et al. / Information Processing and Management 42 (2006) 1345–1365
1361
Fig. 11. Returnees (grouped) by country.
but users do not always share this trait. Alternatively, a language problem (in the case of Western Europe) might also offer an explanation. 5.3.5. Users defined by type of university (Emerald) There were differences between old and new universities in the UK. Old universities were less likely to visit more frequently, the proportion visiting more than once for old Deal universities was 42% compared to 31% for new universities (Fig. 12). In both cases Deal universities visited more frequently. Old universities with Big Deals visited most frequently – 53% visited more than once. 5.3.6. Users defined by level of site penetration (Emerald) Fig. 13 shows a strong link between the number of visits made and the number of items viewed. Those people making most visits were also the people who viewed most items. Well over half (56%) of those people who made more than fifteen visits a year viewed more than 4 items in a session, whereas the same figure for people who visited once was 26%. 6. Limitations Standard transaction log analysis has a number of limitations, such as caching, which underreports use, problems with user identification which is normally based on IP authentication, and problems with differentiating user performance from system performance (Jamali, Nicholas, & Huntington, 2005). Deep log analysis (DLA) methods try to minimise these limitations by enriching the log data and obtaining more robust data. This enrichment procedure can include linking demographic data to log data and categorising users into smaller groups rather than looking at a broad picture of the usage. However, even DLA provides little in the way of explanation, satisfaction and impacts, but what is really does is raise the questions that really need to be asked, in interview and questionnaire. DLA is clearly useful for certain kinds of analyses, like shedding light
1362
D. Nicholas et al. / Information Processing and Management 42 (2006) 1345–1365
Fig. 12. Number of visits in a year – UK Old and New Universities (Chi = 1669.0 9df.000).
Fig. 13. Number of visits by number of requests in a session (Chi = 266556 12 df.0000).
D. Nicholas et al. / Information Processing and Management 42 (2006) 1345–1365
1363
on the format of the articles scientists read (PDF or HTML), the age of the articles (obsolescence), and the way scientists navigate to the required material (searching and browsing behaviour). But log analysis is not very helpful in discovering the value and use of the articles retrieved, or about the rationales behind expressed information seeking behaviour. 7. Conclusion We have reported on a large scale deep log analysis that has provided usage data for two digital journal libraries in order to demonstrate the types of analyses that are possible using such techniques. In so doing we have also provided comprehensive and detailed insights to the nature of information seeking behaviour in the digital scholarly journal environment. In regard to the type of items viewed, the picture for the two digital libraries was quite different, largely a function of the heavy use of abstracts by Emerald users. This might be a result of the site structure as users have to view an abstract if they want to view the full text, or it might be due to greater use of the Emerald site by non-subscribers (itself a function of easier access), people for whom the abstract was a substitute for the full-text article. In terms of individual user groups it was particularly noteworthy that there was a digital sales mentality in the case of Emerald, where trialists made greater use of articles (29% of views) than paid-up subscribers (24%) who had much greater download choice. The results for the two digital libraries, in regard to the number of items viewed in a session, were very similar with well over two-thirds of Blackwell and Emerald users viewing between 1 and 3 items in a session. This supports our previously argued proposition (Nicholas et al., 2004) that web users do not dwell, they examine just a few items/pages before they leave. The key user features were: ! Non-subscribers were more likely to view a single item in a session than subscribers; ! Old university users penetrated the site more deeply, and whether they were part of a Big Deal did not make much difference. Big Deals made a difference in the case of new universities and it was non-deal new universities that were more likely to have ‘‘bouncer’’ sessions (viewing 1–3 items); 15% were as compared to an expected value of about 9%. However, a follow-up study is needed to explain this trend especially in terms of user satisfaction. For example users who penetrate the site more may be doing so because they cannot find exactly what they want or need and users who view just a few items and leave might do so because they view exactly what they look for and then leave the site. For both digital libraries around two-thirds of visitors did not return within the survey period and we largely put this down to the information promiscuity that has arisen out of massive digital choice (Nicholas et al., 2004). The higher percentage returning to the Blackwell site is thought to be due to the more pressing current awareness needs of scientists. Other features of the returnee analysis were: ! Professors and teachers were the most likely to return, 48% did so, while undergraduates were the least likely to; only 32% returned. ! Non-subscribers were more likely to visit once than subscribers. ! UK users were more likely to return. ! Big Deal universities visited more frequently compared to non-deal universities. Old universities with big deals visited most frequently – 53% visited more than once. By using the kind of analysis we have outlined here we can profile key user groups, as the following example of the occupational user group shows. Professors/lecturers proportionally conducted more sessions which viewed 4–10 items, and were the most likely to revisit the site. Undergraduate students conducted the highest proportion of sessions viewing 11 and more items and were the least likely group to revisit. Postgraduates search sessions were characterised by the low number of items viewed. This may be because undergraduate students are less familiar with the topic for which they are searching compared to postgraduates; hence they
1364
D. Nicholas et al. / Information Processing and Management 42 (2006) 1345–1365
need to check more items to find what they want. This is of course just a hypothesis and the issue is yet to be explained by a qualitative follow-up study. Based on these findings the direction our future research will be two fold: 1. to investigate the possibility of relating use data to user demographic and perception data by means of a questionnaire filled in by subscribers to the site. This would provide us with the means to explain use and attribute it to various behaviours. This is in fact now taking place in a study of ScienceDirect (2005–2006). 2. to conduct follow-up survey work with users to obtain answers to the questions raised by the logs. This is being undertaken with OhioLINK users (2005–2008). Acknowledgements The authors acknowledge the following organizations who helped fund the research reported in the article: The Ingenta Institute, Blackwell, and Emerald. References Bishop, A. P. et al. (2000). Digital libraries: Situating use in changing information infrastructure. Journal of the American Society for Information Science, 51(4), 394–413. Bonthron, K. et al. (2003). Trends in use of electronic journals in higher education in the UK – Views of academic staff and students. D-Lib Magazine, 9(6). Available from http://www.dlib.org/dlib/june03/urquhart/06urquhart.html. Borghuis et al. (1996). As cited in Bishop, A. P. et al. (2000). Digital libraries: Situating use in changing information infrastructure. Journal of the American Society for Information Science, 51(4), 394–413. Boyce, P., King, D. W., Montgomery, C., & Tenopir, C. (2004). How electronic journals are changing patterns of use. The Serials Librarian, 46(1–2), 121–141. Davis, P. M. (2002). Patterns in electronic journal usage: Challenging the composition of geographic consortia. College and Research Libraries, 63(6), 484–497 (and E-mail to the author, 01/05/2004). Davis, P. M. (2004a). Information-seeking behaviour of chemists: A transaction log analysis of referral URLs. Journal of the American Society for Information Science and Technology, 55(4), 326–332 (E-mail to the author, 02/06/2004). Davis, P. M. (2004b). For electronic journals, total download can predict number of users. Portal: Libraries and the Academy, 4(3), 379–392. Davis, P., & Solla, L. (2003). An IP-Level analysis of usage statistics for electronic journals in chemistry: Making inferences about user behaviour. Journal of the American Society for Information Science and Technology, 54(11), 1062–1068. Eason, K., Richardson, S., & Yu, L. (2000). Patterns of use of electronic journals. Journal of Documentation, 56(4), 477–504. Eason, K., Yu, L., & Harker, S. (2000). The use and usefulness of functions in electronic journals: The experience of SuperJournal Project. Program, 34(1), 1–28. Entlich, R. et al. (1996). As cited in Bishop, A. P. et al. (2000). Digital libraries: Situating use in changing information infrastructure. Journal of the American Society for Information Science, 51(4), 394–413. Finholt, T. A., & Brooks, J. (1999). Analysis of JSTOR: the impact on scholarly practice of access to on-line journal archives. In R. Ekman & R. E. Quandt (Eds.), Technology and scholarly communication (pp. 177–194). Berkely: University of California Press. Gargiulo, P. (2003). Electronic journals and users: The CIBER experience in Italy. Serials, 16(3), 293–298 (E-mail to the author, 10/05/ 2004). Institute for the Future (2002a). E-Journal user: Report of Web Log data mining. Accessed 24.04.2000. Institute for the Future (2002b). E-Journal user study: Research findings. Accessed 24.04.2000. Jamali, H. R., Nicholas, D., & Huntington, P. (2005). The use and users of scholarly e-journals: A review of log analysis studies. Aslib Proceedings, 57(6), 554–571. Ke, H.-R., Kwakkelaar, R., Tai, Y., & Chen, L. (2002). Exploring behaviour of E-journal users in science and technology: Transaction log analysis of Elsevier’s ScienceDirect OnSite in Taiwan. Library and Information Science Research, 24(3), 265–291 (Email to the author, 25/05/2004). Liew, C. L., Foo, S., & Chennupati, K. R. (2000). A study of graduate student and end-users’ use and perception of electronic journals. Online Information Review, 24(4), 302–315. Monopoli, M., & Nicholas, D. (2001). A user evaluation of subject based information gateways: Case study ADAM. Aslib Proceedings, 53(1), 39–52. Monopoli, M., Nicholas, D., Georgiou, P., & Korfiati, M. (2002). A user-oriented evaluation of digital libraries: Case study the ‘‘electronic journals’’ service of the library and information service of the University of Patras, Greece. Aslib Proceedings, 54(2), 103–117.
D. Nicholas et al. / Information Processing and Management 42 (2006) 1345–1365
1365
Nelson, D. (2001). The uptake of electronic journals by academics in the UK, their attitudes towards them and their potential impact on scholarly communication. Information Services & Use, 21(3–4), 205–214. Nicholas, D., Huntington, P., Lievesley, N., & Wasti, A. (2000). Evaluating consumer Web site logs: Case study The Times/Sunday Times Web site. Journal of Information Science, 26(6), 399–411. Nicholas, D., Huntington, P., Rowlands, I., Russell, B., & Cousins, J. (2004). Opening the digital box: what deep log analysis can tell us about our digital journal users. In Charleston 2003 conference proceedings, Charleston, SC. Nicholas, D., Huntington, P., & Watkinson, A. (2003). Digital journals, big deals and online searching behaviour: A pilot study. Aslib Proceedings, 55(1–2), 84–109. Nicholas, D., Huntington, P., & Watkinson, A. (2005). Scholarly journal usage: The results of deep log analysis. Journal of Documentation, 60(2), 248–280. Nicholas, D., Huntington, P., Williams, P. (2004). Digital consumer health information and advisory services in the UK: A user evaluation and sourcebook. London; City University/DoH. Available from http://ciber.soi.city.ac.uk/dhrgreports.php. Nicholas, D., Huntington, P., Williams, P., & Dobrowolski, T. (2004). Re-appraising information seeking behaviour in a digital environment: Bouncers, checkers, returnees and the like. Journal of Documentation, 60(1), 24–39. Obst, O. (2003). Patterns and costs of printed and online journal usage. Health Information and Libraries Journal, 20(1), 22–32. Pullinger, D., & Baldwin, C. (2002). Electronic journals and user behaviour: Learning for the future from the SuperJournal Project. Cambridge: Deedot Press. Rusch-Feja, D., & Siebeky, U. (1999). Evaluation of usage and acceptance of electronic journal. D-Lib Magazine, 5(10). Available fromhttp://www.dlib.org/dlib/october99/rusch-feja/10rusch-feja-full-report.html. Salisbury, L., & Noguera, E. (2003). Usability of e-journals and preference for the virtual periodicals room: A survey of mathematics faculty and graduate students. Electronic journal of Academic and Special Librarianship, 4(2–3). Available fromhttp://southernlibrarianship.icaap.org/content/v04n03/Salisbury_l01.htm. Sathe, N. A., Grady, J. L., & Giuse, N. B. (2002). Print versus electronic journals: A preliminary investigation into the effect of journal format on research processes. Journal of the Medical Library Association, 90(2), 235–243. Smith, E. T. (2003). Changes in faculty reading behaviours: The impact of electronic journals on the University of Georgia. The Journal of Academic Librarianship, 29(3), 162–168. Talja, S., & Maula, H. (2003). Reasons for the use and non use of electronic journals and databases: A domain analytical study in four scholarly disciplines. Journal of Documentation, 59(6), 673–691. Tenner, E., & Ye, Z. (1999). End-user acceptance of electronic journals: A case study from a major academic research library. Technical Services Quarterly, 17(2), 1–14. Tenopir, C. (2002). Online Serials heat up. Library Journal, 127(October), 37–38. Tenopir, C. (2003). Use and users of electronic library resources: an overview and analysis of recent research studies. Report for the Council on Library and Information Resources, August 2003. Available from http://www.clir.org/pubs/reports/pub120/pub120.pdf. Tenopir, C., & King, D. (2001). Electronic journals: How user behaviour is changing. In Online information 2001. Proceedings of the international online information meeting, London, 4–6 December 2001 (pp. 175–181). Oxford: Learned Information Europe Ltd. Teskey, P., & Urquhart, E. (2001). The acceptance of electronic journal in UK higher education. Information Services & Use, 21(3–4), 243–248. Tomney, H., & Burton, P. F. (1998). Electronic journals: A study of usage and attitudes among academic. Journal of Information Science, 24(6), 419–429. TULIP Final Report. (1996). Elsevier Science, Amsterdam. Available from http://www.elsevier.com/wps/find/librarians.librarians/tulipfr. Yu, L., & Apps, A. (2000). Studying e-journal user behaviour using log files: The experience of SuperJournal. Library and Information Science Research, 22(3), 311–338. Zhang, Z. (1999). Evaluating electronic Journals services and monitoring their usage by means of WWW server log file analysis. Vine, 111, 37–42.