Language barriers and bibliographic retrieval ... - Europe PMC

1 downloads 0 Views 787KB Size Report
constituted: MEDLINE/Ovid end users, MEDLINE/Ovid librarian- mediated users, and Pascal, a French bibliographic database, end users. Results: Among 191 ...
Language barriers and bibliographic retrieval effectiveness: use of MEDLINE by French-speaking end users By Evelyne Mouillet

Institut de Sante Publique, d'Epidemiologie et de Developpement Universite' Victor Segalen Bordeaux 2 146, rue Leo Saignat 33076 Bordeaux, Cedex France

[email protected] Objective: A study was conducted to determine if bibliographic retrieval performed by French-speaking end users is impaired by English language interfaces. The American database MEDLINE on CD-ROM was used as a model. Methods: A survey of self-administered questionnaires was performed at two libraries of Victor Segalen Bordeaux 2 University, during a two-month period in 1997. Three study groups were constituted: MEDLINE/Ovid end users, MEDLINE/Ovid librarianmediated users, and Pascal, a French bibliographic database, end users. Results: Among 191 respondents, only 22% thought English was an obstacle to their bibliographic retrieval. However, the research software was generally underused and the quality of the retrieval weak. The differences were statistically significant between users trained by librarians and the self-trained group, the former performing better. Conclusion: Special efforts need to be made to develop curriculum training programs for computerized bibliographic retrieval in medical schools, regardless of the native language of the student. INTRODUCTION

When biomedical researchers collect bibliographic data, they have to deal with the fact that the most popular biomedical bibliographic databases use English for the search engine and interface. MEDLINE, produced by the National Library of Medicine (NLM), is one of the most well-known and used databases in the biomedical field. Several studies have documented the performance of this database but most studies have focused on the quality of search results retrieved by English speaking users [1-4]. This study attempted to determine if the English language impaired the quality of a bibliographic search performed by French-speaking end users using the MEDLINE database and Ovid search interface. The following questions were considered in the design and conduct of the study: Is there any obstacle due to the use of written English and of the specific terminology of bibliographic retrieval? If English is a problem, when do the main difficulties occur: in the choice of keywords, in the understanding of the softBull Med Libr Assoc 87(4) October 1999

ware responses, or in the understanding of the citations; and how can this process be improved? MATERIAL AND METHODS

A comparative study was carried out using two bibliographic databases on CD-ROM, the American MEDLINE/ Ovid Technologies and the French Pascal. Over 3,200 journals are indexed in MEDLINE from seventy different countries. All citations are presented in English. The presentation of this database and its users' manual have been reported elsewhere [5]. MEDLINE users can generally learn to use it either through the training courses offered by a medical library, by selftraining, or by using the reference manual [6]. Pascal is a multidisciplinary and multilingual database produced by the French Institut National de l'Information Scientifique et Technique (INIST). Over 4,500 joumals are indexed in Pascal [7] covering the major French and international research in physics, chemistry, applied sciences, biology, agronomy, and medical sciences. Forty percent of the records are from European and French scientific literature, which is in451

Mouillet Table 1 Socio-behavioral characteristics of the bibliographic databases users, Bordeaux, France, 1997

Professional profile (%) Residents (M.D.) Assistant professors Students (master, Ph.D.) Documentation professionals Others (# of missing data) Age (years) Mean Standard deviation Median Range (# of missing data) Bibliographic needs (%)* Specific topic Prospective bibliography Others Origin of English skills (%) High school University Stay in an English speaking country

Self-teaching Others English Understanding (%) Very good/good Weak/bad (# of missing data) Level of satisfaction (%) Very satisfied Satisfied Not satisfied Not satisfied at all (# of missing data)

Group #1 N = 85

Group #2 N = 85

Group #3 N = 21

Total N = 191

25.9 14.1 11.8 9.4 38.8

45.9 8.2 23.5 1.2 21.2 (1)

33.3 9.5 33.3

35.6 11 19.4 4.7 29.3

-

23.9

(1)

34.1 11.52 31 21-70 (1)

30.32 6.8 28 22-60 (1)

35.2 14.6 29 23-71

32.6 10.3 29 21-71 (2)

82.4

82.4 29.4 4.7

71.4 47.6 14.3

81.2 34.6 6.3

74.1 21.1 1.2 3.5 1.2

61.9 33.3

70.15 25.12 0.52 1.04 3.14

7.1/57.6 27.1/5.9 (2)

1.2/44.7 45.9/8.2

-/38.1 52.4/9.5

3.7/49.7 217.3 (2)

27.1 62.3 7.1 1.2

28.2 61.2 5.9

19.0 57.1 19.0

(2)

(4)

26.7 61.3 7.9 0.5 (7)

36.5 5.9 68.2 27.0 4.7

-

4.8

-

-

(1)

Several responses possible. Group #1 = MEDLINE end users, Group #2 = MEDLINE librarian mediated users, Group #3 = PASCAL end users.

dexed in priority. Languages of the documents are ranked as follows: English 74%, French 10%, Russian 7%, German 6%, and others 3%. The Pascal CD-ROM is updated quarterly. The search screens are bilingual French and English. The software, GTI produced by Jouve SI, offers two types of interfaces: an assisted mode allowing retrieval mainly by choices from menus and an expert mode in which commands are keyed in directly, much like the Boolean method used by most online systems. The search can be carried out by keywords. Pascal's controlled vocabulary includes 80,000 terms in English, French, Spanish, or German. The principles of the retrieval are similar to MEDLINE/ Ovid and, although it looks easier to a French user, the software interface is less friendly. A few training courses are available through university libraries in France.

To reach the study objectives, a survey was conducted of database users at the Medical Library of the University Victor Segalen Bordeaux 2 and at the Public 452

Health Library of the Institute of Public Health, Epidemiology, and Development (ISPED) of the same university. The main study group (group #1) consisted of the MEDLINE end users, who were a representative group of physicians, pharmacists, professors, and researchers from the university and from the adjacent Bordeaux University Hospital (Table 1). These health professionals, working in relation to the academic institution, were supposed to be aware of computer and MEDLINE!/Ovid use because they chose to be end users. Two comparison groups were also constituted.

The MEDLINE/Ovid librarian-mediated users (group #2) despite the librarian assistance, still had to choose the English keywords and to read the screen results presented in English to determine if the retrieved citations were appropriate. For Pascal end users (group #3), the multilingual search engine was usually used in French. Three self-administered questionnaires with a common core of questions and items specific to each study Bull Med Libr Assoc 87(4) October 1999

Use of MEDLINE by French-speaking end users Figure 1 English language and Ovid for MEDLINE French-speaking end users The following four questions have to be answered in the questionnaire 1. What does this message mean? Mapping term to subject headings. [This will take about 15 seconds ...] 1_1

1. Localisation du mot-cI6 dans le thesaurus en 15" 2. La r6ponse a votre sujet de recherche apparaltra dans 15" 3. Vous avez 15" pour trouver le bon terme

RESULTS

2. Translate this sentence:

Use the spacebar to select at least two sets to combine.

3. "AND" allows to: 1_1

1. De croiser les equations de recherche 2. D'elargir la recherche 3. D'appeler la bibliothbcaire 4. The following sentence means: Select any term that is more appropriate or choose OK. If there are narrower terms that are all relevant choose the

Explode option 1. Selectionnez le terme le mieux approprie ou cliquez OK, si les termes plus specifiques sont tous pertinents choisissez l'option

Explode 2. Selectionnez le terme approprie en cliquant OK, si vous voulez des termes plus specifiques, choisissez la fonction Explode

group were developed to collect data about knowledge, attitudes, and practices of the users concerning biomedical English language and especially bibliographic retrieval English glossary. The common questions were selected to provide information on the user professional activity, age, education, English training and skills, (especially in written English), training in bibliographic retrieval, and level of satisfaction with the present retrieval. Specific questions were added for the MEDLINE / Ovid end users about the methodology used to perform the MEDLINE search and the use of the search interface. To determine if English had been a barrier to bibliographic retrieval and to identify when the difficulties occurred, MEDLINE end users were asked multiple choice questions requiring them to translate selected messages provided by the software (Figure 1) and explain the meaning of tools and options available in the button bar of the software. The survey was conducted from January to March 1997 in the two libraries. A computer message appeared automatically when the search started to inBull Med Libr Assoc 87(4) October 1999

form the MEDLINE or Pascal end user about the study. The printed questionnaires were available near the computer. For the MEDLINE mediated-users (group #2), the librarian systematically asked each of them to complete the questionnaire. Once completed, the questionnaires were dropped in a special box in the lecture room of the libraries. Coding, data entry, and statistical analysis were performed in French with Epi Info, V6, software.

During the study period, 191 questionnaires were collected: 85 in group #1, MEDLINE end users; 85 in group #2, MEDLINE mediated-users; and 21 in group #3, Pascal end users. Users were essentially medical practitioners (47% of the study sample), and their mean age was thirty-two years, without differences between groups. The individuals surveyed were generally looking for a literature review about a precise subject in order to address a clinical case or to write a thesis or a manuscript. Seven subjects out of ten had learned English in high school; one out of four continued or started English training during a university curriculum (Table 1). Twenty-seven percent of the users were very satisfied with their bibliographic retrieval and 61% were satisfied, without difference between the three groups. The evaluation of the capacity of understanding English was different for the three groups. About twothirds of the MEDLINE end users assessed their understanding as good or very good compared to less than half of the MEDLINE mediated-users and more than one user out of three in the Pascal end users group (P < 0.02). There were also important differences between the three groups with regard to confidence in English skills. English was perceived as a problem for one subject out of ten in group #1, for almost one subject out of three in group #2, and for more than one subject out of three in group #3. The direct and quantitative evaluation of the difficulties with English language was done with an assessment of the translation of selected messages provided by the software and by the frequency of use of tools and options allowed by the software (Figure 1). This evaluation did not show statistically different results according to the level of confidence in English of the users. However, it showed that the bibliographic retrieval methodology was poor, that often the results were not revelant, and that all the capabilities offered by the search engine were not used. Bearing in mind that 62% of MEDLINE end users had attended a training course, the differences in users' performance were statistically significant between the group of MEDLINE end users trained by the librarians and the group of the self-trained users (P = 453

Mouillet Table 2 Software tools utilization and training, MEDLINE end users, Bordeaux, France, 1997 Self-trained

Number of tools used

Trained users N = 51

users N = 24

N = 75

Pvalue

s tool % 2 tools %

58.8 41.2

95.8 4.2

70.7 29.3

0.001 0.001

Total

0.001). No user declared using the three available tools. The trained subjects used two of the three tools available in Ovid to search in the thesaurus (Table 2) ten times more than the other subjects (41.2% versus 4.2%) and 2.5 times more than the other two Ovid options (41.5% versus 16.7%). These two options are Textword and Explode (Table 3). But tools and options of the search engine were underused by six end users out of ten overall. DISCUSSION

The following observations favored the correct representation of the study samples: data collection was systematic during the study period; all MEDLINE mediated-users participated (no refusal in group #2); and the number of questionnaires matched library statistics on the use of the databases. The size of the samples was adequate: thirty to thirty-five subjects per group was the minimum necessary to fulfill the main objective of the survey. Despite the smaller size of group #3 (N = 21), the observed differences were at least equal to those expected. Written English should be viewed here as the specific language of bibliographic research. It is clearly one of the difficulties of bibliographic retrieval for a French-speaking user, despite the high level of satisfaction, the absence of statistical differences between the groups to assess the understanding of written English, and the significant improvement of the retrieval process due to the training. During the bibliographic retrieval process, three specialized languages are required: informatics, documentation, and biomedical sciences. The English interface of MEDLINE / Ovid and the messages provided by the software are quite specific. The user's manual for MEDLINE/Ovid is a thick document in the English language-a French edition does not exist-that uses the specific languages of bibliographic and informatics fields. Each term reflects a specific meaning according to the field concerned. What do terms such as "File," "Window," "Tools," or "Records" mean in informatics? What is the "Ctrl-key?" What happens with "press Enter?" What are the "Broader" and "Narrower Subject Headings" or the "Subheadings?" What does "mapping" mean? 454

Table 3 Software options utilization and training, MEDLINE end users, Bordeaux, France, 1997 Number of options used -

option % 2 options % -

Self-trained or librarianTrained users mediated users N = 41 N = 24 58.5 41.9

83.3 16.7

Total N = 65

Pvalue

67.7 32.3

0.04 0.04

The etymology of the words used in the retrieval process is far from the Latin roots of the French language, for example, terms such as "Browse," "Keywords," or "Search Fields." In short system messages, as is the case during retrieval, some words are immediately understood and from them the user deduces the meaning of the sentence. But while the term used may be easily understood in common English, its exact meaning in bibliographic retrieval language is not always obvious. For instance, the term "Explode," which is a very important option of the search interface, is easily translated but what does it actually mean in the context of a bibliographic search? MEDLINE users also face difficulties in the use of an English biomedical thesaurus, and despite their medical background, the task of finding the proper MeSH term is often a problem. When English is a barrier, difficulties occur primarly during the record selection process in interpreting titles and abstracts despite the fact that the English terms used are very close to those leamed in the medical curriculum. These discussion elements stress the problem of bibliographic training of end users, particularly those in the biomedical sciences. Most users are in fact selftrained and this way of learning often remains superficial and incomplete. Another factor makes the situation worse: working with a computerized database is almost always successful at first: in a couple of minutes, results are obtained and are rarely null. Moreover, when some records have been retrieved, users find it very difficult to consider that the bibliographic strategy has been weak or even irrelevant. The results show that despite a high level of user satisfaction with retrieval, MEDLINE/Ovid utilization is often not relevant. Understanding written English is still difficult for French-speaking bibliographic database users and statistical comparisons confirm better results for trained MEDLINE end users. Because the most popular bibliographic databases in the biomedical sciences are in English (MEDLINE, Current Contents, Biological Abstracts, etc.) with widespread access to scientific literature and medical information today mediated by the Internet, and because the individual end user is a new type of researcher less likely to have had formal training, there is a need for formal training in the bibliographic process (dataBull Med Libr Assoc 87(4) October 1999

Use of MEDLINE by French-speaking end users

base organization, citation entry, search engine knowledge, thesaurus organization) and practical training in the design of search queries and the use of the tools available within the software interface [8]. Researchers in developing countries know of these increasing and huge opportunities to access international networks and use them more and more frequently [9]. The role of the libraries and research units now is to teach their users bibliographic retrieval methods and techniques in order to enable them as end users to perform documentation requests correctly and satisfy their bibliographic needs. It would be appropriate to develop bibliographic database end user-specific educational programs and instructional materials that present search procedures.

REFERENCES

ACKNOWLEDGMENTS

7. INIST-CNRS. Periodiques analyses dans la base de donnees Pascal. Vandoeuvre, France: INIST, 1995 May. 8. CHISNELL C, DUNN K, SITTIG DF. Determining educational needs for the biomedical library customer: an analysis of end-user searching in MEDLINE. In: Greenes RA, Peterson HE, Protti DJ, eds. Medinfo 95, vol 8 pt 2. Edmonton, AB: Healthcare Computing & Communications Canada, 1995: 1423-7. 9. HANMER L, IBRAHIM AS, KADIO A, KORPELA M, MOUILLET E, EDS. Special issue on health informatics on the African continent, HELINA '96. Methods Inf Med 1997;36:61-162.

The author thanks Jacqueline Meynard for her statistical and informatics support; and the librarians of the Service Commun de la Documentation de l'Universite Victor Segalen Bordeaux 2, who took care of the questionnaire circulation, especially Marie Fransoise Vitrac and Maite Courbin. Special thanks to the faculty members of the DEA de Langue Anglaise de Specialite Scientifique et Technique, especially Pr. Michel Perrin and Monique Memet, who supervised this study.

Bull Med Libr Assoc 87(4) October 1999

1. McKIBBON KA, HAYNEs RB, DILKS CJ, RAMSDEN MF, RYAN NC, BAKER L, FLEMMING T, FITZGERALD D. How good are clinical MEDLINE searches? a comparative study of clinical end-user and librarian searches. Comput Biomed Res 1990 Dec;23(6):583-93. 2. POISSON EH. End-user searching in medicine. Bull Med Libr Assoc 1986 Oct;74(4):293-9. 3. WALLINGFORD KT, HUMPHREYS BL, SELINGER NE, SIEGEL ER. Bibliographic retrieval: a survey of individual users of MEDLINE. MD Computing 1990 May-Jun;7(3):166-71. 4. MILLER N, KIRBY M, TEMPLETON E. MEDLINE on CDRom: end-user searching in a medical school library. Med Ref Serv Q 1988 Mar;7(3):1-12. 5. BLOCH-MOUILLET E. Foreign international bibliographic

databases: Index Medicus: presentation and use [CDRom edition, French]. Sante 1997 Mar;7(3):135-42. 6. Ovid v3.0 search software. starter kit. New York, NY: CDPlus Technologies, 1994.

Received February 1999; accepted May 1999

455