Design and implementation of a library-based information service in ...

3 downloads 17266 Views 185KB Size Report
service in molecular biology and genetics at the. University of Pittsburgh ... as one of the top ten universities in both National In- stitutes of Health (NIH) and ... HSLS and UPMC newsletters, and a mass email mes- sage to a select faculty list.
Design and implementation of a library-based information service in molecular biology and genetics at the University of Pittsburgh Ansuman Chattopadhyay, PhD, Information Specialist in Molecular Biology and Genetics, [email protected] Nancy Hrinya Tannery, MLS, Associate Director for Information Services, [email protected] Deborah A. L. Silverman, MLS, Associate Director for Resource Management, [email protected] Phillip Bergen, MA, MSIS, Information Architecture Librarian, [email protected] Barbara A. Epstein, MSLS, AHIP, Director, [email protected] Health Sciences Library System, University of Pittsburgh, 200 Scaife Hall, 3550 Terrace Street, Pittsburgh, Pennsylvania 15261

Setting: In summer 2002, the Health Sciences Library System (HSLS) at the University of Pittsburgh initiated an information service in molecular biology and genetics to assist researchers with identifying and utilizing bioinformatics tools.

Evaluation Mechanisms: Researcher feedback gathered during the first three years of workshops and individual consultation indicate that the information service is meeting user needs.

Program Components: This novel information service comprises hands-on training workshops and consultation on the use of bioinformatics tools. The HSLS also provides an electronic portal and networked access to public and commercial molecular biology databases and software packages.

Next Steps/Future Directions: The service’s workshop offerings will expand to include emerging bioinformatics topics. A frequently asked questions database is also being developed to reuse advice on complex bioinformatics questions.

INTRODUCTION In the wake of the explosion in volume of research, database, and discovery tools in the biomedical sciences [1], life scientists face a difficult task in maintaining awareness of advancements in their fields of interest. Researchers and developers in the field of bioinformatics, the branch of science connecting biology with information and computer science, are creating intuitive tools to assist biologists in browsing, analyzing, and deciphering sequence information. To maximize research success, however, biologists require both access to these information tools and proper training in their use. This paper describes the University of Pittsburgh Health Sciences Library System’s (HSLS’s) development and implementation of an information service in molecular biology and genetics to meet these needs. THE INFORMATION SERVICE IN MOLECULAR BIOLOGY AND GENETICS The HSLS at the University of Pittsburgh provides collections and services to meet the information needs of the educational, clinical, and research programs of the schools of medicine, dental medicine, pharmacy, and health and rehabilitation sciences and nursing and the graduate school of public health, as well as the sevJ Med Libr Assoc 94(3) July 2006

Highlights ● Formal educational and professional experience aids the program’s information specialist in designing services and selecting resources. ● User feedback indicates this expanding service is well received by researchers and graduate students.

Implications ● Subject expertise proved useful for designing relevant training and services for users in the basic sciences. ● A customized Web portal can be a valuable tool for providing both information resources and available library services to this user group.

enteen hospitals of the University of Pittsburgh Medical Center (UPMC). The University of Pittsburgh has a strong biomedical research program and has ranked as one of the top ten universities in both National Institutes of Health (NIH) and National Science Foundation (NSF) funding. After an extended planning process comprising a survey of researcher needs and ideal skills for filling a dedicated support position, HSLS initiated the in307

Chattopadhyay et al.

formation service in molecular biology and genetics in May 2002 with the hiring of an information specialist in molecular biology and genetics to lead the program. The information specialist had extensive training in the basic sciences, a bachelor’s degree in chemistry, master’s and doctoral degrees in biochemistry, postdoctoral research in signal transduction, and experience with the development of commercial literature retrieval software. Shortly after joining the HSLS staff, the information specialist attended the ‘‘National Center for Biotechnology Information (NCBI) Advanced Workshops for Bioinformatics Information Specialists’’ (NAWBIS)’’ course [2] to gain experience in public domain bioinformatics resources and strategies for developing related workshops. To determine offerings for this novel service, the authors consulted articles describing the University of Washington’s seminal molecular biology/genetics–focused program [3]. The information specialist also drew on previous research experience and information needs. Based on this analysis, the information service in molecular biology and genetics includes four main components: (1) hands-on workshops in the use of bioinformatics databases and software, (2) bioinformatics consultations with researchers, (3) licensing of commercial bioinformatics products, and (4) a molecular biology Web portal comprising information about services, workshops, and available information resources and tools. Training workshops By October 2002, the information specialist developed three workshops patterned on the ‘‘NAWBIS’’ course modules. Topics that appealed to a broad audience and provided training to cover basic bioinformatics tools for molecular biology research were introduced first. ‘‘Information Hubs for Molecular Biology and Genetics’’ provided a general outline of Web-based molecular biology and genetics tools, with a focus on the NCBI’s Entrez system for searching nucleotide, protein, and structure databases. The ‘‘Sequence Similarity Searching’’ workshop covered protein and nucleotide sequence similarity searching tools, such as the Basic Local Alignment Search Tool (BLAST) [4, 5]. The ‘‘DNA and Protein Analysis Tools’’ workshop covered major Internet-accessible tools for nucleic acids and protein sequence analysis, including functions such as restriction mapping, PCR primer design, and multiple sequence alignment. Each workshop, held in the library’s fifteen-seat training room, took place as a three-hour session, including two hours of instruction and one hour of hands-on exercises; American Medical Association (AMA) category 2 continuing medical education credits were offered for each course. The workshops were open to affiliates of the schools of the health sciences at the University of Pittsburgh, medical staff, and other employees of UPMC. Marketing strategies included advertisements on the HSLS and University Health Sciences Websites, the HSLS and UPMC newsletters, and a mass email mes308

sage to a select faculty list. Each workshop’s PowerPoint slides were also made available on the HSLS Website. Workshop participants completed a brief sevenquestion feedback form (Appendix, available only online ⬍http://www.pubmedcentral.nih.gov/tocrender .fcgi?action⫽archive&journal⫽93⬎), adapted from a Medical Library Association course evaluation form. The information specialist trained 150 participants in 15 sessions during the 2002 academic year. The heterogeneous group of attendees included faculty, postdoctoral trainees, research assistants, clinicians, and graduate students. These first workshops were refined based on participants’ feedback, with content shortened or split into separate sections. Workshops on HSLS-licensed commercial software programs, such as CellSpace Knowledge Miner [6] and VectorNTI [7], were introduced in October 2003 and March 2004, respectively. During the 2003 academic year, 161 participants attended 19 sessions of 7 different workshops. In 2004, in response to numerous requests for workshops on single nucleotide polymorphism (SNP) analysis and on genome map browsing, the information specialist added two new workshops: ‘‘Genetic Variation Resources,’’ focusing on variation databases and tools for predictive functional analysis of mutations, and ‘‘Introduction to Genome Browsers,’’ introducing prominent genome browsers—such as Map Viewer [8], Ensembl [9], and the University of California Santa Cruz Genome Browser [10]—and their applications for identifying and localizing genes contributing to genetic disorders. The ‘‘Introduction to CellSpace Knowledge Miner’’ class was expanded to ‘‘Gene-Protein Based Literature Searching’’ and modified to introduce additional common literature mining software, such as PubGene [11] and Bibliosphere [12]. In the 2004 academic year, 9 workshops were offered in 29 sessions to 302 attendees, reflecting an 80% increase from the previous year. This increase testifies to the increasing popularity and visibility of the information service in molecular biology and genetics. Table 1 represents a list of offered workshop topics, the year in which they were added, the number of times they were offered, and the total number of attendees. Graduate course lectures In addition to these workshops, the information specialist participates as a guest lecturer or co-instructor in several graduate courses, including computational methods for protein structure-function analysis in a required course for first-year school of medicine graduate students and lectures on molecular databases and software tools, such as Entrez Gene [13] and BLAST [4], in several courses for graduate students in the departments of medicine and public health. Bioinformatics consultation services The information service in molecular biology and genetics also offers individualized consultation to researchers on the use of bioinformatics tools. To faciliJ Med Libr Assoc 94(3) July 2006

Expanding library roles in bioinformatics

Table 1 Molecular biology hands-on workshops statistics Workshop

Year added

Number of times offered

Total number of participants

Information Hubs Sequence Similarity Searching DNA-Protein Analysis Tools Introduction to CellSpace Knowledge Miner Genetic Information Hubs Protein Information Hubs DNA Analysis Tools Protein Analysis Tools Introduction to VectorNTI Gene-Protein Based Literature Searching Genetic Variations Introduction to Genome Browsers

2002 2002 2002 2002 2003 2003 2003 2003 2003 2004 2004 2004

6 11 5 2 5 4 6 4 12 4 2 2

71 121 42 18 26 18 51 38 162 26 28 12

tate consultation for lecture and workshop attendees as well as other interested users, the information specialist’s email address is linked from the HSLS molecular biology Web portal. Since Summer 2002, the library has offered consultations to 177 individuals, including 35 in 2002, 70 in 2003, and 72 in 2004 (data tallied by academic year). The clientele was made up of 25% faculty, 35% research associates, 18% research assistants, and 22% students. Basic queries answered by directing the users to the proper software or databases and requiring less than an hour of information specialist’s time represent 75% of questions received to date. Analytical queries, requiring multiple iterative sessions between the information specialist and the biologist, represent about 25% of questions asked and call for in-depth support of data analysis. On occasion, these complex consultations have resulted in coauthor status for the information specialist [14]. Approximately half of the questions in each category pertain to use of commercially licensed products. Sample queries for both types of question are included in Table 2. Molecular biology Web portal HSLS added a digital molecular biology and genetics guide in spring 2003 (Figure 1) to serve as an electronic gateway to the information service [15]. This guide offers a catalog of bioinformatics resources via a task-specific hierarchical menu of links to software

tools and databases. Through the portal, affiliated researchers can send bioinformatics queries to the information specialist, browse the schedule of workshops, and access HSLS-licensed resources. Each link is tagged with an information icon providing a short resource description. In 2004, a major portal revision incorporated links from the Nucleic Acids Research resource listing [16]. Searching was also enhanced by integrating a search engine and clustering tool using software from Pittsburgh-based Vivisimo [17] (Figure 2). The WebTrends [18] software is used to calculate and analyze the number of visits to the site. In 2003, the HSLS molecular biology guide received an average of 1,756 visits per month from 1,258 one-time visitors, (those who appear only once in the log file) and 247 visitors who accessed the site more than once. In 2004, the portal averaged 2,714 monthly visits from 1,134 one-time and 341 frequent visitors. In 2005, the portal received 5,171 visits per month from 1,834 one-time and 642 frequent visitors. The increasing volume of traffic to the site as well as growth in the number of repeat users of the site reflect the site’s utility for the institution’s researchers. Commercial software programs and databases Given the library’s positioning as a central resource provider, library leadership found it a natural progression to add key commercial software programs

Table 2 Sample user queries handled by the information service in molecular biology and genetics Basic queries

Analytical queries

J Med Libr Assoc 94(3) July 2006

䡲 ‘‘I took your course on SNPs and variation last week . . . Is there a way to search for SNPs in batch by gene name, symbol, or locus link ID? I haven’t figured out a way to do this yet.’’ 䡲 ‘‘I was using the NetPhos server to look for phosphorylation sites in a protein, and it gives predictions for serine, thrionine, and tyrosine phosphorylation but doesn’t tell you what kinases are predicted to act on those sites. Do you know how to get that information?’’ 䡲 ’’I took your VectorNTI workshop . . . Would you please show me how to open ABI chromatogram file using Vector NTI?’’ 䡲 ‘‘I want to design overlapping primers to amplify a 32 kb-long genomic sequence and then want to join individual PCR products into a large assembled sequence. Is there a software tool available to perform this job?’’ 䡲 ‘‘I am working with a novel protein and fortunately (or unfortunately) not much information is available for this protein. Using bioinformatics software would you be able to predict interacting partners of my protein of interest?’’ 䡲 ‘‘I would like to do an analysis of a set of promoter sequences and determine common transcriptional regulatory sites. I have done some preliminary work, but my adviser said that you might be interested in helping me further as part of a collaborative effort. If you would have time to meet with me and go over what I have so far and potentially improve on the methods I have used, I would appreciate it.’’

309

Chattopadhyay et al.

Figure 1 The Health Sciences Library System molecular biology Web portal [15]

and databases resources in molecular biology and genetics. In 2002, the library received a request for the Proteome Bioknowledge Library (PBKL) [19]—a suite of six protein-centric databases on human, mouse, rat, worm, yeast, and fungal pathogens—from researchers working in yeast genetics. After evaluation and purchase recommendation by the information specialist, the HSLS negotiated an institution-wide site license for unlimited use of this resource. Similarly, CellSpace Knowledge Miner [6]—software for identifying literature associations between proteins, biological process, 310

cell types, and organisms—and Current Protocols Online [20], comprising laboratory manuals considered the standard for scientific research methods, were added to the HSLS collection in 2002. In 2003, to fulfill demand for institution-wide access to a basic molecular biology commercial software package, HSLS purchased a fifteen-seat license for Vector NTI [7], which allows biologists to analyze, manipulate, construct, annotate, store, and manage DNA and protein sequences. In 2004, the library purchased a site license for Sequencher, version 4.5 [21], a DNA sequence analysis program popular among molecular biJ Med Libr Assoc 94(3) July 2006

Expanding library roles in bioinformatics

Figure 2 Clustered search results for the query ‘‘transcription factor’’

ologists for its intuitive interface, ease of use, speed, and accuracy. Outreach and communication The information specialist promotes the information service in molecular biology and genetics by offering frequent presentations in school and departmental seminars and meetings. In 2004, the senior vice chancellor and dean of the school of medicine invited the information specialist to discuss the library’s bioinformatics initiatives at his regular department chair meeting. The meeting generated significant interest and resulted in requests for seminar presentations from additional departments. For such presentations, the information specialist often uses a problem-based approach to demonstrate step-by-step use of bioinformatics tools for answering real-life biological questions. For example, a researcher with a short twenty amino acid human peptide sequence can gather information about its full-length protein sequence, three-dimensional (3D) structure, matching gene sequence, precise location in the human J Med Libr Assoc 94(3) July 2006

genome, and implications in any genetic malfunction before conducting any laboratory work. These examples are intended to educate and intrigue researchers, demonstrating how the use of library-based bioinformatics tools could speed their research as well as increase the information service’s user base. The information specialist also presents formal papers at scientific symposia and meetings, including the Advancing Practice, Instruction and Innovation through Informatics (APIII) conference [22] and Medical Library Association (MLA) annual meetings [23, 24]. The HSLS bimonthly newsletter promotes outreach activities, with regular articles authored by the information specialist to highlight potential applications of bioinformatics resources licensed by the library. In 2004, the information specialist participated in an NIH-funded project to develop a joint cancer biology curriculum between the University of Pittsburgh Cancer Institute and Hampton University in Virginia, a minority training institution. This collaboration’s goal was to increase cancer research training for both fac311

Chattopadhyay et al.

ulty and students at Hampton. The training included increasing the faculty and students’ familiarity with bioinformatics. The specialist developed lectures and taught Hampton biology undergraduates as part of this project. CHALLENGES The bioinformatics technology and marketplace are evolving rapidly, comprising a moving target for information professionals as well as users. Software that was ‘‘cutting edge’’ a few years ago may be obsolete as new products emerge and scientists’ needs become more challenging. Locally loaded software may be upgraded to networked models requiring site licensing rather than software purchase. As smaller software producers are acquired by larger companies, licensing models might change or become moot as products become freely available in larger commercial endeavors. Both the information specialist and the library’s resource managers must keep pace with dynamic business models and gauge acquisitions strategies against the risks of a volatile marketplace. In addition, the information specialist faces an uphill task in keeping up with emerging laboratory technologies and research topics in the fields of biomedical sciences. Expertise in molecular biology laboratory techniques, as well as in-depth knowledge of the subject domain, has proved essential to completely understanding researchers’ questions and to identifying suitable tools and strategies to answer those questions. FUTURE DIRECTIONS HSLS has been successful thus far in providing bioinformatics support to the University of Pittsburgh research community, as evidenced by steady yearly growth in the use of its molecular biology Web portal, increasing attendance at bioinformatics workshops, and ongoing use of the bioinformatics consultation service. Future plans call for development of new workshops on emerging topics such as gene expression analysis, protein structure prediction, and RNA interference. Commercial bioinformatics software packages often require advanced training; the popularity of HSLS’s commercially licensed programs is due in large part to the ready availability of in-house training and support. In addition to introductory workshops, the information specialist is developing advanced workshops to cover individual units of licensed software tools. In addition, the information specialist will create and maintain a searchable repository of step-by-step guides to researchers’ frequently asked questions, assisting researchers in remembering the sequential steps in visiting disparate Web servers needed for answering complex bioinformatics queries. Finally, due to ongoing expansion of service and teaching, funding was secured to add an assistant information specialist to share the workload and to continue to expand the offerings and services of the information service in molecular biology and genetics. 312

CONCLUSION Traditionally, the library brings bibliographic support to the research community by offering services and expertise in resource acquisition and information distribution. It is logical for academic health sciences libraries to expand their services to include bioinformatics resources, with the goal of creating an environment that allows researchers to learn how to effectively search and organize information to facilitate discovery and analysis. Molecular databases for information retrieval and bioinformatics software for scientific data analysis are essential tools in biomedical research in the post-genome era. Responding to the need for integrated support and training to facilitate efficient resource use, the HSLS implemented a well-received information service in molecular biology and genetics, which offers a portfolio of resource training workshops and consultation services for researchers’ bioinformatics queries. The HSLS is in its third year of offering this service to the University of Pittsburgh’s research and academic community and plans to expand this service by adding additional resources and staff. REFERENCES 1. BAXEVANIS AD. The molecular biology database collection: 2003 update. Nucleic Acids Res 2003 Jan 1;31(1):1–12. 2. GEER RC, MESSERSMITH DJ, ALPI K, BHAGWAT M, CHATTOPADHYAY A, GAEDEKE N, LYON J, MINIE ME, MORRIS RC, OHLES JA, OSTERBUR DL, TENNANT MR. Advanced workshop for bioinformatics information specialists. [Web document]. Bethesda, MD: National Center for Biotechnology Information. [rev. 19 Oct 2005; cited 15 Feb 2006]. ⬍http:// www.ncbi.nlm.nih.gov/Class/NAWBIS/⬎. 3. YARFITZ S, KETCHELL DS. A library-based bioinformatics services program. Bull Med Libr Assoc Jan 2000;88(1):36–48. 4. ALTSCHUL SF, GISH W, MILLER W, MYERS EW, LIPMAN DJ. Basic local alignment search tool. J Mol Biol 1990 Oct 5; 215(3):403–10. 5. ALTSCHUL SF, MADDEN TL, SCHAFFER AA, ZHANG J, ZHANG Z, MILLER W, LIPMAN DJ. Gapped BLAST and PSIBLAST: a new generation of protein database search programs. Nucleic Acids Res 1997 Sep 1;25(17):3389–402. 6. CELLOMICS. CellSpace knowledge miner. [Web document]. Pittsburgh, PA: Cellomics, 2005. [rev. 23 Nov 2004; cited 15 Feb 2006]. ⬍http://www.cellomics.com/content/menu/CellSpace㛮 Knowledge㛮Miner/⬎. 7. LU G, MORIYAMA EN. Vector NTI, a balanced all-in-one sequence analysis suite. Brief Bioinform 2004 Dec;5(4):378– 88. 8. WHEELER DL, CHURCH DM, LASH AE, MADDEN TL, PONTIUS JU, SCHULER GD, SCHRIML LM, SEQUEIRA E, TATUSOVA TA, WAGNER L. Database resources of the National Center for Biotechnology Information: 2002 update. Nucleic Acids Res 2002 Jan 1;30(1):13–6. 9. STALKER J, GIBBINS B, MEIDL P, SMITH J, SPOONER W, HOTZ HR, COX AV. The Ensembl Web site: mechanics of a genome browser. Genome Res 2004 May;14(5):951–5. 10. KAROLCHIK D, BAERTSCH R, DIEKHANS M, FUREY TS, HINRICHS A, LU YT, ROSKIN KM, SCHWARTZ M, SUGNET CW, THOMAS DJ, WEBER RJ, HAUSSLER D, KENT WJ; UNIVERSITY OF CALIFORNIA SANTA CRUZ. The UCSC genome browser database. Nucleic Acids Res 2003 Jan 1;31(1):51–4.

J Med Libr Assoc 94(3) July 2006

Expanding library roles in bioinformatics

11. JENSSEN TK, LAEGREID A, KOMOROWSKI J, HOVIG E. A literature network of human genes for high-throughput analysis of gene expression. Nat Genet 2001 May;28(1):21–8. 12. TASHEVA ES, KLOCKE B, CONRAD GW. Analysis of transcriptional regulation of the small leucine rich proteoglycans. Mol Vis 2004 Oct 7;10:758–72. 13. MAGLOTT D, OSTELL J, PRUITT KD, TATUSOVA T. Entrez Gene: gene-centered information at NCBI. Nucleic Acids Res 2005 Jan 1;33(database issue):D54–8. 14. MILES MC, JANKET ML, WHEELER ED, CHATTOPADHYAY A, MAJUMDER B, DERICCO J, AYYAVOO V. Molecular and functional characterization of a novel splice variant of ANKHD1 that lacks the KH domain and its role in cell survival and appoptosis. FEBS J 2005 Aug;272(16):4091–102. 15. THE HEALTH SCIENCES LIBRARY SYSTEM (HSLS), UNIVERSITY OF PITTSBURGH. The molecular biology Web portal. [Web document]. Pittsburgh, PA: The University, 2006. [rev. 19 Oct 2005; cited 15 Feb 2006]. ⬍http://www.hsls.pitt.edu/guides/ genetics/⬎. 16. GALPERIN MY. The molecular biology database collection: 2005 update. Nucleic Acids Res 2005 Jan 1;33(database issue):D5–24. 17. VIVISIMO. [Web document]. Pittsburgh, PA: Vivisimo, 2006. [rev. 19 Oct 2005; cited 15 Feb 2006]. ⬍http://www .vivisimo.com⬎. 18. WEBTRENDS. [Web document]. Portland, OR: WebTrends, 2006. [rev. 19 Oct 2005; cited 15 Feb 2006]. ⬍http://www .Webtrends.com⬎.

J Med Libr Assoc 94(3) July 2006

19. COSTANZO MC, CRAWFORD ME, HIRSCHMAN JE, KRANZ JE, OLSEN P, ROBERTSON LS, SKRZYPEK MS, BRAUN BR, HOPKINS KL, KONDU P, LENGIEZA C, LEW-SMITH JE, TILLBERG M, GARRELS JI. YPD, PombePD and WormPD: model organism volumes of the BioKnowledge library, an integrated resource for protein information. Nucleic Acids Res 2001 Jan 1;29(1): 75–9. 20. WILEY INTERSCIENCE. Current protocols. [Web document]. Hoboken, NJ: John Wiley & Sons, 2006. [rev. 19 Oct 2005; cited 15 Feb 2006]. ⬍http://www3.interscience.wiley .com/cgi-bin/browsebyproduct?type⫽5⬎. 21. TIPPMANN HF. Analysis for free: comparing programs for sequence analysis. Brief Bioinform 2004 Mar;5(1):82–7. 22. CHATTOPADHYAY A. Information hubs in molecular biology and genetics. Paper presented at: The Advancing Practice, Instruction and Innovation through Informatics (APIII) Meeting; October 2002; Pittsburgh, PA. 23. CHATTOPADHYAY A, EPSTEIN BA, MICKELSON PC, TANNERY NH. Development of information service program in molecular biology and genetics. Paper presented at: MLA ’03, Medical Library Association Annual Meeting; San Diego, CA; 2003. 24. CHATTOPADHYAY A. Selection of resources for the development of an information service program in molecular biology and genetics. Paper presented at: MLA ’04, Medical Library Association Annual Meeting; Washington, DC; 2004.

Received December 2005; accepted February 2006

313

Chattopadhyay et al.

APPENDIX Evaluation survey for the Health Sciences Library System Molecular Biology Workshops* Workshop topic: Instructor’s name: Date: Please check the appropriate rating for each of the following aspects of this workshop. Agree

Somewhat agree

Somewhat disagree

Disagree

N/A

I acquired: knowledge and skills I can use











Workshop objectives: met my expectations











Workshop content: was well organized was relevant to my needs

▫ ▫

▫ ▫

▫ ▫

▫ ▫

▫ ▫

Instructor was: knowledgeable well prepared and organized

▫ ▫

▫ ▫

▫ ▫

▫ ▫

▫ ▫

What part of this workshop was most helpful?

What part of this workshop was least helpful?

How did you learn about this workshop?

Comments:

* Adapted from the Medical Library Association’s continuing education evaluation form.

E-192

J Med Libr Assoc 94(3) July 2006

Suggest Documents