for clinical practice guideline development - Europe PMC

SYSTEMATIC LITERATURE REVIEW FOR CLINICAL PRACTICE GUIDELINE DEVELOPMENT* BY Denis M. 0 Day, MD, Earl P Steinberg, MD, MPP (BY INVITATION), AND (BY INVITATION) Kay Dickersin, PhD

INTRODUCTION

IN 1989, CONGRESS ESTABLISHED THE AGENCY FOR HEALTH CARE POLICY

and Research (AHCPR) to enhance the quality, appropriateness, and effectiveness of health care services. Within the Agency, an office known as The Forum for the Effectiveness and Quality of Health Care was further established to facilitate the development and periodic review of clinical practice guidelines. As envisaged by Congress, these guidelines were to be made available for use by physicians and other health care practitioners. Management of cataract was selected as one of the first topics for guideline development because of its prevalence in the Medicare population (1.3 million cataract operations per year), and its significant contribution to Medicare expenditures.' In accordance with AHCPR policy, a practice guideline must be based primarily on knowledge derived from the published literature, and only secondarily on expert opinion. This policy was based primarily on a desire to have AHCPR guidelines grounded in scientific evidence as much as possible. At the same time, it was recognized that professional judgment and group consensus may be used when insufficient empirical evidence is available. In this paper, we describe the process used to identify and analyze publications relevant to the guideline "Cataract in Adults: Management of Functional Impairment," and we discuss the quality and shortcomings of the current literature on cataract as identified in this project.

'From the Department of Ophthalmology and Visual Sciences, Vanderbilt University School of Medicine, Nashville, Tennessee; Division of Internal Medicine, Johns Hopkins University

School of Medicine, Baltimore; and Clinical Trials-Epidemiology Unit, University of Maryland School of Medicine Baltimore. Supported in part by an unrestricted grant and a Senior Scientific Investigator Award from Research to Prevent Blindness, Inc, New York.

422

O'Day et al METHODS

At the outset, it was clear that the, panel appointed by the AHCPR faced a mammoth task in assessing the literature relevant to cataract. Initially, Congress set a 6-month deadline for completion of the project. In July 1990, an Ad Hoc Committee, composed of ophthalmologists, experts in study design and interpretation (methodologists), and representatives from the AHCPR and the National Library of Medicine (NLM), was established to develop a strategy for identification and review of the relevant literature. This strategy was then presented to the guideline panel, which approved it after some modification. The approach to the literature review that was approved by the guideline panel was as follows (Fig 1): Title. The panel selected "The Management of Functional Impairment Due to Cataract in Adults" as the focus for the review. (This was later changed in the final draft to "Cataract in Adults: Management of Functional Impairment".) Categories to be covered by the literature review. Since the focus of the guideline was on the outcome of care of the patient with this condition, 16 major topics dealing with factors influencing the outcome were identified. Within each major topic, a set of core issues of interest were identified, resulting in a total of 93 separate issues related to the management and outcome of patients with cataract.2 Inclusion/exclusion criteria for selected literature. A set of comprehensive eligibility criteria allowing research articles to come under the Panel's review were specified (Table I). Literature sources. A series of computerized data bases were searched electronically (Table II). In addition, the bibliographies of pertinent articles were reviewed, and input was obtained from experts on the topics addressed in the review. TABLE I: CRITERIA FOR INCLUSION OF AN ARTICLE IN THE LITERATURE REVIEW

Report focused on more than 10 individuals. Research reported in articles must have focused on senile or presenile cataracts or cataracts related to diabetes. Congenital and traumatic cataracts were excluded. Reports on animal studies were excluded. Postmortem studies were excluded unless other patient data were included. Articles describing combined procedures for corneal disease or glaucoma were excluded. Date of publication, 1975 through December 31, 1990* Article was published in the English language. "Unpublished literature was excluded from the formal review process, but some manuscripts were reviewed informally. For literature on surgery and complications, the cutoff date was extended to April 1991. However, an informal surveillance was maintained through December 31, 1992, for all topics.

Clinical Practice Guideline

423

LITERATURE REVIEW PROCESS Ad hoc committee identified topics and research issues Search strategy approved by Panel NLM recovers 8,000 abstracts 44 reviewers identify potentially relevant articles

Teams of content & methodology reviewers evaluate 6,948 "masked" articles Reviewers identify additional articles

-

303 articles provide evidence for reports to Panel Panel writes guideline Peer reviewers identify 7 additional articles

Evidence from additional articles incorporated in guideline FIGURE 1

Outline of process used to identify and evaluate literature for Clinical Practice Guideline.

t

O'Day et al

424

TABLE II: DATABASES INCLUDED IN NATIONAL LIBRARY OF MEDICINE COMPUTERIZED LITERATURE SEARCH START DATE

1. 2. 3. 4. 5. 6. 7. 8.

Catline CINAHL Dissertation Abstracts Exerpta Medica

1975 1983 1960 1982 1975 1975 1975 1987

Health Medline° Psychological Abstracts Science Citation Index *Bimonthly updates were conducted on the Medline database from August 1990 through

August 1992.

Outcome measures. Positive and negative outcomes of care from the patient's perspective were defined to assist in the literature review. They are summarized in the Guideline Report.2 Retrieval of the literature. The searches performed by the NLM identified citations (titles and abstracts) of potentially relevant articles utilizing the approach developed in collaboration with the Ad Hoc Committee and the Panel.2 Selection of potentially relevant articles. For each topic, a content review team leader and one or more methodologists were appointed. Since content relevance was of prime importance in defining the literature to be reviewed, printouts of abstracts were retrieved by the NLM and were sent to clinical experts in each topic area. These experts were asked to select articles that appeared to be potentially relevant. In this first stage, the reviewers were asked to be sensitive, as opposed to specific-that is, to avoid overlooking any potentially relevant articles. Complete copies of the selected articles were retrieved from periodicals, mailed to the content reviewers, and then reviewed by them to determine whether the articles satisfied eligibility criteria. These same articles were also reviewed by the methodologic evaluators for each team. The methodologists and team leaders then communicated with one another to agree on a common set of articles deemed relevant to the topic. Assessment of content and methodology. A two-pronged approach to the literature analysis was devised in which both content and methodologic quality were evaluated. The final set of pertinent articles was masked so that authors, institutions, journal and year of publication were not visible. Clinical experts first reviewed the ophthalmologic content of each article,


425

being guided by a set of clinical questions of interest to the panel. Methodologists then reviewed pertinent publications and came to a consensus regarding the quality and justifiable conclusions of each study using previously defined criteria. This latter step often required frequent interaction between methodologic and content experts. When there were only a few articles to be evaluated, all articles were reviewed by at least two content reviewers. To assess interrate reliability, a common set of articles were assigned to at least two content and methodologic reviewers. The literature on five topics-contrast sensitivity testing, glare testing, potential vision testing, rehabilitation following surgery, and use of YAG laser for posterior capsular opacification-underwent a complete second review after the panel identified flaws in the initial review. The methodologic review was designed to cover the following for each article: study design, adequacy of the description of the patient population, specifications of inclusion and exclusion criteria, population from which patients drawn, adequacy of sample size, use of a comparison group, selection of cases or controls, standardization of exposure or intervention, standardization of definition and measurement of outcome, use of masking or randomization, specified duration of follow-up, handling of attrition, consideration of nonindependence of eyes for a single patient, appropriateness of analysis, and external validity of the study. Although the methodologic reviewers assigned to each review team were asked to consider each of these issues, the methodologic reviewers assigned to different teams used different rating systems to assess quality. This resulted in a lack of uniformity in reporting for individual topics related to cataract. The majority of the methodologic assessors rated articles as good, fair, or poor. The system for one topic, setting, and providers of care rated quality on a scale of 1 to 5. For this report, we collapsed categories so that 1 was rated poor, 2 and 3 were fair, and 4 and 5 were good. For the literature on cataract surgery and complications, a complex analysis was performed and is reported separately.3 The literature for preoperative ophthalmic tests and for anesthesia was rated according to a series of specific quality measures, and the results were tabulated. ADDITIONAL LITERATURE SOURCES

Content and methodologic reviewers and panel members were encouraged to search the bibliography of the articles identified through the electronic searches performed by the NLM, as well as their own files in an attempt to identify additional relevant publications. These additional articles were then reviewed according to the same criteria.

O'Day et al

426

Finally, peer reviewers of the Clinical Practice Guideline and the Guideline Report furnished a number of additional articles, which were then reviewed using the same approach. Those meeting the inclusion criteria were included in the guideline report. RESULTS

The NLM identified approximately 8,000 citations related to the topics designated by the panel. Forty-four clinical experts reviewed the titles and abstracts and selected 6,948 potentially relevant articles for further review. These were then masked and distributed to the content reviewers and subsequently to the methodologists for further evaluation. Of these, 303 met inclusion criteria and were used to provide evidence for the guideline panel. An additional seven previously unidentified articles were furnished by peer reviewers and were included in the final literature review. Among journals from which six or more articles were drawn, the Journal of Cataract and Refractive Surgery provided 87 articles, Ophthalmology 30, Ophthalmic Surgery 27, Archives of Ophthalmology 22, and the Anerican Journal of Ophthalmology 18 (Fig 2).

SOURCES OF 310 INCLUDED ARTICLES Journal

British Journal of Ophthalmology Acta Ophthalmologica

(Journal of American Intraocular Implant Society)

14

Ophthalmic Surgery

Ophthalmology

of Cataract &

Refractive Surgery

30

11

\1844\\\>1

74 \

*Other

Annals of Ophthalmology

Archives of Ophthalmology

(Transactions of Ophthalmologica Societies of United Kingdom) s

American Journal of Ophthalmology *Others: Cited 5 times or less FIGURE 2

Literature scores for articles included in Clinical Practice Guideline.

427


The majority of the 310 articles included in the final literature review covered 3 of 14 topics: surgery and complications (100), posterior capsular opacification (77), and potential vision (40) (Table III). No literature that met inclusion criteria could be found for three of the topics (indications for cataract surgery, preoperative medical evaluation, and rehabilitation following surgery) (Table III). Two of the 16 initially selected topics (selection of intraocular implant and nonsurgical management of cataract) were not searched owing to lack of time. Of the 77 issues relating to the 14 topics that were searched, 43 yielded no publications (Table IV). There were 18 issues with one to three publications and 6 issues with five to nine. Only three issues had more than 50 relevant publications. The issues for which no cited literature was summanzed are listed in Table V. A summary of the overall quality of articles related to each of the 14 topic areas is provided in Table VI. The overwhelming majority of studies lacked a control group and consisted of case series of one type or another (Table VII). A direct comparison of the quality assessment performed by the individual groups is not possible because of the different approaches used in scoring by the various teams. However, serious deficiencies in the quality of the research were evident in all topics areas that were reviewed. The 328 potentially relevant articles accessed as a result of the NLM search of topics related to preoperative testing eventually yielded 10 on the subject of contrast sensitivity testing, 16 on glare testing, 40 on potential TABLE

III: NUMBER

OF PUBLIC,ATIONS ON NO. OF

Referral/access

Setting of care/provider of care Contrast sensitivity testing Glare testing

Potential vision testing Specular microscopy Indications for surgery Preoperative medical evaluation Anesthesia Surgery/complications Second eye surgery Postoperative care Rehabilitation Posterior capsular

AND ISSUES TOTAL NO. OF

NO. OF RESEARCH ISSUES W'ITH PUBLICATIONS

PUBLICATIONS

ISSUES

7 9

5 2

7 18

2

10

7 7 3 3 3

2 4 1 0 0

16 40 8 0 0

9 5 3 3 6 5

4 3 3 3 0 5

9 100 10 15 0 77

77

34

310

RESEARCII TOPIC

TOPICS

ON TOPIC

opacification Total

O'Day et al

428

TABLE IV: NO. OF RELEVANT REPORTS FOR 77 RESEARCHI ISSUES NO. OF REPORTS

NO. OF RESEARCH ISSUES

0 1-3 4-9 10-19 20-49 50+ Total

43 18 6 4 3 3 77

TABLE V: ISSUES FOR WHICH NO ELIGIBLE LITERATURE WAS IDENTIFIED

Association between outcome following cataract surgery and; the identity and/ or qualifications of the individual performing the initial work-up; the length of association between the patient and physician postoperatively; the location of the surgery; the availability of emergency care; the context and identity of the person providing postoperative home care. Degree of functional impairment detected by glare testing, contrast sensitivity testing, or potential acuity testing that is not detected by history or routine physical examination alone. Value of contrast sensitivity testing, glare testing, or potential acuity testing in detecting cases that are not suitable for surgery. Relationship between the use of contrast sensitivity testing, glare testing, or potential acuity testing and the timing of surgery. Relationship between the use of potential acuity, glare and contrast sensitivity testing, and the volume of surgery performed annually by the ophthalmologist ordering the test. Relationship between preoperative contrast and potential acuity testing and the postoperative test results in patients in whom no intraoperative or postoperative complication occurred. Association between outcome with and without surgery.

Outcome of surgery among patients whose other eye is legally blind. Association between acute medical outcome in the postoperative period and the preoperative medical evaluation of the patient. Association between nonmedical evaluation and preparation of the patient and the outcome. Association between visual outcome and the following: the choice of general versus local anesthesia; the choice of local anesthesia-retrobulbar versus peribulbar; the identity and credentials of the individual monitoring the patient; the identity and credentials of the individual administering the retrobulbar injection. Association between anesthesia morbidity and the identity and credentials of the individual administereing retrobulbar or peribulbar injection. Impact of surgery on the overall function of the patient. The association between outcome of cataract surgery and the following: delivery of postoperative care at home; preoperative and postoperative patient and family education and counseling; financial status; cultural, religious, and ethnic factors; the identity of the person involved in providing postoperative rehabilitation. Cost/benefit issues across the spectrum of care for the patient with cataract.

429


vision, and 8 on specular microscopy (Table VI). The quality review of these articles revealed serious widespread deficiencies (Table VIII). Few articles met the criteria pertinent to studies that assess a preoperative test. Only 14 papers used the appropriate terminology associated with assessment of a clinical test. Twelve presented such data appropriately and in 10 the appropriate analysis (ie, data presented that allowed calculation of senTABLE

VI:

LITERATURE REVIEW OF INCLUDED ARTICLES

QUALITY TOPIC

RATING OF REPORTS

GOOD

FAIR

POOR

TOTAL

2 0

3 6

2 9

7 15 10 16 40 8 0 0 9 100 13 14 0 77

Referral/access to care Setting of care/provider of care Contrast sensitivity testing Glare testing Potential vision testing Specular microscopy Indications for surgery Preoperative medical evaluation

(See Table VII)

Anesthesia Surgery/complications Second eye surgery Postoperative care

0 4

See Table VIII 'Separate analysis 6 4

Rehabilitation Posterior capsular opacification °Powe et al.3

6

46

TABLE TOPIC

Referral/access Provider/setting Contrast

Glare Potential acuity Specular microscopy Indications Preoperative medical eval-

VII:

7 6

25

STUDY DESIGN-INCLUDED PUBLICATIONS

RCT

NRCT

CC

COS

CS

OTHER

TOTAL

0 0 0 0 0 2 0 0

0 3 0 0 0 1 0 0

0 0 0 0 0 0 0 0

0 6 0 0 0 0 0 0

4 8 10 16 40 5 0 0

3 1 0 0 0 0 0 0

7 18 10 16 40 8 0 0

3 0 0 3 0 9

1 0 0 3 0 2

0 0 0 0 0 0

0 7 0 0 0 0

5 93 9 3 0 63

0 0 1 6 0 4

9 100 10 15 0 77

uation

Anesthesia Surgery/complications Second eye Postoperative care Rehabilitation

Postoperative capsular opacification 310 0 13 256 14 17 10 Total RCT, randomized controlled trial; NRCT, nonrandomized controlled trial; CC, case control; COS, cohort study; CS, case series.

430

O'Day et al TABLE

VIII: SPECULAR MICROSCOPY

(10 ARTICLES)

GLARE TESTING (16 ARTICLES)

POTENTIAL VISION

QUALITY MEASURE

(40 ARTICLES)

(8 ARTICLES)

Clear description of study design Presentation of a "gold standard" Masking interpreter with regard to clinical history or other test results Appropriate use of terms: sensitivity, specificity, positive or negative predictive value, false positive, false negative, accuracy Appropriate presentation of data described by each term Appropriate calculation of values of each term Random order of diagnostic tests comparing test with other methods Measurement of interobserver variability Appropriate statistical analysis: data only, or partial statistical analysis

7

5

32

2

2

15

37

7

0

1

2

0

1

3

10

0

2

3

7

0

2

2

6

0

0

2

3

0

1

2

1

2

5

8

21

7

CONTRAST

SENSITIV7ITY

sitivity and specificity) was performed. The problem of observer bias was addressed in only three papers. The comparatively large number of articles related to YAG capsulotomy also allowed a more comprehensive evaluation of quality. Six of these 77 publications (from 1983 throughl992) were considered to be of good quality, 46 fair, and 25 poor. The following deficiencies in this section of literature were noted in 50% or more of the publications. * Biased outcome assessment * Lack of a comparison group * Inappropriate handling of patient attrition * Lack of appropriate statistical analysis Other problems included inadequate sample size, lack of a denominator, use of noncomparable comparison groups, failure to define outcome measurements, and failure to provide indications for Nd:YAG capsulotomy. An analysis of these 77 publications revealed no discernible improvement in quality in recent years (Fig 3). The quality of publications across journals was also similar (Fig 4). A brief overview of the nine articles related to quality of literature on anesthesia is provided in Table IX.


431

POSTERIOR CAPSULE OPACIFICATION - 77 PUBLICATIONS

QUALITY BY YEAR OF PUBLICATION

G

QO0 U

0

A

D

0

@

0 L

0

~ ~~

~

~

~

1991

1990

00@

~

oR0

1992

F

~

~

0

1988

1989

00

~ 0

1987 1

0 0

1985

986

1984

1983

1982-75

YEAR OF PUBLICATION FIGURE 3

Quality of publications on posterior capsular opacification by year of publication.

POSTERIOR CAPSULE OPACIFICATION - 77 PUBLICATIONS QUALITY SCORE BY PUBLICATION

G

QO U

0

S 0

@

I

A D

00 @I -

T P F

@.

I R

G0

0

0

00

0

0 0

0

0

0

0

AJO

EYE (TOSUK)

05

000

I

O

JCRS

(JAIAS)

I

I ARCH

OPHTHAL OPHTHAL

OPHTHAL SURG

I

.

0

ANN

OPHTHAL

TAOS

KEY:

JCRS = Journal of Cataract and Refractive Surgery (Journal of American Intraocular Implant Society) OPHTHAL = Ophthalmology ARCH OPHTHAL = Archives of Ophthalmology OPHTHAL SURG = Ophthalmic Surgery AJO = American Journal of Ophthalmology EYE (TOSUK) = Transactions of the Ophthalmological Societies of the United Kingdom ANN OPHTHAL = Annals of Ophthalmology TAOS = Transactions of the American Ophthalmological Society FIGURE 4

Quality of publications on posterior capsular opacification by journal.

OTHER

O'Day et al

432

TABLE IX: QUALITY ASSESSMENT OF ARTICLES ON ANESTHESIA QUALITY MEASURES (9 ARTICLES)

Randomized controlled trial Inclusion/exclusion criteria Appropiiate baseline data Appropriate methodology for patient attrition Appropriate comparison

NO. OF ARTICLES

3 0 6 6 5

groups

Appropriate sample size Masking of observers Appropriate outcome assessment to avoid bias Appropriate statistical analyses

5 1 1

3

DISCUSSION

The task faced by the Cataract Panel as it prepared to develop the practice guideline was formidable. Congress, while mandating a rigorous literature review as a prerequisite, had imposed an extremely short period of time for completion of the review. In May 1990, the AHCPR called a meeting of the newly appointed panel chairs and methodologists to discuss the general literature review procedures. It was left to the panels to develop their own specific approach. Of the 310 articles ultimately included as evidence for the guideline, 303 were identified by the computerized literature search. Despite intense public interest in the guideline over the 2 years in which it was in preparation, peer reviewers and others were able to provide only seven additional articles not previously identified. None of these materially altered the guideline conclusions. The division of the literature review task among teams of methodologists and content experts allowed a more meaningful critical appraisal to be performed. By design, communication between methodologists and content experts was a requirement so that the issue of relevance was at the forefront. However, this prolonged the amount of time and work involved in the literature review. As we noted, the panel, after receiving preliminary reports on a number of topics, including the preoperative tests, indications for surgery, postoperative rehabilitation, and posterior capsular opacification, asked for a complete reassessment of these sections of the literature. For some topics, such as rehabilitation and indications for surgery, this necessitated both an expanded search by the NLM and a re-review of the


433

previously identified literature to ensure comparability of the articles reviewed. Given the large number of articles (approximately 8,000) initially accessed by title and abstract, it is important to note that less than 4% met the eligibility requirements for inclusion in the literature review. From a literature base of more than 6,000 publications on cataract surgery and complications, only 100 relevant articles met eligibility criteria.3 The low specificity of a literature identification approach that deliberately "threw a broad net" to ensure that no relevant articles were missed proved to be very costly in terms of time and effort consumed by the review. We believe that in the future, more targeted efficient and specific approaches for identification of pertinent literature should be employed. Perhaps, as a starting point, panels of clinical experts should identify the most important published articles on a particular topic. As the review progressed, it became apparent that a majority of issues relevant to a guideline on the management of functional impairment due to cataract, such as indications for surgery and postoperative rehabilitation of the patient, had not been systematically studied. Another sizeable number had received only brief attention as indicated by the fact that they were addressed in five or fewer publications. Even when there was adequate literature to consider for a given issue, it was often necessary for the reviewers to reconstruct the data from tables or other sources, since the authors did not directly address the question posed by the panel.2 For example, many authors of studies evaluating preoperative tests did not examine sensitivity or specificity of the test. Data could sometimes be extracted, however, to allow the methodologists to perform these calculations. The criteria for inclusion in the review focused, by necessity, on the effects of various interventions on the outcome for patients with cataract. This approach required certain methodologic designs.' Disappointingly, our review of the cataract literature unearthed little evidence of understanding of the requirements of proper experimental design to answer questions related to evaluation of interventions. Even among the literature included in the final analysis, the preponderance of studies lacked such basic elements as a comparison or control group, efforts to minimize bias, rigorous attempts to define patient selection criteria, and appropriate measures of outcome. The literature on the use of the YAG laser for posterior capsular opacification, because of its relatively large volume and homogeneity, provided an opportunity to examine the quality of the studies in somewhat greater depth. The analysis suggests that the variable quality of these

434

O'Day et al

publications is not related to the journal of publication, nor is there evidence of any improvement in more recent years. Powe and associates,3 in a separate analysis of the 100 papers related to cataract surgery and complications, also found no improvement in study quality in recent years but did not find an association between study quality and journal of publication. Nowhere is the lack of understanding of the needs of evidence-based research more apparent than among the publications dealing with preoperative ophthalmic testing. This relatively voluminous literature was narrowed down to 74 relevant articles, of which 40 dealt with potential vision testing. The quality of the preoperative testing literature was wanting from the perspective of providing evidence that would justify the acceptance of these tests as clinically appropriate. Few studies were available that evaluate sensitivity, specificity, reliability, or predictive value. Instead, articles tended to examine one aspect of a particular test (ie, correlation of preoperative test results and preoperative visual acuity) without explicitly considering the context in which the test is used, that is, providing information useful in improving outcome for the patient.' The state of current ophthalmic literature, as revealed in this review, raises a number of important questions, particularly as the movement toward evidence-based medicine advances.4 Eight peer review ophthalmologic periodicals, all with well-respected scientific editorial boards, are substantially represented among the literature included in this review. Each journal published indifferent or poor-quality literature. While this may indicate a lack of appreciation of basic scientific method among authors, editorial boards, and reviewers, it may instead indicate that editorial policy, driven by the need to conserve journal space, has discouraged an adequate description of experimental design in the final version of articles that are ultimately published. This analysis suggests that an educational effort is needed to acquaint editors, editorial boards, and their reviewers with the basic requirements for sound study design if it is not already being done. Serious consideration should also be given to the idea of including epidemiologists, biostatisticians, and other methodologists as part of the review process. The approach to the literature review by the cataract guideline panel was apparently successful in that it met its goal of providing evidence for development of the guideline. However, there were many shortcomings. The task took 2% years to complete and was highly labor-intensive. Part of the reason for this, as already noted, is the state of the literature and the difficulties encountered in sorting out its many ambiguities. On the other hand, the lack of adequate experience with critical appraisal of the lit-


435

erature at the management level within the Agency, combined with the unrealistic deadline imposed by Congress, necessitated a "learn-as-you-go" and "revise and regroup" approach. Insufficient resources were available to perform and efficiently manage the literature review so that supervision and coordination of the groups, once they began their task, was very difficult. Preliminary training of reviewers and team leaders was not available. For these reasons, as is evident in this analysis, individual review teams followed different approaches in their assessment of the literature. Because of an incomplete understanding of the overall goals, and perhaps because some reviewers and team leaders were not well versed in this type of critical appraisal of the literature, it was necessary for some substantial parts of the review to be repeated. The work accomplished in conjunction with development of the cataract guideline indicates that, despite these difficulties, a thorough analysis of the literature is feasible and desirable for any major topic involving the care of patients. It is likely that other challenging topics in ophthalmology will be subject to serious review in the near future. It is sobering to recognize that although the ophthalmic literature is voluminous on the subject of cataract, there is a lack of well-designed studies that explicitly address important management issues. We anticipate similar problems in other topics related to eye care. Identification and retrieval of the relevant literature is an expensive and time-consuming problem that is made more difficult by inaccurate titling and uninformative abstracts. We, therefore, recommend that editors and editorial boards of peer-reviewed journals place increased emphasis on the need for accurate descriptive titling of manuscripts and the use of succinct structured abstracts containing precise descriptions of the methodology and the results of the research. SUMMARY

The purpose of this paper was to evaluate the quality and scope of the published literature on functional impairment due to cataract in adults as reviewed for the Agency for Health Care Policy and Research Clinical Practice Guideline. We examined the method of literature retrieved and analysis performed in the course of development of literature-based recommendations for the guideline panel. To collect data, we reviewed the process of literature acquisition and identification and the quality assessments made by reviewers of 14 individual topics composed of 77 issues related to the guideline. We collated this information to provide an assessment of the quality and scope of the relevant literature.

436

O'Day et al

Less than 4% (310) of the approximately 8,000 articles initially identified as potentially relevant to the guideline were ultimately used. The majority covered three topics (surgery and complication, 100; Nd:YAG capsulotomy, 77; and potential vision testing, 40). Three other topics indications for surgery, preoperative medical evaluation, and rehabilitationwere devoid of articles meeting inclusion criteria. For 43 issues, there was no identifiable relevant literature. With few exceptions, the quality of the literature was rated fair to poor owing to major flaws in experimental design. Case series (256 reports) of one type or another accounted for the majority of the included literature. There were 17 random controlled trials. This review revealed a sparse and generally low-quality literature relevant to the management of functional impairment due to cataract, despite a relatively large data base in reputable peer-reviewed journals. ACKNOWLEDGMENTS

The authors wish to acknowledge the assistance of the following in the analysis of the literature review: lone Auston, MLS; William Bourne, MD; Gordon DeFriese, PhD; John V. Donlon, MD; Donald Doughman, MD; Dagmar Friedman, MPH, LICSW; Robert Hayward, MD; Rajiv Luthra, MD, MPH; Jonathan Javitt, MD, MPH; Ernest Mazzaferri, MD; Martha Gerrity, MD, MPH; Stephen Gieser, MD, MPH; Catherine Glynn-Milley, RN, CRNO; Harry Knoph, MD; David Musch, PhD; David F. Partlett, LLB; Neil R. Powe, MD, MPH, MBA; Dinah Reitman, MPS.; Henry Sacks, MD; Oliver D. Schein, MD, MPH; Phoebe Sharkey, PhD; Eva N. Skinner, RN; Walter Stark, MD; Alan Sugar, MD; Arlo Terry, MD; James Tielsch, PhD; Linda A. Vader, BS, RN; James Weber, MD; Von Best Whitaker, RN, PhD; and Ira G. Wong, MD. REFERENCES 1. Cataract Management Guideline Panel: Cataract in Adults: Management of Functional Inmpairment. Clinical Practice Guideline. Number 4, Rockville, MD. US Department of Health and Human Services, Public Health Service, Agency for Health Care Policy and Research. AHCPR Pub. No. 93-0542. Feb, 1993. 2. Cataract Management Guideline Panel: Management of functional impairment: Cataract in Adults. Guideline Report. Ophthalmology (Suppl) 1993; 100:1.S-350S. 3. Powe NR, Tielsch JM, Schein OD, et al: Quality assessment of studies of the effectiveness and safety of cataract extraction with intraocular lens implantation. In press. 4. Evidence-Based Medicine Working Group: Evidence-based medicine: A new approach to teaching the practice of medicine. JAMA 1992; 17:2420-2425.


437

DISCUSSION

DR BRUCE E. SPIVEY. The science of cataract evaluation and therapy has been advanced significantly by the AHCPR guideline on "Cataracts in Adults: Management of Functional Impairment." The guideline reflects ophthalmologic care and the way most ophthalmologists manage patients with cataracts in a quality context. An unexpected outcome of the AHCPR guideline, and one with serious implications that cannot be ignored, is its revelation about the state of the science in cataract evaluation and care. As described by Dr O'Day and colleagues, the scientific literature on cataracts, while enormous in numbers, is equally vast in deficiencies. The literature thus yielded far less scientifically valid information on outcomes than would be expected given the numbers of studies published in peer-reviewed journals. An immediate/visceral reaction to the authors' exposition of the scientific literature's dearth of useful information is one of disbelief The AHCPR Panel's literature review strategy must have been seriously flawed in some major way. The questions posed were wrong. The right articles were not included or too stringent criteria for evaluating the quality of the articles were applied. These are legitimate concerns. And they require critical examination of the methods used in the literature review. Both the AHCPR guideline and this paper attest to the literature review's thoroughness and soundness of approach by the study. The 14 topics and their 77 related issues focused the guideline and literature review on the outcomes of care for patients with cataracts, an appropriate focus given the purpose of clinical practice guidelines and the Panel's charge. The inclusion/exclusion criteria appear reasonable and the methodologic evaluation of the quality of articles was conducted according to guidelines that reflect basic requirements of experimental design. The overwhelming predominance of studies based on a case series design (256 of 310) illustrates clearly the deficiencies in the literature. To illustrate the inadequacies more specifically, the authors analyzed the quality of the literature on preoperative tests (contrast sensitivity, glare testing, potential vision, and specular microscopy) according to nine quality measures (listed in Table VI). For example, specular microscopy failed to meet five of the nine quality measures entirely. The literature on the other tests fared only slightly better. The authors do identify several weaknesses of the literature review: the different approaches and rating systems used by the various literature review teams; the need to redo the analyses of some of the major topic areas due to poor quality; and the 2'/2 years it took to complete the review. While these shortcomings are important to consider, they should not detract from this paper's main message: There are serious deficiencies in the scientific literature on cataract evaluation and therapy stemming from a lack of understanding about what constitutes proper study design. This is a message we must hear, given today's environment. Expert opinion and "clinical impression" will no longer be sufficient to support our convictions about the effectiveness of a given test or procedure. We will need to have scientific proof that any given intervention

438

O'Day et al

will improve clinical outcome. This requirement for evidence-based medicine and the clear documentation of "quality" will only intensify in the future. These realities make it painfully clear that the profession, the research community, and the editors, editorial boards, and reviewers must join together to consistently apply the scientific method to clinical research and outcomes evaluation in all areas of medicine. There must be a renewed commitment to scientific rigor and a better understanding of proper study design. Dr O'Day is due a tremendous expression of gratitude by the ethical ophthalmologic community, all of medicine, and the citizens of this country. He led a heterogeneous group through a maze of information and misinformation to a consensus report. He then was vilified by many whose loose ethics and income were threatened by this study. Throughout, Dr O'Day's dignity and commitment to the best interests of our patients did not waiver. We cannot express our appreciation adequately, but he holds my highest respect. DR DENIS O'DAY. Dr Spivey, I very much appreciate your comments. For me, and for most of us on the panel, the process of developing this guideline was a revelation. I hope the analysis we performed has been helpful. An important finding was the tendency in the published literature to compress scientific reports because of the cost of space. To achieve this, it appears to be widespread editorial policy to reduce the amount of space devoted to experimental methods and the reporting of results. As a result, when we attempted to analyze the literature, we often found that we were at a loss to determine exactly what investigators had done. Editorial policy, reacting to concerns about the cost of space, is partly responsible for this, but it also appears that many investigators lack an understanding of what constitutes good scientific method. WAe spent a considerable amount of time endeavoring to reconstruct data in the various papers and in trying to extract information from the data that might be useful to our analyses. These analyses sometimes went beyond the intent of the original authors. Interestingly, we were not able to perform meta analyses because of the lack of fundamental experimental information in most of the papers. For an operation so prevalent in our society, it is distressing that we continue to lack a basic understanding of factors that affect the outcome. One positive result of our study is the identification of a large number of research issues which we hope will stimulate future research. It is important to note that the cataract literature is the first in ophthalmology to come under this microscope. However, it seems obvious that the published literature on glaucoma, diabetic retinopathy, and other more prevalent conditions will soon suffer a similar fate. WVe have to be concerned that this body of literature will also be found wanting in quality. There is, perhaps, some consolation in the fact that literature analyses in other areas of medicine have come to similar conclusions. Thank you.