pathologist was determined with likelihood ratios and receiver operating ... pathologic diagnosis is a laboratory test, it should be used in a Bayesian manner to ...
Anatomic Pathology / CLINICAL HISTORY AND DIAGNOSTIC ACCURACY
Effect of Clinical History on Diagnostic Accuracy in the Cytologic Interpretation of Bronchial Brush Specimens Stephen S. Raab, MD,1-3 Thaira Oweity, MD,4 Jonathan H. Hughes, MD,4 Diva R. Salomao, MD,4 Carolyn M. Kelley, MD,4 Christopher M. Flynn, MD,4 Joyce A. D’Antonio, PhD,1,2 and Michael B. Cohen, MD4 Key Words: Cytology; Probability; Diagnosis; Clinical history; Cancer
Abstract There has been little study of the effect of clinical history on pathologic diagnostic accuracy. Five pathologists retrospectively examined 97 bronchial brush specimens with and without clinical historic information. Forty-nine patients had a biopsy-proven malignant lesion, and 48 had a benign lesion. Diagnostic accuracy with and without history for each pathologist was determined with likelihood ratios and receiver operating characteristic curves. The overall diagnostic accuracy with and without history was 0.84 and 0.76, respectively. The average negative predictive value of a benign diagnosis decreased from 89.2% (with history) to 74.0% (without history). Overall, the cytopathologists were more reluctant to make a definitive malignant diagnosis without history compared with history. The average positive predictive value of a malignant diagnosis with and without history was almost identical. The absence of history leads to lower diagnostic accuracy in the cytologic interpretation of bronchial brush specimens partly because pathologists underdiagnose malignant lesions.
78
Am J Clin Pathol 2000;114:78-83
Pathologists, like radiologists,1-5 frequently are frustrated by the absence of patient history on specimen requisition forms. At some institutions, cytologists who interpret respiratory cytology are lucky to even get the information “r/o dz [sic].” There has been little study of how this history affects pathologic diagnostic accuracy.6,7 The importance of history partly depends on how clinicians use the pathologic diagnosis.8-10 Some practitioners argue that because the pathologic diagnosis is a laboratory test, it should be used in a Bayesian manner to calculate the posttest probability of disease from the pretest probability of disease. 8 Even nondefinitive diagnoses, such as atypical or suspicious, may be converted into likelihood ratios that may used to calculate disease probabilities.9,11,12 These practitioners argue that clinical history should not be provided because it produces a bias favoring the pretest probability of disease.8,9 For example, if the clinician thinks that there is a high pretest probability that a lung lesion is cancer, informing the pathologist of this clinical information will bias the pathologist toward making a malignant diagnosis. It is argued that pathologists should just “read the slide,” express diagnoses probabilistically, and leave it to the clinicians to interpret the diagnosis. Others argue that most clinicians and pathologists do not use the pathologic diagnosis in a Bayesian manner.10 To some clinicians evaluating a patient with a lung mass, a cytologic diagnosis other than benign or malignant may not be meaningful. Withholding clinical information may lead to less definitive diagnoses, which in turn is detrimental to patient care. Many experienced pathologists know of anecdotal instances when the clinical history was a critical factor in changing a diagnostic interpretation from benign to malignant or vice versa (eg, marked atypia in a brush spec© American Society of Clinical Pathologists
Anatomic Pathology / ORIGINAL ARTICLE
imen that would be diagnosed as reactive in an 18-year-old patient with a diffuse pneumonitic process and malignant in a 69-year-old patient with a large central lung mass). In the present study, the effect of the presence or absence of clinical history on the diagnostic accuracy of bronchial brush specimen interpretation was determined.
Materials and Methods Patients Ninety-seven bronchial brush specimens were selected retrospectively from the 1991-1993 University of Iowa Hospitals and Clinics cytology files. These cases had been used previously in a study examining interobserver diagnostic variability.10 Each of the specimens consisted of 2 slides. All cases had histologic follow-up and between 6 and 18 months (mean, 13 months) of clinical follow-up. Histologic follow-up consisted of biopsy tissue. Clinical follow-up was obtained by review of the medical charts. Forty-nine cases had benign follow-up results, and 48 had malignant follow-up results. The original cytologic diagnoses had been placed in 1 of 4 categories: benign (34 cases), atypical (22), suspicious (14), and malignant (27). The malignant cases consisted of 39 non–small cell carcinomas, 7 small cell carcinomas, 1 carcinoid, and 1 malignant lymphoma. The benign cases exhibited a spectrum of changes from no diagnostic abnormality to marked reactive changes.
if they desired, to look elsewhere on the smears. Each observer scored each case in 1 of the following categories: definitely benign, probably benign, possibly malignant, probably malignant, or definitely malignant.10,13 These categories spanned the spectrum of certainty of benign to malignant. No further subclassification of each case was required. The observers were provided standardized answer forms with instructions on how to complete them. The observers did not consult each other. Pathologists The observers had different levels of experience in interpreting bronchial brush specimens. Observers 3 and 4 were the most experienced and were staff cytopathologists. Observer 2 was a cytopathology fellow who examined the slides at the end of the fellowship year. Observers 1 and 5 were residents on their 3-month block of cytology training. Statistical Analysis The likelihood ratio for malignant diagnosis was calculated for each observer and for each diagnostic category using previously described methods.10-12,14 The likelihood ratio of a diagnostic category is the quotient of the proportion of individuals with disease who have a particular diagnosis to the proportion of individuals without disease who have that particular diagnosis. Given the pretest probability of disease, the likelihood ratio can be used to calculate the posttest probability of disease. The likelihood ratio is related to the odds of disease by the following equation: Posttest Odds = Pretest Odds × Likelihood Ratio
Slide Set Preparation Each case was relabeled randomly with a number from 1 to 97. The slides were screened by an experienced cytotechnologist who was unaware of the previous cytologic diagnoses and clinical histories. The cytotechnologist placed 5 dots on each slide, marking the most worrisome areas. In the benign cases, these areas often exhibited reactive or degenerative changes. The cases were divided into 3 groups and twice circulated among the study participants. On the first circulation, no clinical history was provided. On the second circulation, clinical history was provided. The second circulation was performed approximately 2 to 3 months after the first circulation. Clinical histories were obtained by abstracting clinical information from the original cytology reports and the current medical records. Included in the clinical history for each case was patient sex, age, clinical findings (if any), and clinical suspicion of disease (for example, “56-year-old man with a 3-cm lung mass, suspicious of carcinoma”). When reviewing the cases, the observers were instructed to first concentrate on the dotted areas, and then, © American Society of Clinical Pathologists
The odds of disease and probability of disease are related by the following equations: Odds = Probability / (1 – Probability) Probability = Odds / (1 + Odds) Likelihood ratios may range from 0 to infinity. A likelihood ratio less than 1.0 lowers the posttest probability of disease from the pretest probability of disease. A likelihood ratio equal to 1.0 does not alter the posttest probability of disease from the pretest probability. A likelihood ratio more than 1.0 raises the posttest probability of disease from the pretest probability of disease. The receiver operating characteristic curves were constructed using previously described methods.10,14-17 The software package used was Statistical Package for the Social Sciences (SPSS, Chicago, IL). A receiver operating characteristic curve is a plot fitted to pairs of true-positive rates (sensitivity) and false-positive rates (100% – specificity) for a given observer or group of observers as the Am J Clin Pathol 2000;114:78-83
79
Raab et al / CLINICAL HISTORY AND DIAGNOSTIC ACCURACY
criteria for making a diagnosis are varied. Each criterion gives rise to 1 point on the curve. Sensitivity is a measure of the percentage of known diseased patients with positive test results among all patients diagnosed as diseased who are evaluated. Specificity is a measure of the percentage of patients with negative test results among all tested patients who were not diagnosed with the disease in question. The diagnostic accuracy of each observer was determined by measuring the area under the receiver operating characteristic curve. The area under the curve generally ranges from 0.5 to 1.0. An area of 0.5 corresponds to the area under a straight 45-degree line (random guessing), whereas an area of 1.0 corresponds to the area under the curve of an optimal observer. The standard error and 95% confidence interval were calculated for each area measurement. For each observer, a receiver operating characteristic curve was calculated for the diagnoses with history and a receiver operating characteristic curve was calculated for the diagnoses without history. By pooling the pathologists’ diagnoses with and without history, curves were calculated to determine the mean accuracy of diagnoses with and without history.
Results ❚Table 1❚ shows the total number of diagnoses placed
in each diagnostic category for the 5 observers. For all observers, there was a shift in the use of diagnostic categories depending on the presence of clinical history. If clinical history was provided, there was an increase in the number of malignant diagnoses. The use of the benign category varied by observer, with only some observers using this category more frequently if history was not present. ❚Table 2❚ shows the likelihood ratios for the observers for the 5 diagnostic categories when clinical history was not provided, and ❚Table 3❚ shows the likelihood ratios
when it was provided. For every observer, the likelihood ratio for the benign category was lower with clinical history than without clinical history; this indicates that for every observer, a benign diagnosis more likely indicated a benign lesion actually was present if clinical history was provided than if clinical history was not provided. For the other diagnostic categories, depending on the observer, the presence of clinical history had a variable affect. For example, for the malignant category, if clinical history was provided, the likelihood ratio increased for 2 observers and decreased for 3 observers, although the mean likelihood ratio for the malignant category increased with clinical history. ❚Table 4❚ shows the positive predictive value of a malignant diagnosis, the negative predictive value of a benign diagnosis, and the percentage of definitive diagnoses (ie, benign or malignant) for each observer. For each observer, the positive predictive value of a malignant diagnosis was similar if history was or was not provided. For each observer, the negative predictive value was always higher if clinical history was provided. This means that when history is provided, observers are more accurate with the benign diagnostic category and are able to shift malignant cases out of this category. Providing clinical history had less of an effect on observers when they used the malignant category. If an observer thought that a case was malignant if history was not provided, then that observer also thought the diagnosis was malignant if history was provided (and vice versa). There was a variability in the use of definitive diagnoses if history was provided. Observer 5 was much more definitive if history was provided, whereas observer 3 was more definitive if history was not provided. ❚Table 5❚ shows the diagnostic accuracy (area under the receiver operating characteristic curve) for the pathologists making diagnoses with and without history. The mean diagnostic accuracy with and without history was calculated from the pooled data. ❚Figure 1❚ shows the receiver operating characteristic curves for the pooled diagnoses
❚Table 1❚ Total Number of Diagnoses Placed in Each Diagnostic Category by 5 Observers Cytologic Diagnosis
Observer 1 2 3 4 5
80
Clinical History Known Yes No Yes No Yes No Yes No Yes No
Am J Clin Pathol 2000;114:78-83
Benign 32 36 48 51 34 41 50 49 31 23
Probably Benign 13 23 10 13 15 11 6 17 13 30
Possibly Malignant 12 8 2 9 7 3 14 8 13 21
Probably Malignant 13 10 4 8 6 7 9 14 15 18
Malignant 27 20 33 16 35 34 17 9 25 5
© American Society of Clinical Pathologists
Anatomic Pathology / ORIGINAL ARTICLE
❚Table 2❚ Likelihood Ratios of Cytologic Diagnoses by Five Observers Without Knowledge of Clinical History Cytologic Diagnosis Observer
Benign
1 2 3 4 5 Mean
0.58 0.39 0.25 0.28 0.28 0.36
Probably Benign
Possibly Malignant
0.54 1.63 0.60 1.46 0.68 0.83
0.15 2.04 0.52 ∞ 1.12 1.26
Probably Malignant
Malignant
4.08 3.06 1.39 2.55 5.10 3.14
9.19 7.15 7.82 8.17 ∞ 8.54
❚Table 3❚ Likelihood Ratios of Cytologic Diagnoses by Five Observers With Knowledge of Clinical History Cytologic Diagnosis Observer 1 2 3 4 5 Mean
Benign
Probably Benign
0.19 0.27 0.10 0.25 0.15 0.20
1.19 0.44 1.17 1.0 0.64 0.85
Possibly Malignant
Probably Malignant
1.02 ∞ 0 6.0 0.6 1.11
Malignant
1.19 3.06 5.10 8.0 2.81 2.66
5.87 12.9 10.9 7.5 11.74 8.93
❚Table 4❚ Accuracy of Five Observers With and Without Knowledge of Clinical History Observer 1 2 3 4 5 Mean
Clinical History Yes No Yes No Yes No Yes No Yes No Yes No
Positive Predictive Value (%)
Negative Predictive Value (%)
85 90 91 88 91 88 88 89 92 100 90 84
Definitive Diagnoses (%)
84 64 79 73 91 80 80 76 87 78 89 74
61 58 84 69 71 78 60 60 57 29 69 59
❚Table 5❚ Diagnostic Accuracy of Five Pathologists With and Without Knowledge of History With History Pathologist 1 2 3 4 5 Mean
Accuracy 0.79 0.84 0.88 0.82 0.85 0.84
SE 0.05 0.04 0.04 0.05 0.04 0.02
Without History CI 0.71-0.88 0.75-0.92 0.81-0.95 0.73-0.91 0.77-0.93 0.80-0.87
Accuracy 0.70 0.76 0.83 0.77 0.76 0.76
SE 0.05 0.05 0.04 0.05 0.05 0.02
CI 0.60-0.81 0.66-0.86 0.74-0.92 0.68-0.87 0.66-0.85 0.72-0.81
Accuracy, area under receiver operating characteristic curve; CI, confidence interval (upper bound-lower bound).
with and without history. The diagnostic accuracy of all pathologists increased if history was provided. For the pooled data across all pathologists, there was a statistically significant difference (P < .05, 1-tailed t test) between the accuracy of the diagnoses with history and without history. © American Society of Clinical Pathologists
Discussion In the interpretation of bronchial brush specimens, pathologists exhibited greater diagnostic accuracy if history was provided. Increased accuracy was because of the Am J Clin Pathol 2000;114:78-83
81
Raab et al / CLINICAL HISTORY AND DIAGNOSTIC ACCURACY
1
0.9
0.8
0.7
Sensitivity
0.6
0.5
0.4
0.3
0.2
0.1
0 0
0.5
1
1 – Specificity
❚Figure 1❚ Receiver operating characteristic curves of mean diagnoses when history was provided (diamonds) and when history was not provided (squares).
ability to make more definitive malignant diagnoses and to move cases with malignant follow-up results out of the benign category and into less definitive categories. If history was not provided, there was more diagnostic hedging and there were fewer malignant diagnoses and more false-negative diagnoses. Absence of history did not seem to alter the false-positive rate. In clinical medicine, the biasing effect of history is well recognized.5,18,19 Practitioners in some specialties, such as radiology,5 would rather trade off bias for increased diagnostic accuracy. If clinicians acted as Bayesian decision makers, creating this bias would not be beneficial,8 but most clinicians, at best, act in only a quasi-Bayesian sense. This is particularly true regarding the use of pathologic diagnoses; nondefinitive diagnoses often are treated as equivocal,8,9 rather than used probabilistically to calculate the posttest likelihood of disease. For example, in some organs, even though diagnoses such as suspicious and malignant have almost identical likelihood ratios, these diagnoses evoke a completely different clinical response. Thus, it would seem that if clinicians want to receive the most accurate pathologic diagnosis possible, clinical history would be provided. 82
Am J Clin Pathol 2000;114:78-83
In the present study, a probabilistic diagnostic scale was used,10 which is not how cytologic diagnoses are reported in the real world. In practice, the diagnoses of atypical and suspicious replace the diagnoses of probably benign, possibly malignant, and probably malignant.20 In contrast with not providing any history, as was done in the present study, pathologists usually are given at least a minimal history (eg, age and sex) and may be able to garner additional information from the hospital information system or directly from the clinician. Although pathologists rarely are blinded completely, obtaining additional information often is pathologist- and institution-dependent. For many reference laboratories, obtaining additional information is extremely difficult. For the majority of pathologists, the number of benign diagnoses that were made was greater than the number of malignant diagnoses, regardless of whether history was provided. Pathologists may be more reluctant to make definitive malignant diagnoses because of the potential clinical effect that malignant diagnoses may have. Many pathologists think that, given a malignant diagnosis, clinicians may disregard clinical information supportive of a benign lesion and act aggressively. On the other hand, many pathologists think that if a nonmalignant diagnosis is made and the clinical suspicion of malignant disease is high, additional testing will be performed and the patient will not undergo a procedure with higher risks of morbidity and mortality. Regardless of the fallacy of this thinking, false-positive diagnoses are more likely considered misinterpretations that may lead to inappropriate treatment, whereas false-negative diagnoses are more likely considered as sampling errors rather than interpretive errors.21 Whether the decrease in diagnostic accuracy secondary to the absence of clinical history directly correlates with serious diagnostic errors is uncertain.22,23 For each pathologist, depending on the diagnostic category, the likelihood ratio exhibited only a slight increase or decrease if history was or was not provided. Thus, in most cases, there were only slight shifts in diagnostic probability if history was provided (eg, a diagnosis was changed from probably benign to benign or from probably malignant to malignant). However, in some cases, there were more profound diagnostic shifts and even a few cases in which a benign diagnosis was changed to malignant and vice versa. If these specific cases were examined in a true clinical setting, the pathologist may have attempted to obtain additional information, thereby decreasing the possibility of error. There has been little study of how pathologists operate in obtaining and using history6,7 and of how clinicians actually use nondefinitive diagnoses. In the present study, the more definitive diagnoses were made by the most experienced pathologist (no. 3). This pathologist was particularly adept at making malignant © American Society of Clinical Pathologists
Anatomic Pathology / ORIGINAL ARTICLE
diagnoses when no history was provided, as opposed to the other pathologists who became more diagnostically cautious. This may indicate that clinicians should be more certain to communicate clinical information to the more junior pathologists to decrease potential errors or increase diagnostic certainty. There also seems to be a large individual component of how different diagnostic categories are used.10 For example, even with history, pathologist 4 made fewer malignant diagnoses than the residents and fellow. Individual diagnostic tendencies have not been well characterized in the medical literature and may have a greater effect on the clinical use of diagnoses than any diagnostic alteration secondary to the presence or absence of clinical history.13 For example, even though pathologist 4 may provide relatively accurate diagnoses, to some clinicians, these diagnoses may have little value given the reluctance to be diagnostically definitive in malignant cases. When history was not provided, pathologist 3 was diagnostically definitive in 78% of cases, whereas pathologist 5 was diagnostically definitive in only 29%. For all pathologists, there is a trade-off between being diagnostically definitive and the number of false-negative and false-positive diagnoses that are made. Some pathologists are better at being both diagnostically definitive and at making fewer definite diagnoses that are incorrect. Providing clinical history seems to improve any pathologist’s ability in both areas. From the 1Department of Pathology and Laboratory Medicine and the 2Center for Clinical Effectiveness and Outcomes Research, Allegheny General Hospital, Pittsburgh, PA; 3Duquesne University, Pittsburgh, PA; and the 4Department of Pathology, University of Iowa Hospitals and Clinics, Iowa City. Address reprint requests to Dr Raab: Dept of Pathology and Laboratory Medicine, Allegheny General Hospital, 320 East North Ave, Pittsburgh, PA 15212-4772.
References 1. Good BC, Cooperstein LA, DeMarino GB, et al. Does knowledge of the clinical history affect the accuracy of chest radiograph interpretation? AJR Am J Roentgenol. 1990;154:709-712. 2. Schreiber MH. The clinical history as a factor in roentgenogram interpretation. JAMA. 1963;185:137-139. 3. Eldevik OP, Dugstad G, Orrison WW, et al. The effect of clinical bias on the interpretation of myelography and spinal computed tomography. Radiology. 1982;145:85-89. 4. Doubilet P, Herman PG. Interpretation of radiographs: effect of clinical history. AJR Am J Roentgenol. 1981;137:1055-1058.
© American Society of Clinical Pathologists
5. Elmore JG, Wells CK, Howard DH, et al. The impact of clinical history on mammographic interpretations. JAMA. 1997;277:49-52. 6. Layfield LJ, Lenel JC, Crim JR, et al. Bone tumor radiograph review by pathologists prior to pathologic diagnosis: a receiver operator curve analysis of diagnostic utility. Oncol Rep. 1998;5:949-953. 7. Vives Jordan M. The value of clinical history in gastrointestinal pathology: probable amebic pancreatic granuloma [in Spanish]. Rev Esp Enferm Apar Dig. 1975;45:209-218. 8. Schwartz WB, Wolfe HJ, Pauker SG. Pathology and probabilities: a new approach to interpreting and reporting biopsies. N Engl J Med. 1981;305:917-923. 9. Bryant GD, Normal GR. Expressions of probability: words and numbers [letter]. N Engl J Med. 1980;302:411. 10. Raab SS, Thomas PA, Lenel JC, et al. Pathology and probability: likelihood ratios and receiver operating characteristic curves in the interpretation of bronchial brush specimens. Am J Clin Pathol. 1995;103:588-593. 11. Giard RW, Hermans J. Interpretation of diagnostic cytology with likelihood ratios. Arch Pathol Lab Med. 1990;114: 852-854. 12. Radack KL, Rouan G, Hedges J. The likelihood ratio: an improved measure for reporting and evaluating diagnostic test results. Arch Pathol Lab Med. 1986;110:689-693. 13. Cohen MB, Rodgers RP, Hales MS, et al. Influence of training and experience in fine-needle aspiration biopsy of breast. Arch Pathol Lab Med. 1987;111:518-520. 14. Sackett DL, Haynes RB, Guyatt GH, et al, eds. Clinical Epidemiology: A Basic Science for Clinical Medicine. 2nd ed. Boston, MA: Little Brown; 1991. 15. Beck JR, Shultz EK. The use of relative operating characteristic (ROC) curves in test performance evaluation. Arch Pathol Lab Med. 1986;110:13-20. 16. Langley FA, Buckley CH, Taster M. The use of ROC curves in histopathologic decision making. Anal Quant Cytol. 1985;7:167-173. 17. Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology. 1982;143:29-36. 18. Feinstein AR. Clinical Epidemiology: The Architecture of Clinical Research. Philadelphia, PA: Saunders; 1985. 19. Begg CB. Biases in the assessment of diagnostic tests. Stat Med. 1987;6:411-423. 20. Raab SS. Diagnostic accuracy in cytopathology. Diagn Cytopathol. 1994;10:68-75. 21. Horn RC. What can be expected of the surgical pathologist from frozen section examination. Surg Clin North Am. 1962;42:443-454. 22. Ramsay AD, Gallagher PJ. Local audit of surgical pathology. Am J Surg Pathol. 1992;16:476-482. 23. Whitehead ME, Fitzwater JE, Lindsay SK, et al. Quality assurance of histopathologic diagnoses: a prospective audit of three thousand cases. Am J Clin Pathol. 1984;81:487-491.
Am J Clin Pathol 2000;114:78-83
83