Radiology
Breast Imaging Stamatia V. Destounis, MD Patricia DiNitto, MD Wende Logan-Young, MD Ermelinda Bonaccio, MD Margarita L. Zuley, MD Kathleen M. Willison, RT(R)(M)
Index terms: Breast neoplasms, radiography, 08.11 Breast radiography, quality assurance Computers, diagnostic aid Published online before print 10.1148/radiol.2322030034 Radiology 2004; 232:578 –584 Abbreviations: ACR ⫽ American College of Radiology CAD ⫽ computer-aided detection 1
From The Elizabeth Wende Breast Clinic, 170 Sawgrass Dr, Rochester, NY 14620. From the 2001 RSNA scientific assembly. Received January 15, 2003; revision requested March 26; final revision received November 12; accepted January 5, 2004. Address correspondence to K.M.W. (e-mail:
[email protected]) or S.V.D. (e-mail:
[email protected]).
Can Computer-aided Detection with Double Reading of Screening Mammograms Help Decrease the False-Negative Rate? Initial Experience1 PURPOSE: To retrospectively evaluate the role of computer-aided detection (CAD) in reducing the rate of false-negative (FN) findings on screening mammograms considered normal at initial double reading. MATERIALS AND METHODS: At the authors’ institution, independent prospective double readings in which the second reader is not blinded to results of the first reading are performed routinely for all mammograms. When cancer is diagnosed, prior mammograms also are reviewed with double reading to determine cancer visibility. Findings are categorized as (a) no evidence of cancer on any prior screening mammogram and patient presents more than 1 year after prior screening, (b) no evidence of cancer on any prior screening mammogram and patient presents with symptoms within 1 year after prior screening (year-interval occult false-negative), or (c) cancer visible. The clinical director separately evaluates each case in the same way. In 2000, 519 histologically proved breast cancers were diagnosed, including 132 for which patients sought a second opinion and FN findings were not tracked. Prior screening mammograms were available in 318 of the other 387 cases. Five radiologists in two reading sessions independently reviewed current and prior mammograms to categorize visible cancers as either threshold or actionable FN findings. Visible cancers deemed actionable by at least three of five readers were analyzed with a commercially available CAD system. FN rates were calculated prior to and after CAD analysis. RESULTS: Twenty-seven occult and 71 visible cancers were found (total FN findings, 98). Three of five readers considered 52 (73%) of 71 visible cancers actionable. The CAD system correctly marked 37 (71%) of these 52 on prior screening mammograms (19 [65%] of 29 masses, seven [88%] of eight microcalcifications, seven [78%] of nine architectural distortions, and four [67%] of six masses with microcalcifications). The FN rate was 98 (31%) of 318 before CAD and 61 (19%) of 318 after CAD.
Author contributions: Guarantor of integrity of entire study, S.V.D.; study concepts and design, all authors; literature research, S.V.D., K.M.W.; clinical studies, S.V.D., P.D.N., W.L.Y., E.B., M.L.Z.; data acquisition and data analysis/interpretation, all authors; manuscript preparation and definition of intellectual content, all authors; manuscript editing, S.V.D., W.L.Y., K.M.W.; manuscript revision/ review and final version approval, all authors ©
RSNA, 2004
578
CONCLUSION: In this retrospective review of this small subset of cancers, it appears that CAD has the potential to decrease the FN rate at double reading by more than one-third (from 31% to 19%). The CAD system correctly marked 37 (71%) of 52 actionable findings read as negative in previous screening years. ©
RSNA, 2004
A reduction in mortality from breast cancer is evident in published reports; each year, however, an estimated 40,200 deaths will occur and 239,300 new breast cancer cases will be diagnosed (1,2). The cause of breast cancer is unknown. Attempts have been made to control the disease through identification of risk factors and effective prevention methods. Early detection is widely recognized as a means to affect mortality and morbidity. A three-pronged approach that includes clinical breast examination, mammography, and
Radiology
breast self-examination is the prevailing method for early detection, with mammography being accepted as the standard test for the earliest detection of breast cancer (3–7). However, the limitations of mammography are well documented (8 – 13). Technical constraints, false-positive and false-negative rates, and cumulative x-ray dose are just some of the reported shortcomings. The most troublesome limitation may be the false-negative rate. The wide range in reported false-negative rates (10%–25%) (10 –12,14 –18) reflects the complexity of determining what is or is not a false-negative finding, as well as how these data are recorded. While mammographically occult cancers represent a portion of the false-negative rate, cancers visible in retrospect represent the larger portion (10,11,12,15). The latter category engenders the greatest debate about what constitutes a false-negative finding (16,19). Visible cancers are missed for many reasons at routine mammography, and their number probably could be reduced. Certainly, the interpretive acuity of the radiologist affects the false-negative rate (20), but even the most experienced of mammographers might miss cancers that are visible in retrospect, because the radiologist failed to perceive a worrisome lesion, subtle or not (15,18). Perception plays a large role in the optimization of mammography. Overcoming perceptual obstacles can allow the achievement of higher detection rates. Interpretation of the mammogram by two radiologists (double reading) is one method to compensate for perceptual inconsistencies; results from several studies indicate a 4.6%–15% increase in sensitivity with use of double as opposed to single reading (21–24). Furthermore, cancers identified by a second reader are detected at an earlier stage (22). The additional cost of double reading, however, prohibits its wide use. Even with use of double reading, false-negative findings on mammograms persist (15,17). Another method to reconcile perceptual problems is computer-aided detection (CAD) (25–28). Previous investigations of CAD in the interpretation of screening mammograms demonstrated as much as a 19.5% increase in cancer detection with a single reading (17,29). To our knowledge, no study to date has determined the value of CAD in the setting of double reading. The purpose of this retrospective pilot study was to evaluate the role of a CAD system in reducing the rate of false-negative findings on screening mammograms that have been double read as normal. Volume 232
䡠
Number 2
MATERIALS AND METHODS Patients and Double Reading Technique In our community-based practice, we examined 64,442 women in 2000 (30). Of these patients, 60,575 (94%) had undergone one or more prior mammographic examinations at our facility. Each of our radiologists—all of whom are specialists in breast disease diagnosis with experience ranging from 6 to 30 years (mean, 13 years)—reads an average of 30,000 mammograms annually. Each mammogram is interpreted online by two radiologists. Results of the double reading are reported to the patient at the time of visit. All screening mammograms are double read in the following manner: Two radiologists review and interpret studies online and independently. The second reader is not blinded to the first reader’s interpretation. Each reader places an identifying mark on an interpreted mammogram at the conclusion of reading. Cases recalled by the first reader are removed to a separate viewing area for subsequent work-up. The second reader interprets the mammograms along with the results of any work-up. This may result in additional work-up, because our clinic’s policy permits any physician to initiate work-up of any patient. Any remaining cases on the viewer after the first reading are an indication to the second reader that the first reader interpreted the mammograms as normal. The second reader may recall any other cases, which are then handled in the same manner as first-reading recalls. The first reader will also reevaluate any second-reading recall after work-up is complete. No radiologist may supersede another’s decision to work up a patient; however, any radiologist may consult with a colleague at any time. Double reading of diagnostic mammograms is completed as follows: The assigned radiologist evaluates the patient and, if the radiologist discovers an abnormality, directs any subsequent tests toward deciding the course of action. All testing is completed online. The second reader independently reviews the mammogram and consults with the assigned first reader. The primary goal of the second reader in the diagnostic setting is to screen for problems overlooked by the first reader, not to reevaluate the problematic area, although opinions may be sought.
False-Negative Rates We have tracked outcomes rigorously since our clinic began operation in 1976. To determine false-negative rates, we in-
clude not only cancers diagnosed within 1 year of a mammogram interpreted as negative but also cancers that are visible on any prior mammogram. We routinely perform a retrospective review of prior mammograms (diagnostic or screening) for all patients in whom cancer is diagnosed. When cancer is diagnosed, the two initial readers review both the current mammogram and prior screening mammograms to determine whether cancer is visible on any prior study. Cancers that are visible are documented. If the cancer is not visible but less than 1 year has elapsed since the most recent prior screening, we consider the cancer a yearinterval missed lesion and document it as an occult false-negative finding. Separately and independently, the director of the clinic (W.L.Y.) also evaluates all cancers in the same manner. Any cancer deemed visible on mammograms obtained in prior years is documented, whether one, two, or three radiologists agree or not. Twice a year, in a group conference, five radiologists reevaluate current and prior mammograms of interval occult cancers and visible cancers, all of which are considered false-negative findings, to increase each radiologist’s experience and knowledge base. For purposes of this study, these group conferences were replaced by two reading sessions performed by each of five radiologists (S.V.D., P.D.N., W.L.Y., E.B., M.L.Z.) to subcategorize the visible cancers as threshold false-negative or actionable false-negative findings. Threshold cancers are defined as pathologically proved cancers that were found at diagnostic or screening mammography in the current year and that were visible on a previous screening mammogram but were below the threshold for concern. Either these cancers were not acted on, or they were judged benign or probably benign on the basis of work-up (which may have included the acquisition of additional mammographic views, as well as physical examination, ultrasonography, fine-needle aspiration biopsy, and/or needle core biopsy). Threshold false-negative findings are considered to result from errors in judgment, biopsy technique, histopathologic evaluation, or technical limitations of the mammographic examination. Actionable findings include any pathologically proved cancer that, after independent review, was considered by three of five radiologists to be both evident and actionable on any prior screening study. We defined as actionable those cases in which mammographic evidence was strong enough to have warranted further work-up but for
CAD with Double Reading of Screening Mammograms
䡠
579
Radiology
TABLE 1 Categorization of Breast Cancer Cases according to Basis for Review of FalseNegative Findings Prior Mammogram
Means of Cancer Detection Diagnostic mammography Screening mammography Total†
No
Yes
Second Opinion*
Total
11 (6) 58 (17)
105 (60) 213 (62)
59 (34) 73 (21)
175 (34) 344 (66)
69 (13)
318 (61)
132 (25)
519 (100)
Note.—Data are numbers of cases diagnosed in 2000. Numbers in parentheses are percentages of the total or the subcategory (cancers detected at diagnostic or screening mammography). * In cases in which a second opinion was obtained for diagnosis, false-negative findings were not tracked. † Percentages for the first three columns do not total 100 because of rounding.
TABLE 2 Categorization of Cancers according to Visibility of False-Positive Findings on Prior Screening Mammograms Not Visible on Prior Mammogram Means of Cancer Detection Clinical symptoms Findings at screening mammography Total
Visible on Prior Mammogram
No Evidence*
Occult FalseNegative Finding
64 (20)
27 (9)†
Actionable Threshold False-Negative False-Negative Finding Finding 5 (2)†
9 (3)†
Total 105 (33)
156 (49)
0 (0)
14 (4)
43 (14)
213 (67)
220 (69)
27 (9)
19 (6)
52 (16)
318 (100)
Note.—Data are numbers of cases with at least one prior mammogram. Numbers in parentheses are percentages. * Prior mammogram obtained ⬎1 year before diagnosis. † Prior mammogram obtained ⱕ1 year before diagnosis.
which no work-up was completed in any prior year. Actionable false-negative findings are considered to result from errors in perception.
Selection of Cases for CAD System Analysis Institutional review board approval was not required for this study. Informed consent is not required by our institution for retrospective review of data. In 2000, a total of 519 histologically proved cancers (Table 1) were diagnosed at our facility (175 at diagnostic and 344 at screening mammography). In 132 of these 519 cases, the patient sought a second opinion, and false-negative findings in these 132 cases were not tracked. Prior screening mammograms had been obtained at our facility for 318 (82%) of the other 387 cases. These 318 cancers were detected either at diagnostic imaging performed because of symptoms that occurred within 1 year after prior screening (n ⫽ 105) or at follow-up routine screening mammography (n ⫽ 213). Prior screening mammo580
䡠
Radiology
䡠
August 2004
grams in some cases predated the current mammograms by more than 1 year. The initial evaluation process, with the double reading and with review by the director of the clinic, yielded findings of visible disease on prior screening mammograms for 71 cancers (Table 2). Five radiologists (S.V.D., P.D.N., W.L.Y., E.B., M.L.Z.) independently reviewed the current and prior mammograms of these 71 cancers. Three of the five radiologists determined independently that 52 (73%) of the 71 mammograms (mean patient age, 65 years; age range, 42– 89 years) showed abnormalities that could have been acted on (actionable false-negative findings). Our study was focused on the 52 cancers that were actionable on any prior screening study, regardless of whether they had been detected because of symptoms or at routine screening. Lead time to cancer diagnosis from the most recent prior screening mammographic examination and lead time to cancer diagnosis from the oldest actionable prior screening mammographic ex-
amination were documented for the 52 actionable false-negative findings. Total time intervals were calculated for actionable findings with lead time of 1 year or less and actionable findings with lead time of more than 1 year. Patient age, breast composition (as defined by the American College of Radiology [ACR] Breast Imaging and Reporting Data System lexicon [21]), cancer type (ie, invasive carcinoma versus ductal carcinoma in situ), morphology (ie, mass or calcifications), and means of detection (ie, diagnostic mammography performed because of clinical symptoms, or screening mammography) were recorded for the 52 cancer patients. Lymph node status was recorded. Lesion size was measured by hand on the current mammograms and the oldest actionable prior mammograms for all invasive cancers in which such measurement was feasible. The number and percentage of minimal cancers (ie, ductal carcinomas in situ, or invasive cancers with a diameter of 10 mm or less) depicted on current mammograms were determined.
CAD Analysis Current mammograms and prior screening mammograms of the 52 actionable cancers were analyzed by using a CAD device (Image Checker V2.2; R2 Technology, Sunnyvale, Calif). The current and previous mammograms were reviewed by using a viewer (RAD-x 604; R2 Technology) equipped with two 9-inch gray-scale monitors, which displays the scanned mammograms with CAD analysis results. The display reveals the location of CAD marks and indicates morphologic type. It is not intended for softcopy interpretation. Overall, a cancer was counted as detected if the CAD system marked the lesion site with the appropriate mark on either view of the current or any prior mammogram. The CAD system sometimes marked the cancer on one but not all prior mammograms, on a prior mammogram but not the current mammogram, or on the current mammogram but no prior mammogram. The following parameters were calculated and documented: number of cancers marked with appropriate marks on the current mammogram, number of cancers marked with appropriate marks on mammograms obtained 1 year or less before the current mammogram, and the number of cancers marked with appropriate marks on mammograms obtained Destounis et al
Radiology
more than 1 year prior to the current mammogram. Distribution of morphologic presentation among marked cancers also was recorded. The total number of marks per case and the number of marks that correctly indicated cancer per case were documented.
Determination of False-Negative Rates False-negative rates were determined, prior to CAD analysis of the 52 actionable false-negative findings, by using the total number of breast cancers diagnosed in 2000 (excluding those diagnosed through a second opinion) for which prior mammograms obtained at our clinic were available. False-negative rates were calculated as follows: The number of all (occult, threshold, and actionable) false-negative findings was divided by the total number of cancers, and the contribution of each category (occult, threshold, or actionable) to the false-negative rate was divided by the total number of cancers. In addition, we determined the false-negative rate by using the guidelines issued by the ACR (29) for a practice audit, which includes all occult false-negative findings and diagnostic (year-interval) threshold or actionable false-negative findings only, but excludes actionable and threshold false-negative findings diagnosed more than 1 year after screening. After the CAD system analysis of the 52 actionable findings was complete, the false-negative rates were recalculated with the assumption that all CAD-marked cancers could have been detected on a prior mammogram.
RESULTS For the 52 actionable false-negative findings, a mean interval of 12 months (range, 2–28 months) had elapsed between the most recent prior mammogram and cancer detection. Mean lead time to cancer diagnosis from the oldest prior actionable screening mammogram was 1.6 years (range, 2 months to 4 years). Twenty-nine (56%) of 52 cancers were deemed actionable up to 1 year prior, and 23 (44%) were deemed actionable more than 1 year prior. Table 3 illustrates the analysis of the 52 cancers. Thirty (58%) were considered minimal cancers. Most of these falsenegative findings (35 [67%] of 52 cancers) were based on mammograms acquired in breasts with dense tissue. CAD analysis resulted in a total of 218 marks, an average of 4.5 marks per case, with 75 (34%) of those 218 marks indiVolume 232
䡠
Number 2
TABLE 3 Characteristics of 52 Cancers Rated as Actionable False-Negative Findings Means of Cancer Detection Characteristic Breast composition Fatty Fibrous and fatty Heterogeneously dense Dense Cancer type Ductal carcinoma in situ Invasive carcinoma Morphology Mass Calcification Mass with calcification Architectural distortion Size* Prior mammogram Current mammogram Lymph node status Positive Negative Not sampled
Diagnostic Mammography
Screening Mammography
Total
0 (0) 0 (0) 5 (56) 4 (44)
7 (16) 10 (23) 20 (47) 6 (14)
7 (13) 10 (19) 25 (48) 10 (19)
1 (11) 8 (89)
9 (21) 34 (79)
10 (19) 42 (81)
6 (67) 1 (11) 1 (11) 1 (11)
23 (53) 7 (16) 5 (12) 8 (19)
29 (56) 8 (15) 6 (12) 9 (17)
8–17 (10) 10–25 (14)
2–15 (6) 5–20 (9)
2† 38 11
1‡ 0 0
3 (6) 38 (73) 11 (21)
Note.—Of the 52 cancers, nine were detected at diagnostic mammography and 43 were detected at screening mammography. Unless otherwise noted, data are numbers of cases, and numbers in parentheses are percentages. * Size is given in millimeters. Numbers are ranges, with means for the 42 measurable cancers in parentheses. † Mean size, 22.5 mm. ‡ Size, 6 mm.
cating cancers. On mammograms from prior screening years, the CAD system correctly marked 37 (71%) of the 52 cancers (Tables 4, 5): 19 (65%) of 29 masses, seven (88%) of eight calcifications, seven (78%) of nine architectural distortions, and four (67%) of six masses with calcifications. The CAD system marked the cancer on one but not all prior mammograms in seven cases, on a prior mammogram but not the current mammogram in five cases, and on the current mammogram but no prior mammogram in eleven cases. In the overall assessment, these cancers were regarded as having been detected with CAD. Overall, the false-negative rate before CAD analysis of the 52 actionable cancers was 98 (31%) of 318. The false-negative rate after CAD analysis was 61 (19%) of 318. Calculation of the false-negative rate according to the ACR guidelines (all occult false-negative findings [n ⫽ 27] plus year-interval actionable false-negative findings [n ⫽ 9] and threshold falsenegative findings [n ⫽ 5] only) revealed a rate of 41 (13%) of 318 (Table 6). The estimates of the false-negative rates before and after CAD have a standard error of two percentage points. The estimates of the CAD stand-alone sensitivity for specific subgroups are considerably less
TABLE 4 CAD Results for Actionable FalseNegative Findings according to Interval between Prior Mammography and Detection
Interval
No. of Cases
Prior Mammogram Correctly Marked by CAD System
ⱕ1 year ⬎1 year
29 23
26 11
52
37
Total
precise, with standard errors ranging from six to 19 percentage points for different false-negative finding subgroups. Three (7%) of the 40 patients who underwent lymph node dissection had lymph node metastases (Table 3). Mammographically, those three cancers manifested as masses (n ⫽ 2) and as a mass with calcifications (n ⫽ 1). The CAD system correctly marked these three cancers on prior mammograms.
DISCUSSION Because mortality and morbidity due to breast cancer are lower among patients who have minimal cancers (ductal carci-
CAD with Double Reading of Screening Mammograms
䡠
581
Radiology
TABLE 5 Results of CAD System Analysis of 52 Actionable False-Negative Findings Mammograms Marked by CAD System
Original Means of Cancer Detection* Clinical symptoms Findings at screening mammography Total
Total
Prior
Current
Current or Prior
9 (17)
3 (33)
7 (78)
8 (89)
43 (83)
34 (79)
35 (81)
39 (91)
52 (100)
37 (71)
42 (81)
47 (90)
Note.—Data are numbers of cases with actionable false-negative findings identified at retrospective CAD system analysis. Numbers in parentheses are percentages.
noma in situ, or invasive cancer ⬍ 10 mm), which are most often discovered through early detection techniques (7), it is natural to want to achieve earlier detection so as to improve prognosis. Routine screening mammography is predicated on this goal. The complexities of mammographic interpretation, as well as the technical limitations of mammography, place the interpreter at a disadvantage. When radiologists sit down to read, they know that some cancers will not be visible. Cancer visibility is dependent on the capacity of the imaging system, adequacy of technical factors, and the technologist’s ability to optimally position the patient. Although such factors are regularly monitored through quality assurance testing mandated by the Mammography Quality Standards Act, mammograms are rarely technically perfect. It would be ideal if the technical limitations of an imaging system were the only source of false-negative findings. However, a number of cancers, although visible, will be judged benign. The radiologist’s ability to identify suspicious abnormalities can be improved with experience, training, and constant attention to individual threshold. Yet, another group of visible cancers will be missed simply because of failure to perceive an abnormality. The perception of visible and actionable disease, subtle or not, may not necessarily improve with training, as evidenced by the results of studies about the reasons for false-negative findings. Bird et al (11) reported that more than 40% of false-negative findings were “overlooked” and another 17% were “present” on previous mammograms. Birdwell et al (18) concluded that images incorrectly evaluated at first as negative for cancer have common features that are “usually considered to be suspicious for breast cancer.” Although most radiologists employ, and recognize the benefits of, reading strategies to ensure systematic geographic review and morphologic sub582
䡠
Radiology
䡠
August 2004
mapping, even very experienced mammographers do not detect all visible cancers. Assessment of false-negative findings, especially when used as a learning tool, can help expand the boundaries of what is detectable. Accurate determination of false-negative rates is complicated. Bias inherent in retrospective review can result in rates that are under- or overestimated. Variation in audit practices and discrepancy in what constitutes a falsenegative finding also result in reporting inaccuracies. Further discrepancies in false-negative rates may occur because previously screened patients with a later diagnosis of breast cancer may obtain subsequent screening and/or diagnostic work-up at different facilities or may proceed directly to the surgeon; in such cases, radiologists are predominantly not informed of false-negative findings. Because Rochester, NY, has a stable population, our clinic has multiple previous mammograms for most patients. This permits us to track false-negative findings more closely than most clinics can. For 2000, we have at least one prior mammogram for 318 (82%) of 387 cancers, a fact that allowed us to perform a more accurate assessment of the false-negative rate. In addition, we have established longstanding mechanisms in our community that more effectively enable us to obtain the results of pathologic analysis even when we have not initiated the biopsy. The ACR (31) defines a false-negative finding as “diagnosis of cancer within 1 year of a mammographic examination with normal or probably benign findings (BI-RADS [Breast Imaging and Reporting Data System] category 1, 2, and 3).” The Final Rules of the Mammography Quality Standards Act (32) require a systematic practice audit. Although the Mammography Quality Standards Act does not define how a facility should perform a practice audit, it provides clear instruction about the reporting of false-negative
mammographic findings: “Facilities must also include in their audit any patients that they become aware of who were subsequently found to have cancer that was not detected through their mammogram.” Herein lies the greatest debate over false-negative values. If the ACR definition of false-negative findings is applied, our clinic’s false-negative rate for 2000 is 41 (13%) of 318 cancers (Table 6), including 27 (9%) that were mammographically occult and 14 (4%) that were yearinterval missed cancers (five threshold and nine actionable false-negative findings) visible on prior mammograms. However, our clinic reviews the prior mammograms of all patients with cancer, whether the mammogram was obtained less than or more than 1 year prior to cancer diagnosis. Using our method of review for false-negative findings, we identified another 57 (18%) of 318 cancers (14 threshold and 43 actionable false-negative findings) that were retrospectively visible on previous screening mammograms. We believe that this is the best method to determine the true falsenegative rate. Admittedly, there is some subjectivity in deciding whether the earlier appearance of tumors is actionable. If the goal is to enhance the interpretive proficiency of the radiologist and to decrease perceptual deficiencies, then the information obtained by reviewing these additional cancer cases must also be considered. Since 1995, the staff at our clinic have performed a double reading of all mammograms. An internal audit for 1996 – 2000 revealed a 9% increase in the number of detected cancers as a result of the second reading (86 of 938 cancers detected in those years). This result is similar to previously reported statistics of 4.6%–15% (21–24). Because our office and staff are located at a single centralized facility, two radiologists are always available for double reading. Our recall rate is higher than those at facilities at which only one physician interprets the mammogram, because either of our radiologists can recall the patient. However, the current lack of trained mammographers is stretching our staff to its limits. Whether CAD can replace the second reader remains unclear, and the question is in need of further research. It appears from our retrospective review that CAD has the potential to decrease the false-negative rate. Fifty-two cancers were considered by our facility to be false-negative findings, although 43 of the 52 were detected at follow-up screenDestounis et al
Radiology
TABLE 6 False-Negative Rates before and after CAD System Analysis of 52 Actionable False-Negative Findings
Occult*
Threshold
Actionable
Diagnostic mammography Screening mammography
27 (9) 0 (0)
5 (2) 14 (4)
9 (3) 43 (14)
27 (9)
19 (6)
52 (16)
Total
Total
False-Negative Findings with CAD†
False-Negative Findings according to ACR Guidelines
41 (13) 57 (18)
39 (12) 22 (7)
41 (13) 0 (0)
98 (31)
61 (19)
41 (13)
False-Negative Findings without CAD
Means of Cancer Detection
Note.—Data are numbers of cases. Numbers in parentheses are percentages (false-negative rates) based on a total of 318 cases in which prior mammograms were available. * Diagnosed 1 year or less after prior screening mammography. † Based on the assumption that all CAD-marked cancers would be acted on.
ing. Thirty (58%) of these 52 cancers were considered minimal cancers (ductal carcinomas in situ or ⱕ10-mm invasive carcinomas); however, three (7%) of 40 patients who underwent lymph node dissection had lymph node metastases. The CAD system correctly marked 37 (71%) of 52 cancers on prior mammograms, with 11 (21%) of the 52 cancers marked on mammograms acquired more than 1 year prior to diagnosis. The system also correctly marked all three cancers with lymph node involvement on prior mammograms. In one case of lymph node metastasis, the CAD system marked the cancer on a mammogram acquired more than 2 years prior. Twenty-eight (76%) of the 37 cancers detected by the CAD system were invasive and had a mean size of 12 mm (range, 5–25 mm) at detection, compared with a mean size of 8 mm (range, 2–17 mm) on the oldest CAD system–marked prior mammogram. It is impossible to determine retrospectively whether the CAD system would prompt either radiologist in the prospective double reading to detect a cancer. Also, it is impossible to expect that increases in detection rates with CAD would come without increased cost. Of 218 CAD system marks, only 75 designated cancers. It is likely that false marks would lead to an increase in recall rates, radiologists’ workload, and overall operating expenses. However, Warren-Burhenne et al (17), in their retrospective review, found an overall decrease in recall rates (⬍1%), and Freer and Ulissey (29), in their prospective study, found an increase in recalls of a little more than 1%, no change in positive predictive value, and an overall increase in sensitivity of 19.5%. These results may indicate the quality of CAD system marks and/or the ability of the radiologist to distinguish false-positive marks from high-quality actionable marks. Volume 232
䡠
Number 2
It is also possible that CAD may decrease the number of findings in the threshold false-negative cohort. Because of subjective interpretation and wide variation in recall rates, CAD may influence recall of borderline cases. This can only be determined in a prospective manner. In this retrospective review of this small subset of cancers, it appears that CAD has the potential to decrease the false-negative rate at double reading by more than one-third (from 31% to 19%). The CAD system correctly marked 37 (71%) of 52 actionable findings read as negative in previous screening years. What could not be determined retrospectively were the associated changes in recall rates and the effect on the positive predictive value. A prospective study is warranted and is now under way at our institution to measure and evaluate the effect of the addition of CAD to independent double interpretation of screening mammograms. References 1. Howe HL, Wingo PA, Thun MJ, et al. The annual report to the nation on the status of cancer (1973–1998), featuring cancers with recent increasing trends. J Natl Cancer Inst 2001; 93:824 – 842. 2. Greenlee RT, Hill-Harmon MB, Murray T, Thun M. Cancer statistics 2001. CA Cancer J Clin 2001; 51:15–36. 3. Leitch AM, Dodd GD, Costanza M, et al. American Cancer Society guidelines for the early detection of breast cancer: update 1997. CA Cancer J Clin 1997; 47: 150 –153. 4. Kerlikowske K, Barclay J. Outcomes of modern screening mammography. J Natl Cancer Inst Monogr 1997; 22:105–111. 5. National Institutes of Health Consensus Conference on Breast Cancer Screening for Women Ages 40 – 49: proceedings. Bethesda, Maryland, USA. January 21–23, 1997. J Natl Cancer Inst Monogr 1997; 22:vii–xviii, 1–156. 6. Shapiro S. The status of breast cancer screening: a quarter of a century of research. World J Surg 1989; 13:9 –18.
7.
8.
9.
10.
11.
12.
13.
14. 15.
16. 17.
18.
Tabar L, Fagerberg G, Chen HH, et al. Efficacy of breast cancer screening by age: new results from the Swedish two-county trial. Cancer 1995; 75:2507–2517. Sala E, Warren R, McCann J, Duffy S, Luben R, Day N. High-risk mammographic parenchymal patterns, hormone replacement therapy and other risk factors: a case-control study. Int J Epidemiol 2000; 29:629 – 636. Mandelson MT, Oestreicher N, Porter PL, et al. Breast density as a predictor of mammographic detection: comparison of interval- and screen-detected cancers. J Natl Cancer Inst 2000; 92:1081–1087. Ikeda DM, Andersson I, Wattsgard C, Janzon L, Linell F. Interval carcinomas in the Malmo Mammographic Screening Trial: radiographic appearance and prognostic considerations. AJR Am J Roentgenol 1992; 159:287–294. Bird RE, Wallace TW, Yankaskas BC. Analysis of cancers missed at screening mammography. Radiology 1992; 184:613– 617. van Dijck JA, Verbeek AL, Hendriks JH, Holland R. The current detectability of breast cancer in a mammographic screening program: a review of the previous mammograms of interval and screen detected cancers. Cancer 1993; 72:1933– 1938. Baines CJ, Dayan R. A tangled web: factors likely to affect the efficacy of screening mammography. J Natl Cancer Inst 1999; 91:833– 838. Baker LH. Breast cancer detection demonstration project: 5-year summary report. CA Cancer J Clin 1982; 32:194 –225. Harvey JA, Fajardo LL, Innis CA. Previous mammograms on patients with impalpable breast carcinoma: retrospective vs. blind interpretation. AJR Am J Roentgenol 1993; 161:1167–1172. Duncan AA, Wallis MG. Classifying interval cancers. Clin Radiol 1995; 50:774 – 777. Warren-Burhenne LJ, Wood SA, D’Orsi CJ, et al. Potential contribution of computer-aided detection to the sensitivity of screening mammography. Radiology 2000; 215:554 –562. Birdwell RL, Ikeda DM, O’Shaughnessy KF, Sickles EA. Mammographic characteristics of 115 missed cancers later detected with screening mammography and the potential utility of computer-aided detection. Radiology 2001; 219:192–202.
CAD with Double Reading of Screening Mammograms
䡠
583
19.
Radiology
20.
21.
22.
23.
584
Murphy WA, Destouet JM, Monsees BS. Professional quality assurance for mammography screening programs. Radiology 1990; 175:319 –320. Kan L, Olivotto IA, Warren-Burhenne LJ, et al. Standardized abnormal interpretation and cancer detection ratios to assess reading volume and reader performance in a breast screening program. Radiology 2000; 215:563–567. Ciatto S, Del Turco MR, Morrone D, et al. Independent double reading of screening mammograms. J Med Screen 1995; 2:99 – 101. Thurfjell EL, Lernevall K, Taube AA. Benefit of independent double reading in a population-based mammography screening program. Radiology 1994; 191:241–244. Beam C, Sullivan D, Layde P. Effect of human variability on independent double reading in screening mammography. Acad Radiol 1996; 3:891– 897.
䡠
Radiology
䡠
August 2004
24.
25.
26.
27.
28.
Anttinen I, Pamilo M, Soiva M, Roiha M. Double reading of mammography screening films: one radiologist or two? Clin Radiol 1993; 48:414 – 421. Zheng B, Ganott MA, Britton CA, et al. Soft-copy mammographic reading with different computer-assisted detection cuing environments: preliminary findings. Radiology 2001; 221:633– 640. Jiang Y, Nishikawa RM, Schmidt RA, Toledano AY, Doi K. Potential of computeraided diagnosis to reduce variability in radiologists’ interpretations of mammograms depicting microcalcifications. Radiology 2001; 220:787–794. Kegelmeyer WP, Pruneda JM, Bourland PD, Hillis A, Riggs MW, Nipper ML. Computeraided mammographic screening for spiculated lesions. Radiology 1994; 191:331–337. Chan HP, Sahiner B, Helvie MA, et al. Improvement of radiologists’ characterization of mammographic masses by us-
29.
30.
31.
32.
ing computer-aided diagnosis: an ROC study. Radiology 1999; 212:817– 827. Freer TW, Ulissey MJ. Screening mammography with computer-aided detection: prospective study of 12,860 patients in a community breast center. Radiology 2001; 220:781–786. Logan-Young W. The breast imaging center: successful management in today’s environment. Radiol Clin North Am 2000; 38:853– 860. American College of Radiology. Breast imaging reporting and data system (BIRADS). Part III: Follow-up and outcome monitoring. Available at: www.acr.org. Accessed November 16, 2002. Mammography Quality Standards Act Regulations, February 6, 2002 (67 Federal Register 5446), part 900: mammography, subpart A: accreditation. Available at: www .fda.gov/cdrh/mammography/frmamcom2 .html#s9005. Accessed June 7, 2004.
Destounis et al