An Automatable Method for Determining Adequacy of Thyroid Fine

0 downloads 0 Views 877KB Size Report
Thyroid Fine-Needle Aspiration Samples. Daniel B. Schmolze, MD; Andrew H. Fischer, MD. Context.—Thyroid nodules are a common clinical problem. Cytologic ...
EARLY ONLINE RELEASE Note: This article was posted on the Archives Web site as an Early Online Release. Early Online Release articles have been peer reviewed, copyedited, and reviewed by the authors. Additional changes or corrections may appear in these articles when they appear in a future print issue of the Archives. Early Online Release articles are citable by using the Digital Object Identifier (DOI), a unique number given to every article. The DOI will typically appear at the end of the abstract.

The DOI for this manuscript is doi: 10.5858/arpa.2018-0072-OA The final published version of this manuscript will replace the Early Online Release version at the above DOI once it is available.

© 2018 College of American Pathologists

Original Article

An Automatable Method for Determining Adequacy of Thyroid Fine-Needle Aspiration Samples Daniel B. Schmolze, MD; Andrew H. Fischer, MD

 Context.—Thyroid nodules are a common clinical problem. Cytologic evaluation via fine-needle aspiration is often employed in the diagnostic workup, and rapid onsite assessment of adequacy can help ensure an adequate sample is obtained. However, rapid on-site assessment of adequacy only examines part of the sample, a part that may not then be available for ancillary testing. Moreover, the procedure is time-consuming and poorly reimbursed. Objective.—To develop an automatable fluorescencebased image analysis system for assessing the adequacy of thyroid fine-needle aspirations that uses the entire aspirated sample. Design.—There were 12 previously diagnosed cases that served as a training set, and 11 cases were used for validation of an image analysis algorithm. The samples were fluorescently stained and imaged using a fluorescent microscope. The images were assessed for adequacy by an

T

hyroid nodules are a very common clinical problem. The reported prevalence depends on the method of detection and the population being studied; in iodinesufficient countries, the prevalence of palpated nodules is approximately 5% in women and 1% in men.1–3 When detected via high-resolution ultrasonography or autopsy series, prevalence is 50% or higher in most studies.3,4 During the past several decades, the incidence of thyroid nodules has increased sharply in developed countries, likely as a result of increased use of sensitive imaging techniques.5 Most thyroid nodules are benign.6 Imaging and clinical features can suggest malignancy, but cytologic evaluation via fine-needle aspiration (FNA) is usually required for definitive diagnosis.7 Fine-needle aspiration is a safe and relatively effective technique8; however, up to 15% of samples are classified as nondiagnostic because of inadequate cellularity.9,10 The inadequacy rate is affected by characteristics of the nodule itself, the technique used (eg, ultrasound guidance), and the experience of the aspirator. Accepted for publication July 30, 2018. From the Department of Pathology, City of Hope National Medical Center, Duarte, California (Dr Schmolze); and the Department of Pathology, University of Massachusetts Medical School, Worcester (Dr Fischer). The authors have no relevant financial interest in the products or companies described in this article. Corresponding author: Daniel B. Schmolze, MD, Department of Pathology, City of Hope National Medical Center, 1500 East Duarte Road, Duarte, CA 91010 (email: [email protected]). Arch Pathol Lab Med

image analysis algorithm. Following image analysis, a ThinPrep slide was prepared and blindly scored by a cytopathologist. The standard and computer-derived results were then compared. Results.—The algorithm was optimized using the 12 cases in the training set, and then applied to the 11 test cases. A total of 8 of 8 adequate samples in the test group were correctly scored as adequate, and 2 of 3 cases that were inadequate were correctly scored as inadequate by the algorithm. One case was erroneously designated as not adequate by the algorithm. Conclusions.—Our results demonstrate the feasibility of automating thyroid adequacy assessment using a fluorescent labeling technique followed by computer image analysis. (Arch Pathol Lab Med. doi: 10.5858/arpa.2018-0072OA) Numerous studies have concluded that rapid on-site assessment of adequacy (ROSE) can significantly reduce the inadequacy rate.9–12 ROSE is performed by a cytopathologist or a cytotechnician, who is present during the FNA procedure and who renders an immediate assessment of adequacy on the aspirated material. The aspirator can then make additional passes as needed, to ensure the final sample is suitable for diagnostic interpretation. ROSE also offers an opportunity to appropriately triage the specimen based on initial diagnostic impressions. For example, if lymphoma is suspected, flow cytometry can be performed; if atypical cells are identified, additional passes can be requested to ensure adequate material for possible molecular studies. Although ROSE is helpful, it has several important limitations. Typically, only a portion of the aspirated sample is expressed directly on a glass slide for microscopic examination; the needle must be rinsed to recover the entire sample. Because part of the sample is not examined, it is possible to falsely assess a sample as inadequate, with subsequent unnecessary additional passes.13 Alternatively, and more troublingly, the final sample may prove inadequate despite appearing adequate at the time of ROSE.14 In a study of ROSE for thyroid FNA it was concluded that an adequate assessment of a thyroid FNA ‘‘does not guarantee’’ a diagnostic final diagnosis.15 Up to 20% of thyroid FNAs with an ‘‘adequate’’ ROSE showed suboptimal cellularity in the final slide.15 Even if there is no discordance, the material examined for adequacy is usually not available for ancillary testing, and therefore the effective sample size is reduced by Automated Thyroid FNA Adequacy—Schmolze & Fischer 1

ROSE. The desire for ever less invasive procedures, combined with an expanding menu of prognostic and predictive tests, will only make this problem more acute. Finally, ROSE is time-consuming and poorly reimbursed.16 The procedure is most accurate when performed by a cytopathologist, but it often incurs a net loss of revenue when time is considered. With these limitations in mind, we describe the key components of an automatable fluorescence-based image analysis system for assessing the adequacy of thyroid FNAs. DESIGN Thyroid FNA specimens with residual material following diagnosis were retrospectively identified. The samples had all been previously collected in CytoRich Red Collection Fluid (ThermoFisher Scientific, Waltham, Massachusetts). Following centrifugation, the CytoRich Red needle rinse was resuspended in ThinPrep PreservCyt vials (Hologic Inc, Marlborough, Massachusetts). ThinPrep monolayer slides were prepared, and if there was more than 5 mL remaining in the specimen vial after the ThinPrep, a Cellient (Hologic) cell block was prepared. For some samples, residual material remained after the cell block was prepared; after diagnosis, the ThinPrep vials were grossly examined, and if there was more than 5 mL still remaining, these vials were used for the study. A total of 12 cases served as a training set for the development and optimization of an image analysis algorithm, whereas 11 cases were used as a test set. Samples were centrifuged at 1000 g for 5 minutes, and the pellet was resuspended in 1 mL of a 100 ng/mL solution of the fluorescent nuclear stain DAPI (4 0 ,6-diamidino-2phenylindole, Sigma) in water. Following incubation for 1 minute, the sample was again centrifuged at 1000 g for 5 minutes, and all but 100 lL of the supernatant was decanted. The remaining 100 lL of sample in DAPI solution was placed on a standard-sized microscope slide bearing a 4-mm circular well defined by a bordering hydrophobic barrier in order to contain the sample (SPI Supplies, West Chester, Pennsylvania). Next, the samples were imaged at 310 magnification using a noninverted Nikon Eclipse E800 epifluorescence microscope equipped with a Spot RT3 camera (Diagnostic Instruments Inc, Sterling Heights, Michigan). To capture the entire 4-mm sample, 4 overlapping images were acquired and then digitally stitched to yield a single image. An image analysis algorithm was developed using the R programming language17 and the EBImage package.18 The algorithm consisted of a series of binary thresholding operations (Figures 1 and 2). We observed that contaminating debris, such as fibrin, exhibited a much lower fluorescence intensity than DAPI-stained nuclei. Groups of closely clustered nuclei exhibited a moderate intensity (higher than that of surrounding debris) due to diffusion of light from their constituent nuclei. Thus, thresholding was first applied using a higher minimum intensity, such that dim pixels likely to represent contaminating debris, such as fibrin, were excluded. Next, groups of cells were defined and enumerated by again applying an intensity threshold designed to include groups of nuclei (moderate intensity) along with their constituent nuclei. This process ensured that only nuclei within closely clustered groups were retained. Cell types other than thyroid follicular cells (eg, macrophages, lymphocytes) were relatively dispersed in our samples, and were thus excluded by this step. Thyroid 2 Arch Pathol Lab Med

follicular cells entrapped in fibrin were also sufficiently dispersed to be excluded. For each group, nuclei comprising the group were then identified and enumerated by applying a final threshold designed to retain only very high-intensity pixels. Finally, images were scored for adequacy. Standard Bethesda system criteria were applied,19 such that a sample was deemed adequate if there were at least 6 groups, each containing at least 10 cells. Following imaging and automated computer scoring, the entire sample was returned to a ThinPrep vial, and a single ThinPrep slide was prepared and scored for adequacy by a cytopathologist, who was blinded to the automated assessment. The algorithm was optimized to achieve concordance with the cytopathologist on the 12 training cases, and subsequently validated blindly on the 11 test cases. RESULTS AND CONCLUSIONS The Table compares the adequacy assessments rendered on the 11 test cases by the cytopathologist and the automated algorithm. One case deemed adequate by the cytopathologist was erroneously scored as inadequate by the algorithm. Otherwise, there was complete agreement. Of note, for the discordant case, the cytopathologist noted that adequacy criteria were barely satisfied, in that there were exactly 6 groups with at least 10 cells. In the digital image analyzed by the algorithm, several nuclei in 1 group were in close proximity and were counted as a single nucleus, yielding a count of 7 cells and resulting in an inadequate designation for that group and an overall inadequate designation for that sample. To our knowledge, this study is the first to propose a fluorescence-based system for the automated assessment of thyroid FNA adequacy. As presented in this pilot study, the system requires manual intervention at several stages, and thus cannot be described as fully automated. However, the design was conceived with automation in mind, and we envision the entire process being automated and incorporated into a stand-alone device. Although our study focused on adequacy assessment, ROSE also provides a preliminary diagnostic impression and the opportunity to appropriately triage specimens, facilitating more rapid clinical decision-making.9 For example, if lymphoma is suspected, flow cytometry can be ordered before a definitive diagnosis is rendered. Our system could be adapted to assist with this aspect of ROSE as well. In particular, the image analysis algorithm constitutes a modular component which could be trained to detect unusual scenarios, such as metastatic carcinoma, lymphoma, or variants of thyroid carcinoma, such as medullary carcinoma. Additional fluorophores could be useful for this task; for example, one could imagine incorporating fluorescently labeled antibodies to lymphoid cells, or to cytokeratins. The digital image acquired by the system could also easily be viewed remotely, facilitating diagnosis via telecytology.20 Likewise, the method could in principle be applied to other specimen types following appropriate training of the image analysis algorithm; we focused on thyroid FNA specimens because these are common specimens with well-established adequacy criteria. One advantage of the method is that the entire sample can be assessed for adequacy, precluding erroneous adequacy assessments due to incomplete evaluation. On the other hand, direct smears (eg, prepared during Automated Thyroid FNA Adequacy—Schmolze & Fischer

Figure 1. Image analysis algorithm. The algorithm consists of 4 steps. First, the entire field of view is imaged at 310 magnification (A), yielding a single combined image (B). An intensity threshold is applied to the resulting image, eliminating low-intensity fluorescence likely to represent fibrin or other debris. C, A small area with several cellular groups (4 0 ,6-diamidino-2-phenylindole [DAPI]–stained nuclei are visible as small bright dots within the groups). These groups are retained by the initial thresholding. By comparison, D shows a sample consisting largely of fibrin and debris with a low fluorescence intensity, which is eliminated by the initial intensity thresholding. In the next step, thresholding is again applied, but with a higher minimum intensity threshold. Groups are defined as contiguous regions of retained pixels (E and F). In step 3, a final round of thresholding is applied within each group region, using a high minimum intensity. Retained contiguous pixels are defined as nuclei. Groups containing at least 10 nuclei are colored green; those containing fewer are colored white (F). In step 4, the sample is assessed for adequacy using standard Bethesda criteria (G). Arch Pathol Lab Med

Automated Thyroid FNA Adequacy—Schmolze & Fischer 3

Figure 2. Representative adequate and inadequate images. A, An image containing many groups of moderate intensity, composed of individual nuclei with high intensity. The thresholding algorithm first retains and enumerates the groups, then the nuclei within each group. The image is ultimately scored as adequate if there are at least 6 groups, each with at least 10 nuclei. B, An image consisting largely of fibrin, which exhibits a low fluorescence intensity. The initial thresholding step excludes most of these dim pixels, and the final image is ultimately scored as inadequate.

conventional ROSE) may improve diagnosis,21 especially for diagnosis of thyroid malignancies.22 One would not be precluded from assessing the needle rinse with our automated method, after making direct smears. The automatable method could be especially useful for individuals performing thyroid FNA who are not adept at making direct smears. The method may also be attractive for large centers performing many thyroid FNAs, in which conventional ROSE is not currently available, potentially increasing the productivity of cytopathologists who would be spared the time-consuming and poorly reimbursed task of assessing adequacy. The original goal of ROSE was to assure that a diagnosis could be made. However, it is no longer enough to just make a diagnosis: a growing number of ancillary tests are becoming essential for patient management. Many (or most) of the ancillary tests cannot be performed on the material used for ROSE.23 Thus, the material that is rinsed into the specimen vial (and not examined at ROSE) determines the adequacy of the sample for ancillary testing.24,25 Although it may be safe to assume that the part Algorithm Versus Cytopathologista Algorithm

a

Cytopathologist

Adequate

Inadequate

Adequate Inadequate

8 0

1 2

Results of the image analysis algorithm and of subsequent blinded scoring by a cytopathologist are shown for the test set.

4 Arch Pathol Lab Med

of the pass that is assessed at ROSE will be identical in relative composition to the material that is put into the needle rinse (eg, 50% tumor cells, with .1000 cells per smear), there are no data on the ability for the ROSE procedure to estimate the proportion of the sample that is on the smear compared with the amount that went into the needle rinse. We are proposing to overcome these important and underappreciated limitations of ROSE by examining the sample that is in the needle rinse container, potentially nondestructively. This pilot study is limited by the number of cases evaluated. Although the results appear promising in terms of sensitivity and specificity, the small numbers do not permit a robust assessment of accuracy. A larger sample size might expose weaknesses in the image analysis algorithm that were not appreciated in our samples. For example, we only observed relatively dispersed inflammatory cells in our samples, but clusters of such cells might be erroneously counted as groups of thyroid follicular cells. Likewise, fibrin exhibited a lower fluorescence intensity in our samples than groups of follicular cells, and was appropriately discarded along with any entrapped thyroid follicular cells. However, other potential contaminants may interfere with our algorithm. Therefore, our results should be validated with a larger sample size. References 1. Tunbridge WM, Evered DC, Hall R, et al. The spectrum of thyroid disease in a community: the Whickham survey. Clin Endocrinol (Oxf). 1977;7(6):481–493. 2. Vanderpump MP, Tunbridge WM, French JM, et al. The incidence of thyroid disorders in the community: a twenty-year follow-up of the Whickham Survey. Clin Endocrinol (Oxf). 1995;43(1):55–68.

Automated Thyroid FNA Adequacy—Schmolze & Fischer

3. Dean DS, Gharib H. Epidemiology of thyroid nodules. Best Pract Res Clin Endocrinol Metab. 2008;22(6):901–911. 4. Mortensen JD, Woolner LB, Bennett WA. Gross and microscopic findings in clinically normal thyroid glands. J Clin Endocrinol Metab. 1955;15(10):1270– 1280. 5. Davies L, Welch HG. Current thyroid cancer trends in the United States. JAMA Otolaryngol Neck Surg. 2014;140(4):317–322. 6. Frates MC, Benson CB, Doubilet PM, et al. Prevalence and distribution of carcinoma in patients with solitary and multiple thyroid nodules on sonography. J Clin Endocrinol Metab. 2006;91(9):3411–3417. 7. American Thyroid Association (ATA) Guidelines Taskforce on Thyroid Nodules and Differentiated Thyroid Cancer, Cooper DS, Doherty GM, et al. Revised American Thyroid Association management guidelines for patients with thyroid nodules and differentiated thyroid cancer. Thyroid Off J Am Thyroid Assoc. 2009;19(11):1167–1214. 8. Nguyen GK, Lee MW, Ginsburg J, Wragg T, Bilodeau D. Fine-needle aspiration of the thyroid: an overview. CytoJournal. 2005;2(1):12. 9. Baloch ZW, Tam D, Langer J, Mandel S, LiVolsi VA, Gupta PK. Ultrasoundguided fine-needle aspiration biopsy of the thyroid: role of on-site assessment and multiple cytologic preparations. Diagn Cytopathol. 2000;23(6):425–429. 10. Choi SH, Han KH, Yoon JH, et al. Factors affecting inadequate sampling of ultrasound-guided fine-needle aspiration biopsy of thyroid nodules. Clin Endocrinol (Oxf). 2011;74(6):776–782. 11. Redman R, Zalaznick H, Mazzaferri EL, Massoll NA. The impact of assessing specimen adequacy and number of needle passes for fine-needle aspiration biopsy of thyroid nodules. Thyroid. 2006;16(1):55–60. 12. Schmidt RL, Witt BL, Lopez-Calderon LE, Layfield LJ. The influence of rapid onsite evaluation on the adequacy rate of fine-needle aspiration cytology. Am J Clin Pathol. 2013;139(3):300–308. 13. Nasuti JF, Gupta PK, Baloch ZW. Diagnostic value and cost-effectiveness of on-site evaluation of fine-needle aspiration specimens: review of 5,688 cases. Diagn Cytopathol. 2002;27(1):1–4. 14. Olson MT, Tatsas AD, Ali SZ. Cytotechnologist-attended on-site adequacy evaluation of thyroid fine-needle aspiration: comparison with cytopathologists

Arch Pathol Lab Med

and correlation with the final interpretation. Am J Clin Pathol. 2012;138(1):90– 95. 15. Eedes CR, Wang HH. Cost-effectiveness of immediate specimen adequacy assessment of thyroid fine-needle aspirations. Am J Clin Pathol. 2004;121(1):64– 69. 16. Layfield LJ, Bentz JS, Gopez EV. Immediate on-site interpretation of fineneedle aspiration smears: a cost and compensation analysis. Cancer. 2001;93(5): 319–322. 17. R Development Core Team. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. 2008. http:// www.R-project.org. Accessed July 16, 2018. 18. Pau G, Fuchs F, Sklyar O, Boutros M, Huber W. EBImage–an R package for image processing with applications to cellular phenotypes. Bioinformatics. 2010; 26(7):979–981. 19. Cibas ES, Ali SZ. The Bethesda system for reporting thyroid cytopathology. Am J Clin Pathol. 2009;132(5):658–665. 20. Khurana KK. Telecytology and its evolving role in cytopathology. InDiagn Cytopathol. 2012;40(6):498–502. 21. Ljung BM. Thyroid fine-needle aspiration: smears versus liquid-based preparations. Cancer. 2008;114(3):144–148. 22. Fischer AH, Clayton AC, Bentz JS, et al. Performance differences between conventional smears and liquid-based preparations of thyroid fine-needle aspiration samples: analysis of 47076 responses in the College of American Pathologists Interlaboratory Comparison Program in Non-Gynecologic Cytology. Arch Pathol Lab Med. 2013;137(1):26–31. 23. Fischer AH, Hutchinson LM. Technical and US regulatory issues in triaging material for the molecular laboratory: commentary. Cancer Cytopathol. 2017; 125(2):83–90. 24. Sung S, Crapanzano JP, DiBardino D, Swinarski D, Bulman WA, Saqi A. Molecular testing on endobronchial ultrasound (EBUS) fine needle aspirates (FNA): impact of triage. Diagn Cytopathol. 2018;46(2):122–130. 25. Collins BT, Garcia TC, Hudson JB. Rapid on-site evaluation improves fineneedle aspiration biopsy cell block quality. J Am Soc Cytopathol. 2016;5(1):37– 42.

Automated Thyroid FNA Adequacy—Schmolze & Fischer 5

Suggest Documents