Whole cell segmentation in solid tissue sections - Wiley Online Library

q 2005 International Society for Analytical Cytology

Cytometry Part A 67A:137–143 (2005)

Whole Cell Segmentation in Solid Tissue Sections Daniel Baggett,1 Masa-aki Nakaya,2 Matthew McAuliffe,3 Terry P. Yamaguchi,2 and Stephen Lockett4* 1

2

Worcester Polytechnic Institute, Worcester, Massachusetts Cancer and Developmental Biology Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Frederick, Maryland 3 Center for Image Analysis, National Institutes of Health, Bethesda, Maryland 4 Image Analysis Laboratory, SAIC-Frederick, National Cancer Institute, Frederick, Maryland Received 3 March 2005; Revision Received 2 May 2005; Accepted 3 May 2005

Background: Understanding the cellular and molecular basis of tissue development and function requires analysis of individual cells while in their tissue context. Methods: We developed software to find the optimum border around each cell (segmentation) from two-dimensional microscopic images of intact tissue. Samples were labeled with a fluorescent cell surface marker so that cell borders were brighter than elsewhere. The optimum border around each cell was defined as the border with an average intensity per unit length greater that any other possible border around that cell, and was calculated using the gray-weighted distance transform. Algorithm initiation requiring the user to mark two points per cell, one approximately in the center and the other on the border, ensured virtually 100% correct segmentation. Thereafter segmentation was automatic.

Tissue development and function and disease-related processes such as tumorigenesis are driven in large part by communication between neighboring cells. Therefore, to understand the actions of tissue, it is necessary to quantitatively analyze individual cells at the molecular level while the cells remain in their natural tissue environment. Such analyses necessitate fluorescence labeling tissue samples for specific molecules of interest, followed by digitized microscopy to record the labeled samples, and then application of computational algorithms that first segment (delineate) the individual structures of interest (usually cells) and then quantify the molecular properties of each structure. A major bottleneck preventing widespread use of such analyses is the lack of an efficient method to segment individual, whole cells from images of intact tissue. We report an algorithm based on dynamic programming that reliably and efficiently segments individual cells from two-dimensional (2D) microscopic images of tissue. Previous work on analyzing microscope images of fluorescence-labeled tissue samples has predominantly focused on segmenting cell nuclei (1–4), and there are only a few reports of methods for segmenting whole cells (i.e., the cell nucleus plus the surrounding cytoplasm) (5–7). Seg-

Results: The method was highly robust, because intermittent labeling of the cell borders, diffuse borders, and spurious signals away from the border do not significantly affect the optimum path. Computer-generated cells with increasing levels of added noise showed that the approach was accurate provided the cell could be detected visually. Conclusions: We have developed a highly robust algorithm for segmenting images of surface-labeled cells, enabling accurate and quantitative analysis of individual cells in tissue. q 2005 International Society for Analytical Cytology

Key terms: image segmentation; dynamic programming; image analysis; tissue analysis

mentation of whole cells in intact tissue introduces additional problems compared with isolated cells in culture, because the cells are in a close packing arrangement; in other words, they are in complete contact with their neighbors. This precludes the use of volume labeling of the cells (as is generally done for cell nuclei), and instead requires the use of explicit cell surface labeling. Further, most closely packed cells are not convex, thus morphologic operators cannot be used to separate touching cells as is often done for touching cell nuclei. Instead, methods for segmenting whole cells have searched directly for cell

The content of this publication does not necessarily reflect the views of the Department of Health and Human Services, nor does mention of trade names, commercial products, or organizations imply endorsement by the U.S. government. Contract grant sponsor: National Cancer Institute, National Institutes of Health; Contract grant number: NO1-CO56000. *Correspondence to: Dr. Stephen Lockett, Room 104A, B538, National Cancer Institute at Frederick, P.O. Box B, Frederick, MD 21702. E-mail: [email protected] Published online 12 September 2005 in Wiley InterScience (www. interscience.wiley.com). DOI: 10.1002/cyto.a.20162

138

BAGGETT ET AL.

FIG. 1. Segmentation of whole cells. A: Subregion of an image of a mouse embryo labeled with Oregon Green phalloidin for detection of the cell surface. The red dots, C and B, show the point approximately in the center and the point on the surface, respectively, which were indicated by the user. The automatic algorithm uses these user-defined points to convert the images from Cartesian to polar coordinates. B: Polar transform of the image in A. The origin, 0 in this image ([R, u] 5 [0, 0] 5 [0, 2p]) is point C in A. Points B are the same point and correspond to point B in A. C: The red line is the globally optimal cell boundary found by the gray-weighted distance transform and is constrained to go through point B. D: Reversal of the Cartesian to polar transform results in the final border around the cell.

borders using algorithms based on the watershed algorithm (6), or they have been initiated with markers inside each cell and searched outward from the markers to the cell borders. For example, Dow et al. (5) used the segmented cell nuclei as initial markers, followed by searching for the optimal boundary around each cell using the ‘‘active snake’’ method (8). In contrast, Ortiz de Solorzano et al. (7) used interactive marking of points inside each cell, followed by gradient-curvature driven flow (9) for finding the cell boundaries. Currently we are not aware of any method that correctly segments every cell when compared with visual examination. This is mainly due to the inherent combination of noise, finite spatial resolution of the image acquisition system, and non-ideal labeling. However, a possible, further limitation of current methods is that they are not explicitly designed to find a globally optimal border surrounding each cell. To ensure that every cell was correctly segmented when compared with the gold standard of visual inspection and to avoid objects that were visually indistinct, we deliberately included a small amount of proactive user interaction to guide the segmentation procedure. Further, we hypothesize that a dynamic programming approach that ensures that the identified border around each cell is globally optimal is more accurate and reliable than a local optimization method, such as the method described by Ortiz de Solorzano et al. (7). We define the globally optimal border as the border with an average intensity per unit length greater that any other possible border around that cell. One of the earliest applications of dynamic programming to image analysis was by Montanari in 1971 (10) who presented a recursive algorithm that found curves of low curvature across an image, which maximized a figure of merit. That method forms the basis of our method. Since then, dynamic programming has been used on several occasions for biological image segmentation. For example, Geusebroek et al. (11) found networks of lines in images and demonstrated their method for segmenting cells in heart tissue and for tracing neurites. Kampe and Kober (12) used dynamic programming to find optimal intensity thresholds for segmenting color and gray images, whereas Zeng et al. (13) combined dynamic programming with anisotropic diffusion for finding three-

dimensional (3D) sulcal ribbons in magnetic resonance images. In this study we developed dynamic programmingbased software to find the optimal border around each cell from 2D microscopic images of intact tissue labeled with a fluorescent cell surface marker, a process known as ‘‘segmentation.’’ To guarantee correct segmentation of every cell as judged by visual inspection, algorithm initiation required the user to mark two points per cell, one approximately in the center and the other on the border, after which segmentation was automatic. Immediately after segmentation of each cell, the user could interactively correct any perceived errors in the detected border or reject the segmentation of that cell.

MATERIALS AND METHODS Segmentation Method The segmentation method is illustrated in Figure 1 and is presented as a flow diagram in Figure 2. Input data are 2D, epifluorescence microscopic images of tissue samples where the cell surfaces (Fig. 5) have been specifically fluorescence labeled. For each cell the user indicates a point approximately in the center of the cell and a point on the border (red dots C and B in Fig. 1A). The algorithm performs a Cartesian to polar coordinate remapping on the region of the image centered on point C so that the looped border around the cell becomes a horizontal, but not straight, path (Fig. 1B). Using the gray-weighted distance transform (14) (see Appendix for an explanation), the globally optimal path is determined from point B on the left edge to point B on the right edge in Figure 1B. The optimal path is defined as the path that has the highest average intensity along its length compared with all other possible paths. This averaging property makes this method highly resistant to image noise and gaps in the path. The result is shown in Figure 1C. The final step reverses the initial transformation (Fig. 1D). The method was implemented in MATLAB (versions 6.5 or later, The Mathworks Inc., Natick, MA) and DIPimage (image processing toolbox for Matlab, Delft University of Technology, The Netherlands).

WHOLE CELL SEGMENTATION IN SOLID TISSUE SECTIONS

139

FIG. 2. Flow diagram of the segmentation procedure.

Interactive Correction for Errors On occasions the algorithm finds a globally optimal border that encompasses two or more cells. The user may correct these errors by marking an additional border point on the edge of the cell of interest (Fig. 3). The algorithm then finds the border in two stages, the first stage being from the initial border point to the additional border point and the second stage from the additional border point back to the initial border point. Interactive correction of the initial border was more often required when a long side of an elongated cell was adjacent to another cell (Fig. 3). Measurement of Segmentation Accuracy Using Computer-Generated Objects Computer generated circles, squares, triangles, and four-pointed stars were generated with an intensity of 200 along their one-pixel-wide borders and zero intensity inside and outside. The circles were of radius 25 pixels and the other shapes were a similar size. Gaussian-distributed noise with zero mean and standard deviations ranging from 50 to 225 were added to the images, followed by clipping intensities less than zero and more than 255 to zero and 255 respectively. Figure 4A shows the circles with noise added and Figure 4C shows the star and square with noise added. The shapes were segmented with the user-indicated center point deliberately placed off the true center by several pixels. Accuracy of the segmentation

FIG. 3. Interactive correction for errors. A: Border around two cells. B: Correction point inserted by the user (green dot). C: Corrected border around one cell.

was determined by calculating the shortest distance from every point along the true border to the detected border using the Euclidean distance transform (15). The average of these distances served as the measure of accuracy. Preparation, Imaging, and Analysis of Tissue Samples of Whole Cells Early somite stage (E8.0–5) mouse embryos were obtained from NIH swiss female mice. Embryos were fixed with 2% paraformaldehyde/phosphate buffered saline for 30 min at room temperature and stained with Oregon Green 488 Phalloidin (O-7466, Molecular Probes, Eugene, OR, USA) after treatment with 0.1 M glycine, 0.2% Triton X-100/phosphate buffered saline for 10 min. Phalloidin predominantly labels filamentous actin found abundantly underlying the cell membrane. The posterior primitive streak region of embryos was manually dissected and mounted with SlowFade Light Antifade kit (S-7461, Molecular Probes) on slide glass. Images were acquired using a 403, 1.3 numerical aperture oil objective lens, and pinhole of 1 airy unit on an LSM 510 confocal microscope (Carl Zeiss Inc., Thornwood, NY, USA). Excitation was with a 488-nm laser light and emitted light between 500 and 550 nm was acquired. Thus the effective thickness of the optical sections was 425 nm at the coverslip but increased farther away from the coverslip. Pixel size was 0.14 mm.

140

BAGGETT ET AL.

FIG. 4. Computer-generated circles, square, and four-pointed stars with an intensity of 200 along their one-pixel-wide borders and zero intensity inside and outside. Gaussian-distributed noise with zero mean and standard deviations ranging from 25 to 225 in 25 intensity unit increments were added to the images. A: The set of nine circles. B: The segmentation results for the circles in A. C: The segmentation result for the star with Gaussian noise of standard deviation of 75 and a square with Gaussian noise of standard deviation of 175. D: The average deviation of the shortest distance from every point along the true border to the detected border. The figure is divided into three columns corresponding to noise levels expected in microscopic images of fluorescencelabeled samples, higher noise levels where objects are still visible, and very high noise levels where objects are not visible.


141

FIG. 5. Segmentation of whole cells. A: Left: 2D image of whole cells labeled with Oregon Green phalloidin, which predominantly labels the cell membranes in this sample. Middle: Image overlaid with the borders of the segmented cells. Note that the user did not choose to segment all cells and avoided those cells whose borders were not clearly discernible by visual inspection. Right: Alternative display of the segmented cells. B: Other images of segmented cells.

RESULTS Segmentation of Computer-Generated Objects Figure 4B shows the segmentation results for the circles, Figure 4C shows examples of segmentation results for the star and square, and Figure 4D reports the segmentation accuracy for all shapes. An average deviation of less than 1 pixel from the detected to true border was considered accurate. We observed that the algorithm reliably segmented the shapes, even when the shapes were not visually discernible in the images, although in places there was significant deviation of the detected border from the true border (bottom row of Figure 4B). The deviation was most significant at the sharp corners of the star and triangle (Fig. 4C, left), where the algorithm is liable to ‘‘short cut’’ through noise inside the object. For the noise levels expected in microscopic images (e.g., top row of Fig. 4A), segmentation was considered accurate. Segmentation of Whole Cells Figure 5 shows the results for segmentation of whole cells. Segmentation took several seconds per segmented object, but this could be speeded up to real time by compiling the source code. Approximately half of the cells

required interactive segmentation, frequently because the globally optimal border encompassed more than one cell. Cells were correctly segmented based on visual verification. Although, the user needed to indicate the approximate center and a border point of each cell, this level of interaction was not considered to be a significant burden. Further, this prospective interaction facilitated verification and correction (if any) of each cell immediately after its automatic segmentation. This negated the need for retrospective inspection and interactive correction across the entire image, which is usually the case after application of completely automatic algorithms. DISCUSSION We have developed a method to segment objects (such as whole cells) from 2D microscopic images of biological samples. We anticipate that this method will prove valuable for analyzing the spatial organization of heterogeneous cells in tissue or cell culture samples. However, the current 2D method cannot accurately quantify individual cells in tissues, because the 2D image represents only an arbitrary thin slice through the tissue sample. To quantify the features of individual cells, it is necessary to segment the entire volume of each cell from 3D images and, conse-

142

BAGGETT ET AL.

FIG. 6. Explanation of the gray-weighted distance transform (GWDT) for finding the optimal path across an image. A: A 4 3 4 pixel image with arbitrary intensities assigned to each pixel. B: Result from application of the ‘‘city block’’ distance transform for calculating the distance from the starting pixel (intensity 0) to all other pixels. C: Image values in the original image (A) for pixel of distance 3 from the starting pixel are copied into a new image. D: Partial result after application of the GWDT to pixels of distance 2 from the starting pixel. E: Result after completion of the GWDT. F: Optimal paths from all pixels on the right hand edge to the starting pixel.

quently, future development of this method is extension to 3D. However, several difficulties are anticipated in this extension, namely much less efficient dynamic programming algorithms for finding the globally optimal surface in a 3D image (16) and significantly poorer spatial resolution in the axial dimension compared with the lateral dimensions in 3D microscopy. To overcome these difficulties, it may be necessary to combine dynamic programming with other image processing and analytic approaches. For example, Park and Keller (17) combined four image analysis approaches: watershed algorithm, snakes, multiresolution analysis, and dynamic programming to segment 2D images of cells. Other promising approaches applied to segmentation of cells have included a statistical model fitting using a Markov chain Monte Carlo method (18), a gradient velocity flow method to handle low contrast boundaries of cells (19), and statistical methods using models of the intensity distribution in and outside objects (20–22). Further, there are inherent difficulties in terms of visually inspecting 3D images, because it is impossible to present the complete 3D image to the human visual system at one time. This makes verification of the segmentation of all cells after attempted segmentation of all cells in the image extremely tedious and error prone. However, directing the user to each cell immediately after its segmentation should remain simple for 3D images. We believe that extension of our method to 3D will be most useful in applications where tens to hundreds of cells per sample require segmentation, with the confidence that the segmentation of each cell chosen for segmentation is accurate based on visual judgment. We believe that this capability will have major applications for understanding cell-to-cell communication during tissue development and carcinogenesis using actual tissue samples or 3D cell culture models. Examples include understanding genetic variation from cell to cell that arise in solid tumors (23), interactions between drug-resistant and drug-sensitive cells (24), and cell polarity in tissue devel-

opment and homeostasis (25). However, we do not believe this 3D method will be easily adapted to complete automation, to analyze samples where thousands or more cells per sample require segmentation. The method we present for segmenting whole cells has virtually 100% reliability based on visual inspection. A prerequisite of this segmentation method was that the surfaces of the objects were fluorescence labeled to identify individual, touching cells in intact tissue. Very high reliability in terms of correct segmentation was achieved by introducing two attributes to the segmentation task that are not normally considered when analyzing microscopic images of biological samples. The first attribute was the use of a dynamic programming-based method to obtain a globally optimal border around each selected object. The second attribute was the requirement that the user minimally interacts with each object (by marking two points) before automatic segmentation by the algorithm. This proactive involvement by the user eliminated the need for retroactive correction of images after automatic segmentation. We consider this to be a major advantage in terms of minimizing user interaction while achieving virtually 100% correct segmentation.

LITERATURE CITED 1. Ortiz de Solorzano C, Rodriguez EG, Jones A, Sudar D, Pinkel D, Gray JW, Lockett SJ. Automatic nuclear segmentation for 3D thick tissue confocal microscopy. J Microsc 1999;193:212–226. 2. Lockett SJ, Herman, B. The automatic detection of clustered, fluorescent-stained nuclei by digital image-based cytometry. Cytometry 1994;17:1–12. 3. Lin G, Chawla MK, Olson K, Guzowski JF, Barnes CA, Roysam B. Hierarchical, model-based merging of multiple fragments for improved threedimensional segmentation of nuclei. Cytometry 2005;63A:20–33. 4. W€ahlby C, Sintorn IM, Erlandsson F, Borgefors G, Bengtsson E. Combining intensity, edge and shape information for 2D and 3D segmentation of cell nuclei in tissue sections. J Microsc 2004;215:67–76. 5. Dow AI, Shafer SA, Kirkwood JM, Mascari RA, Waggoner AS. Automatic multiparameter florescence imaging for determining lymphocyte phenotype and activation status in Melanoma tissue sections. Cytometry 1996;25:71–81.


6. W€ahlby C, Lindblad J, Vondrus M, Bengtsson E, Bjorkesten L. Algorithms for cytoplasm segmentation of fluorescence labelled cells. Anal Cell Pathol 2002;24:101–111. 7. Ortiz de Solorzano C, Malladi R, Lelie`vre SA, Lockett SJ. Segmentation of nuclei and cells using membrane related protein markers. J Microsc 2001;201:404–415. 8. Kass M, Witkin A, Terzopoulos D. Snakes: active contour models. Int J Comput Vis 1988;1:321–331. 9. Sethian J. Level set methods: evolving interfaces in geometry, fluid mechanics, computer vision, and material sciences. Cambridge monographs on applied and computational mathematics. Cambridge: Cambridge University Press; 1997. 10. Montanari U. On the optimal detection of curves in noisy pictures. Commun ACM 1971;14:335–345. 11. Geusebroek J-M, Smeulders AWM, Geerts H. A minimum cost approach for segmenting networks of lines. Int J Comput Vis 2001; 43:99–111. 12. Kampe T, Kober R. Nonparametric Image Segmentation. Pattern Anal Appl 1998;1:145–154. 13. Zeng A, Staib LH, Schultz RT, Tagare H, Win L, Duncan JS. A new approach to 3D sulcal ribbon finding from MR images. Medical image computing and computer-assisted intervention, MICCAI’99 Proc. Lect Notes Comput Sci 1999;1679:148–157. 14. Verwer BJH, Verbeek PW, Dekker ST. An efficient uniform cost algorithm applied to distance transforms. IEEE Trans Pattern Anal Mach Intell 1989;11:425–429. 15. Danielsson PE. Euclidean distance mapping. Comput Graph Image Process 1980;14:227–248. 16. Starink J-PP, Gerbrands JJ. Three-dimensional object delineation by dynamic programming. Bioimaging 1994;2:204–211. 17. Park J, Keller JM. Snakes on the watershed. IEEE Trans Pattern Anal Mach Intell 2001;23:1201–1205. 18. Al-Awadhi F, Jennison C, Hurn M. Statistical image analysis for a confocal microscopy two-dimensional section of cartilage growth. Appl Stat 2004;53:31–49. 19. Zimmer C, Labruyere E, Meas-Yedid V, Guillen N, Olivo-Marin J-C. Segmentation and tracking of migrating cells in videomicroscopy with parametric active contours: a tool for cell-based drug testing. IEEE Trans Med Imaging 2002;21:1212–1221. 20. Hu J, Razdan A, Nielson GM, Farin GE, Baluch P, Capco DG. Volumetric segmentation using Weibull E-SD fields. IEEE Trans Vis Comput Graph 2003;9:320–328. 21. Wang J, Trubuil A, Graffigne C, Kaeffer B. 3-D Aggregated object detection and labeling from multivariate confocal microscopy images: a model validation approach. IEEE Trans Syst Man Cyber B 2003;33: 572–581. 22. Strasters KC, Gerbrands JJ. Three-dimensional image segmentation using a split, merge and group approach. Pattern Recogn Lett 1991; 12:307–325. 23. Chin K, Ortiz de Solorzano C, Knowles D, Jones A, Chou W, Rodriguez EG, Kuo WL, Ljung BM, Chew K, Myambo K, Miranda M, Krig S, Garbe J, Stampfer M, Yaswen P, Gray JW, Lockett SJ. 3D image analysis of thick breast cancer specimens show high cell-to-cell genetic heterogeneity. Nat Genet 2004;36:984–988. 24. Starzec A, Briane D, Kraemer N, Kouyoumdjian J-C, Moretti J-L, Beaupain R, Oudar O. Spatial organization of three-dimensional cocultures of Adriamycin-sensitive and -resistant human breast cancer MCF-7 cells. Biol Cell 2003;95:257–264. 25. Amonlirdviman K, Khare NA, Tree DRP, Chen W-S, Axelrod JD, Tomlin CJ. Mathematical modeling of planar cell polarity to understand domineering nonautonomy. Science 2005;307:423–426.

143

APPENDIX Explanation of the Gray-Weighted Distance Transform for Finding the Optimal Path Across an Image We explain how the gray-weighted distance transform (GWDT) is used to find the globally optimal path from any pixel on the right edge to a specific pixel on the left edge of an image. (This method is also known as ‘‘maximum cost path’’ or ‘‘live wire.’’) The globally optimal path is defined as the path with an average intensity per unit length that is greater that any other possible path between the starting and ending points. For clarity, the method is illustrated on a very small (4 3 4 pixel) image, although the method extends to any sized 2D image. Figure 6A shows the 4 3 4 pixel image with arbitrary intensities assigned to the pixels, which serve as ‘‘weights’’ in the GWDT. There are two initial steps before commencement of the GWDT. First, the distance from the starting pixel (assigned distance 0) to all the other pixels is calculated using the distance transform. Figure 6B shows these distances using the ‘‘city block’’ version of the transform. (In practice, Chamfer distance metrics are used, which more accurately account for the diagonal distances between pixels.) Second, the intensity values in the original image (Fig. 6A) are copied into a new image for only those pixels with the longest distances from the starting pixel (Fig. 6C). The GWDT is a recursive algorithm, which goes through all the pixels in the image in decreasing order of distance from the starting pixel. At each pixel, the intensities of the nearest-neighbor pixels that are farther away from the starting pixel are inspected. The maximum of these intensities is added to the intensity of the pixel under consideration. Pixels that do not have nearestneighbor pixels farther away from the starting pixel are left unassigned. (These pixels occur along the top and bottom edges of the image.) Figure 6D shows the partial result of the GWDT after analyzing all pixels with a distance of 2 from the starting point. The same process is repeated for pixels with distances of 1 and for the pixel with distance 0 (Fig. 6E). The optimal path from any pixel on the right edge to the starting pixel is readily determined by incrementally moving from the current pixel to the nearest-neighbor pixel closer to the starting point that has the highest value in the GWDT image (Fig. 6F).