combining region and edge information to extract

0 downloads 0 Views 906KB Size Report
extract the cells of interest based on edge information. In ... Typical histological image of fish ovary (left) and its correspondent set of annotated oocytes (right).
COMBINING REGION AND EDGE INFORMATION TO EXTRACT FISH OOCYTES IN HISTOLOGICAL IMAGES P. Anta, P. Carri´on and A. Formella E. Cernadas Dpto de Inform´atica Dpto. de Electr´onica e Computaci´on ESEI, Universidade de Vigo Universidade de Santiago de Compostela Campus Universitario As Lagoas 15782 Santiago de Compostela (Spain) 32004 Ourense (Spain) [email protected] [email protected] R. Dom´ınguez and F. Saborido-Rey Instituto de Investigaci´ons Mari˜nas Consejo Superior de Investigaciones Cient´ıficas (CSIC) Vigo (Spain) [email protected] ABSTRACT The study of biology and population dynamics of fish species requires the routine estimation of fecundity in individual fish in many fisheries laboratories. The traditional procedure used by fisheries research is to count the oocytes manually on a subsample of known weight of the ovary, and to measure few oocytes under a binocular microscope. This process can be done on a computer using an interactive tool to count and measure oocytes. In both cases, the task is very time consuming, which implies that fecundity studies are rarely conducted routinely. We attempt to design an algorithm being able to extract the oocytes in a histological image. In a previous work [1], a statistical comparison of the performance of region and edge segmentation approaches was presented. The results have been encouraging but the edge based approach needed to mark manually the centers of the cells of interest. This paper proposes a non–guided method to extract the cells of interest based on edge information. In a post–processing stage, the segmentation results of the region and edge approaches are combined to improve the performance. The rate of oocyte detection has been increased to 82% when the demanded overlap between machine detection and true oocyte area is greater than 75%. KEY WORDS Segmentation, Image analysis, Fish oocytes, Fecundity, Histological images

1 Introduction The description of the reproductive strategies and the assessment of fecundity are fundamental topics in the study of biology and population dynamics of fish species [2]. Studies on reproduction, including fecundity, the assessment of size at maturity, duration of reproductive season, daily spawning behavior and spawning fraction, permit a quantification of the reproductive capacity of individual

fish [3]. This information increases the knowledge that fisheries researchers need to improve the standard assessments of many commercially valuable fish species. Moreover, establishment of extensive databases on stocks reproduction parameters in combination with corresponding data on abiotic factors enables the study of causal relationships between reproductive potential and the effects of fishing pressure and environmental variation [4]. It requires the estimation of parameters, such as fecundity, in a routine way in many fisheries laboratories. To estimate fish female fecundity, researchers need to count the number of oocytes (i.e., ovarian cells precursors of the ovules) that are developing within the breeding season. There is a threshold diameter at which the developing oocytes could be separated from non developing ones, which varies from species to species. Thus, it is critical to know at what developing stage is the ovary analyzed. This is normally achieved measuring the oocyte diameter and obtaining a size distribution of the developing oocyte. Currently, researchers on this field basically use two different approaches to achieve this goal [4]: i) Counting directly the oocytes in a subsample of known weight of the ovary, and measuring the oocytes under binocular microscope and ii) using stereological approach on histological sections. Method i) was the traditional methodology and it does not require expensive equipments. Nevertheless, the time consumption is so large that it has prevented its implementation in the routine in most of the laboratories and fecundity studies are rarely conducted routinely. In recent years this process can be done on a computer using commercial image analysis software. Because counting and measuring oocytes are performed in the same step aided with image analysis, it ends up in a faster procedure. One of the stronger limitation of this method is that it is still necessary to screen histologically the sample. Because histological sections have to be prepared anyway, many researchers took the benefit of it for counting and measuring the normal oocytes in the section directly,

Figure 1. Typical histological image of fish ovary (left) and its correspondent set of annotated oocytes (right).

using stereological techniques (method ii). Figure 1 shows an example of histological section of fish ovary and its correspondent set of oocytes of interest (annotated oocytes). The most often method used to measure and count the oocytes is Weibel stereometry. It counts the number of points of a fixed grid which lie upon sections of the object in question (oocytes). Weibel stereometry needs large numbers of sample points for reliable estimation, so that stereometry can be a tedious technique to apply in practice. For automating this process, each oocyte in the histological section must be identify. Additionally, the diameter of the oocyte has to be measured only in those oocytes where the nucleus is visible, otherwise the measured diameter underestimates the real one. Our attempt is the design of a computer vision system, which automatically recognizes oocytes in histological images and can run on–line. In the preliminary work [1], we only focused on extracting the silhouette of oocytes. The centroid of the interesting cells were known (because they had been manually marked by an expert). In this work, we attempt to extract the cells of interest without the marks. The paper is organized as follows: Section 2 describes the technical details of image acquisition, Section 3 provides a brief review of segmentation techniques and summarizes the region–growing segmentation approach proposed in [1], Section 4 describes the non–guided edge– based segmentation approach to extract the contour of the oocytes in the histological image. The combination of results provided by region and edge approaches is also described there. Section 5 presents and discusses the obtained statistical results and, finally, Section 6 draws the conclusions.

2 Image acquisition Ovaries were embedded and sectioned using standard histological procedures. Section were stained with

Haematoxylin-Eosin, which produce a wide range of stained structures with a good contrast. Images were obR tained from histological section using a LEICA DRE research microscope with a total magnification of 6.3x and a calibration of 0.547619 microns per pixel. A R LEICA DF C320 digital camera directly connected to c was the microscope combined with the Leica IM 50 used to digitize the images. The camera resolution was set to 3.3 Mpixels (2088 × 1550 pixeles). The camera has squared pixels of 3.45 µm. Exposure time and color balance was set automatically.

3 Overview of previous approaches In order to determine the diameter and area of the oocytes, the first task is to segment the cells in an histological image. Segmentation consists in subdividing an image into its constituent parts and extracting those parts of interest (in our case oocytes). There is a broad literature on image segmentation [5] available. Threshold methods often fail to produce accurate segmentation on images containing shadings, highlights, non-uniform illumination or texture. Many other segmentation methods are based on two basic properties of pixels in relation to their local neighborhood: continuity and similarity. Pixel discontinuity gives rise to boundary–based methods [6] and pixel similarity gives rise to region–based methods [7, 8]. Classical boundary–based methods include first and second– order derivative edge detectors. Other alternative techniques include active contours or snakes [9] or postprocessing methods to link broken edges detected previously by classical boundary–methods [5]. Region–based segmentation includes three kinds of methods: seeded region– growing [7, 8], split and merge techniques [10, 11], and watershed [12]. Recently, Mu˜noz et al. [13] review several strategies to integrate region and boundary information for segmentation and many algorithms were proposed in the

The method is applied to the green band of the RGB color image due to fact that the cell edges are more prominent than in the grey–level version. Let I be a histological image (green band), zk = (xk , yk ) the coordinates of the k–th point in the image and Cj the j–th edge for j = 1, . . . , NC

(Cj contains the set of points zk that compose the j–th edge). Inicially, first and second–order derivatives are applied to I in order to detect the edges contained in the image. In both cases, the implementations provided by the modules osl and gevd of the VXL library2 are used. The fist–order derivative filter is the Canny filter. In the second–order filter, only the peaks are considered. In both cases, the implementations incorporate a linking edge step. The set of edges given by both detectors are joined. Let SC = {Cj , j = 1, . . . , NC } be the set of edges detected in the image. SP is the set of polygons representing all detected oocytes being Pj the polygon defined by the points zk ∈ Cj . Many undesired edges are detected in the images. They may be classified in two types: edges due to textured appearance of inner parts of the oocytes; and the outlines of undesired cells (cells which are not of interest for fecundity estimation). Some heuristical rules are applied to remove the undesired edges. This process is critical, because some interesting edges are broken and they may be confused with edges being noise. The experts do frecuently not describe or quantify the way in which they select the oocytes of interest to estimate fecundity of fish. The only parameter that they currently use is the minimum size (diameter) of the cells. All cells smaller than 110 pixels are discarded. Taking into account these ideas, we first apply a filter to the set of edges, SC , in order to remove the noised edges. This filter removes the edges, which are contained in a box of size d × d. We adopt a conservative criterion to preserve true cell edges that are broken. A d of 35 pixels is used. This step removes edges which enclose small areas. Edges due to the cytoplasm may be preserved after this filter. They normally are large edges enclosing small areas. In order to remove this kind of edges, an area filter is applied to the set SP (an area of 2500 pixels is used). Once some noise edges are removed, broken edges should be merged in order to extract the silhouette of the interesting cells. For this objective, we propose to join polygons when their intersection is significant, i.e., let SP = Pj , j = 1, ..., NC be a set of polygons and A(Pj ) the area enclosed by the polygon Pj . We define a new set of polygons SP′ = Pij , i, j = 1, ..., NC where Pij is:  if A(Pi ∩ Pj ) > a × A(Pi ) or  Pi ∪ Pj A(Pi ∩ Pj ) > a × A(Pj ) Pij =  Pi otherwise (1) where a is the overlapping rate. The value of a used is 0.05. The value of this parameter is also critical. Lower values of a do not merge broken edges of the same cell and higher values of a may merge edges belonging to different cells. Afterwards, a convex hull of all the polygons of the resulting set is calculated to fill holes inside the cells. Finally, a post–processing step is applied to remove

1 ITK is an opensource C++ library for computer vision research (www.itk.org).

2 VXL is a collection of opensource C++ libraries designed to computer vision research (vxl.sourceforge.net).

literature [14, 15]. All segmentation techniques mentioned so far use only local information to segment the objects of interest in the image. Model–based segmentation methods require a priori specific knowledge about the geometrical shape of the objects. Segmentation techniques included in this group are the Hough Transform [16], fuzzy approaches [17], and split and merge regions based on shape models [10]. Standard seeded region growing algorithms [7] merge iteratively seed regions with adjacent pixels if their dissimilarity is below a fixed threshold. This methods suffers from the fact that the resulting segmentation depends on the choice of the seeds and the threshold. In a previous work [1], we propose a fully automatic segmentation algorithm which determines both parameters from the original image. Initially, the original RGB image is converted to a grey–level image. Our approach consists of four steps: seed generation, threshold calculation, region growing and postprocessing. In the segmentation process, we consider that the foreground is the cytoplasm and the background are the oocytes. So, we choose the seed point in the cytoplasm as follows. We experimentally observed that oocytes are normally textured and darker than the cytoplasm, which often contains broad, bright regions. Seeds should be placed within sufficiently large uniform regions of bright pixels. Sweeping over the image, the first square of 24 × 24 pixels which have 95% of their values higher than 200 is selected as the seed point. The threshold T to segment the image is automatically determined from the statistical characteristics of the image using the Otsu’s method [18]. The region–growing algorithm implemented by the ITK library1 is used. The resulting binary image of the region–growing process does not exhibit perfect and closed regions, even some cells are misjoined. So,a strong closing morphological operator is applied to break bridges (breaking cells) preserving the sizes of the objects. We use a filter of 10 × 10 pixels. Finally, since the cells have a minimum size, the binary image is filtered to remove all regions with less than 2500 pixels.

4

Methods

We first describe the non–guided edge–based segmentation method and then the method to integrate results from edge and region segmentation approaches. 4.1 Non-guided edge-based segmentation approach

(a)

(b)

(c)

Figure 2. A region of interest of histological image and the detected oocytes (outlined) by the region growing algorithm (a), edge detection algorithm (b) and the integration of edge and region information (c).

the cells whose diameter is smaller than 110 pixels. 4.2 Integrating edge and region information The segmentation results provided by edge and region approaches are integrated to combine region and edge information. Let Se and Sr be the sets of polygons extracted by edge and region segmentation approaches. The final set of polygons Sf is calculated taking into account the best behaviour of region based approaches in outlining the cells. Then the final set is defined joining the polygons of set Sr and the polygons of Se that do not intersect with any polygon in Sr , i.e., Sf = {(Pi ∈ Sr ) ∪ (Pj ∈ Se /Pj ∩ Pi = Ø), i = 1 . . . Cr , j = 1, . . . , Ce }. An example of the visual performance of the algorithms is shown in the images of figures 2. The black points inside the oocytes indicate which cells are interesting for fecundity estimation, and thus they form the set of cells that our system tries to detect.

5 Results and discussions 5.1 Data set The data set contains 8 images. The position and the true contour of each cell have been marked by an expert to evaluate the performance of the segmentation algorithms. The total number of oocytes identified is 451 ranging from 46 to 69 between images. Figure 1 shows an image of the fish oocytes and its corresponding set of annotated oocytes. 5.2 Results The performance of the automatic segmentation is evaluated using statistical measures. As in previous works, we use the measures proposed by Hoover et al.[19] to compare segmentation algorithms, i.e., correct detection, over–

segmentation, under–segmentation, missed and noise regions (see appendix for definitions). We calculate a rate of instances for each image in relation to each tolerance T and average the results for all the images. The results of applying each of the algorithms described in Sections 4.1 and 4.2, as well as the region growing algorithm of our previous work [1], to the dataset of images, are shown in Figures 3. The left graph shows the percentage of oocytes correctly detected for every approach and different tolerances and the right one shows the rate of instances of noise regions for each algorithm. In our previous work [1], the rate of correct instances was 74% when an area overlapping of 75% was demanded. Though the percentage of detected oocytes by our new nonguided edge approach is lower (66%) on its own, it extracts useful information from images. By combining region information provided by the region growing algorithm and our new edge–based algorithm, we have been able to segment a larger amount of cells, reaching 82% of correct detection with a standard deviation among images of 0.82 ± 0.13. The integration of edge and region information has also improved the performance for higher values of tolerance; 80% of oocytes are localized for a tolerance of 0.80, and 75% for a tolerance of 0.85. The percentage of correct detections decreases for higher values of the tolerance. The over–and under–segmentation rate is always lower than 1% and 5%, respectively. In figure 4, we analyse the system performance in relation to the expected oocyte size in order to study the behaviour of algorithms with this parameter. The oocytes are grouped in eight intervals of areas in pixels. The left graph shows the percentage of annotated oocytes in each interval (e.g., 25% of the oocytes have an area in the interval [10000, 15000) pixels) and the percentage of detected oocytes of the integration strategy for each interval of area and a tolerance of 0.75. In right graph the correct detection rate of the integration of edge and region information, for

1

1 Edge Detection Approach Region Growing Approach Edge and Region Combination

0.8

0.8

0.6

0.6 Noise Rate

Correct Detection Rate

Edge Detection Approach Region Growing Approach Edge and Region Combination

0.4

0.4

0.2

0.2

0 0.55

0.6

0.65

0.7

0.75 0.8 Tool Tolerance

0.85

0.9

0.95

1

0 0.55

0.6

0.65

0.7

0.75 0.8 Tool Tolerance

0.85

0.9

0.95

1

Figure 3. Correct detection (left) and noise regions (right) rates for region– and edge– based segmentation approaches and region–edge combination, for different levels of tolerance.

each interval is shown. It is important to emphasize that the system behaviour is relatively uniform over the size of the oocytes ranging from 75% to 100%.

6 Conclusion The initial aim of this research was to prove the feasibility of identifying the true boundaries of fish cells using region– and edge–based segmentation techniques. With our first work [1], we obtained a significant recognition rate (74– 75%) with an acceptable level of accuracy (overlapping between detected and annotated region greater than 75%). However, it had the disadvantage that the experts guidance marking the center of the oocytes had been needed. This new proposal exploits the information extracted independently by our region– and edge– based methods, and integrates them in a post–processing step. This strategy has significantly improved previous results; our system is able to localize 82% of the oocytes of interest contained in a typical histological image. It has the additional advantage that it does not need any kind of a priori knowledge about the image. The system does not need that an expert marks the cells. The results of the segmentation could be optimized using local segmentation in addition to the global information used up to now. Another thread of research is the incorporation of a classification stage that takes into account the inner cell textures, in order to reduce the detection of undesired cells and to analyse the developing stage of the oocytes.

Appendix: Statistical measures Let Di be the number of recognized cells and Ai the number of annotated cells in image Ii . Let the number of pixels of each recognized cell Rdi , d = 1, . . . , Di be called Pdi . Similarly, let the number of pixels of each annotated cell i Rai , a = 1, . . . , Ai be called Pai . Let Oda = Rdi ∩ Rai be the number of overlapped pixels between Rdi and Rai .

i Thus, if there is no overlap between two regions, Oda = 0, i i while if the overlapping is complete, Oda = Ra = Rdi . Let a threshold T be a measure for the strictness of the overlapping (0.5 ≤ T ≤ 1). A region can be classified in the following types:

• A pair of regions Rdi and Rai is classified as an ini stance of correct detection if Oda ≥ Pdi T (at least T percentage of the pixels of region Rdi in the detected i image overlap with Rai ) and Oda ≥ Pai T . i i • A region Rai and a set of regions Rd1 , . . . , Rdx ,2 ≤ i x ≤ D are classified as an instance of over– segmentation if ∀q ∈ x, Odi q a ≥ Pdiq T and Px i i q=1 Odq a ≥ Pa T . i i • A set of regions Ra1 , . . . , Ray , 2 ≤ y ≤ Ai , and i a region Rd is classified as an instance of under– Py i i segmentation if O r=1 dar ≥ Pd T and ∀r ∈ i i y, Odar ≥ Par T .

• A region Rai that does not participate in any instance of correct detection, over–segmentation, or under– segmentation is classified as missed. • A region Rdi that does not participate in any instance of correct detection, over–segmentation, or under– segmentation is classified as noise.

Acknowledgment This investigation was supported by the Xunta de Galicia (regional government) projects PGIDIT04RMA305002PR and PGIDIT04RMA402003PR.

References [1] S. Al´en, E. Cernadas, A. Formella, R. Dom´ınguez, and F. Saborido-Rey. Comparison of region and edge segmentation approaches to recognize fish oocytes in

1

1 Annotated Oocytes Detected Oocytes

Correct detection rate

0.8

Correct detection rate

Oocytes Rate

0.8

0.6

0.4

0.2

0.6

0.4

0.2

0

0 0

5000

10000

15000

20000 Area in pixels

25000

30000

35000

40000

0

5000

10000

15000

20000 Area in pixels

25000

30000

35000

40000

Figure 4. Size distribution of detected oocytes by the edge-region combination and annotated oocytes (left); and correct detection rates for every size interval for edge–region integration, with tolerance = 0.75 (right).

histological images. Lecture Notes in Computer Science, 4142:853–864, 2006. [2] J. R. Hunter, J. Macewicz, N. C. H. LO, and C. A. Kimbrell. Fecundity, spawning, and maturity of female Dover Sole, Microstomus pacificus, with an evaluation of assumptions and precision. Fishery Bulletin, 90:101–128, 1992. [3] H. Murua and F. Saborido-Rey. Female reproductive strategies of marine fish species of the North Atlantic. Journal of Northwest Atlantic Fishery Science, 33:23–31, 2003. [4] H. Murua, G. Kraus, F. Saborido-Rey, P. Witthames, A. Thorsen, and S. Junquera. Procedures to estimate fecundity of marine fish species in relation to their reproductive strategy. Journal of Northwest Atlantic Fishery Science, 33:33–54, 2003.

[11] Richard Nock and Frank Nielsen. Statistical Region Merging. IEEE Trans. on Pattern Analysis and Machine Intelligence, 26(11):1452–1458, 2004. [12] L. Vincent and P. Soille. Watersheds in digital spaces: An efficient algorithm based on immersion simulations. IEEE Trans. on Pattern Analysis and Machine Intelligence, 13(6):583598, 1991. [13] X. Mu˜ noz, J. Freixenet, X. Cuf´ı, and J. Mart´ı. Strategies for image segmentation combining region and boundary information. Pattern Recognition Letters, 24:375–392, 2003. [14] J. Gambotto. A new apprach to combining region growing and edge detection. Pattern Recognition Letters, 14:869–875, 1993.

Wiley-

[15] Chen Guang Zhao and Tian Ge Zhuang. A hybrid boundary detection algorithm based on watershed and snake. Pattern Recognition Letters, 26:1256–1265, 2005.

[6] J. F. Canny. A computational approach to edge detection. IEEE Trans. on Pattern Analysis and Machine Intelligence, 8(6):679–698, 1986.

[16] Chun-Pong Chau and Wan-Chi Siu. Adaptive dualpoint Hough transform for object recognition. Computer Vision and Image Understanding, 96:1–16, 2004.

[5] W. K. Pratt. Digital Image Processing. InterScience, 2001.

[7] R. Adams and L. Bischof. Seeded region growing. IEEE Trans. on Pattern Analysis and Machine Intelligence, 16(6):641–647, 1994. [8] J. Fan, G. Zeng, M. Body, and M-S. Hacid. Seed region growing: an extensive and comparative study. Pattern Recognition Letters, 26:1139–1156, 2005. [9] M. Kass, A. Witkin, and D. Terzopoulos. Snakes: Active contour models. International Journal on Computer Vision, 1(4):321–331, 1987. [10] Lifeng Liu and Stan Sclaroff. Deformable modelguided region split and merge of image regions. Image and Vision Computing, 22:343–354, 2004.

[17] Shu-Hung Leung, Shi-Lin Wang, and Wing-Hong Lau. Lip Image Segmentation Using Fuzzy Clustering Incorporating an Elliptic Shape Function. IEEE Trans. on Image Processing, 13(1):51–62, 2004. [18] N. Otsu. A threshold selection method from graylevel histograms. IEEE Trans. on Systems, Man and Cybernetics, 9(1), 1979. [19] A. Hoover, G. Jean-Baptiste, X. Jiang, P. J. Flynn, H. Dunke, D. B. Goldgof, K. Bowyer, D. W. Eggert, A. Fitzgibbon, and R. B. Fisher. An experimental comparison of range image segmentation algorithms. IEEE Trans. on Pattern Analysis and Machine Intelligence, 18(7):673–689, 1996.

Suggest Documents