A novel automatic seed point selection algorithm for breast ultrasound images Juan Shan, H. D. Cheng, Yuxuan Wang Dept. of Computer Science, Utah State University, Logan, UT 84322
[email protected],
[email protected],
[email protected] Abstract Region growing is a frequently used segmentation method for medical ultrasound images processing. The first step of region growing is selecting the seed point which is inside the breast lesion. Most of the region growing methods require manually selecting the seed point which needs human interaction. To make the segmentation completely automatic, we propose a new automatic seed point selecting method for region growing algorithm. The method is validated on our database with 105 ultrasound images with breast masses and it is compared with other automatic seed point selecting method on the same database. Quantitative experiment results show that our proposed method can successfully find the proper seed points for 95.2% of the US images in the database which is much more robust than other automatic seed point selection methods.
1. Introduction Ultrasonography has been one of the most powerful techniques for imaging organs and soft tissue structures in human body [1]. It has been used for breast cancer detection in recent years because of the advantages of ultrasound (US) imaging such as noradiation, sensitive to dense breast, low false positive rate, portable and cheap cost. Therefore, US imaging becomes one of the most important diagnostic tools for breast cancer detection. However, due to the nature of US imaging, the images always suffer from the poor image quality caused by speckle noise, low contrast, blurred edge and shadow effect. These make the segmentation of the interested lesions quite difficult. One of the frequently
978-1-4244-2175-6/08/$25.00 ©2008 IEEE
used segmentation methods is region growing [2, 3]. A seed point is the starting point for region growing and its selection is very important for the segmentation result. If a seed point is selected outside the region of interests (ROIs), the final segmentation result would be definitely incorrect. Due to the low quality of US images, most of these region growing methods require the seed point be selected manually in advance. In order to make the region growing segmentation fully automatic, it is necessary to develop an automatic and accurate seed point selection method for US images. However, seldom works have been done in this area and the relevant works are rare and immature. [4] proposed an automatic method to select seed point for masses using both the co-occurrence features and run length features. The run length features are calculated around the points selected by the co-occurrence features. If all the run length features of a selected point and its neighborhood points are equal, then the point is considered as a seed point [4]. In [5], after several preprocessing steps, a seed point score formula is used to evaluate a set of randomly selected points. The point with the highest score is considered as the seed point. Another method [6] is that, after preprocessing and morphological operations, a binary image is obtained and the sum of the pixels on each row and column are computed. Indexes of the seed point are found as the row and column number with the max sums respectively. All the existing methods take into account only the statistics texture features for a mass region (ie, the mass is darker than the surrounding tissues and more homogeneous than other regions), seldom considers spatial features of a US mass (such as a mass frequently appears at the upper part of image and is barely connected with the image boundary). Therefore, the probability of a selected
seed point outside the lesion is high, especially in the noisy and low-contrast images. In this paper, we develop a new automatic seed point selection method for US images. The method not only considers the texture features of a lesion, but also incorporates the spatial characteristics of a lesion. We describe our method in details in section 2. The new method is compared with Madabhushi’s automatic seed point selection method [5] using the same database in section 3. At last, section 4 draws conclusions.
2. Method The automatic seed point selection method is composed of 5 steps. They are described as below.
2.1. Speckle reduction We employ the speckle reducing anisotropic diffusion (SRAD) [7] as the de-speckle method. SRAD can iteratively process the noisy image with adaptive weighted filters, reduce noise and preserve edges. The diffusion coefficient is determined by
c( q ) =
1 1 + [q ( x, y; t ) − q0 (t )] / [q0 2 (t )(1 + q0 2 (t ))] 2
2
and the instantaneous coefficient is
q ( x, y; t ) =
(1/ 2)(| ∇I | / I ) 2 − (1/ 42 )(∇ 2 I / I ) 2 . [1 + (1/ 4)(∇ 2 I / I )]2
The initialized q0(t) is given by
q0 (t ) =
var[ z (t )] , where z(t) is the most z (t )
homogeneous area at t. In our experiments, we set the iteration times as 10.
2.2. Iterative threshold selection Most segmentation methods need to threshold the image into background and foreground. The threshold selection greatly affects the final segmentation result. Here, we iteratively select thresholds based on the histogram and breast lesion’s spatial characteristics. No training or empirical based threshold value is needed. The advantage of this iterative method is that it can be used for a US image without any requirement for image resource consistence or human interaction to tune a reasonable threshold value. Only the information of current US image is needed to determine the proper threshold.
We first calculate all the local minimums of the image histogram. A good threshold which can properly separate the lesion from the background should be one of these local minimums. Starting from the smallest to biggest, we evaluate every local minimum until we find the proper one. The iteration is described below: 1. Let t equal to the current local minimum of the histogram. Binarize and reverse the de-speckled image using threshold t (lesion becomes white and background is black) to get Ib. If the ratio of the number of foreground points and the number of background point is less than 0.1, let t equal to next local minimum. Continue until the ratio is no less than 0.1. 2. Perform dilation and erosion on Ib to remove noise. 3. Find all the connected components in Ib. If none of the connected components has intersection with the image center region (a window about 1/2 size of the whole image and centered at the image center), let t equal to the next local minimum. Continue until there is a connected component has intersection with the center window. After applying the above 3 steps, a proper threshold t is chosen to binarize the image into background and foreground. Because the iterative threshold chosen process starts from the smallest local minimum and increases gradually based on the possible lesion to image ratio, it can avoid the problems that foreground is too large (lesion is connected with other tissues) or too small (lesion is not included into the foreground).
2.3. Delete the boundary-connected regions After image binarization, we find all the connected components again. Each connected component represents a possible lesion region. Besides the real lesion region, there are some regions connected with the boundary and such kind of boundary-connected regions always have a big area. We cannot simply delete all the regions connected with boundary of the image because sometimes the lesion region is also connected with the boundary. Therefore, we use the center window in section 2.2 to evaluate every boundary region. If a region has no intersection with the center window and it is connected with any of the 4 image boundaries, we delete this region from the lesion candidate list.
2.4. Rank the regions Now the left regions are either not connected with the boundary or having intersection with the image center window. We use the following score formula to
rank each left region. The one with the highest score is considered as the lesion region.
Sn =
Area , n = 1, …, k dis(Cn , C0 )var(Cn )
where k is the number of regions, Area is the number of pixels in the region, Cn is the center of the region, C0 is the center of the image, and var(Cn) is the variance of a small circular region centered at Cn. In the implementation, we slightly moved the image center C0 to the upper part of the image (around row/4) based on our observation that a lesion frequently appears in the upper part of an image and shadow frequently appears in the lower part of an image.
2.5. Determine the seed point Suppose the minimum rectangle contains the winning region [xmin, xmax; ymin,ymax]. For most cases, the center of the winning region ((xmin+xmax)/2, (ymin+ymax)/2) could be considered as a seed point. However, there are cases that the lesion shape is irregular and thus the center point might be outside the lesion. For these special cases, we choose a seed point by the following rule: xseed = ( xmin + xmax ) / 2 ,
yseed = {∀y | ( xseed , y ) ∈ lesion region} .
3. Experiment We tested the proposed seed point selection method using a breast US image database. The database consists of 105 images with various breast lesions. Each lesion’s boundary is manually outlined by radiologists. As long as the detected seed point is inside the lesion, we consider such case as a true positive (TP). On the other side, if the seed point is outside the manually outlined lesion or on the boundary, we count it as a false positive (FP). Table 1. Performance of our method TP
FP
Description Detected seed point is near the center of the lesion Detected point is inside the lesion but near the boundary Detected seed point is outside the lesion
# of images
and seed point is near the boundary. Although these two categories are not strictly distinguished, we can use the statistics values to evaluate the quality of the seed point selection method, given the assumption that the more the seed point is near the center of the lesion, the better the seed point is. After running our method on the database of 105 images, only 5 images failed to get the seed point inside the lesion. The results are summarized in Table1. Next we compared our method with Madabhushi’s automatic seed point selection method [5]. Seed point selection is one step of the low level processes in Madabhushi’s segmentation system. We exactly followed every step described in his paper. His method needs to calculate the pdfs for intensity and texture on a training set before seed point selection. To maximize his method’s performance, we used all our 105 images in the database to train the pdfs, and then use the same 105 images to test his method. The performance comparison is given in Table2. Based on the experiment results, our method outperformed Madabhushi’s method not only on the accuracy (our TPR=95.2% and Madabhushi’s TPR=73.4%), but also the seed points’ quality. As we discussed above, the more a seed point is near the center of the lesion, the better the seed point is for further region growing. Therefore, the percentage of seed points with good quality found by our method is much higher than the Madabhushi’s (percentage of good seed points by our method is 81.1% and Madabhushi’s is 58.1%). Figure 1 gives two example results of our method and Madabhushi’s.
(a)
(b)
Rate
85 95.24% 15 5
4.76%
Further more, we separate TP cases into two categories: seed point is in the center part of the lesion
(c) (d) Figure 1. Results of tw o cases: (a) and (c) are our method results; (b) and (d) are Madabhushi’s method results.
Besides, we analysis the failed cases and those seed-point-near-boundary cases in our method. We find that many of those cases have a shadow right below the lesion and the intensity of the shadow and the lesion are quite similar. Therefore the algorithm can not separate the lesion from the shadow. That’s why the seed point is chosen outside the lesion or near the boundary. Figure2 shows one of such cases.
Figure 2. A seed-point-near-boundary case with shadow effect Table 2. comparison of the tw o methods TP FP
Madabh ushi’s method Our method
Seed point near center 29 61 (27.62%) (58.10%) 5 (4.76%)
85 (80.95%)
Seed point near boundary 15 (14.29%) 15 (14.29%)
4. Conclusion In order to implement a completely automatic region growing segmentation method for breast cancer US images, we propose a new, robust, fast and automatic seed point selection algorithm. The algorithm needs no prior information or training process. By taking into account both the homogeneous texture features and spatial features of the breast lesions, we successfully find the seed points for 100 images (95.2%) of the 105 images in the database and fail on only 5 images. The quantitative results demonstrate the robustness of our seed selection method. By studying the failed cases and seed-pointnear-boundary cases, we found that most of those cases are caused by a shadow with similar intensity of the lesion and right below the lesion. Thus, the future work will focus on eliminating the shadow effect on automatic seed point selection.
References [1] A. Achim and A. Bezerianos. Novel Bayesian Multiscale Method for Speckle Removal in Medical Ultrasound Images. IEEE Trans. On Medical Imaging, 20 (8): 772– 783, 2001. [2] S. Hojjatoleslami and J. Kittler. Region growing: a new approach. IEEE Transactions of Image Processing, 7(7): 1079-1084, July 1998. [3] A. Rolf and B. Leanne. Seeded Region Growing. IEEE Transactions on Image processing, 16(6): 641-647, June 1994. [4] S. Poonguzhali and G. Ravindran. A complete automatic region growing method for segmentation of masses on ultrasound images, Biomedical and Pharmaceutical Engineering International Conference on, pp. 88-92, December 2006. [5] A. Madabhushi and D. N. Metaxas. Combining low-, high-level and empirical domain knowledge for automated segmentation of ultrasonic breast lesions. Medical Imaging, IEEE Transactions on, 22(2): 155- 169, February 2003. [6] I. S. Jung, D. Thapa and G. N. Wang. Automatic Segmentation and Diagnosis of Breast Lesions Using Morphology Method Based on Ultrasound. Fuzzy Systems and Knowledge Discovery, 3614: 1079-1088, 2005. [7] Y. J. Yu and S. T. Acton, Speckle reducing anisotropic diffusion. IEEE Transactions on Image Processing, 11: 1260-1270, 2002.