48
IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 21, NO. 1, JANUARY 2002
Segmentation of Ultrasound B-Mode Images With Intensity Inhomogeneity Correction Guofang Xiao, Michael Brady, J. Alison Noble*, Member, IEEE, and Yongyue Zhang
Abstract—Displayed ultrasound (US) B-mode images often exhibit tissue intensity inhomogeneities dominated by nonuniform beam attenuation within the body. This is a major problem for intensity-based, automatic segmentation of video-intensity images because conventional threshold-based or intensity-statistic-based approaches do not work well in the presence of such image distortions. Time gain compensation (TGC) is typically used in standard US machines in an attempt to overcome this. However this compensation method is position-dependent which means that different tissues in the same TGC time-range (or corresponding depth range) will be, incorrectly, compensated by the same amount. Compensation should really be tissue-type dependent but automating this step is difficult. The main contribution of this paper is to develop a method for simultaneous estimation of video-intensity inhomogeities and segmentation of US image tissue regions. The method uses a combination of the maximum a posteriori (MAP) and Markov random field (MRF) methods to estimate the US image distortion field assuming it follows a multiplicative model while at the same time labeling image regions based on the corrected intensity statistics. The MAP step is used to estimate the intensity model parameters while the MRF step provides a novel way of incorporating the distributions of image tissue classes as a spatial smoothness constraint. We explain how this multiplicative model can be related to the ultrasonic physics of image formation to justify our approach. Experiments are presented on synthetic images and a gelatin phantom to evaluate quantitatively the accuracy of the method. We also discuss qualitatively the application of the method to clinical breast and cardiac US images. Limitations of the method and potential clinical applications are outlined in the conclusion. Index Terms—Contrast enhancement, intensity inhomogeneity, Markov random field, segmentation, tissue classification, ultrasound.
I. INTRODUCTION
B
-MODE ultrasound (US) imaging is one of the most frequently used diagnostic tools for a range of clinical applications, because images are available in real-time, there is a low health risk to the patient and the cost of a scan is low relative to the cost of other imaging modalities.
Manuscript received November 23, 1999; revised November 8, 2001. The work of G. Xiao was supported by a U.K. Government Overseas Research Student award. The Associate Editor responsible for coordinating the review of this paper and recommending its publication was M. Insana. Asterisk indicates corresponding author. G. Xiao, M. Brady, and Y. Zhang are with the Medical Vision Laboratory, Department of Engineering Science, University of Oxford, Oxford OX1 3PJ, U.K. *J. A. Noble is with the Medical Vision Laboratory, Department of Engineering Science, University of Oxford, Oxford OX1 3PJ, U.K. (e-mail:
[email protected]). Publisher Item Identifier S 0278-0062(02)01045-5.
Segmentation is often an important step in US B-mode image analysis. For example, segmentation can be used as the precursor image processing step to quantitative measurement of lesion size. This is clinically valuable, for example, for monitoring lesion growth (or shrinkage) and for providing a clinical indicator that can be used in surgery planning and treatment. Three-dimensional (3-D) US imaging, provides another example where segmentation is useful. Often the reason to use a 3-D acquisition is to provide a 3-D visualization of a complex object or anatomy. In this case, segmentation is often required to identify the object prior to display by surface rendering. Three–dimensional US image analysis is more challenging than the two–dimensional (2-D) case because acquisition parameters are fixed at the beginning of an acquisition. Although this parameters may give a good quality image for the first frame, these settings may not be “optimal” for images taken at other probe orientations during the 3-D scan. B-mode imaging artefacts include speckle noise, attenuation (absorption and scattering), etc. The statistical analysis and reduction of speckle noise has been studied extensively in the literature [1]–[7]. Other artifacts, particularly those caused by nonuniform beam attenuation within the body that are not accounted for by time gain compensation (TGC), also decrease the image signal-to-noise ratio (SNR). This problem has also been well-studied, although mainly at the radio frequency (RF) signal level [8], [9]. The function of TGC is to amplify the amplitude of echoes in order to compensate for signal attenuation on the travel path. Usually, it corrects the gain equally for echoes in the same depth range. It follows that it does not work well if regions with different attenuation properties appear at the same depth. As a result, in the displayed image, the image intensities within regions of the same tissue type often appear inhomogeneous and the intensity distributions of different tissue classes often overlap significantly. Some other authors have recognized that tissue image intensity variation causes difficulties for automated intensity-based segmentation and proposed image processing solutions to solve this that work on video-intensity images because these images are most commonly available on commercial US machines [10], [11]. The contribution of this paper is to develop an original method to address this problem. In this paper, we consider the problem of correcting for attenuation-related intensity inhomogenieties i.e., those that cause a slowly changing (low-frequency) intensity contrast and are not due to speckle. Note that this problem is similiar to that of bias field distortion in magnetic resonance correcting for imaging (MRI), for which, recently, a number of image-processing solutions have been proposed [12], [13] as well as extended to simultaneously correct for the bias-field and segment
0278–0062/02$17.00 © 2002 IEEE
XIAO et al.: SEGMENTATION OF ULTRASOUND B-MODE IMAGES
49
magnetic resonance (MR) brain images in [14]. In this paper, we adapt the statistical method of [14] to work on B-mode US images. This method is outlined in Section II. However, in order to use this method, we have to explain how the multiplicative model it assumes relates to ultrasonic image formation. This is done in Section III. Section IV describes a series of experiments that have been done on synthetic and phantom images to test the accuracy of the approach and on clinical breast and cardiac data sets to show potential clinical utility. Section V concludes the paper and discusses future work. II. A STATISTICAL MODEL FOR INTENSITY INHOMOGENEITY CORRECTION AND SEGMENTATION In this section, we review the method proposed by Zhang et field distortion and simultaneously segal. for estimating the menting an MR image and provide implementation details on how it has been adapted to work with US images. This method essentially estimates the low (spatial)-frequency multiplicative degradation field while at the same time identifying regions of similar intensity inhomogeneity using an MRF-MAP framework. As we will explain in Section III, although developed for another imaging modality, under simplified assumptions, we can justify using the same approach on displayed US images. A. Model Specification Let be a lattice indexing the pixels in the given image. Furand be the obther, let served and the ideal (that is, without intensity inhomogeneity distortion) intensities of the given image respectively, being the number of pixels in the image. We assume that the distorcan be expressed by a multiplicative tion at pixel model of the form (1) where represents the gain of the intensity due to the intensity inhomogeneity at pixel . A logarithmic transformation of this denote, respectively, equation yields an addition. Let and the observed and the ideal log-transformed intensities, then
With the distortion field taken into account, the above distribution can be written in terms of the observed intensity as (4) and, hence, a class-independent intensity distribution (5) Thus, the intensity distribution at pixel is modeled as a Gaussian mixture, given the distortion field. Assuming that the pixel intensities are statistically independent, the probability density for the entire image, given the distortion field, is (6) Bayes’ rule can be used to obtain the posterior probability of the distortion field, given the observed intensity values (7) is a normalization constant. The prior probability where is modeled as a Gaussian density of the distortion field with zero mean to capture its smoothness property. The maximum a posteriori (MAP) principle can be employed to obtain the optimal estimate of the distortion field , given the observed intensity values (8) The optimum solution
(9) Solving this equation leads to the update equations (see [12] for detail) (10) with
(2) where denotes the log-transformed intensity distortion field. Segmentation can be considered as a problem of statistical classification, which is to assign every pixel a class label from a label set. Let denote the label set. A labeling of will be is the corresponding class denoted by , in which label of pixel . Given the class label , it is assumed that the follows a Gaussian distribution intensity value at pixel , (this assumption will be justified in Section III) with parameter being the mean and the variance of class , respectively (3) where
satisfies the following condition:
(11)
is the posterior probability that pixel belongs to class Here, given the distortion field estimate, is a low-pass filter, is the mean residual in which for pixel (12) and
is the mean inverse covariance, in which if otherwise.
(13)
B. Estimation With Global Prior in (5) and (10) is set to If the prior probability at every pixel , as in [12], is esbe equal for all sentially the normalized conditional probability and the estimation is consequently a maximum-likelihood (ML) approach. A
50
IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 21, NO. 1, JANUARY 2002
two-step expectation-maximization algorithm can be applied. In this case, the E step (10) calculates the posterior tissue class when the distortion field is known. In the M probability is known. step (11), the distortion field is estimated when The algorithm runs iteratively. Initially, the distortion field is assumed to be zero everywhere in the image. Once the distortion can be restored by difield is obtained, the ideal intensity . A class labeling of the pixels is obtained viding by the label , that has by assigning to each pixel , . the largest value C. Estimation Using an MRF Prior Model The ML estimation is known to be sensitive to noise and thus on its own one would predict that it would not be suitable for US images. Zhang et al.’s contribution was to show that a full MAP estimation can be achieved by incorporating an MRF prior model for the image tissue classes [14]. MRF theory provides a convenient way to model the contextual information in an image, which is crucial in many cases for interpretation of the image content. indicate the true but unknown labeling of the given Let image and represent an estimate of . We assume that both variables are interpreted as particular realizations of a random can be infield . The log-transformed observed image terpreted as a realization of random variable . The problem of classification then becomes one of estimating , given . Such an estimate can be obtained using MAP estimation (14) According to Bayes’ rule (15) where (16) is a normalization constant and is the prior Here, that we model as an MRF and will be introdistribution of is discussed next. duced next. The form of 1) MRF Prior Distribution: In an MRF, only neighboring sites have direct interactions with each other and they tend to have the same class labels. According to the Hammersley–Clifford theorem [15], the probability density of an MRF is given by the Gibbs distribution (17) where (18) is the energy function which is a sum of clique potentials over all possible cliques . is a clique, which is defined as a subset of sites in in which every pair of distinct sites are neighbors, except for single-site cliques. is a positive constant which controls the size of clustering and is a normalization term. In this paper, for the 2-D case, only cliques of size two
within an eight-neighborhood system are considered. The clique potential of with respect to its clique neighbor is of the form (19) where if otherwise. 2) MRF-MAP Classification: Taking the logarithm of and making use of (3) and the posterior probability (15)–(17), the MAP estimate of class labels is formulated as (20) where the likelihood energy
and the prior energy
is given by
is
where denotes the set of pixels neighboring to . Finding the global minimum in (20) is nontrivial, because the number of possible configurations for pixel labels is enormous and there are typically many local minima where the optimization process can be trapped. Several methods, such as simulated annealing [16], [17] and Genetic Algorithms [18], guarantee, at least theoretically, convergence to the global minimum. However, these methods may take a long time to run. Therefore, local methods that converge to suboptimal solutions but are more practical in terms of time are preferable. In this paper, the iterated conditional modes (ICM) algorithm [19] is employed. The ICM algorithm uses a “greedy” strategy in the iterative local minimization: in the th iteration, at each pixel , given the observed image and the labeling of the neigh, the algorithm sequentially updates to by bors with respect to . Such an updating minimizing process converges rapidly in a few iterations. 3) The Complete Estimation Framework: The complete estimation framework is thus as follows. An EM algorithm is apis built plied as described in Section II-B, except that now as the posterior probability based on the MRF prior model rather than the simple normalized conditional probability, that is (21) has the form of (17). In the E step, the where MRF-MAP classification is carried out to give the posterior and the class labeling , providing that the disprobability tortion field is known. In the M step, the distortion field is esti. The iteration can start with either step. In mated knowing most cases, the distortion field is not known a priori, therefore, the classification often begins with the assumption that the distortion field is zero everywhere in the image. Note that the pdf and update equations of the method perform local averaging in a way reminscent of anisotropic diffusion [20], i.e., the average
XIAO et al.: SEGMENTATION OF ULTRASOUND B-MODE IMAGES
is only taken over neighborbouring pixels that do not seem (at this iteration) to cross a boundary. As with anisotropic diffusion this helps to preserve the localization of region boundaries. 4) Multiresolution Implementation: The algorithm is implemented in a multiresolution manner, using a Gaussian pyramid. The classification and the distortion field estimation are performed at each resolution and both results become the initial conditions at the next finer resolution. Such a scheme is known to be less likely to be trapped in local minima and also faster than a single resolution implementation. The importance of using a multiresolution implementation for US image analysis is expanded on in Section III.
51
and attenuation . Note that we have deliberately left as spatially-varying here, rather than make the normal assumption that it is a global constant which is only true if you consider a one tissue object. We assume the lateral distribution of the propagating wave (determined by the transducer geom. The signal received at the receiver etry in practice) is will have travelled a round trip of 2 where is the velocity of sound in the medium which is also assumed to be uniform. In other words, at the receiver the waveform is delayed 2 . will then pass through a by this amount i.e., signal processing unit where the pulses are summed and scaled of the form to give an output signal
D. The 3-D Case The algorithm can be applied to 3-D volumes reconstructed from a sequence of parallel, closely spaced 2-D images. We assume that in such a sequence neighboring slices resemble each other, that is, overlapping pixels in neighboring slices tend to have the same class labels. Intensity inhomogeneity field estimation is performed within each 2-D image, while the energy function in the MRF prior model involves a 3-D neighborhood system, which includes, for each pixel in a 2-D scan, the eight nearest neighbors in the same scan and the two direct neighbors in the previous and the next scan. This 3-D constraint helps to strengthen ambiguous boundaries that are easily mislocated in 2-D processing.
(22) is a normalizing constant. It is normally justified to where and 1 vary assume that the attenuation functions 2 is, howslowly with . The return pulse function ever, generally narrowband and, therefore, acts as a delta function with regard to the attenuation functions. If we further asis constant, that is has a sume that, at least locally value , then (22) simplifies to
III. INTENSITY INHOMOGENEITIES IN ULTRASOUND B-MODE IMAGES The method outlined in the Section II was specifically designed for segmenting brain MR images where there is signifbias field distortion which derives from the differenicant tial attenuation of signals and the nonlinearity of the sensitivities of the receiver coils and also the variation of the interaction between the human body and the magnetic field [21]. In this section, we explain why the same approach is suitable for processing displayed US B-mode images. Although a few other researchers have implicitly assumed related models [10], [11], we have not seen this explanation presented before in this context. The MRF-MAP method makes two key assumptions: the degradation field is low frequency and multiplicative (1) and the image is described by a Gaussian pdf mixture model [leading to priors of the form in (3)]. US image noise (speckle) is definitely not Gaussian distributed and speckle correlates pixels. Further, the assumption that attenuation-related artefacts can be modeled as a multiplicative field is valid only under some assumptions. We discuss these points in depth below. We begin with the second point by showing how (1) can be related to a classic reflection imaging equation of ultrasonic physics. Following Macowski’s notation [22] we define the direction of wave propagation (axial direction) to be and the lateral and elevation directions to be and , respectively. We ignore diffraction effects and other effects such as the frequency dependency of attenuation for the time being. We consider a basic pulse-echo reflection imaging system in which a pulse excites the transmitter which sends waves into an waveform isotropically scattering body which has a reflectivity
(23) This can be seen to be a 3-D convolution with a time-varying that is a multiplicative degradation of the form attenuation of (1). In commercial US machines, the goal in time-gain comby the inverse of , i.e., pensation (TGC) is to multiply to compensate tissue in the same is not known depth range by the same amount. However in practice so a linear manual image adjustment is done. In this paper, we aim to do a more “intelligent” automatic correction than position-dependent TGC that is tissue-type depenof dent and that automatically segments tissue regions (through the MAP estimation). Note that we actually do this on video-intensity images for which the RF signal described by (23) has undergone logarithmic compression and scaling. From an implementation point of view, this is not a problem as this only means we do not need to do the initial image logarithmic step. Of course, (23) is a simplified model that assumes a weakly reflecting medium, ignores diffraction and assumes a uniform attenuation. The first assumption is valid in practice for most biological structures, the second is commonly employed in US image physics and is needed to enable us to assume that is constant over the transducer face and zero elsewhere. In general, the third assumption is untrue as the attenuation coefficient is a function of both the particular tissue and the frequency of the propagating wave. In this paper, we do not address frequency
52
IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 21, NO. 1, JANUARY 2002
Fig. 1. Synthetic image.
Fig. 2.
Plots of the histograms of two regions in the synthetic image.
dependency or try to differentiation between the absorption and scattering components of attenuation. In a given application, the amount of correction depends on the degree of low spatial frequency attenuation. If this changes with transducer frequency then for a longitudinal study it will be necessary to use a transducer of fixed frequency (or frequency blend). We do not see this as a limitation of our method since in many applications, such as breast imaging, a transducer of fixed frequency is used. Concerning the algorithm’s assumption of a Gaussian image model (pdf) this is clearly not true for US images in general. The main noise in US images is speckle noise which has very different characteristics [2], [6]. However, the hidden model MRF algorithm that we use requires a suitable pdf to update intensities locally (10) and (17). Ideally, we would derive the pdf from the physics of image formation including a model of speckle, however, we have not done this in the current work. Pending a more accurate model of image formation, we use a Gaussian which we justify by the following reasoning. The algorithm is run in multiresolution mode from coarse to fine. In fact, the lower resolution (‘blurred’) solutions do satisfy locally the Gaussianity assumption and we use the parameter estimates at a low scale to initiate a search for the solution at the next higher scale. At the finer scales, although theoretically the assumption that the pdf is local Gaussian is no longer valid the fact that you are only adjusting rather than solving from scratch for the parameters means your solution is less sensitive to error. Indeed, the error is likely to be no worse than the error observed due to the fact that even for the case where the Gaussian model was valid at fine scales, the estimation of the distribution parameters from a small number of samples will be inaccurate. Note that the above discussion implies that the Gaussian pdf does not hold if the method is applied at a single resolution. In Section IV-A, we
provide results on a synthetic example to show experimentally that the method works well for non-Gaussian (Rayleigh) noise. IV. EXPERIMENTS AND RESULTS The algorithm of Section II has been applied to synthetic images, in vitro B-mode images of a phantom and in vivo breast and cardiac B-mode US images acquired with different types of US transducers. In all the experiments described below, the following fixed algorithm parameters were used: a four-level 0.6 at the lowest resolution, with Gaussian pyramid and an increment of 0.3 for every finer resolution. A. Synthetic Data The purpose of this experiment was to show that the method works successfully in the presence of Rayleigh-distributed noise. Success in this case is measured by the misclassification rate (the ratio of the number of misclassified pixels to the number of pixels of the original region-of-interest). Recall from Section III that we use a multiresolution implementation to ensure that the Gaussian assumption, that is explicitly assumed in the algorithm is approximately valid. A synthetic image was generated consisting of a disc of intensity value 40 at the centre of a background of intensity value 80 [Fig. 1(a)]. We then added to it an intensity field with the intensity varying linearly from 80 to 0 from top to bottom [Fig. 1(b)], to simulate tissue image intensity inhomogeneities caused by increasing beam attenuation with increasing depth. The resulting image was then blurred with a Gaussian filter of zero mean and standard deviation two greylevels, to simulate the blurring effect under finite imaging resolution. Signal-dependent Rayleigh noise was randomly added afterwards and the image intensity range
XIAO et al.: SEGMENTATION OF ULTRASOUND B-MODE IMAGES
53
Fig. 3. Results of segmenting the synthetic image with intensity inhomogeneity removal.
Fig. 4.
Plots of intensity profiles along the central vertical line from the top to the bottom of (a) the original synthetic image and (b) the corrected image.
was mapped to 0–255 to give the final image [Fig. 1(c)]. The intensity histograms of two regions in the synthetic image, one rectangle region in the background and one square region inside the disc [Fig. 2(a)], are shown in Fig. 2(b) and (c), respectively. 13, stanThe algorithm was run with two classes: mean 10 for the central disc and mean 35, dard deviation 10 for the background. Fig. 3 shows standard deviation the results. From Fig. 3(a) and (b), it can be seen that the added linearly varying intensity field has been successfully removed. The disc was well segmented [Fig. 3(c)], with a misclassification rate (the ratio of the number of misclassified pixels to the number of pixels of the original disc) of 3.2%. Fig. 3(d) shows the misclassified pixels colored white. The intensity profiles along the central vertical line from the top to the bottom of some of the original synthetic image and the restored image are plotted (from left to right) in Fig. 4. It can be seen that before intensity inhomogeneity removal, the intensities in the profile are low below the disc [Fig. 4(a)], but higher after intensity inhomogeneity removal [Fig. 4(b)]. The spikey nature of the intensity profile is retained in the corrected image, since our method only removes a low-frequency component from the original image. B. In Vitro Data A phantom study was performed to show how the new method removes intensity variations under different TGC settings. A phantom was constructed, which was a box of gelatin (with talcum powder added to increase scattering) containing an eggshaped latex balloon filled with gelatin of a different concentration. The two B-mode images in Fig. 5(a) were acquired with a Hewlett-Packard (HP) 10-MHz linear-array transducer. They are of the same cross section of the phantom, each with its TGC curve on the right. Corresponding intensity histograms are shown in Fig. 5(b). The algorithm was run with two classes:
mean 10, standard deviation 15 for the inclusion 100, standard deviation 20 for the backand mean ground for both images. The corrected images are shown in Fig. 5(c), which exhibit significantly similar background intensity homogeneity, with the contrast between the inclusion and the background improved in both. Under the assumption that pixels of the same tissue class should have similar image intensities, intraclass intensity variations that we are not interested in from an object segmentation point of view are also removed. The speckle patterns are retained in the corrected images, since our method only removes a low-frequency component from the original image. The intensity histograms of the corrected images are shown in Fig. 5(d), each displaying two distinct peaks. Images of the distortion field removed are shown in Fig. 5(e). The segmentation results are both good and show significant resemblance to each other [Fig. 5(f)]. C. In Vivo Data The in vivo results in this section aim to illustrate the application of the method on clinical images. First, we illustrate the application of the algorithm to breast lesion segmentation. Fig. 6(a) shows an image acquired with an Esaote 7.5-MHz linear-array transducer, of a big, hypoechoic breast lesion that was close to the skin surface. The upper half of the lesion has low contrast with the subcutaneous fat also hypoechoic. The lesion is also associated with edge shadows and posterior enhancement, which provide useful clinical cues for visual interpretation. However, they result in an increase of background intensity variance, which is not desirable for intensity-based segmentation. The algorithm was run with two classes: mean 12, standard deviation 20 for the lesion and mean 40, standard deviation 30 for the background tissue. The corrected image, the distortion field and the segmentation result are shown in Fig. 6(b)–(d), respectively.
54
IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 21, NO. 1, JANUARY 2002
Fig. 5. Segmentation with intensity inhomogeneity removal—an experiment in vitro.
XIAO et al.: SEGMENTATION OF ULTRASOUND B-MODE IMAGES
Fig. 6.
Breast lesion segmentation with intensity inhomogeneity removal.
Fig. 7.
Segmentation of an image of the left ventricle.
55
56
Fig. 8.
IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 21, NO. 1, JANUARY 2002
Segmentation using 3-D neighborhood in the MRF prior model.
Next, we show a typical result of applying the algorithm to echocardiographic images. Fig. 7(a) is an image of the left ventricle of a healthy volunteer. It was acquired with a HP transthoracic OmniPlane 3–5-MHz transducer. In this case, it can be seem that attenuation artifacts alter the intensity of equally significant cardiac structures depending on their orientation with respect to the US beam. The algorithm was run with two classes: 3, standard deviation 4 for the left ventricle and mean 40, standard deviation 30 for the tissue. The mean corrected image, the removed intensity inhomogeneity field and the segmentation result are shown in Fig. 7(b)-(d), respectively. Here, we see that the blood pool has been correctly identified. Finally, we show an example of applying the algorithm to a 3-D volume of a breast lesion reconstructed from a sequence of 2-D B-mode images acquired with a 3-D free-hand US imaging system [23]. The transducer used is an Esaote 7.5-MHz linear-array transducer. These scans are closely spaced and have similar orientation so we can treat them as parallel. The 26, standard algorithm was run with two classes: mean 5 for the lesion and mean 90, standard deviation 30 for the background tissue, for all the scans. deviation Fig. 8(a) shows three successive scans in the volume. When the scans are processed independently, the segmentation results do not appear to be neighboring scans, although fairly good quality segmentations seem to be obtained in the individual scans [Fig. 8(b)]. When a 3-D neighborhood is considered in the MRF prior model, the segmentation results appear more spatially coherent [Fig. 8(c)].
V. CONCLUSION AND FUTURE WORK We have described and evaluated a method for simultaneous attenuation distortion correction and image region segmentation of video-intensity US images. The method is novel in that it uses a combination of MRF and MAP estimation techniques to estimate the multiplicative distortion field that is the dominate attenuation artefact in US images and label image regions based on their corrected intensity statistics. We related the assumed degradation model to ultrasonic image formation to explain how and why the method works and justified some of the assumptions made in the algorithm. One current limitation of the algorithm is that it assume a Gaussian pdf image model. In this paper, we show that this was justifiable if a multiresolution implementation was used. Future work will consider extending the method to work with more realistic models of ultrasonic image formation including a model of speckle noise. The method can also not accommodate severe shadowing artefacts since in this case the imaging model is invalid (attenuation does not change gradually over time). Experimental results on synthetic, phantom, and in vivo breast and cardiac images were presented to test the accuracy of the method and provide some indication of potential clinical utility. We assumed that the number of classes in an image is known. We do not see this as a particular limitation of the method since in most cases the number of tissue classes is known—typically foreground/background, i.e., two classes. In theory, one could run the method using a different number
XIAO et al.: SEGMENTATION OF ULTRASOUND B-MODE IMAGES
of labels to find the segmentation that best-fit to the data. However, this still would not guarantee that the final segmentation correctly delineated the object(s) of interest. The other model parameters used in the experiments were determined heuristically. This step might be automated, say, by using a simple threshold-based method to crudely estimate the starting values for each region statistic. To judge the success of the approach in clinical practice requires considering the use of the segmentation in a higher-level application such as echocardiographic (endocardial) border tracking. An initial comparsion of this paper’s method with another method developed in our laboratory is described in [24] and further work is planned in this area.
ACKNOWLEDGMENT The authors would like to thank Dr. R. English of the Breast Care Unit, Churchill Hospital, Oxford for acquiring the clinical data of the breast used in this paper and Dr. D. Boukerroui for discussions on a draft of this paper and on ultrasonic image segmentation in general.
REFERENCES [1] P. N. T. Wells and M. Halliwell, “Speckle in ultrasonic imaging,” Ultrasonics, vol. 19, pp. 225–229, 1981. [2] R. F. Wagner, S. W. Smith, J. M. Sandrik, and H. Lopez, “Statistics of speckle in ultrasound B-scans,” IEEE Trans. Sonics Ultrasonics, vol. SU-30, pp. 156–163, Mar. 1983. [3] J. C. Bamber and C. Daft, “Adaptiv filtering for reduction of speckle in ultrasonic pulse-echo images,” Ultrasonic Imag., pp. 41–44, January 1986. [4] G. Castellini, D. Lamate, L. Mascotti, E. Monnini, and S. Rocchi, “An adaptive Kalman filter for speckle reduction in ultrasound images,” J. Nucl. Med. Appl. Sci., vol. 32, no. 3, p. 213, 1988. [5] D. Kaplan and Q. Ma, “On the statistical characteristics of log-compressed Rayleigh signals: Theoretical formulation and experimental results,” J. Acoust. Soc. Amer., vol. 95, no. 3, pp. 1396–1400, 1994. [6] V. Dutt and J. F. Greenleaf, “Adaptive speckle reduction filter for log-compressed B-scan images,” IEEE Trans. Med. Imag., vol. 15, pp. 802–813, Dec. 1996.
57
[7] A. N. Evans and M. S. Nixon, “Biased motion-adaptive temporal filtering for speckle-reduction in echocardiography,” IEEE Trans. Med. Imag., vol. 15, pp. 39–50, Feb. 1996. [8] S. D. Pye, S. R. Wild, and W. N. McDicken, “Adaptive time gain compensation for ultrasound imaging,” Ultrasound Med. Biol., vol. 18, no. 2, pp. 205–212, 1992. [9] D. I. Hughes and F. A. Duck, “Automatic attenuation compensation for ultrasound imaging,” Ultrasound Med. Biol., vol. 23, no. 5, pp. 651–664, 1997. [10] E. A. Ashton and K. J. Parker, “Multiple resolution Bayesian segmentation of ultrasound images,” Ultrasound Imag., vol. 17, pp. 291–304, 1995. [11] D. Boukerroui, O. Basset, A. Baskurt, and G. Gimenez, “A multiparametric and multiresolution segmentation algorithm of 3-D ultrasound data,” IEEE Trans. Ultrason., Ferroelect. Freq. Contr., vol. 48, no. 1, pp. 64–77, Jan. 2001. [12] W. M. Wells, E. L. Grimson, R. Kikinis, and F. A. Jolesz, “Adaptive segmentation of MRI data,” IEEE Trans. Med. Imag., vol. 15, pp. 429–442, Aug. 1996. [13] R. Guillemaud and J. M. Brady, “Estimating the bias field of MR images,” IEEE Trans. Med. Imag., vol. 16, pp. 238–251, June 1997. [14] Y. Zhang, M. Brady, and S. Smith, “Segmentation of brain MR images through a hidden Markov random field model and the expectation-maximization algorithm,” IEEE Trans. Med. Imag., vol. 20, pp. 45–57, Jan. 2001. [15] J. Besag, “Spatial interaction and the statistical analysis of lattice systems (with discussion),” J. Roy.Statist. Soc., Series B, vol. 36, no. 2, pp. 192–326, 1974. [16] S. A. Kirkpatrick, C. D. Gellatt, and M. P. Vecchi, “Optimization by simulated annealing,” Science, vol. 220, pp. 671–680, 1983. [17] S. Geman and D. Geman, “Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images,” IEEE Trans. Pattern Anal. Machine Intell., vol. PAMI-6, pp. 721–741, June 1984. [18] D. E. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning. Reading, MA: Addison-Wesley, 1989. [19] J. Besag, “On the statistical analysis of dirty pictures (with discussion),” J. Roy. Statist. Soc., Series B, vol. 48, no. 3, pp. 259–302, 1986. [20] P. Perona and J. Malik, “Scale-space and edge detection using anisotropic diffusion,” IEEE Trans. Pattern Anal. Machine Intell., vol. 12, pp. 629–639, July 1990. [21] M. Tincher, C. R. Meyer, R. Gupta, and D. M. Williams, “Polynomial modeling and reduction of RF body coil spatial inhomogeneity in MRI,” IEEE Trans. Med. Imag., vol. 12, pp. 361–365, Apr. 1993. [22] A. Macowski, Medical Imaging Systems. Englewood Cliffs, NJ: Prentice-Hill, 1983. [23] G. Xiao, “3-D free-hand ultrasound imaging of the breast,” Ph.D. dissertation, Depat. Eng. Sci., Oxford Univ., Oxford, U.K., 2001. [24] D. Boukerroui, A. Baskurt, J. A. Noble, and O. Basset, “Segmentation of ultrasound images: Multiresolution 2-D and 3-D algorithm based on local and global statistics,” Pattern Recogn. Lett., 2002, to be published.