Document not found! Please try again

Grayscale IIF staining pattern representation from color ... - IEEE Xplore

1 downloads 0 Views 444KB Size Report
presented computer-aided-diagnosis (CAD) systems aiming at supporting IIF diagnostic procedure. Investigated top- ics covered the areas of image acquisition ...
Color to grayscale staining pattern representation in IIF Ermanno Cordelli, Paolo Soda Integrated Research Centre, Universit`a Campus Bio-Medico di Roma, Roma, Italy Via Alvaro del Portillo 21, 00128 Roma, Italy [email protected], [email protected]

Abstract Indirect immunofluorescence (IIF) is the recommended technique to detect rheumatic diseases through the analysis of images exhibiting a fluorescence in the green band. Since the request of such tests has recently increased, researches efforts have been directed towards the development of computer-aided diagnosis (CAD) tools supporting the specialists and improving the standardization of the method. Technological advances have made available color cameras for IIF image acquisition, but their use requires to determine which color to greyscale conversion method provides most useful information for the needs of CAD development. In this respect, we experimentally compare four different methods converting a color image into a greyscale one, analyzing wide features sets for each conversion method and applying four classification paradigms. Experiments have been carried out on an annotated dataset of HEp-2 cells, finding out a subset of features which is independent from the color model used and showing that a greyscale representation based on the HSI model better exploits information for IIF images analysis.

1. Introduction Rheumatic diseases are characterized by antinuclear autoantibodies (ANAs), which are detected by indirect immunofluorescence (IIF) on HEp-2 slides [22, 27]. HEp2 substrate contains a specific antigen that can bond with serum antibodies, forming a molecular complex; then, this complex reacts with human immunoglobulin conjugated with a fluorochrome and the complex is observed at the fluorescence microscope. In ANA tests, the recommended fluorochrome is the fluorescein isothiocyanate mixed isomer (FITC), which is excited by an incident blue light (λi = 490nm). Due to Stokes shift, its fluorescence is observed in the green band (λe = 518nm). The diagnostic procedure for IIF images consists of fluorescence intensity classification and staining pattern recognition. Fluo-

rescence intensity classification permits to distinguish between positive, intermediate and negative samples. Then, on positive samples, medical doctors classify the staining patterns since they are specific to different autoantibodies [27]. Staining patterns are typically grouped in the following classes that are specific to the most relevant and recurrent ANAs [22, 23]: • Homogeneous: characterized by a diffuse staining of the interphase nuclei and staining of the chromatin of mitotic cells; • Peripheral nuclear or Rim: characterized by a solid staining, primarily around the outer region of the nucleus, with weaker staining toward the center of the nucleus; • Speckled: characterized by a fine or coarse granular nuclear staining of the interphase cell nuclei; • Nucleolar: characterized by a large coarse speckled staining within the nucleus, less than six in number per cell; • Cytoplasmic: characterized by fine fluorescent fibers running the length of the cell; it is frequently associated with other autoantibodies to give a mixed pattern; • Centromere: characterized by several discrete speckles (∼ 40 − 60) distributed throughout the interphase nuclei and characteristically found in the condensed nuclear chromatin during mitosis as a bar of closely associated speckles. Panel A of Figure 1 depicts examples of cells belonging to these staining patterns. Although ANA detection in routine practice has recently increased, IIF is far from being standardized and suffers of a relevant inter and intra laboratories variability [3]. To overcome such limitations, efforts have been directed towards the automatization of the method, resulting in the development of robotic devices performing dilution, dispensation and washing operations to support slide preparation process [2, 5]. Furthermore, recent research has

presented computer-aided-diagnosis (CAD) systems aiming at supporting IIF diagnostic procedure. Investigated topics covered the areas of image acquisition [11, 25], image segmentation [12, 20] and fluorescence intensity classification [7, 19] as well as staining pattern recognition [10, 7, 18, 20, 23]. While preliminary attempts acquired IIF images using greyscale cooled cameras [18, 19, 25], recent research [7, 8, 12, 23, 24] as well as commercial CAD solutions [6, 13, 16] have moved to color cameras. Motivations for this choice are: (i) high performance of color cameras, which improves a lot in recent years since they have been used in many fields, (ii) greyscale cooled cameras and color cameras have similar performance, (iii) color cameras have better performance than not cooled greyscale cameras, (iv) color cameras are cheaper than greyscale cameras, (v) performances of color cameras are adequate for the needs of IIF image acquisition. The introduction of color devices has made available color information which can be exploited for IIF image analysis. We found that previous works reported in the literature use a color to greyscale conversion method heuristically set and then they apply image analysis algorithms on the greyscale image achieved. The rationale lies in observing that the physical phenomenon behind image formation involves the green band only. In order to develop a CAD system for IIF image analysis, we recently discussed the application of color to greyscale conversion methods in case of fluorescence intensity classification [4], i.e. the first step of the diagnostic procedure. We experimentally found that the use of the green channel of the color image provides best performance for fluorescence intensity recognition. In this paper, we extend our previous work focusing now on staining pattern recognition, i.e the second step of the diagnostic procedure. To this aim, we first present four methods converting a color image to its corresponding greyscale representation. Second, we analyze a wide set of texture features and, third, we apply different popular classification paradigms on an annotated dataset of HEp-2 cells.

2

Methods

In order to investigate how to extract useful information from HEp-2 color images we compare four methods converting a color image into the corresponding greyscale representation. To this aim, we apply the following steps: 1. Color image conversion: RGB images acquired at the fluorescence microscope are converted to their greyscale representation according to one of the methods we investigate; 2. Features extraction and selection: we compute several texture features belonging from greyscale images

and then we search the best discriminant subset. This step is independently performed for each of the four greyscale models. 3. Classification tests: we perform cross validation experiments on the image dataset using features selected in the previous step and testing different classification architectures. Next paragraphs present the steps reported above. Color image conversion. Image acquisition devices typically use the RGB color model, where each color appears in its primary spectral components of red, green and blue. The rationale comes from experimental evidences proving that million cones in the human eye can be divided into three principal sensing categories, corresponding roughly to red, green, and blue. Approximately, 65% of all cones are sensitive to red light, 33% are sensitive to green light, and only about 2% are sensitive to blue (but the blue cones are the most sensitive). Hence, RGB color model is well suited for hardware implementations. Unfortunately, RGB and other similar color models are not well suited for describing colors in terms that are practical for human interpretation. Indeed, on the one hand, RGB space is not perceptually uniform, that is, the differences between the colors in the RGB space do not correspond to the color differences perceived by humans and, on the other hand, its dimensions are highly correlated. Further to color model suited for hardware implementation, the HSI and the L∗ a∗ b∗ models were proposed to better describe colors in a useful way for human interpretation. The HSI color model was developed observing that humans describe a color object by its hue, saturation, and brightness. This model decouples the intensity component from the color-carrying information, i.e. hue and saturation, in a color image. It is considered as an ideal tool for developing image processing algorithms based on color descriptions that are natural and intuitive to humans. The L∗ a∗ b∗ model was developed to provide a color space perceptually uniform, in contrast to the above limitation of RGB space. It consists of a luminosity L∗ or brightness layer, chromaticity layer a∗ indicating where color falls along the red-green axis, and chromaticity layer b∗ indicating where the color falls along the blue-yellow axis. However, it has been noticed that achromatic and chromatic color stimuli with the same luminance (or luminance factor) are generally different in brightness (or perceived lightness). Actually, the brightness of the latter is larger than the former. This effect is known as HelmholtzKohlrausch effect (abbreviated as HK hereafter). It is defined as the change in brightness of perceived color produced by increasing the purity of a color stimulus while

Figure 1. Color (panel A) and corresponding grayscale representations - weighted conversion (B), green channel (C), intensity channel (D), HK conversion (E) - of HEp-2 cells exhibiting the homogeneous, rim, speckled, nucleolar, cytoplasmic and centromere staining patterns (left to right).

keeping its luminance constant within the range of photopic vision. Taking into consideration this phenomenon, and defining the chrominance C ∗ as the sum of the modules of chromaticity layers a∗ and b∗ , the perceived luminosity L∗ is given by L∗ = L∗ + 0.143 · C ∗ [17]. In the following, we consider four methods converting IIF color images into the corresponding greyscale representation. They are based on the four color models presented so far, which derive from both hardware and human color description. Two methods are based on RGB primary spectral components, one is based on the HSI space, and the last on the L∗ a∗ b∗ model taking into consideration the HK effect. Denoting as Igrey (x, y) the greyscale intensity of pixel (x, y), as Iij (x, y) the i-th component of j-th color model, the four methods are: P3 RGB 1. Igrey (x, y) = , where α is a veci=1 α(i) · Ii tor of weights taking into consideration different colors sensitivity of eye. According to a general practice, α = (0.2989, 0.5870, 0.1140) [21]. This method is referred to as weighted conversion in the following; 2. Igrey (x, y) = I2RGB , where I2RGB is the green channel. According to previous conversion scheme, this choice corresponds to set α equal to (0, 1, 0). This method is referred to as green channel in the following; 3. Igrey (x, y) = I1HSI , where I1HSI is the intensity component in the color image represented by HSI model.

This method is referred to as intensity channel in the following; ∗ ∗ ∗

∗ ∗ ∗

4. Igrey (x, y) = I1L a b , where I1L a b is the luminosity layer in the L∗ a∗ b∗ model corrected considering the HK effect. This method is referred to as HK conversion in the following. Panel B-E of Figure 1 shows the results achieved applying those greyscale conversion methods to cells shown in panel A of the same figure. Features extraction and selection. In order to discriminate the six staining patterns, we consider a heterogeneous set of texture features belonging to different categories, e.g. statistical descriptors, spectral measures, local binary pattern (LBP) and morphological descriptors. Statistical measures have been extracted both from first and second order histograms, respectively, computing their statistical moments, e.g. skewness, kurtosis, energy, entropy, to name a few. The former histogram describes greylevel distributions, whereas the latter provides a good representation of the overall nature of the texture. Spectral measures have been computed from Fourier and Wavelet transforms as well as from Zernike moments. We compute the energy of Fourier transform spectrum divided into angular and radial bins [15]. For Wavelet transform, we considered discontinuous (Haar) and continuous (Daubechies) wavelets since they have proven their usefulness for the discrimination of medical imaging textures

Table 1. Selected features of greyscale conversion methods. In parenthesis we report the source of features computation. Features Type First order histogram

Second order histogram

Morphological Zernike

Name Position of absolute maximum Energy around relative maxima Entropy Number of relative maxima Range Skewness Standard deviation Absolute Contrast Correlation Covariance Energy Entropy Homogeneity Maximum Moments Inverse moments Bdip Moment based on r · cos(θ) polynom



√ (LBP ) (LBP , LBP u2 )

P −1 X

s(gp − gc )2p



√ (LBP ) u2 (LBP √ , LBP ) (LBP ) √



(LBP )

(LBP u2 )



√ √ (LBP ) √ (LBP ) √ (LBP ) (LBP )

√ (LBP , 3rd ) √ (Image)

HK conversion √ √ (Image) (LBP u2 ) √ u2 (LBP √ , LBP ) (LBP )

√ (LBP ) √ (LBP , LBP u2 ) √ (LBP , LBP u2 )



√ (LBP u2 ) √ √ (LBP ) (LBP )

(LBP )

√ √ (Image) √ (LBP ) (LBP )

√ (LBP , 3rd , 5th , 6th ) √ (LBP , 2nd , 3rd ) √ (Image)

[9]. Zernike moments are a set of rotation-invariant features, equal to the magnitudes of a set of orthogonal complex moments of the image, which have demonstrated their superiority over regular moments [14]. LBP operators are grey-scale invariant statistical primitives reporting good performance in texture classification [26]. Considering a grey scale image, a texture T is defined as the joint distribution of the grey levels between the grey values of a central pixel (gc ) and its P neighbors (gp , with p = 1, 2, ..., P ,) located on a circle of radius R. Subtracting the value of the central pixel from the values of its neighborhood and assuming that differences gp − gc are independent of gc , we get a highly discriminative texture operator, invariant against grey scale shifts. Considering only the sign of each difference gp − gc , we represent the spatial structure of the local texture as a unique number LBPP,R : LBPP,R =

Conversion method Green channel Intensity channel √ √ (Image) (Image)

Weighted conversion √ (Image)



(LBP u2 )



√ √ √ (LBPu2) √(LBP ) √ (LBP ) (LBP ) √ rd th √ (LBP , 3nd, 4 th) (LBP , 2 , 5 ) √ (Image) √ (Image)

(Image)

(Image, LBP ) (Image, LBP , LBP u2 ) √

(Image, LBP )

√ √

(LBP , 3rd , 4th ) (LBP , 1st , 3rd , 4th ) √ (Image)

changes in binary code. Considering only patterns having U value of at most 2, the following descriptors may be introduced:  PP −1 if U (LBPP,R ) ≤ 2 u2 p=0 s(gp − gc ) LBPP,R = P +1 otherwise (2) In our implementation, we use P = 8 and R = 1. Both u2 LBP and LBPP,R operators return matrices where we compute first and second order histograms. We also compute a morphological descriptor, namely the block difference of inverse probabilities (bdip), which is a subregion-based version of the difference of inverse probabilities. Given a squared subregion w of side N belonging to an image I centered at location (i, j), it is defined as: P (i,j)∈w I(i, j) 2 bdip(i, j) = N − (3) max(i,j)∈w (I(i, j))

(1)

p=0

where s(gp − gc ) is 1 if gp − gc ≥ 0, and 0 otherwise. A more compact way to describe texture information of an image is to use LBP u2 operators. Indeed, information, on about 90%, of the vast majority of all 3x3 patterns obtained when R = 1 is captured by a limited number of patterns. LBP su2 work as templates for microstructures such as bright and dark spots, flat areas, and edges. Formally, they are defined by a uniformity measure U corresponding to the number of pattern spatial transitions i.e. bitwise 0/1

The search of the best discriminant features subset for each color to greyscale conversion method has been performed running greedy stepwise and best first searches both forward and backward. Then, the selection process has been refined by an exhaustive search, taking into account the dimensionality of both the dataset and the features set. At the end of this process, we individuate a set of descriptors per each color to greyscale conversion method with the strongest capability to separate the samples in the features space. Corresponding results are shown in Table 1, where each entry is ticked if the feature reported in the correspond-

ing row is discriminant with respect to the greyscale conversion reported in the column. Furthermore, in round parenthesis we report the source of feature computation, being either the original image, the LBP matrix or the LBP u2 matrix. It is worth observing that some descriptors are discriminant whatever the color to greyscale conversion method used, thus being robust to color space variation. On the contrary, other features are specific to the greyscale representation used. Furthermore, most of the features rely on LBP or LBP u2 image description showing that relevant texture information on staining patterns is embedded in a circular neighborhood of the pixels. Classification tests. We test four popular classifiers belonging to different paradigms. They are a Multi-Layer Perceptron (MLP) as a neural network, a k-Nearest Neighbour as a statistical classifier (kNN), a Support Vector Machine (SVM) as a kernel machine, and AdaBoost as an ensemble of classifiers. For each classifier, the experimental setup is: (i) MLP has one or two hidden layers, and the number of neurons in the input layer is equal to the number of features. We perform a grid search to determine the best configuration in terms of number of neurons in each hidden layer: specifically, configurations from 1 up to 40 neurons per layer were tested; (ii) we explore kNN classifier with different values of k in the interval [1; 13]; (iii) for SVM classifier we use the Persen VII function based kernel [1], performing a grid search to determine the best setup in terms of σ and w, which control the half-width and the tailing factor of the peak, respectively. The searches of σ and w have been carried out in the intervals [0.01; 1] and [0.1; 10], respectively; (iv) for AdaBoost we use for base hypotheses both J48 and random forest, exploring different numbers of iterations in the interval [1; 100].

3

Results

Experimental tests were performed on IIF image dataset presented in [8]. It consists of 1457 images of single cell of HEp-2 substrate exhibiting the main staining patterns presented in section 1. Such cells are distributed over two classes of image intensity, namely positive and intermediate. Cells of the former class are bright, whereas cells of the latter one are dark. Notice that such a variation of fluorescence intensity between cells of the dataset increases the complexity of the recognition task since detecting the texture may be difficult on samples belonging to the intermediate class. Table 2 reports the average rate of recognition accuracy measured performing 10-fold cross validation on the six classes of staining patterns. On the one side, with reference to color to greyscale conversion method, we observe that the intensity channel always outperforms the other methods whatever the classifi-

cation architecture used. It achieves an average classification accuracy equal to 74.76%, whereas the best value is 79.34% attained applying the AdaBoost classifier in conjunction with the corresponding features set reported in Table 1. On the other side, with reference to the classification architecture used, best average performance (78.81%) are provided by the AdaBoost classifier. Furthermore, the same classifier achieves the largest accuracies for all color to greyscale conversion methods. These results show that best performance in staining pattern recognition can be attained computing features from the intensity layer of HSI representation of HEp-2 color images and then applying the AdaBoost classifier. Notice that such results differ from those reported in [4], where we focused on fluorescence intensity classification, obtaining best performance using the green channel and a SVM classifier. Obviously, differences between the results achieved are mainly due to the large difference between the two recognition tasks. Let us now analyze the recognition results achieved here in comparison with those reported in the literature [18, 20, 23]. We found that the hit rates on staining pattern recognition presented in [18], [20] and [23] are equal to 75.9%, 79.2% and 75.0%, respectively. Although a direct comparison is not possible because the image datasets are different, we can make the following considerations. First, the image dataset we use extends the panel of staining patterns considered. Indeed, while we consider here six classes (section 1), in [18, 20, 23] the authors considered only four patterns, namely homogeneous, rim, speckled and nucleolar. Second, the dataset we use here is larger than others: we have 1457 samples, whereas in [18, 20, 23] they used 573, 321, and 1000 samples, respectively. Third, similarly to [18, 19] we use here samples diluted at 1:80 titer, as recommended by the guidelines. On the contrary, in [20, 23] samples diluted at a titer higher than 1:160 are used. In this case, only very positive sera exhibit detectable fluorescence intensity, making the staining patterns clearly observable also because intermediate samples are missing with such dilution ratios. Fourth, while in [18, 20, 23] the authors developed classification architectures specifically tailored for the recognition task at hand, e.g. multi-experts system [18] or decision trees [20], to cite a few, here we test a simple scheme based on a single classifier. Given these observations, we deem that the best recognition rate achieved here (79.34%) would permit to develop a CAD system outperforming existing solutions.

4

Conclusions

In this paper we have experimentally compared four methods converting a color image into the corresponding

Table 2. Rates of classification accuracy. Classifier MLP kNN SVM AdaBoost Average

Weighted conversion 65.20 66.71 70.08 78.65 70.16

Green channel 63.28 63.00 66.37 78.45 67.78

Intensity channel 71.38 69.46 78.87 79.34 74.76

HK conversion 67.06 67.19 73.30 78.79 71.59

Average 66.73 66.59 72.15 78.81 —

greyscale representation, measuring the accuracy achieved in the staining pattern recognition on a set of HEp-2 cells. The results show that a greyscale representation based on the HSI model better exploits information for IIF images analysis. Furthermore, we have investigated several features belonging to different paradigms for each greyscale representation, finding out a subset which is independent from the color model used. Such results, integrated with those presented in [4], will permit us to develop a CAD system useful for the needs of real applications

Acknowledgments This work has been carried out in the framework of the ITINERIS2 project, Codice CUP F87G10000080009, under the financial support of Regione Lazio (Programme “Sviluppo dell’Innovazione Tecnologica nel Territorio Regionale”, Art. 182, comma 4, lettera c), L.R. n. 4, 28 Aprile 2006).

References [1] L. B. B. Uestuen, W.J. Melssen. Facilitating the application of support vector regression by using a universal Pearson VII function based kernel. Chemometrics and Intelligent Laboratory Systems, 81:29–40, 2006. [2] Bio-Rad Laboratories Inc. PhD System. http://www.biorad.com, 2004. [3] N. Bizzaro, R. Tozzoli, et al. Variability between methods to determine ANA, anti-dsDNA and anti-ENA autoantibodies: a collaborative study with the biomedical industry. Journal of Immunological Methods, 219:99–107, 1998. [4] E. Cordelli and P. Soda. Methods for greyscale representation of hep-2 colour images. In Computer-Based Medical Systems, 23rd IEEE Int. Symp. on, pages 383–388, 12-15 October 2010. [5] Das s.r.l. Service manual AP22 Speedy IF. Palombara Sabina (RI), 5 edition, March 2010. [6] Das s.r.l. SL-IM. Palombara Sabina (RI), 2 edition, 2010. [7] K. Egerer, D. Roggenbuck, et al. Automated evaluation of autoantibodies on human epithelial-2 cells as an approach to standardize cell-based immunofluorescence tests. Arthritis Research & Therapy, 12(40), February 2010. [8] P. Foggia, G. Percannella, et al. Early experiences in mitotic cells recognition on HEp-2 slides. In Computer-Based Medical Systems, 23rd IEEE Int. Symp. on, pages 38–43, 2010.

[9] P. Heinlein, J. Drexl, et al. Integrated wavelets for enhancement of microcalcifications in digital mammography. IEEE Trans. on Medical Imaging, 22(3):402–413, 2003. [10] R. Hiemann et al. Challenges of automated screening and differentiation of non-organ specific autoantibodies on HEp2 cells. Autoimmunity Reviews, 9(1):17–22, 2009. [11] R. Hiemann, N. Hilger, et al. Objective quality evaluation of fluorescence images to optimize automatic image acquisition. Cytometry Part A, 69:182–184, 2006. [12] Y. L. Huang, Y. L. Jao, et al. Adaptive automatic segmentation of HEp-2 cells in indirect immunofluorescence images. In IEEE Int. Conf. on Sensor Networks, Ubiquitous and Trustworthy Computing, pages 418–422, 2008. [13] ImageInterpret Ltd. Hep-2 pad. Leipzig, Germany, 2009. [14] A. Khotanzad and Y. H. Hong. Invariant image recognition by zernike moments. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(5):489–497, May 1990. [15] A. D. Kulkarni and P. Byars. Artificial neural network models for texture classification via the radon transform. In Proceedings of the 1992 ACM/SIGAPP symposium on Applied computing, pages 659–664, New York, USA, 1992. [16] Medipan GmbH. Aklides. Dahlewitz, Germany, 2009. [17] Y. Nayatani. Simple estimation methods for the HelmholtzKohlrausch effect. Color Research & Application, 22(6):385–401, 1997. [18] P. Soda and G. Iannello. Aggregation of classifiers for staining pattern recognition in antinuclear autoantibodies analysis. IEEE Transactions on Information Technology in Biomedicine, 13(3):322–329, 2009. [19] P. Soda, G. Iannello, and M. Vento. A multiple experts system for classifying fluorescence intensity in antinuclear autoantibodies analysis. Pattern Analysis & Applications, 12(3):215–226, September 2009. [20] P. Perner, H. Perner, and B. Muller. Mining knowledge for HEp-2 cell image classification. Journal Artificial Intelligence in Medicine, 26:161–173, 2002. [21] R. C. Gonzalez and R. E. Woods. Digital image processing. Prentice Hall Upper Saddle River, NJ, 2004. [22] A. Rigon et al. Indirect immunofluorescence in autoimmune diseases: Assessment of digital images for diagnostic purpose. Cytometry B (Clinical Cytometry), 72:472–477, 2007. [23] U. Sack, S. Knoechner, et al. Computer-assisted classification of HEp-2 immunofluorescence patterns in autoimmune diagnostics. Autoimmunity Reviews, 2:298–304, 2003. [24] P. Soda, L. Onofri, et al. A decision support system for Crithidia luciliae image classification. Artificial Intelligence in Medicine, 51(1):67–74, 2011. [25] P. Soda, A. Rigon, et al. Automatic acquisition of immunofluorescence images: algorithms and evaluation. In Computer-Based Medical Systems, 2006. 19th IEEE Int. Symp. on, pages 386–390, 2006. [26] T. Ojala, M. Pietik¨ainen, and T. M¨aenp¨aa¨ . Multiresolution gray-scale and rotation invariant texture classification with local binary pattern. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(7):971–987, 2002. [27] R. Tozzoli, N. Bizzaro, et al. Guidelines for the laboratory use of autoantibody tests in the diagnosis and monitoring of autoimmune rheumatic diseases. American Journal of Clinical Pathology, 117(2):316–324, 2002.

Suggest Documents