Multi-class Stain Separation using Independent

Multi-class Stain Separation using Independent Component Analysis Nicholas Trahearna , David Sneadc , Ian Creeb,c , Nasir Rajpoota,d a

c

Department of Computer Science, University of Warwick, United Kingdom; b Warwick Medical School, University of Warwick, United Kingdom; Department of Pathology, University Hospitals Coventry and Warwickshire, United Kingdom; d Department of Computer Science and Engineering, Qatar University, Qatar ABSTRACT

Stain separation is the process whereby a full colour histology section image is transformed into a series of single channel images, each corresponding to a given stain’s expression. Many algorithms in the field of digital pathology are concerned with the expression of a single stain, thus stain separation is a key preprocessing step in these situations. We present a new versatile method of stain separation. The method uses Independent Component Analysis (ICA) to determine a set of statistically independent vectors, corresponding to the individual stain expressions. In comparison to other popular approaches, such as PCA and NNMF, we found that ICA gives a superior projection of the data with respect to each stain. In addition, we introduce a correction step to improve the initial results provided by the ICA coefficients. Many existing approaches only consider separation of two stains, with primary emphasis on Haematoxylin and Eosin. We show that our method is capable of making a good separation when there are more than two stains present. We also demonstrate our method’s ability to achieve good separation on a variety of different stain types. Keywords: Stain Separation, Colour Deconvolution, Independent Component Analysis

1. INTRODUCTION The term Stain Separation refers to a class of algorithms with a common objective, to transform the input 3-channel RGB image of a multi-stained tissue section into a series of grayscale stain images. A stain image represents the intensity of that particular stain’s expression across the section. This allows us to perform processing on a single stain’s expression, which is often more biologically meaningful than the mixture of stains in the original image. For this reason stain separation has an important as a preprocessing step for many tasks in histopathological image analysis. For instance, if we consider the situation where a user would like to focus solely upon the nuclei within the section, this problem could be aided by first applying stain separation to isolate just the regions that have reacted to the nuclei stain. Another example is its use in stain normalisation, where an image is adjusted so that the colours of the stains better match those of another image which appear differently. Typically the stain separation problem is posed as a problem of finding the optimal stain matrix, a matrix which when multiplied to our normal colour channels will output the desired stain channels. A stain matrix can be viewed as a concatenation of our set of model stain vectors, which each attempt to describe the ideal colour composition of a particular stain. Early attempts at stain separation3 established a set of fixed stain matrices for many of the possible stain groups, based upon empirical study of the colour distributions present in their sample data set. The setback of using a single stain matrix for all possible images of one stain group becomes clear when one considers the amount of variability that exists within across even images of the same stain. The variability can generally be grouped into one of two stages of the histological process. First there are differences in the staining processes. Different centres may use chemical stains produced by different manufacturers, and may choose to

use different quantities of the stain itself when preparing the slide. This may be in part due to pathologist preference, those who prefer a more saturated slide image will likely request for the section to be more deeply stained. Variations in the staining process can subsequently result in a stain appearing as one of a variety of different intensities and shades of the same colour. The second influence occurs in the image capture phase. The components and capture mechanisms vary across scanners and as a result it is likely that the digitized image of the slide will differ depending on the scanner used. The variability in images of the same slide produced by different scanners has been noted in literature.1 Naturally this problem of variability will also extend to the colour distribution of the stains within and image. As a result of these two factors it is generally not possible to produce a transformation that will adequately separate any image of a given stain group. Therefore there is a clear need to be able to separate stains on a per-image basis, based upon the colour distribution within a given image. We present a new method of stain separation, capable of separating two or more stains from a bright-field tissue section image. Our method uses Independent Component Analysis as the key component in determining the stain separation vectors. The independent components are then improved by a novel correction step to produce the final stain vectors. Section 2 reviews some of the existing approaches to stain separation, highlighting their main contributions and limitations. In section 3 we outline the proposed stain separation method. Results of the algorithm are shown in section 4, demonstrating its ability to perform well on a variety of stain types and providing comparison with other popular separation algorithms. Finally, section 5 summarises our findings and offers some potential areas of further development.

2. EXISTING WORK Perhaps one of the earliest attempts at digitally separating stains based on mixed signals was presented by Zhou et. al.,2 which focused on the problem in the context of multispectral imaging. Ruifrok and Johnston3 would follow with an approach targeted at standard RGB images, known as colour deconvolution. Colour deconvolution specified fixed stain matrices for a number of popular stain combinations, the matrix for Haematoxylin and Eosin which is presented below.   1.88 −0.07 0.60 M = −1.02 1.13 −0.48 −0.55 −0.13 1.57 Both of the aforementioned papers propose the idea of first transforming the image into Optical Density space, in accordance with the Beer-Lambert law. Consequently we can, in theory, represent stains as a linear combination of stain vectors, making the approach of stain separation matrices possible. As noted previously there are a number of pitfalls to using a single stain matrix for all images, such an approach is unlikely to be able to capture the amount of image variability for even a single stain pair. In addition a different stain matrix is required for each combination of stains, which must be precalculated. As a result a number of methods of dynamic stain separation have been developed. While the methods vary the underlying approach is largely similar. Methods generally attempt to find correlations in the colour distributions of the input image, and subsequently use these to extract a model colour of each stain for this image, which are then used to generate the stain matrix. Rabinovich et. al.4 consider the use of ICA and Non Negative Matrix Factorisation (NNMF) in the separation of multispectral imaging, however no consideration was given either method’s use in RGB imaging. RGB images have fewer colour channels and thus cannot always be processed in exactly the same way as a consequence. In addition, as noted by the authors, ICA or NNMF alone do not result in an optimal separation. Therefore post-processing of the ICA or NNMF data is required for this to be improved upon. Macenko5 posed the problem as a component in the larger task of stain normalisation, with particular focus upon slides stained with H&E. A Singular Value Decomposition (SVD) of the image data is calculated and the image data is projected onto a plane corresponding to the two largest singular values. The extreme angles of the data are used to produce two stain vectors, one for Haematoxylin and one for Eosin. This method is limited to the separation of two stains. In addition, the presence of noise and non-stain features can affect the singular vector estimation and thus may result in stain vectors that do not describe the stains. Niethammer6 extended

this approach by including prior information to improve the initial stain vector estimation in the presence of noise. In a method proposed by Gavrilovic7 RGB data is projected into the Maxwellian chromaticity plane, a triangular projection which places perceptually similar colours closer together. In this space the stains are expected to be grouped together with a clear division between them. The algorithm suggests modelling each stain in this space as a 2D Gaussian distribution, this Gaussian mixture is then solved using Expectation Maximisation. The stain vectors are then derived from the means of the Gaussian mixture. A supervised stain separation method was outlined by Khan.8 Much like Macenko, the method described was a part of a larger stain normalisation algorithm. The authors introduced the Stain Colour Descriptor (SCD), which attempts to quantify the colour distribution of an image as a single value. This SCD and the image’s RGB values were used as input to a pre-trained RVM classifier, which would determine whether each pixel belonged primarily to either a particular stain or the background. The results of the classification is used to determine the ideal stain colours, and thus the stain matrix. While this method does account for intra-stain variability, the problem still exists with inter-stain variability. Each combination of stains requires its own pre-trained classifier, which results in a similar limitation to the static stain matrices of Ruifrok and Johnston.

3. THE PROPOSED ALGORITHM Figure 1 shows an outline of our proposed stain separation method.

Figure 1: The flow of the ICA-based stain separation.

3.1 Preprocessing 3.1.1 Conversion to Optical Density Space Our method follows the standard convention of converting the image from RGB colour space into Optical Density (OD) space. The conversion has become a standard practice for stain separation algorithms,3, 5, 7, 8 this is because within OD space the intensities for the red, green, and blue channels can generally be represented as a linear combination of the quantities of stains present on the section. The approach follows from the observation that the absorbance of many stains, notably including Haematoxylin and Eosin, follow the Beer-Lambert law.9 Equation 1 is used to make the conversion. IOD = −ln(

IRGB ). IRGB0

(1)

3.1.2 Dimensionality Reduction Principal Component Analysis (PCA) is applied to the OD image data and the n largest principal components are selected for further processing, where n is the number of stains present in the section image. Each principal component is assembled as a linear combination of the OD stain channels, and as such our reduced representation of the image data, IR can be constructed as a matrix multiplication on the our OD pixels. We therefore define the reduction matrix, H, such that IR = IOD H. (2) It is important to note that the use of PCA does not affect the results of the ICA projection in later stages of the algorithm. This is because ICA algorithms typically include a preprocessing step of whitening, which will transform our colour channels into a series of statistically independent vectors. PCA can be viewed as a means of whitening the data and as a consequence such a transformation was going to occur at this point anyway, in fact in many ICA implementations PCA itself is used as the whitening method.

3.2 Independent Component Analysis Independent Component Analysis is a method of blind source separation that attempts to extract non-Gaussian signals from a Gaussian source. It is closely related to the similarly named PCA; however a key difference is that PCA distributes the data in way the best separates variance while ICA does the same for kurtosis. In our observations the calculated independent components correspond quite closely to the desired stain vectors, which isn’t the case for some related methods, such as PCA alone. This property of ICA is very desirable because it simplifies the process of estimating stain vectors from our initial components. For our implementation we adopt the FastICA algorithm.10 In applying ICA we arrive at a matrix, Π, that projects our image data from the dimensionality reduced PCA space to ICA space. Application of the matrix as follows P = IR Π

(3)

results in a set ICA data points P = {p1 , . . . , pm }, which coarsely correspond to the relative stain quantities. For convenience we define a matrix, W , that projects from directly from OD space to ICA space. As both intermediate transformations are linear we can arrive at this matrix by a multiplication of the two matrices, thus W = HΠ.

(4)

3.3 ICA Stain Vector Correction In many cases ICA distributes the data along the axes, such that each independent component directly corresponds to a given stain. In more challenging cases, such as those where two stains have similar colours, the distribution of data may not be perfectly orthogonal. As a result the independent components will not correspond directly to the stains. We therefore introduce a correction step to estimate the true stain vector from within the ICA space. The correction algorithm is based on two main conditions of the data in ICA space. 1. The true stain vectors are a minor adjustment from the unit vectors in ICA space. 2. Most of the data in ICA space is close to one of the true stain vectors. The second condition could also be phrased as such: We would like to arrive at a set of stain vectors V = {v1 , . . . , vn } that best fit the ICA projected data, P . We define di,j as the distance between pj and vi at its closest point, which we will henceforth refer to as vi (t). We can calculate di,j by first defining an intermediate vector Li,j = vi (t) − pj ,

(5)

representing the path from pj to vi (t). It should be clear that the magnitude of Li,j is equivalent to our desired distance, or rather di,j = ||Li,j ||2 . (6) By substitution of equation 5 this can also be represented as di,j = ||vi ||2 t2 − 2(vi .pj )t + ||pj ||2 .

(7)

The unknown of t can be determined by differentiating di,j with respect to t and locating the stationary point. For equation 7 this is vi .pj (8) t= ||vi ||2 Substituting for t in equation 7 ultimately results in a distance equation of di,j = ||pj ||2 −

(vi .pj )2 . ||vi ||2

(9)

In accordance with our second condition, for each point we are only interested in the distance to its closest vector. Consequently we define Dj = min(d1,j , . . . , dn,j ), (10) the minimal distance of a point pj to its closest vector in V . Finally we define the objective function m

C(V, P ) =

1X Dj n j=1

(11)

which calculates the mean distance from a point to its nearest stain vector. For a given set of data in ICA space, P , Function 11 will be minimised for the true stain vectors Vmin . We therefore seek the global minima for this objective function. Under our first condition the unit vectors we begin with are already quite close to the true stain vectors. As a result we can adopt an iterative local search to improve our initial vectors until we reach a local minimum. Algorithm 1 details the method in pseudo-code. It can be seen that the method cycles through the provisional vectors continually, updating them until convergence of C(V, P ) is reached, which is normally within 100 iterations. While this approach may not necessarily produce a global minimum it is highly unlikely to result in vectors that are vastly different from our true stain vectors if our first condition holds. Typically the adjustment is minor, but in certain cases it has been shown to give greatly improved separation when compared to ICA without the correction step. The true stain vectors are linear combinations of the independent components, which in turn are linear combinations of the OD stain channels. Therefore to generate the final stain matrix we simply multiply our ICA demixing matrix by an n × n matrix produced by concatenating the members of Vmin , M = W Vmin .

(12)

From this we derive the following equation to produce our separated stain channels S = e−IOD M

(13)

or in terms of the original pixel values I

S=e

ln( I RGB )HΠVmin RGB0

.

(14)

Input : P , the ICA projected image data Output: Vmin , the true stain vectors 1

Initialise: Vmin as unit vectors in ICA space;

2

for i = 1 to n do Initialise: δi = 0.1; end

3 4 5 6 7 8

Set: Cost = C(Vmin , P ); Set: MinDelta = 10−6 ; foreach stain vector: vi ∈ V do Generate U = {u0 , u1 , ...}, the set of update directions;

13

foreach update vector: uj ∈ U do Set: vj0 = vi + δi uj ; Set: Vj0 = {x : (x ∈ V AN Dx 6= vi )ORx = vj0 }; Set: Kj = C(V 0 , P ); end

14

Set: r = argmax(K);

9 10 11 12

15 16 17 18 19 20 21 22 23 24 25 26

if Kr < Cost then Set: Cost = Kr ; Set: Vmin = Vj0 ; else δi Set: δi = max( 10 , MinDelta); end end if ∃i : δi > MinDelta then Repeat from 7; else Return: Vmin ; end Algorithm 1: The ICA vector correction algorithm

4. RESULTS AND DISCUSSION Figure 2 demonstrates the separation quality of our method, as compared to a number of alternate stain separation algorithms. The separated stain images have been pseudo-coloured to show the estimated dominant colour for the stain. It is clear that this image cannot be separated by all but the final two stain separation methods, which only consider the problem of dividing just two stains. The presence of the melanin regions, appearing as a dark brown colour in the image, means that there are more than two dominant colours in the image. It can be seen that Macenko’s method has failed to give good separation in this case, distributing the Eosin stain onto the channels for both Haematoxylin and the melanin. This because Macenko’s method only considers stain separation as a two stain problem, and the introduction of a third dominant colour violates this assumption. Our ICA-based method by contrast is able to identify the three dominant colours and extract them into separate channels. Of the methods tested Gavrilovic’s is the only other method that is able to properly treat this as a three stain problem and as such divide the three dominant colours, but despite this there are flaws in the separation. On the Eosin channel we can see that in regions where melanin is present the intensity is much deeper than surrounding areas, implying that the stain channel has inadvertently caught some of the melanin. This is likely because the presence of melanin has influenced to location of the mean of the Eosin Gaussian distribution, and thus the stain vector as a whole.

Figure 3 shows our method’s performance on a number of different stain types. Demonstrating its flexibility to provide good separation for a variety of stain colour combinations. Observe that for the separation of the CK56 some of the DAB stain can be seen in the Haematoxylin channel. This appears to correspond to the most heavily stained region, which is almost black in colour. An accurate separation is very difficult in these cases, this is because such colours exhibit a very low luminance value and as such may not correspond with a particular dominant colour as strongly as lighter pixels. As a result association with a particular dominant colour may not be possible considering local information alone, perhaps suggesting that stain separation methods would benefit from also considering the colour of neighbouring pixels.

5. CONCLUSIONS AND FUTURE WORK We demonstrate a method of stain quantification and separation using Independent Component Analysis, which to the best of our knowledge has not been used previously for RGB images. Additionally we present a novel stain vector correction method for the initial unit ICA vectors, resulting in a separation which better describes the stains. Previous stain separation methods have frequently been limited to sections with only two stains, with a major emphasis on Haematoxylin and Eosin. We have shown how our method can perform adequate separation on sections with more than two dominant stain colours. In addition we have demonstrated high quality separation on sections from a variety of different stain groupings. As previously acknowledged, there are some cases where the colour of a given pixel is not sufficient to determine which stain is predominant at that point. These cases typically correspond to pixels that are either very bright or very dark, and thus the colour of the pixel is much less clear. A stain separation method that operates on a pixel neighbourhood, rather than a pixel-to-pixel basis, is one possible solution to this issue. The current method requires the number of dominant colours to be specified as input, however it is hoped that in future this method can be combined with a method to estimate the optimal number of dominant colours, to achieve fully automated separation. Methods of determining the optimal independent components11 are a possible way this could be realised.

ACKNOWLEDGMENTS Whole-slide images used during the course of this project have been acquired using Omnyx software and were provided courtesy of University Hospitals Coventry and Warwickshire. Research has been funded jointly by EPSRC and Omnyx LLC.

REFERENCES [1] Yagi, Y., “Color standardization and optimization in whole slide imaging,” Diagn Pathol 6(Suppl 1), S15 (2011). [2] Zhou, R., Hammond, E. H., and Parker, D. L., “A multiple wavelength algorithm in color image analysis and its applications in stain decomposition in microscopy images,” Medical Physics 23(12), 1977–1986 (1996). [3] Ruifrok, A. and Johnston, D., “Quantification of histochemical staining by color deconvolution,” Analytical and quantitative cytology and histology / the International Academy of Cytology [and] American Society of Cytology 23, 291–299 (August 2001). [4] Rabinovich, A., Agarwal, S., Laris, C., Price, J. H., and Belongie, S. J., “Unsupervised color decomposition of histologically stained tissue samples,” in [Advances in Neural Information Processing Systems], None (2003). [5] Macenko, M., Niethammer, M., Marron, J., Borland, D., Woosley, J. T., Guan, X., Schmitt, C., and Thomas, N. E., “A method for normalizing histology slides for quantitative analysis.,” in [ISBI], 9, 1107–1110 (2009). [6] Niethammer, M., Borland, D., Marron, J., Woosley, J., and Thomas, N., “Appearance normalization of histology slides,” in [Machine Learning in Medical Imaging ], Wang, F., Yan, P., Suzuki, K., and Shen, D., eds., Lecture Notes in Computer Science 6357, 58–66, Springer Berlin Heidelberg (2010). [7] Gavrilovic, M., Azar, J. C., Lindblad, J., Wahlby, C., Bengtsson, E., Busch, C., and Carlbom, I. B., “Blind color decomposition of histological images.,” IEEE Transactions on Medical Imaging 32(6), 983–994 (2013).

[8] Khan, A., Rajpoot, N., Treanor, D., and Magee, D., “A nonlinear mapping approach to stain normalization in digital histopathology images using image-specific color deconvolution,” Biomedical Engineering, IEEE Transactions on 61, 1729–1738 (June 2014). [9] van der Loos, C. M., “Multiple immunoenzyme staining: methods and visualizations for the observation with spectral imaging,” Journal of Histochemistry & Cytochemistry 56(4), 313–328 (2008). [10] Hyv¨ arinen, A. and Oja, E., “Independent component analysis: algorithms and applications,” Neural networks 13(4), 411–430 (2000). [11] Rasmussen, P. M., Mørup, M., Hansen, L. K., and Arnfred, S. M., “Model order estimation for independent component analysis of epoched EEG signals,” in [Biosignals 2008 International Conference on Bio-inspired Systems and Signal Processing.], (2008).

Original Image:

Ruifrok & Johnston:3

Macenko:5

Khan:8

Gavrilovic:7

ICA Stain Separation:

Figure 2: Comparison of our approach with popular stain separation methods. This above H&E image with brown melanin regions is an example of a case that existing stain separation methods cannot process correctly.

Stain Type

Original Image

Stain 1

Stain 2

ALK:

CK56:

ER:

AB:

H&E:

Figure 3: Comparison of the ICA-based method across a variety of stain types.

Multi-class Stain Separation using Independent

Multi-class Stain Separation using Independent

Suggest Documents

Single Channel Blind Source Separation Using Independent ...

Single Channel Blind Source Separation Using Independent ...

Multiclass Support Vector Machines Using

Multiclass Motion Identification using Myoelectric

Using Genetic Programming for Multiclass Classification ... - CiteSeerX

Uncertainty Estimation Using Fuzzy Measures for Multiclass

Port-wine stain treatment is wavelength independent in the range ...

Histopathological Image Classification Using Stain ... - Springer Link

Multiclass Learning for Writer Identification using Error

INDEPENDENT COMPONENT SEPARATION FROM ... - CEA-Irfu

Simplified Myeloperoxidase Stain Using Benzidine ...

Vertical separation vs. independent downstream entry ...

Separation Principles in Independent Process Analysis - CMAP

Multiclass Object Recognition Using Object-based

Network Traffic Classification Using Multiclass Classifier

Using Two-Class Classifiers for Multiclass

Stain Normalization of Histopathology Images Using Generative ...

Multiclass Hammersley-Aldous-Diaconis process and multiclass

Pre-Stain Wood Conditioner #8063 - Cabot Stain

SEPARATION OF FRUCTOSYLTRANSFERASE USING ...

Gram Stain Kit (Microorganism Stain) - Abcam

Orcein Stain

GelRed stain

2D SEPARATION USING PARAMAGNETIC ...