Local Binary Pattern for Automatic Detection of Acute ... - IEEE Xplore

3 downloads 111663 Views 363KB Size Report
Email: [email protected]. Abstract—Acute Lymphoblastic Leukemia (ALL) is caused due to increase in number of abnormal lymphocyte cells in blood or.
Local Binary Pattern for Automatic Detection of Acute Lymphoblastic Leukemia Vanika Singhal

Preety Singh

Department of Computer Science and Engineering The LNM Institute of Infomation Technology Jaipur, Rajasthan Email: [email protected]

Department of Computer Science and Engineering The LNM Institute of Infomation Technology Jaipur, Rajasthan Email: [email protected]

Abstract—Acute Lymphoblastic Leukemia (ALL) is caused due to increase in number of abnormal lymphocyte cells in blood or bone marrow. This paper presents a methodology for automatic detection of the abnormal lymphocytes in a given image of the blood sample. We have used Local Binary Pattern (LBP) features for classifying the lymphocyte cell as blast or normal. LBP texture features of blood nucleus are investigated for the detection of ALL. We have also used shape features for classification and a comparative analysis of both the features is performed. It is seen that the LBP features provide reasonably good accuracy in classification.

I. I NTRODUCTION Leukemia is a cancer of blood or bone marrow, characterised by an abnormal increase in immature white blood cells called blasts. Acute Lymphoblastic Leukemia (ALL) is a type of leukemia that affects lymphocytes and grows rapidly. ALL accounts for about 80% of childhood leukemia and mostly occurs in the age group of 2-5 years [1]. ALL is characterized by excess lymphocytes in blood. Lymphocytes are a type of white blood cells that fight infection. In case of ALL, the bone marrow starts to produce lots of unformed cells called blasts that would have developed into lymphocytes. These blasts are abnormal and cannot fight infection. The increased numbers of blasts crowd out normal cells in blood and start to spread in peripheral blood and other body organs. Early and fast diagnosis of the disease is essential for the recovery of the patient. The symptoms of leukemia are similar to other disorders like fever, anemia, weakness, bone pain, joint pain and for this reason, the diagnosis is very difficult [1]. The manual detection of the disease is very subjective as it is affected by the tiredness and skills of the pathologist [11]. Image processing techniques can be employed for automatic detection of cancer using images of blood sample. The task of automation can speed up the process of detection and is economical also. Digital analysis is also beneficial in remote analysis where only the image of the blood sample needs to be sent for diagnosis purpose. In automatic detection of the ALL, images of blood or bone marrow are processed using image processing techniques. The images contain three major components: white blood cells, red blood cells and platelets. Initial steps in the detection of ALL is c 2014 IEEE 978-1-4799-2361-8/14/$31.00

preprocessing, which is used to remove noise from the image. Then, segmentation is performed to extract white blood cells from all the cells present in the image. As ALL is concerned with lymphocytes, the next step is to extract the lymphocytes from white blood cells [13]. To detect ALL, features are extracted from images of lymphocytes and classification is performed. A trained classifier marks the input test images as a normal or blast cell. In this research paper, we propose using Local Binary Pattern (LBP), a texture feature descriptor, as a feature for automatic detection of ALL. LBP has not been used for ALL detection before, to the best of our knowledge. We also extract geometric features from the lymphocyte. We compare classification results for both types of features. We find that LBP features perform almost as well as the geometrical features. The rest of the paper is organised as follows. Section II presents the related work. Section III describes the methods used for segmentation of lymphocyte cell, feature extraction and classification. Section IV presents a brief description of our experimental setup. It also analyses the obtained results. Section V presents the concluding remarks. II. R ELATED W ORK In literature, many methods have been proposed for automated detection of ALL. Razatofighi et al. [13] propose a color image segmentation method that applies Gram-Schmitt orthogonaliztion for segmentation of nucleus in white blood cell. Halim et al. [15] and Lin and Hong [16] use region growing technique to segment nucleus from the color image of blood. Mohapatra et al. [2] have used boundary irregularity obtained using Hausdroff dimension and contour signature techniques. Other features used are shape, color, texture and densitometric features. Support vector machine (SVM) is used for classification and 95% classification accuracy is reported. Asadi et al. [3] have used Zernike moments as feature for classification. Two classification methods: k-nearest neighbor (k-NN) and minimum distance mean are used for classification. k-NN gives a reported best accuracy of 93%. In [4], FAB classification is used for the selection of shape features. Three classifiers: linear, k-nearest neighbor

A similar method is used to extract the cell from the input image, but thresholding is applied on the hue part of the HSI color image. Using cell and nucleus images, cytoplasm can be extracted by subtracting the nucleus from the cell.

Fig. 1. Detection of ALL

and feedforward neural network (FF-NN) are used to compute the classification accuracy. FF-NN shows the minimum error rate and lymphocytes are classified with a mean error rate of 0.02%. Madhloom et al. [10] have used a combination of shape features extracted from nucleus and cell and texture features of nucleus. k-nearest neighbor is used for the purpose of classification and 92.5% accuracy is reported. Mohapatra et. al. [6] use Shadowed C-Mean (SCM) clustering technique for segmentation. A combination of shape, color and three types of texture feature descriptors: wavelet, haralick and fourier are used to train an ensemble of classifiers. A classification accuracy of 99% is reported.

(a) Original Image

(c) Cell

(b) Nucleus

(d) Cytoplasm

Fig. 2. Segmentation of nucleus, cell and cytoplasm of a lymphocyte

III. P ROPOSED M ETHODOLOGY A. Segmentation The basic steps towards automatic detection of ALL in a blood sample image are: image preprocessing, lymphocyte segmentation, separation of nucleus and cytoplasm, feature extraction and classification. Since our database contains images of lymphocytes only, we skip the step of detection of lymphocytes from other WBC. The process is shown in Figure 1. The task of segmentation is to remove background and extract the lymphocyte. From the image of the lymphocyte, nucleus and cytoplasm are extracted. We have used HSI color based segmentation as it provides better performance than RGB color segmentation [7]. The process of nucleus segmentation can be explained using following steps: • • • •



• •

Convert the input RGB image of lymphocyte into HSI image format. Select the saturation component from the HSI color space. The threshold is selected manually. Apply thresholding on the S component of the image to extract the nucleus. Apply morphological operation dilation on the image obtained from the previous step. A disk shaped structuring element of size 3 × 3 is used for dilation. Apply a median filter of size 5 × 5 on the resultant image. Based on the area, small pixel groups of noise are removed from the image. Thus, nucleus is segmented from the image. Gray scale image of nucleus is now obtained by mapping obtained binary image to the gray scale image of the input RGB image.

Figure 2(a) shows the original blood image consisting of a single lymphocyte with cluttered background. Figures 2(b), 2(c) and 2(d) show the nucleus, cell and cytoplasm respectively, obtained from HSI based segmentation. B. Feature Extraction In this paper, we have extracted two types of features: shape features and texture features using Local Binary Pattern. The performance of the ALL detection has been compared using these two sets of features. 1) Shape Features: According to haematologists, shape features play an important role in the diagnosis of ALL. Blast cells are immature forms of the lymphocyte cells and they have different shape characteristics as compared to normal lymphocytes [10]. Nucleus of the blast cells is relatively larger in size than nucleus of normal lymphocyte cell. Besides size of the nucleus, other shape features are also able to differentiate between normal and blast cells. The shape features used are as follows: • Area - number of pixels in the nucleus, cytoplasm and cell. •

Perimeter - distance between each adjoining pair of pixels around the border of the nucleus.



Convex Area - area of smallest convex polygon that can contain the nucleus.



Eccentricity - measure of the deviation of object from being circular. This feature is important as the normal lymphocytes are more circular than blasts [5].



Major axis length - length in pixels of the major axis of the ellipse containing the nucleus.



Minor axis length - is computed as the length in pixels of the minor axis of the ellipse containing the nucleus.



Solidity - proportion of the number of pixels in nucleus to the area of convex hull. This is computed as:

(a) Sample

(b) Difference with center pixel value

(c) Weights

(d) Thresholding result

Area (1) ConvexArea Compactness - measure of roundness of the nucleus [5]. Solidity =



P erimeter2 (2) Area Orientation - angle between major axis length and the x-axis. As shown in Figure 3(c), angle a is orientation. Compactness =



Binary pattern = 0∗1+0∗2+0∗4+1∗8+1∗16+1∗32+1∗64+0∗128 = 120 Fig. 4. Computation of Local Binary Pattern



Ratio of area of cytoplasm to nucleus.



Ratio of area of nucleus to cell.

This gives us a total of thirteen geometrical features, some of the features are shown in Figure 3.

(a) Gray scale image of nucleus

(a) Perimeter

(b) Convex Area

(c) Orientation

(b) Histogram of a subimage

(d) Major Axis

(e) Minor Axis

Fig. 3. Shape Features

2) Local Binary pattern: Texture gives information about spatial arrangement of the intensities in an image. Local Binary Pattern (LBP) [9] is a texture feature which is computationally simple and robust to changes in intensities due to illumination variations. ALL causes major changes in the chromatin distribution of lymphocyte nucleus which can be visualised in the form of texture [6]. Thus, we compute LBP to check texture variants in nucleus image. To compute the LBP, the nucleus image is first divided into multiple subimages of equal size. LBP is computed by thresholding a 3 × 3 neighborhood of each pixel by the center pixel value. A bit code is generated by comparing eight neighborhood pixels with the center pixel, resulting

(c) LBP obtained by concatenating histograms of all subimages Fig. 5. Example of LBP computation

into a binary pattern. Let d be the difference between the neighborhood pixel p and center pixel c. Whenever d is greater than zero, a 1 is generated, otherwise, 0 is generated, as given below:  1, if d > 0 s= (3) 0, otherwise

A label for the center pixel is generated by multiplying each thresholded value s with the weights assigned to the neighborhood pixels in a snail direction starting with top-left corner. The label is computed for each pixel in the block. Using these labels, a histogram for the subimage is formed. An illustration of LBP computation is shown in Figure 4. Histograms are computed for each subimage. LBP for the complete image is obtained by concatenating histograms of all the subimages. The LBP pattern can be now used as a feature vector. Intensity variations in the blast cells are different as compared to normal lymphocyte cell. Thus, texture features are computed over the nucleus image only. The gray scale nucleus image is shown in Figure 5(a). The histogram of a subimage of the nucleus is shown in Figure 5(b). The LBP obtained by concatenating histograms of all the subimages is shown in Figure 5(c).

We have computed the evaluation metrics for both types of features: geometrical and LBP. IV. E XPERIMENTS AND RESULT A NALYSIS We have used the ALL-IDB2 database obtained from Universit degli Studi di Milano, Italy [8]. We have used 75 blast lymphocyte images and 65 normal lymphocyte images for training. The test set contains 46 blast and 61 normal lymphocyte images. Figure 6 shows images of normal and blast lymphocytes. For each image in the data set, we have

C. Classification For training and classification purpose, we have used Support Vector Machine (SVM) [12]. The training algorithm takes a set of input data to train the classifier. The trained model is then used to classify new data into one of the two classes: normal or blast. A confusion matrix can be used to analyse the classification results. This matrix is shown in Table I. It contains information about the actual and predicted values of the test set images obtained from a classifier. True Positive (TP) is the number of TABLE I C ONFUSION MATRIX Actual class/Predicted class Blast Normal

Blast True Positive (TP) False Positive (FP)

Normal False Negative (FN) True Negative (TN)

blasts correctly classified as blasts. False Negative (FN) is the number of blasts classified as normal. False Positive (FP) is the number of normal lymphocytes classified as blasts and True Negative (TN) is the number of normal lymphocytes correctly classified as normal. The results are analysed using four evaluation measures: Sensitivity, Specificity, Misclassification and Accuracy. • Sensitivity is the probability of correctly identifying a blast cell. It is given by: Sensitivity = T P/(T P + F N ) •

Specificity determines the probability of correctly classifying a normal cell as given below: Specif icity = T N/(T N + F P )



(5)

Misclassification is the total number of cells classified incorrectly, as defined below: M isclassif ication = F P + F N



(4)

(6)

Accuracy is defined as: Accuracy =

TP + TN TP + TN + FP + FN

(7)

(a) Normal Lymphocyte

(b) Blast Lymphocyte

Fig. 6. Lymphocytes

computed the 11 geometrical features and LBP texture features as described in Section III. These features are used for training the SVM classifier. The results obtained from the two types of features are discussed in the following subsections. A. Shape features From each normal and blast image, we extract the 13 geometrical features. These are used as a feature vector for classification. After training the SVM classifier with the training set, we give the test images as input to the classifier. The confusion matrix for the shape features is shown in Table II. As can be seen, all the blast cells have been correctly TABLE II C ONFUSION MATRIX FOR S HAPE F EATURES Actual class/Predicted class Blast (46) Normal (61)

Blast 46 11

Normal 0 50

classified and there are no False Negatives. This shows that there is no possibility of the system missing out on any blast cell. It is also seen that out of the 61 normal cells, 11 have been classified as blasts. The overall accuracy of the system is 89.72%. This may prompt the physician to subject the patient to further tests. B. Local Binary Pattern features LBP texture features are extracted from each image of normal and blast cell. The confusion matrix for LBP features is given in Table III. Compared to geometric features, the number of False Positives using LBP features is less. However, the number of False Negatives have increased. Some ALL cases may be missed out by using this techniques but the possibility of having a normal cell classified as blast will decrease.

TABLE III C ONFUSION MATRIX FOR L OCAL B INARY PATTERN F EATURES Actual class/Predicted class Blast (46) Normal (61)

Blast 37 3

Normal 9 58

TABLE IV C LASSIFICATION RESULTS Evaluation metrics Misclassification (FP+FN) Sensitivity % Specificity % Accuracy %

Shape Features 11 100 81.96 88.79

LBP 12 80.43 95.08 89.72

C. Comparative analysis The evaluation metrics of the shape features and LBP features are shown in Table IV. The misclassification for both the features are almost same. The shape features shows more sensitivity as compared with LBP features by correctly classifying all the blast cells. Specificity is more in LBP features than shape features, which signifies that number of cells correctly classified as normal are more. Both the features provide almost same classification accuracy. It can be seen that LBP features perform reasonably well as compared to shape features. V. C ONCLUSION In this paper, we explored the automatic detection of Acute Lymphoblastic Leukemia. We have used two types of features: geometric features and LBP texture features for the detection of blast cells in the blood images of lymphocytes. Both types of features are computed for each image in the training and test set. SVM classifier is used for classification of images as blastaor normal. The results shows that the LBP texture features perform reasonably well as compared to shape features. As future work the authors would like to explore other texture variants to improve the efficiency of our proposed system. ACKNOWLEDGMENT The authors would like to thank R.D. Labati, V. Piuri, F. Scotti, Universit degli Studi di Milano for providing the ALLIDB database. R EFERENCES [1] Mohapatra, S.; Samanta, S.S.; Patra, D.; Satpathi, S., ”Fuzzy Based Blood Image Segmentation for Automated Leukemia Detection,” Devices and Communications (ICDeCom), 2011 International Conference on , pp.1,5, 2011. [2] Mohapatra, S.; Patra, D.; Satpathi, S., ”Image analysis of blood microscopic images for acute leukemia detection,” Industrial Electronics, Control & Robotics (IECR), 2010 International Conference on , pp.215,219, 2010. [3] Asadi, M.R.; Vahedi, A.; Amindavar, H., ”Leukemia Cell Recognition with Zernike Moments of Holographic Images,” Signal Processing Symposium, 2006. NORSIG 2006. Proceedings of the 7th Nordic , pp.214,217, 2006. [4] Scotti, F., ”Automatic morphological analysis for acute leukemia identification in peripheral blood microscope images,” Computational Intelligence for Measurement Systems and Applications, 2005. CIMSA. 2005 IEEE International Conference on , pp.96,101, 2005.

[5] Mohapatra, S.; Patra, D.; Satpathy, S., ”Automated leukemia detection in blood microscopic images using statistical texture analysis,” Communication, Computing & Security, 2011. ICCCS. 2011 ACM International Conference on , pp.184,187, 2011. [6] Mohapatra, S.; Patra, D.; Satpathy, S., ”An ensemble classifier system for early diagnosis of acute lymphoblastic leukemia in blood microscopic images,” Neural Computing and Application, 2013. Springer-Verlag, pp.1,18, June 2013. [7] Nor Hazlyna, H.; Mashor, M.Y.; Mokhtar, N. R.; Aimi Salihah, A.N.; Hassan, R.; Raof, R. A A; Osman, M.K., ”Comparison of acute leukemia Image segmentation using HSI and RGB color space,” Information Sciences Signal Processing and their Applications (ISSPA), 2010 10th International Conference on , pp.749,752, 2010. [8] Labati, R.D.; Piuri, V.; Scotti, F., ”All-IDB: The acute lymphoblastic leukemia image database for image processing,” Image Processing (ICIP), 2011 18th IEEE International Conference on , pp.2045,2048, 2011. [9] Nanni, L.; Lumini, A.; Brahnam, S., ”Local binary patterns variants as texture descriptors for medical image analysis,” US National Library of Medicine National Institutes of Health , pp.117,125, 2010. [10] Madhloom, H.T.; Kareem, S.A.; Ariffin, H., ”A Robust Feature Extraction and Selection Method for the Recognition of Lymphocytes versus Acute Lymphoblastic Leukemia,” Advanced Computer Science Applications and Technologies (ACSAT), 2012 International Conference on , pp.330,335, 2012. [11] Mohamed, M.; Far, B.; Guaily, A., ”An efficient technique for white blood cells nuclei automatic segmentation,” Systems, Man, and Cybernetics (SMC), 2012 IEEE International Conference on , pp.220,225, 2012. [12] Cortes, C.; Vapnik, V., ”Support-vector network”, Machine Learning, vol. 20, pp.273,297 1995. [13] Rezatofighi, S.H.; Soltanian-Zadeh, H.; Sharifian, R.; Zoroofi, R.A., ”A New Approach to White Blood Cell Nucleus Segmentation Based on Gram-Schmidt Orthogonalization,” Digital Image Processing, 2009 International Conference on , pp.107,111, 7-9 March 2009. [14] Fatichah, C., Tangel, M.L., Widyanto, M.R., Dong, F., Hirota, K., ”Parameter optimization of local fuzzy patterns based on fuzzy contrast measure for white blood cell texture feature extraction,” J. Ref. J. Adv. Comput. Intell. Intell. Inform. 16(3),pp.412419, 2012. [15] Halim, N.H.A.; Mashor, M.Y.; Abdul Nasir, A. S.; Mokhtar, N. R.; Rosline, H., ”Nucleus segmentation technique for acute Leukemia,” Signal Processing and its Applications (CSPA), 2011 IEEE 7th International Colloquium on , pp.192,197, 4-6 March 2011. [16] Sheng-Fuu Lin; Yu-Bi Hong, ”Differential count of white blood cell in noisy normal blood smear,” Industrial Electronics and Applications (ICIEA), 2012 7th IEEE Conference on , pp.1784,1789, 18-20 July 2012.