Automatic Skin Lesion Boundary Segmentation Using Deep Learning Convolutional Networks with Weighted Cross Entropy M. A. Al-masni1, ⚚, M. A. Al-antari1, ⚚, P. Rivera1, E. Valarezo1, G. Gi1, T.-Y. Kim1, H. M. Park1 and T.-S. Kim1,*
Abstract An automatic segmentation of skin lesions from the dermoscopy images is a key procedure to accurately diagnose different skin diseases. In this study, recent state-of-the-art deep learning segmentation models which are U-Net, SegNet, FCN, and FrCN utilizing weighted cross entropy as a loss function are adapted and utilized. We evaluated all these different models using the ISIC 2018 segmentation challenge dataset. The U-Net, SegNet, FCN, and FrCN methods achieve average threshold Jaccard index of 54.4%, 69.5%, 74.7%, and 74.6% on the online validation dataset, respectively. These segmentation methods generate fine boundaries of the segmented lesions. Keywords: Deep learning, dermoscopy, melanoma, skin image analysis, skin lesion segmentation 1.
INTRODUCTION
S
kin cancer is one of the most popular cancers in the world. According to relevant report of the American Cancer Society in the United States in 2018, it is estimated that about 99,550 cases were diagnosed as new cases with skin cancer, and the expected deaths from this disease accounted around 13.52% from the new cases. Moreover, skin cancer has a mortality rate of 2.21% among all other cancer types [1]. However, it has been shown that the survival rate from skin cancer could be increased if the disease is early detected and correctly diagnosed [2]. Dermoscopy is the gold standard imaging technology utilized for skin cancer screening. It is a non-invasive tool that utilizes polarized light to acquire magnified images of the skin lesions [3]. Deeper details of the skin lesion structure can be visualized using the dermoscopy imaging tool. Compared to visual inspection using the naked eyes, dermoscopy images assist dermatologists to improve the diagnostic accuracy of the skin cancer. In clinical practice, dermatologists face many challenges to correctly detect and recognize the skin cancer from the dermoscopy images due to the high similarity among different types of cancers, subjective, time consuming, and fault prone [4]. Therefore, automated computer-aided diagnosis (CAD) system is highly demanded through which a second opinion is provided to help and support dermatologists’ decision. Automatic segmentation of melanoma from surrounding skin tissues is a prerequisite step toward better skin disease classification [5, 6, 7]. However, the segmentation is not an easy task because the skin lesions have large variations in the sizes, shapes, colors, and locations in the images. Figure 1 shows some challenging examples which pose extra difficulties to the segmentation task. Recent advances of the deep learning methods on different medical applications have been getting a lot of attention in the fields of object detection, segmentation, and recognition [8, 9, 10, 11, 12]. ⚚ Corresponding author. Authors equally contributed to this work. Department of Biomedical Engineering, College of Electronics and Information, Kyung Hee University, Republic of Korea. Email addresses:
[email protected] (M. A. Al-masni),
[email protected] (M. A. Al-antari),
[email protected] (P. Rivera),
[email protected] (E. Valarezo),
[email protected] (G. Gi),
[email protected] (T.-Y. Kim),
[email protected] (H. M. Park), and
[email protected] (T.-S. Kim).
*
1
1
(a) (c) (b) (d) Fig. 1. Some challenging cases of the skin lesions segmentation task such as (a) low contrast, (b) irregular boundaries, (c) hair artifact, and (d) color illumination. Superimposed red lines indicate the ground truth contours of the skin lesions by expert dermatologists. In 2017, Yuan et al. adopted the deep convolutional-deconvolutional neural networks (CDNN) with a loss function of Jaccard distance for skin lesion segmentation [13]. The CDNN was trained using augmented data with different color images using the IEEE International Symposium on Biomedical Imaging (ISBI) 2017 dataset. This method was ranked first in the ISBI 2017 challenge with Jaccard index of 76.5%. In 2017, Yu et al. proposed deep residual networks for both segmentation and classification tasks using the ISBI 2016 dataset [14]. Their network was ranked second, with Jaccard index of 82.9%, in the segmentation challenge. 2. MATERIALS AND METHODS 2.1 Dataset of Lesion Boundary Segmentation Task Our data was extracted from the “ISIC 2018: Skin Lesion Analysis Towards Melanoma Detection” grand challenge datasets [15, 16]. The dermoscopy images in this challenge consist of 8-bit RGB with different image sizes and various cancer types. The training dataset consists of 2,594 images with their corresponding ground truth response masks which were annotated by expert dermatologists. Moreover, the validation and testing datasets contain 100 and 1,000 dermoscopy images, respectively. However, both validation and testing datasets are provided without the corresponding ground truth masks. It is only available to check the performance of validation dataset at time of online submission. Due to this, we had to re-split the given training dataset, which contains ground truth masks, into 80% for training and 20% for validation to optimize the parameters of all segmentation models. 2.2 Data Augmentation and Transfer Learning For appropriate training of the deep learning methods, it is required to train the networks with huge dataset. Therefore, in this study we applied data augmentation process only to the training data (i.e., 80% of the 2,594 images) as following. First, we included three channels of Hue-SaturationValue (HSV) besides the original RGB images in order to learn various color space features. Second, we augmented the above RGB and HSV images four times using rotation transformation with angles of 0°, 90°, 180°, and 270°. In addition, horizontal and vertical flipping processes are applied. Thus, a total of 33,200 skin lesion images with their corresponding ground truth masks were utilized to train our proposed segmentation methods. In fact, data augmentation process assists to reduce the overfitting problem that may occur during the training process. Furthermore, we adopted the transfer learning concept to enhance the training efficiency. We first utilized the pre-training model weights of VGG-16 network using the huge ImageNet dataset [17]. Then, we fine-tuned our proposed networks using the augmented training data. A separate validation dataset contains 519 dermoscopy images (i.e., 20% of the original training dataset) has 2
been used to optimize the proposed segmentation models. All the dermoscopy images are resized into192×256 pixels using bi-linear interpolation [13]. 2.3 Segmentation Models In this study, we present the full resolution convolutional network (FrCN) method [11] to segment the skin lesions of the ISIC 2018 challenge. FrCN method is a resolution-preserving model which leads to learn high-level features and improve the segmentation performance. This is achieved by removing all the subsampling layers from the network. We have utilized sigmoid activation function in the last layer of the proposed segmentation networks to classify each pixel in the dermoscopy image into two classes (i.e., lesion and non-lesion). More details of the U-Net, SegNet, FCN, and FrCN methods are given in [18, 19, 20, 11]. All the segmentation models were trained using Adadelta optimization method with a batch size of 20. The learning rate was initially set as 0.2 and then reduced with automated updating throughout 200 epochs during the training process. We utilized weighted cross entropy as a loss function in which the overall loss H is minimized throughout the training stage as follows, (1) H = − � ∙ ∙ log ̂ + − � ∙ − ∙ log − ̂ , where and ̂ are the ground truth annotation and predicted segmented map, respectively. � is the weight of the lesion in the dermoscopy image and we have chosen � = .9 . The implementation of this work was performed with Python on Ubuntu 16.04 OS using Keras and Theano libraries. 2.4 Evaluation Metrics The predicted lesion masks of this challenge are scored using a threshold Jaccard index. First, the Jaccard index for each image is calculated using a pixel-wise comparison of each predicted segmentation with the corresponding ground truth mask. Then, the final score for each image is computed as a threshold of the Jaccard index value as follows, , �� �� � < . (2) � � ={ �� �� � �� , �� �� � ≥ . . The average of all per-image scores is considered as final metric value to rank the participants. 3. RESULTS This section presents the performance of the U-Net, SegNet, FCN, and FrCN approaches using validation data (i.e., 100 images) of the ISIC 2018 challenge. Table 1 shows the segmentation performance results in term of average threshold Jaccard index of all validation images on the Table 1: Segmentation performance of different methods on validation dataset Method U-Net SegNet FCN FrCN
Trainable Parameters
Threshold Jaccard Index (%)
12.3 M 11.5 M 134.3 M 16.3 M
54.4 69.50 74.70 74.60 Fig. 2. Examples of predicted lesion contours with Jaccard index values. 3
online submission. Some examples of the predicted segmentation contours are shown in Figure 2 with their Jaccard index values using different segmentation methods. 4. CONCLUSION This work has presented different segmentation methods on the validation dataset of the ISIC 2018 challenge (Task1: Lesion Boundary Segmentation). In future at time of releasing ground truth masks of the testing dataset, we will include the segmentation performance of different deep learning methods on the 1,000 testing images. REFERENCES [1] R. L. Siegel, K. D. Miller and A. Jemal, "Cancer Statistics, 2018," Cancer Journal for Clinicians, vol. 68, pp. 7-30, 2018. [2] C. M. Balch, J. E. Gershenwald, S.-j. Soong, J. F. Thompson, M. B. Atkins, D. R. Byrd and A. C. Buzaid et al., "Final version of 2009 AJCC melanoma staging and classification," Journal of clinical oncology, vol. 27, no. 36, pp. 6199-6206, 2009. [3] M. Binder, M. Schwarz, A. Winkler, A. Steiner, A. Kaider, K. Wolff and H. Pehamberger, "Epiluminescence microscopy: a useful tool for the diagnosis of pigmented skin lesions for formally trained dermatologists," Archives of Dermatology, vol. 131, no. 3, pp. 286-291, 1995. [4] M. E. Vestergaard, P. H. P. M. Macaskill, P. E. Holt and S. W. Menzies, "Dermoscopy compared with naked eye examination for the diagnosis of primary melanoma: a meta‐analysis of studies performed in a clinical setting," British Journal of Dermatology, vol. 159, no. 3, pp. 669-676, 2008. [5] M. E. Celebi, H. Iyatomi, G. Gerald Schaefer and W. V. Stoecker, "Lesion border detection in dermoscopy images," Computerized medical imaging and graphics, vol. 33, no. 2, pp. 148-153, 2009. [6] M. E. Celebi, Q. U. A. N. Wen, H. I. T. O. S. H. I. Iyatomi, K. O. U. H. E. I. Shimizu, H. Zhou and G. Schaefer, "A state-of-the-art survey on lesion border detection in dermoscopy images," Dermoscopy Image Analysis, pp. 97-129, 2015. [7] H. Ganster, P. Pinz, R. Rohrer, E. Wildling, M. Binder and H. Kittler, "Automated melanoma recognition," IEEE transactions on medical imaging, vol. 20, no. 3, pp. 233-239, 2001. [8] M. A. Al-masni, M. A. Al-antari, J.-m. Park, G. Gi, T.-Y. Kim, P. Rivera, E. Valarezo, M.-T. Choi, S.-M. Han and T.-S. Kim, "Simultaneous Detection and Classification of Breast Masses in Digital Mammograms via a Deep Learning YOLO-based CAD System," Computer Methods and Programs in Biomedicine, vol. 157C, pp. 85-94, 2018. [9] G. Carneiro, J. Nascimento and A. P. Bradley, "Automated Analysis of Unregistered Multi-view Mammograms with Deep Learning," IEEE Transactions on Medical Imaging, vol. 36, no. 11, pp. 2355-2365, 2017. [10] M. A. Al-antari, M. A. Al-masni, S. U. Park, J. H. Park, M. K. Metwally, Y. M. Kadah, S. M. Han and T.-S. Kim, "An automatic computeraided diagnosis system for breast cancer in digital mammograms via deep belief network," J. Med. Biol. Eng., pp. 1-14, 2017. [11] M. A. Al-masni, M. A. Al-antari, M.-T. Choi, S.-M. Han and T.-S. Kim, "Skin lesion segmentation in dermoscopy images via deep full resolution convolutional networks," Computer methods and programs in biomedicine, vol. 162, pp. 221-231, 2018. [12] M. A. Al-antari, M. A. Al-masni, M.-T. Choi, S.-M. Han and T.-S. Kim, "A fully integrated computer-aided diagnosis system for digital Xray mammograms via deep learning detection, segmentation, and classification," International Journal of Medical Informatics, vol. 117, pp. 44-54, 2018. [13] Y. Yuan, M. Chao and Y.-C. Lo, "Automatic skin lesion segmentation with fully convolutional-deconvolutional networks," arXiv preprint arXiv:1703.05165, 2017. [14] L. Yu, H. Chen, Q. Dou, J. Qin and P.-A. Heng, "Automated melanoma recognition in dermoscopy images via very deep residual networks," IEEE transactions on medical imaging, vol. 36, no. 4, pp. 994-1004, 2017. [15] N. C. F. Codella, D. Gutman, M. E. Celebi, B. Helba, M. A. Marchetti, S. W. Dusza, A. Kalloo, K. Liopyris, N. Mishra, H. Kittler and A. Halpern, "Skin Lesion Analysis Toward Melanoma Detection: A Challenge at the 2017 International Symposium on Biomedical Imaging (ISBI), Hosted by the International Skin Imaging Collaboration (ISIC)," arXiv:1710.05006, 2017. [16] P. Tschandl, C. Rosendahl and H. Kittler, "The {HAM10000} dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions," Sci. Data, vol. 5, p. 180161, 2018. doi: 10.1038/sdata.2018.161. [17] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg and L. Fei-Fei, "ImageNet Large Scale Visual Recognition Challenge," International Journal of Computer Vision, vol. 115, no. 3, pp. 211-252, 2015. [18] O. Ronneberger, P. Fischer and T. Brox, "U-net: Convolutional networks for biomedical image segmentation," in International Conference on Medical Image Computing and Computer-Assisted Intervention, 2015. pp. 234-241. [19] V. Badrinarayanan, A. Kendall and R. Cipolla, "Segnet: A deep convolutional encoder-decoder architecture for image segmentation," arXiv preprint arXiv:1505.07293, 2015. [20] E. Shelhamer, J. Long and T. Darrell, "Fully convolutional networks for semantic segmentation," IEEE transactions on pattern analysis and machine intelligence, vol. 39, no. 4, pp. 640-651, 2017.
4