Automatic Particle Detection and Counting By One-Class SVM From

0 downloads 0 Views 238KB Size Report
One-Class SVM From Microscope Image. Hinata KUBA, Kazuhiro HOTTA, and Haruhisa TAKAHASHI. The University of Electro-Communications,.
Automatic Particle Detection and Counting By One-Class SVM From Microscope Image Hinata KUBA, Kazuhiro HOTTA, and Haruhisa TAKAHASHI The University of Electro-Communications, 1-5-1 Chofugaoka, Chofu, Tokyo 182-8585, Japan {kuba, hotta, Takahasi}@ice.uec.ac.jp

Abstract. Asbestos-related illnesses become a nationwide problem in Japan. Now human inspectors check whether asbestos is contained in building material or not. To judge whether the specimen contains asbestos or not, 3,000 particles must be counted from microscope images. This is a major labor-intensive bottleneck. In this paper, we propose an automatic particle counting method for automatic judgement system whether the specimen is hazardous or not. However, the size, shape and color of particles are not constant. Therefore, it is difficult to model the particle class. On the other hand, the non-particle class is not varied much. In addition, the area of non-particles is wider than that of particles. Thus, we use One-Class Support Vector Machine (OCSVM). OCSVM identifies “outlier” from input samples. Namely, we model the non-particle class to detect the particle class as outlier. In experiments, the proposed method gives higher accuracy and smaller number of false positives than a preliminary method of our project.

1

Introduction

Asbestos-related illnesses become a nationwide problem in Japan. Asbestos was widely used as building materials in Japan after the high-growth period of the 1970s. However, the use of asbestos has been banned or limited worldwide since the late 1980s, because it was discovered to cause cancer. Therefore, we need to check whether asbestos is used or not in a building when the building is demolished or renovated. There are some automatic airborne asbestos detection methods [1–3]. In this paper, we pay attention to asbestos detection problem in building materials not atmosphere. Asbestos detection in building materials is more difficult than that in atmosphere, because the specimen from building materials includes various kinds of particles. In near future, asbestos detection problem in building materials will become important more and more. The analysis by human inspectors is called as disperse dyeing method. Disperse dyeing method prepares three specimens from one sample, and 1,000 particles are counted from each specimen. When more than four asbestos are contained in 3,000 particles, it is judged as dangerous. The detection of individual particles from microscope images is a

major labor-intensive bottleneck for human inspectors. Thus, we propose an automatic particle detection and counting method to realize the disperse dyeing method by computer. The particle detection problem is a binary classification between the particle class and the non-particle class. However, the size, shape and color of particles are not constant. Therefore, it is difficult to model the particle class. On the other hand, the non-particle class is not varied much. In addition, the area of non-particles is wider than that of particles. Thus, we model the non-particle class to detect the particles as “outlier”. For this purpose, we use One-class Support Vector Machine (OCSVM) [4–6] for outlier detection. In the experiments, we use a background image as a preprocessing. The background image is captured without the specimen. We compute the difference between an input image and the background image. We expect that regions with large difference contain particles. After computing the difference image, the proposed method is applied to the difference image, and particles are detected and counted. The proposed method has two steps. The first step is the detection of particle regions from the difference gray scale image by OCSVM. The gray scale image is divided into small regions without overlap, and we make a gray level histogram of each region. The histograms are used as input features for OCSVM. OCSVM determines that particles exist or not into small regions. The second step is counting particles. It is expected that the non-particle regions contain only the background. On the other hand, the regions which are classified as particle class may contain the particle and background, because we assign the label (particle or non-particle) to the square regions by OCSVM. Thus, we must eliminate the background from the regions with the particles label. We binarize the difference image using the higher and lower intensity values of regions with the non-particle label as a threshold. By this binarization, the background is eliminated from the regions with the particle label. Then the number of particles is counted. The 20 microscope images which include the particles and other particles offered by RIKEN are used. We compared the proposed method with a preliminary particle detection method of our project. The preliminary method is based on the binarization of the difference image between an input and background image. The proposed method gives higher accuracy and smaller number of false positives than the preliminary method. The remainder of this paper is organized as follows. Section 2 covers the particle detection and counting by OCSVM. Section 3 gives the experimental results, followed by conclusions in the last section.

2

Particle detection and counting by One-Class SVM

In this paper, we propose an automatic particle detection and counting method based on OCSVM from microscope images. Figure 1 shows the examples of particles in microscope images of building materials. We understand that the size, shape and color of particles are not constant. It is difficult to model the

particle class. On the other hand, the non-particle class is not varied much. In addition, the area of non-particles is wider than that of particles as shown in Figure 4. Thus, we detect particles as “outlier” by OCSVM. The OCSVM is the unsupervised learning for outlier detection. Although the SVM classifier needs to train in advance, the OCSVM does not need. Therefore, the particle detection by OCSVM is appropriate for practical system. Only one thing to do is to capture a background image at the current environment. This is an advantage of our method. In section 2.1, we explain OCSVM. Section 2.2 describes the proposed particle detection and counting method.

Fig. 1. Example of particles in building materials

2.1

One-Class SVM

OCSVM is the method for detecting outliers from input samples. The OCSVM maps an input sample into a high dimensional feature space F via a non-linear mapping Φ and finds the maximum margin hyperplane which separates the input sample between the origin and desired hyperplane. Figure 2 shows the outlier detection by OCSVM. The 3 samples which are near to the origin are the outliers in this example.

Fig. 2. Outlier detection by OCSVM.

Let x1 , . . . , xn ∈ X are input vectors. In order to separate the data from the origin, it needs to solve the optimization problem: minimize :

1 ||w||2 + 2

1 νn

n ∑

ξi − ρ,

i=1

subject to : ⟨w, ϕ(x)⟩ ≥ ρ− ξi , ξi ≥ 0, i ∈ [n],

where ϕ is the non-linear transform, w and ρ denote a weight vector and a threshold. ξi are slack variables that penalize the objective function with allowing some of the feature vectors to be located in between the origin and desired hyperplane. ν ∈ (0, 1) is the trade-off parameter that controls between the margin and the penalty. ν = 0 means OCSVM with hard margin. The decision function f (z) by OCSVM is defined as f (z) = sgn(⟨w, ϕ(x)⟩ − ρ)), The dual problem of the optimization problem defined as 1 ∑∑ αi αj K(xi , xj ), 2 i=1 j=1 n

minimize :

n

n ∑ 1 subject to : 0 ≤ α ≤ , αi = 1, νn i=1

where α are Lagrange multipliers and K is the kernel function. In OCSVM, Gaussian kernel is frequently used to map outliers near to the origin. We also use the Gaussian kernel defined as K(x, y) = exp(−

||x − y||2 ), σ2

where σ denotes the standard deviation. In this paper, we used the LIBSVM [7]. The parameters are set as ν = 0.9, γ = 1, c = 1 by preliminary experiment. 2.2

Particle detection and counting method

As a preprocessing, we compute the difference between an input image and a background image. The background image does not include any particles. It was captured without specimen. Although both the input and background images are color, the difference image is the gray scale by computing the squared distance of RGB signals of each pixel. The size of the difference image is equal to the input image. The proposed method has two steps. First, the particle regions are detected roughly by OCSVM. Second, particles are counted. Figure 3 shows how to apply OCSVM. In the first step, the difference image is divided into small regions without overlap. The size of small region is set to 10 × 10 pixels. Therefore, when the size of an input image is 100 × 100 pixels, 100 local regions are obtained. In this paper, we use histogram as a feature of each region because the position of particles in a small region is not constant. The histograms of all regions are fed into OCSVM, and the outliers (particles) are selected from 100 regions by OCSVM. Figure 4-9 shows the flow of our approach. Figure 4 is an input image and Figure 5 is a background image. In this paper, only one background image shown in Figure 5 is used in the experiments. Figure 6 shows the difference image.

Fig. 3. How to apply OCSVM. When the size of an input image is 100×100 pixels and the size of a small region is set to 10×10 pixels (I = 10, J = 10), 100 local regions are obtained (L = I × J = 100).

Fig. 4. Original image

Fig. 5. Background image

Fig. 6. Difference image

Fig. 7. Result by OCSVM   Fig. 8. Result by OCSVM Fig. 9. Final result      on original image              

The difference image becomes grayscale by computing the squared distance of RGB colors at each pixel. Particles have large difference values. We compute the histograms of small regions of Figure 6, and they are fed into OCSVM. Then the result shown in Figure 7 is obtained. White regions show the particle label and black regions show the non-particle label. Since squared local regions are used as input samples for OCSVM, the result like the block is obtained. In Figure 8, the regions with the particle label which are shown as pink are overlapped to Figure 4. Since OCSVM assigns labels to squared regions, the regions with the particle label in Figure 8 contain both particles and background. We want to eliminate this for counting particles correctly. To eliminate it, we use the information of the regions classified as background. We expect that the regions with the non-particle label contain only background. Thus, we use the higher and lower intensity values of the regions with non-particle label as a threshold. The result is shown in Figure 9. It can be seen that the result of Figure 9 is more precise than that of Figure 7. We count the particles by labeling from the Figure 9.

3

Experiments

The 20 microscope images which include asbestos and other particles offered by RIKEN[11] are used. In the 20 images, 1,051 particles are included. The size of the microscope images is 640×480 pixels. These are color images. As a preprocessing, the difference grayscale image is computed. To obtain the input features for OCSVM, the difference image is divided into the small regions of 10×10 pixels without overlap. Then we make the histogram of each region. The histogram bin size is set to 30. To show the effectiveness of the proposed method, we compared it with a preliminary method of our project. The preliminary method has 3 steps. First, the difference image between the input and background image is computed. Second, it is binarized by Otsu’s binarized method [8]. Third, particles are counted by labeling in the binarized image. We evaluate the methods by using the true positive rate and the number of false positives. The true positive means that particles are classified correctly. The false positive means that non-particles are mis-classified as the particle class. Table 1 shows the results of our method and the preliminary method. True positive rate of our method achieves 88 % while that of the comparison method achieves only 58 %. In addition, the number of false positive is smaller than the comparison method. These results show the effectiveness of particle detection by OCSVM. Figure 10 shows the examples of particle detection by our method. Figure 10 (a) and (d) show the original input images. Figure 10 (b) and (e) show the results by OCSVM of (a) and (d). The background is already eliminated by the binarization. The regions with particle label are shown as pink. Figure 10 (c) and (f) show the final result of our method. The green rectangle shows the one particle. The proposed method can detect particles with various sizes correctly. This is a giant step to realize the automatic disperse dyeing method by computer, because there are few researches about asbestos detection in building materials by computer. Since the proposed method is based on the difference image, it failed to detect particles which have the close value to background. Almost of all false negatives are this kind of error. In terms of the false positives, there are some error types. The typical false positives are shown in Figure 11. Figure 11 (a) is air bubble which is not a particle. In our method, the regions with large difference from background are detected as particles. Thus, air bubbles are also detected. We count them as false positives strictly. Figure 11 (b) is the example of overlap of some particles. Our method detect it as one particle. It may need to the stereoscopic system. Figure 11 (c) shows the example of out of focus. The microscope has shallow depth of field. Then the captured image come into focus, or out of focus by location. The particles with out of focus are smoothed and have the close value of background. Therefore, it failed to detect particle. Figure 11 (d) is the example of a fiber with broken pieces. Human inspector can count this as one particle, however the proposed method count it as some particles. We judge them as false positives strictly. In this paper, the simple labeling algorithm is used but this will be improved by using conditional random field [9].

Table 1. Evaluation results

Proposed method Preliminary method

Number of particles True positive rate Number of false positive 1,051 88(%) 74 1,051 58(%) 126

(a) Original image

(b) Result by OCSVM of (a)

(c) Final result of (a)

(d) Original image

(e) Result by OCSVM of (d)

(f) Final result of (d)

Fig. 10. Examples of particle detection by the proposed method. (a) and (d) Original images. (b) and (e) The regions with particle label which are shown as pink are overlapped to the original images. (c) and (f) The final results.

(a)

(b)

(c)

(d)

Fig. 11. Example of false positive. (a) an air bubble (b) an overlap of particles (c) a decoupling by the out of focus (d) a fiber with broken pieces

4

Conclusion

In this paper, we propose an automatic particle counting method using OCSVM for automatic disperse dyeing method from building materials by computer. We detect particles from the microscope image as outliers by OCSVM. Experimental results show the effectiveness of the proposed the particle detection method. Although we use the histogram as features for OCSVM,the Gaussian kernel is used in OCSVM. The kernel specialized for histogram may improve the accuracy futher. The pyramid match kernel [10] may be used for this purpose. This is a subject for future works.

Acknowledgements This research is supported by the research program (No.K1920 and No.K2061) of Ministry of the Environment of Japan.

References 1. P.A.Baron, and S.A.Shulman: Evaluation of the Magiscan Image Analyzer for Asbestos Fiber Counting: American Industrial Hygiene Association Journal, Vol.48, pp.39-46, 1987 2. L.C.Kenny: Asbestos Fiber Counting by Image Analysis - The Performance of The Manchester Asbestos Program on Magiscan: Annals of Occupational Hygiene, Vol.28, No.4, pp.401-415, 1984. 3. Y.Inoue, A.Kaga, and K.Yamaguchi: Development of An Automatic System for Counting Asbestos Fibers Using Image Processing: Particle Science and Technology, Vol.16, pp.263-279, 1989. 4. B.Sch¨ olkopf, J.C.Platt, J.Shawe-Taylor, A.J.Smola and R.C.Williamson: Estimating the Support of a High-Dimensional Distribution: Neural Computation, Vol.13, No7, pp.1443-1471, 2001. 5. Vladimir N. Vapnik: Statistical Learning Theory: John Wiley & Sons, 1998. 6. J. Shawe-Taylor and N. Cristianini: Kernel Methods for Pattern Analysis: Cambridge University Press, 2004. 7. C.C Chang and C.J Lin. LIBSVM: http://www.csie.ntu.edu.tw/ cjlin/libsvm, 2001. 8. N.Otsu: A threshold selection method from gray-level histograms: IEEE Trans. on System, Man, and Cybernetics, Vol.SMC-9, No.1, pp.62-66, 1979. 9. J.Lafferty, A.McCallum and F.Pereira: Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data: In Proceedings of the Eighteenth International Conference on Machine Learning (ICML), 2001. 10. K.Grauman and T.Darrell: The Pyramid Match Kernel: Discriminative Classification with Sets of Image Features. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Beijing, China, October 2005. 11. http://www.riken.jp/engn/index.html

Suggest Documents