An Image-Processing Based Automated Bacteria ... - IEEE Xplore

31 downloads 86924 Views 935KB Size Report
Northern Cyprus Campus. 18. An Image-Processing Based Automated Bacteria Colony Counter. Hüseyin Ateş. Ömer Nezih Gerek e-mail: [email protected].
An Image-Processing Based Automated Bacteria Colony Counter Hüseyin Ateş

Ömer Nezih Gerek

e-mail: [email protected]

e-mail: [email protected]

Department of Electrical and Electronics Engineering, Anadolu University. 26555 Eskisehir Turkey.

Abstract—This

paper presents an image processing based automated counting system to detect the number of bacteria colonies that develop in Petri dishes of microbiology laboratories. The visible colonies represent the initial number of bacteria present in the aqueous environment. The counting system contains shape based segmentation and classification algorithms. Colonies are considered as (possibly overlapping with some amount of amorphous deviations from) discs and classified as a cluster of bacteria with respect to their compactness ratio. The system is implemented using Matlab, and tested using ground truth data provided from Anadolu University, Dept. of Environmental Engineering microbiology laboratory. Results are presented.

Figure 1: Colony types according to their form.

Naturally, the most encountered type (by experimenting E.coli) is the circular colony. Consequently, circular colonies are examined in this study. This selection reduces the analysis to a worst case circle detection problem. However, the problem is kept not to assume any information about Petri dish image such as image degradation and noise, size of colonies, color, etc. Some automated counters detect colonies very sensitively by using special growth medium contains fluorogenic substrates [9]. The method is very useful for detecting colonies but fluorogenic substrates are costly. In recent studies regarding image processing based colony counters (in fact, circular object counters), transform based methods, such as Hough transform based methods are mostly used in order to extract circular shapes and curves[2-4]. In this paper, we introduce a robust, efficient, automated, software-centered and low-cost counting system. Unlike the previous methods, Hough transform based methods were not adopted because of the indistinctness and deviation of colony size, making the initial estimate of the colony radius hard to determine. Size of a colony is directly proportional the amount of food in medium and the distribution of the initial bacteria. Typically, size of colony can vary in wide range (see Figure 6-b). As a result of deciding against shape based transforms, a new “two-pass iterative” method is developed. Initially, a simple binarization is applied using histogram based threshold determination (here, this step is notated as “Pattern Detector”). Following the

1. Introduction A colony counter is a tool used for counting colonies of bacteria or other microorganisms growing on an agar plate (also called a Petri dish). Classically, the nonautomated counting process consists of marking off with a felt-tipped pen on the outer surface of the plate and keeping the marked number [1]. This manual process is a long and laborious process that depends on the visual ability and care of human who counts the colonies. In addition, especially for very high number of colonies, manual counting results can differ among different biologists. This is due to the fact that, highly populated colony counting methods mostly use estimation methods, utilizing a small portion of the plate. Bacteria colony is a group of bacteria cells which develop from a single bacteria cell. Different types of bacteria produce differently shaped colonies with specific color, structure, margin, and elevation. In general, colony shapes are classified to four fundamental groups; circular, irregular, filamentous and rhizoid, according to their forms.

978-1-4244-5023-7/09/$25.00 ©2009 IEEE

18

September 14-16, 2009 METU Northern Cyprus Campus

Figure 2: Colony counter flow graph.

Engineering Department of Anadolu University. Results are compared to the tags generated by an expert microbiologist and results of freely available software colonycounter [8].

binarization process, isolated binary objects are analyzed according to their “size” and “compactness ratio” to determine how close the shapes are to the ideal circular colonies (the operation is denoted by “Pattern Classifier” in this work). If a binary shape fails the compactness test, then the watershed algorithm is applied to split the corresponding original (gray) shape to possible colonies (the operation is denoted by the term “Cluster Splitter”). There is a possibility that the result of the watershed splitting may still not correspond to circular colonies. The compactness test is, therefore, applied again. In this second pass, if the test fails again, then the watershed parameters are refined and splitting is applied again. The proposed algorithm stops applying watershed splitting at this point. However, if the compactness ratio test continues to fail, an alternative analysis tool is invoked where the final (non circular) shapes are considered to consist of combinations of colony shapes with known areas ( AC ). Consequently, the number of colonies inside a non-circular shape with area ( A ) is assumed to be N  A AC in the module notated as “Resulter”. The average colony area is assumed to be the arithmetic mean of every other colony shape that has been successfully recognized in the first pass. Application of watershed algorithm twice according to the compactness ratio test is found to be more accurate than a single watershed segmentation. The accuracy is tested in better detail in Section 3. Over 300 sample Petri images were tagged to form the ground truth data in the microbiology laboratory of the Environmental

2. Proposed System The proposed system is illustrated in Figure 2. The system has two main parts; pattern detector and counter. Counter is kept as a separate module to obtain a scalable and reusable system. Modifications can be applied to the pattern detector part. 2.1 Pattern Detector Pattern detector is the first step of the system. It has three missions; Petri dish boundary detection, noise removal, and colony candidate pattern detection. Petri dish boundary detection is not fully automated. The program interface asks user for three points on the boundary of the dish. The parameters of the circle which encapsulates the dish can be found by solving the circle equation which passes through the three points (P1:(x1,y1); P2:(x2,y2) ; P3:(x3,y3)), producing a circle center, (x0, y0), and radius, r as the following equations: E1: (x1-x0)2 + (y1-y0)2= r 2 E2: (x2-x0)2 + (y2-y0)2= r 2 E3: (x3-x0)2 + (y3-y0)2= r 2 Our data set is consist of about 2-3 megapixel photos taken by an amateur-grade digital camera with a fair amount of mis-focus and grain noise. Median filtering is used for noise removal as pre-processing. Colony candidate detection is a thresholding process. Otsu’s threshold algorithm [6] is used for this process.

19

final compactness test is applied, but, this time, its output is no further processed by the watershed algorithm. Instead, if the compactness test still fails, indicating non circular clusters, they are marked as nonseparable, and they are sent to the module of “resulter”, which calculates an estimate number of colonies using division of the area to a “normal” colony area. Due to personal communications with biologists, this kind of an estimation process was recommended as a common practice for real life applications.

Figure 3: Pattern detector steps

2.2 Counter Counter is the encapsulating part of the system which performs iterative splitting and counting operations. The pattern detector produces raw candidates and its output is far from being accurate. In fact, the main “detection” process is embedded into the counter part for better accuracy. The critical and difficult problem is caused by the fact that, the initial outputs (clusters) may not always correspond to a single colony (see Figure 5). The outputs of the pattern detector (thresholding) are observed to be fine initial points of the watershed algorithm which is used to better segment clusters. Despite this improved accuracy, many of the relatively “large” clusters (with more than two actual colonies) may not be topologically split into the “real” colonies using watershed. Besides, watershed algorithm does not guarantee not to separate a colony into two or more fake colonies, either. As seen in Figure 2, three states method is developed as a remedy to such circumstances. Each block in figure is presented in the following sub sections.

Figure 4: 1-3: colony, 4-5: cluster

2.2.2 Cluster Splitter Clusters are split into sub regions by the cluster splitter module where sub regions are assumed as colony. Here, the watershed segmentation algorithm is adopted. A case of watershed segmentation is illustrated in Figure 5. If the cluster consists of “a few” colonies, this algorithm usually produces plausible results.

2.2.1 Pattern Classifier The examined bacteria colonies have circular shape. The pattern classifier tests whether a binary shape corresponds to a colony or a cluster according to its, so called, compactness ratio, also known as the circularity ratio. The definition of circularity ratio is the ratio of the area of the closed shape to the area of a circle (the most compact shape) having the same perimeter, mathematically given as; CR = 4π (area) / (perimeter) 2. As a shape deviates from circle to rather nonconvex shapes, the value decreases [7]. By thresholding this ratio for each shape, the pattern classifier module separates patterns into 2 group; colonies, and clusters of colonies. In order to improve efficiency, the pattern classifier test is applied a second time after feeding to the watershed algorithm, which splits the non-circular shaped cluster to finer colony shapes. Following the second pass, a

Figure 5: Watershed segmentation for two merged colonies

2.2.2 Resulter Resulter is the last module of the counter. It gets colonies from the previous modules and finds the number of colonies for

20

two cases: If the input shape is “small” and circular, then the shape is assigned to a single colony. For larger and non-circular shapes (non-separable clusters), the number of actual colonies is estimated as the ratio of cluster area to an average colony area. The average colony area was automatically recognized by obtaining the size magnitude corresponding to the “most populated” colony sizes, as suggested by the microbiologists. Integer division is applied one by one to all inseparable clusters to minimize error. 3. Experimental Results In our experiments, we used over 300 images which were taken in different times using completely different consumer grade compact digital cameras at resolutions of 3264x2448 and 2048x1536. Dataset is provided from Department of Environmental Engineering Microbiology Laboratory, Anadolu University in Eskisehir. All Petri dishes included colonies grown from the E.coli bacteria (samples are shown in Figure 6). The results are compared with (i) ground truth data, (ii) freely available colony counter (clono-counter), and (iii) one-shot watershed segmentation based counting method. In order to better visualize the performance of the methods under different circumstances, the data set is split into six sub categories with different colony shape characteristics. The categories are exemplified by the six sample images in Figure 6, notated from (a) to (f). Colony number results are presented in Table I. A quick inspection of this table indicates that the proposed two-pass method gives plausible results as compared to the single pass output. Surprisingly, the popular clono-counter gives unacceptable results in several of the cases, making the search for alternative image processing based methods a necessary work.

a

b

c

d

e

f

Figure 6: Some sample colony images corresponding to six different colony formations.

Table I: Colony numbers found by various methods over colony formations indicated in Fig.6.

Clonocounter

One-shot watershed

Proposed method

Ground truth

a

172

29

32

37

b

222

194

282

225

c

211

119

149

141

d

347

1018

1375

1825

e

932

527

626

660

f

191

423

431

513

Despite the difference in the running environment, a detailed comparison is made over image processing and/or machine vision libraries which are under a freely distributable license like GPL, LGPL, MIT etc. The platform capability, efficiency, usability and documentation are listed as follows:

4. Packaging of the algorithm. In this study, the project is implemented and tested on Matlab environment because of its flexibility and rather intensive library. Obviously, such an all purpose platform is not suitable to develop a distributable, fast and user-friendly program. Therefore, an executable version of the program will eventually be developed.

21

work has started using .net based and Qt based application structures with openCV libraries.

AForge: Compatible Language: CLR capable languages (Visual c++, C# VB.net etc.) Platform: Windows (Common Language Runtime) Documentation: Complete API documentation with few example project and article. Usability: Easy to use and learn. Efficiency: too slow.

5. Conclusions and Future Work In this paper, a new image processing based method is proposed for robust, efficient, and automated counting of colonies in Petri dishes. The system is purely software-centered and depends on the digital images produced by consumer grade (cheap) digital cameras. The proposed system can handle colonies in fairly noisy images with occasional focus problems. The main problem of concern was the existence of clusters of colonies which cannot be visually split into individual colonies. Application of more than one method over the same image produced reasonable performance in such circumstances. As a future work, full automation of the overall process (including detection and extraction of the plate region) will be developed. It is also desired to generalize and develop methods that are not solely specific to circular colony shapes, in order to handle more than one type of bacteria colony.

IPL98: Compatible Language: C, c++ Platform: Cross-platform Documentation: Complete API documentation only Usability: Hard to use and learn. Efficiency: Not tested. CVIPTools: Compatible Language: C, c++ Platform: Cross-platform Documentation: Complete API documentation with example projects and a book. Usability: Hard to use and learn. Efficiency: Not tested. OpenCV: Compatible Language: C, c++ Platform: Cross-platform Documentation: Complete API documentation with several projects and pages on internet. Usability: Hard to use Efficiency: Very efficient and extensible with available modules.

6. References [1] M. Goyal “Machine Vision Based Bacteria-Colony Counter”, Thesis, Electrical and Instrumentation Department, Thapar University. [2] J. M. Bewes ,N. Suchowerska , D. R. Mckenzie “Automated cell colony counting and analysis using the circular Hough image transform algorithm (CHiTA)”, Physics in Medicine and Biology 2008, vol. 53, no. 21, pp. 5991-6008.

EmguCV: EmguCV is .net wrapper of openCV so it has nearly all advantages of openCV. Compatible Language: CLR capable languages (Visual c++, C# VB.net etc.) Platform: Windows (Common Language Runtime), cross-platform capable using “mono-project”. Documentation: Complete API documentation Usability: Easy to use and learn. Efficiency: Similar to openCV.

[3] P. R. Barber, B. Vojnovic, J. Kelly, C. R. Mayes, P. Boulton, M. Woodcock, M. C. Joiner, “Automated counting of mammalian cell colonies”, Physics in Medicine and Biology, 2001, vol. 46, no.1, pp. 63-76. [4] Shih-Hsuan Chiu , Jiun-Jian Liaw, An effective voting method for circle detection, Pattern Recognition Letters, vol.26 no.2, pp.121-133.

It can be noticed that the algorithms used in our method (such as Otsu’s thresholding, watershed segmentation, median filter, edge detection etc.) are also available in several free license libraries. Therefore the transformation of our method to executable form is clearly possible. Initial compilation

[5] Meyer, Fernand, “Topographic distance and watershed lines,” Signal Processing, Vol. 38, July 1994,pp. 113-125.

22

[6] N. Otsu, “A Threshold Selection Method from Gray-Level Histograms,” IEEE Transactions on Systems, Man, and Cybernetics, Vol. 9, No. 1, 1979, pp. 62-66. [7] “Compactness measure of a shape”, online http://en.wikipedia.org/wiki/Compactness_mea sure_of_a_shape. [8] M. Niyazi , I. Niyazi and C. Belka, “Counting colonies of clonogenic assays by using densitometric software”, Radiation Oncology 2007, 2:4 doi:10.1186/1748-717X-24. [9] “Colifast”, online at http://www.colifast.no/ [10] “AForge.Net Framework”, online at http://www.aforgenet.com. [11] Image Processing Library 98, online at: www.mip.sdu.dk/ipl98/. [12]CVIPTools, online at: www.ee.siue.edu/CVIPtools/ [13] OpenCV Wiki, online at: http://opencv.willowgarage.com/. [14] Emgu CV, online at: http://www.emgu.com/. [15] Qt software, online at: http://www.qtsoftware.com/.

23

Suggest Documents