Int. J. Advanced Operations Management, Vol. 5, No. 1, 2013
3
Automatic tissue classification by integrating medical expert anatomic ontologies H. Kang, A. Pinti* and Abdelmalik Taleb-Ahmed Université de Valenciennes, LAMIH UMR CNRS 8201 UVHC, Le Mont Houy, Valenciennes, France and I3 TMO, Université d’Orléans, 45032 Orléans, France E-mail:
[email protected] E-mail:
[email protected] E-mail:
[email protected] *Corresponding author Abstract: Image segmentation techniques have been widely used in medical image analysis. However, existing methods can not provide exact physical significance of segmented image regions because they are mainly based on basic image features such as grey level and texture without taking into account specialised medical knowledge. However, medical knowledge plays an indispensable role when doctors analyse medical images in their diagnosis. To deal with this problem, many tissue classification systems were developed by incorporating specific medical knowledge. However, these systems strongly depend on specific applications and then can not provide a general structure for integrating medical knowledge in a larger context. To treat medical image recognition in a systematic way, we propose in this paper a general intelligent tissue classification system which combines both FCM (fuzzy C-means) based clustering algorithm and qualitative medical knowledge on geometric properties of different tissues. In this system, a general model is proposed for formalising non structured and non normalised medical knowledge from various medical images. This model utilises a DOGMA approach (a natural language-based ontology system) for formal representation of these geometric features. Keywords: image segmentation; tissue classification; formal representation; DOGMA approach. Reference to this paper should be made as follows: Kang, H., Pinti, A. and Taleb-Ahmed, A. (2013) ‘Automatic tissue classification by integrating medical expert anatomic ontologies’, Int. J. Advanced Operations Management, Vol. 5, No. 1, pp.3–13. Biographical notes: H. Kang is currently a researcher in the development of 3D televisions in China. He received his Master of Computer Sciences and Technology from the University of Technology of Beijing (China) in 2004 and his PhD in Computer Sciences from the University of Valenciennes (France) in 2009. He developed methods for integrating expert knowledge in data analysis and medical image analysis. Copyright © 2013 Inderscience Enterprises Ltd.
4
H. Kang et al. A. Pinti is currently an Assistant Professor at the University of Valenciennes (France), and is also assigned to the I3 MTO laboratory at the University of Orleans. He received his PhD in Computer Sciences from the University of Mulhouse in 1993. His research interests include modelling and image processing for biomechanics applied to sports and rehabilitation. Abdelmalik Taleb-Ahmed received his Postgraduate degree and his PhD in Electronics and Microwaves from the Université de Lille 1 in 1988 and 1992. From 1992 to 2004, he was an Associate Professor at the Université du Littoral, Calais. Since 2004, he has been currently a Professor at the Université of Valenciennes in the Department GE2I, and does his research at the LAMIH UMR CNRS 8201 UVHC. His research interests includes signal and image processing, image segmentation, prior knowledge integration in image analysis, partial differential equations and variational methods in image analysis, multimodal signal processing, medical image analysis, including multimodal image registration, etc. This paper is a revised and expanded version of a paper entitled ‘Integration of human knowledge for automatic tissue classification on medical images’ presented at The 8th International Conference on Computational Intelligence in Decision and Control (FLINS), Madrid, Spain, 2008.
1
Introduction
With the rapid development of computer technology, medical imaging plays a more and more important role in our life. In diagnosis using magnetic resonance imaging, image segmentation is an indispensable and necessary step. It focuses on partitioning an image into a number of non-overlapped and constituent regions which are homogeneous regarding to some characteristics such as grey level or texture (Zhang and Chen, 2004). Many clustering-based methods such as fuzzy C-means (FCM) (Ahmed et al., 2002; Chen and Zhang, 2004; Wang et al., 2008; Li et al., 2008) have been proposed for image segmentation. However, these methods can not provide physical significance of segmented image objects. This is due to the fact that general features they used, such as grey level and texture, can not take into consideration specialised medical knowledge. This knowledge is crucial when doctors study medical images manually using their own vision and experience. Therefore, it is necessary to incorporate a priori knowledge on medical image analysis in order to annotate the segmented objects or classes. In this context, various knowledge-based tissue classification systems have been proposed, including the analysis for brain tissues (Li et al., 1993; Li et al., 1994; Li et al., 1995), the analysis for bone composition (Liu et al., 1999), and the diagnosis for chest (Brown et al., 1997). Nevertheless, all of these systems focus on specific applications and are not normalised and structured. It will be uncertain and imprecise when applying them in other contexts. Therefore, a more generalised automatic tissue classification system is needed for integrating medical knowledge in a flexible way and being adapted to different applications. In this paper, we propose an intelligent generalised tissue classification system which combines both the FCM algorithm and the qualitative medical knowledge on geometric properties of different tissues for further annotating the segmented objects. For the purpose of knowledge reusability, we propose to use DOGMA (Jarrar and Meersman,
Automatic tissue classification
5
2002), a natural language-inspired ontology approach (Spyns et al., 2002) to replace data modelling (ter Bekke, 1992) for formal representation of these geometric features. In fact, it is difficult for general public not mastering related background knowledge to understand highly specialised data-based models. The data-based models are often too specialised and can not be reused and shared in other applications. Contrary to data modelling, the essential property of DOGMA lies in its independence of specific applications, i.e. the ontology consisting of relatively generic knowledge that can be reused in different applications. In DOGMA, ontology is composed of a lexon base which holds sets of plausible conceptual relations derived by agreement from linguistic means, and a layer of ontological commitments which annotates the application with a subset of those lexons and specifies the constraints under which the application may use these conceptual relations. The separation of lexon base from domain rules implements a separation of concerns and thus achieves reuse and design scalability.
2
The proposed system
In this section, we will first introduce the basic paradigm of DOGMA ontology engineering. Then we will present the proposed tissue classification system based on the DOGMA ontology.
2.1 Foundations of DOGMA approach A DOGMA ontology has two components: lexon base and ontological commitments. The lexon base, also labelled a ‘light ontology base’, aims at performing domain axiomatisations. It is composed of a set of binary conceptual relations (Meersman, 1999). The lexical rendering of a binary conceptual relation is called lexon. A lexon is formally described as , in which term a and term b are linguistic terms, and relation and contrary-relation are the relations of lexon. term a has a relation with term b, and term b has a contrary-relation with term a. In most of cases, the relation is really contrary to the contrary relation. One relevant example is relation = ’inside’ and contrary relation = ‘outside’. In some special cases, the relation is reflexive and then equivalent to the contrary relation. One relevant example is relation = contrary-relation = ‘neighbour’. γ is a context identifier which is used to ‘interpret’ a linguistic term with a concept. A DOGMA context axiomatically performs disambiguation, viz. for each context γ and each term i. The pair (γ, term i) is assumed to refer to a unique (abstract) concept. An ontology is based on many interpretationindependent plausible fact types about a universe of discourse (Spyns et al., 2008). In DOGMA, lexons are formal, axiomatic yet linguistically determined, and agreed representations of propositions about a domain to be used for annotation. Thus, lexons have no further formal interpretation or semantics (Meersman, 2001). They have only intuitive linguistic interpretation. The commitment layer consists of a set of application axiomatisations. Specific applications map their symbols and relationships (expressed in terms of those application symbols) to the lexon base through such an application axiomatisation. Such a mapping is called application ontological commitment. Each application axiomatisation is composed of a set of lexons from a lexon base and a set of rules to control the intended use of application concepts annotated by these lexons.
6
H. Kang et al.
2.2 The proposed ontology-based system In this study, we build a DOGMA ontology system according to the previous principles. This system permits to formalise a priori knowledge on medical image analysis concerning the tissue geometric features, i.e., a
relative positions between tissues
b
neighbouring relation of each tissue
c
ranking order of all tissues according to their areas
d
number of connected components of certain tissues
e
tissues having similar grey level
f
shape of certain tissue.
Assuming that there exist n tissues on a specific medical image denoted as T1, …, Tn, the proposed DOGMA system creates lexons for the knowledge on geometric properties of tissues as follows. 1
Relative positions between tissues. where i, j ∈ {1, … n} and ‘relative position’ is an attribute between tissue i and tissue j, and it belongs to the set {above, below, left, right, above left, below left, above right, below right, inside, outside}. For example, means cortical bone is included inside muscle in the context of ‘geometric feature’, and muscle is outside cortical bone.
2
Neighbouring positions between tissues. where i, j ∈ {1, … n}. It means that tissue i and tissue j are neighbours.
3
Descending order of areas between tissues with one connected component. where i, j ∈ {1, … n}. It means that the area of tissue i is larger than that of tissue j, and the area of tissue j is smaller than that of tissue i.
4
Numbers of connected components in tissues. where i ∈ {1, … n}. It means that tissue i has a specific number of connected components, and a specific number of connected components belongs to the feature of tissue i.
5
Tissues having similar grey level. where i, j ∈ {1, … n}. It means that tissue i and tissue j have similar grey level.
Automatic tissue classification 6
7
Shape of certain tissue. where i ∈ {1, … n}. It means that tissue i has a specific shape. And the shape belongs to the set {triangle, circle, rectangle, …}.
We store the previous lexons in a predefined data structure, i.e., a two-dimensional array. We now take the lexon ‘relative positions between tissues’ as an example to show how we store them. Name of array: RelativePosition. The data structure is a n × 5 array in the following form. Context
Tissue 1
RelativePosition
Contrary-RelativePosition
Tissue 2
RelativePosition ∈ {above, below, left, right, above left, below left, above right, below right, inside, outside}. means cortical bone is included inside muscle in the context of ‘geometric feature’. Apart from the previous typical lexons, there also exist other similar ones characterising the geometric properties of tissues. After the construction of the lexon base, we focus on the commitment layer. We first need an application axiomatisation of tissue classification. Then we define rules specifying and restricting the use of application terms interpreted by lexons from the lexon base, in particular, rules which permit to disambiguate objects and identify tissues from an image. In these rules, all the terms and roles in the lexon base will be underlined. Rule 1 For any labelled tissue Ti, if there is only one unlabelled tissue Tj so that there is a relative position between them, and if Tj does not have similar grey level with any other tissues, then we can find the class corresponding to Tj by comparing positions of pixels and label it. Rule 2 For each unlabelled tissue Ti, if all its tissue neighbours have been labelled, then we can find and label Ti using connectivity analysis related to these neighbouring tissues. Rule 3 If any pair of tissues which have similar grey level has only one unlabelled tissue Ti, then we can rank all the unlabelled classes with one connected component in descending order of areas and make equivalence between them and all ranked unlabelled tissues with one connected component. Rule 4 Given a specific number of connected components, if the number of unlabelled tissues is 1, and if the number of unlabelled classes is 1, then this class is equivalent to the related tissue. Rule 5 For each group of tissues corresponding to one class obtained by the FCM-based image segmentation algorithm, they have similar grey level, if only one tissue in it has not been labelled, then we can easily separate and label it by deducing the pixels corresponding to the labelled tissues from this class. Rule 6 For each shape, if the number of unlabelled tissues which have this shape is 1, and if the number of corresponding unlabelled classes with one connected component is 1, then this class is equivalent to the related tissue.
8
H. Kang et al.
For the aim of optimisation of the application of these rules, we define different priorities for them. The rules with higher priorities will be applied before those with lower priorities. For example, we define the priority of a rule by the number of corresponding premises. So rules with fewer number of premises have higher priority than those with larger number of premises. Then, we have: 1
Rules 2, 3, 5 have higher priority, because each of them has one premise
2
Rules 1, 4, and 6 have the lower priority, because each of them has two premises.
3
Validation of the proposed system
The proposed system has been successfully applied to tissue classification of human arm, forearm, thigh, crus and brain. One example (classification of thigh) is given below to show its effectiveness. Figure 1
One MRI image of human thigh
Figure 1 shows one MRI image of thigh. It was taken in the axial plane, T1-weighted, using a system of Philips Intera of 1.5 Tesla. The size of voxel is 0.93 × 0.93 × 1.0 mm3, and the coding is 12 bits. In addition, the signal has been analysed by spin echo technique in order to minimise artefacts. We can see that there are four tissues and one background. And cortical bone and background have similar grey levels, and bone marrow and adipose tissue have similar grey levels. When applying the proposed system, we run the FCM algorithm with three predefined classes because only three grey levels can be observed (p = 3). Then, we obtain from the clustering algorithm that cortical bone and background belong to one class C1, adipose tissue and bone marrow to another class C2, and muscle corresponds to C3. We first extract the background from C1 because it is an object with only one connected component linking all the four bounds of the image. Then we recursively apply the six rules for separating new tissues from the obtained classes. We use Rule 5 (groups of tissues having similar grey levels) to label cortical bone from the remaining pixels in C1 with respect to the background. Next, we apply the Rule 2 (neighbours of tissues) with respect to the cortical bone for determining pixels of bone marrow from C2. The remaining pixels of C2 can be then identified as adipose tissue using Rule 5 with respect to the bone marrow. Using Rule 2 with respect to cortical bone and adipose tissue, we can label C3 as muscle.
Automatic tissue classification
9
Having classified all tissues, we assign the following grey levels to these classes in order to generate a synthetic image for intuitively showing the effectiveness of the classification result: background: 20; adipose tissue: 250; muscle: 60; cortical bone: 200; bone marrow: 150. Figure 2
Intuitive comparison of different tissue classification methods for human thigh (a) original image (b) classification reference given by one expert (c) the synthetic image obtained by using the specific system A (d) the synthetic image obtained by using the specific system B (e) the synthetic image obtained by using the proposed system
(a)
(b)
(c)
(d)
(e)
Figure 2 intuitively compares the original image of thigh, two synthetic images generated from the identified tissues obtained by using two specific tissue classification systems A
10
H. Kang et al.
(Colin et al., 1995) and B (Han., 2009), the synthetic image generated from the identified tissues obtained by using the proposed method, and one reference classification result provided by a medical expert. In the system A, the original image of thigh is first segmented into three classes by a FCM-based clustering algorithm. By using the knowledge on the ranking order of tissue volumes, only three tissues (muscle, adipose tissue and bone marrow) can be identified. The corresponding result is shown in Figure 2(c). The system B is a tissue classification procedure permitting to control the execution of the FCM clustering algorithm by medical knowledge. In each step of this procedure, FCM is applied in order to obtain two segmented classes under the control of the knowledge. The synthetic image obtained by the system B is shown in Figure 2(d). We can see from Figure 2 that cortical bone and background have been identified by the system B and the proposed system but can not be separated by the system A. In addition, the systems A and B can be available in classification of thigh only and are difficult to be adapted to other parts of human body. In contrast, the proposed system is more general, permitting to create ontologies for different kinds of medical knowledge and be applied to different parts of human body. The integration of medical knowledge in the system A is a manual procedure. However, the proposed system is completely automatic. In practice, by comparing with the other methods for thigh classification, we can find that the proposed system leads to the most satisfying results. Table 1 and Table 2 give the quantitative comparison between the systems A and B, and the proposed system related to the reference classification provided by a medical expert. The performance criterion for the tissue Tj related to the ith method (PCTij) is given as follows. PCTij =
Aij ∩ Arefj Aij ∪ Arefj
(1)
where Aij is the set of pixels belonging to the jth class obtained by the ith method, and Arefj is the set of pixels belonging to jth class in the reference image. For all identified tissues, the performance criterion for evaluating the ith method (PCi) is given by NuT
∪ (A
ij
∩ Arefj )
j =1
PCi =
NuT
(
(2)
NuT
∪ A ) ∪ (∪ A
refj )
ij
j =1
j =1
where NuT is the number of tissues in the image. Table 1
Quantitative comparison of classification for thigh Adipose tissue
Cortical bone
Bone marrow
Muscle
A
78.31%
U
92.31%
89.39%
B
79.12%
77.82%
93.08%
75.21%
The proposed method
79.56%
84.11%
93.69%
89.95%
Automatic tissue classification Table 2
11
Quantitative comparison of overall classification performance for thigh
PCi
A
B
The proposed method
90.84%
93.39%
94.12%
From Table 1, we can find that the proposed system lead to the best classification results for all identified tissues and its overall classification performance is also the best (94%). This result is very significant in medical image analysis. In practice, a slight increase of classification rate can lead to a quite different medical conclusion. Higher the classification rate is, closer the related automatic tissue classification system is to professional medical experts. We also apply the proposed system to the classification of human crus, arm, forearm, and brain and obtain satisfying results. The corresponding results are summarised in Table 3. From these results, we can find that the correct classification rates are high enough (> 92%) and stable in the interval of [92%, 99%]. Then, the accuracy and capacity of generalisation of the proposed method can be validated in a larger context. Table 3
Overall classification performance of the proposed system for human crus, arm, forearm, and brain
PCi
4
Crus
Arm
Forearm
Brain
98.06%
98.73%
95.94%
92.54%
Conclusions
Although automatic image segmentation has been successfully applied to medical analysis and diagnosis, there exist drawbacks in the existing classical methods. These methods can not effectively separate tissues having similar grey levels and give significant physical interpretation on obtained segmented classes. Thus, it is necessary to integrate a priori knowledge on medical image analysis so as to interpret the segmented objects or classes. For this purpose, various knowledge-based tissue classification systems have been proposed. Due to the incorporation of medical knowledge, the obtained results by these systems are more accurate than those obtained by the pixel grey level-based image segmentation methods. However, the existing tissue classification systems focus on analysis of specific parts of human body. So they are neither normalised nor structured, and can not be easily adapted to other parts of human body. In practice, key geometric features vary greatly from tissue to tissue and the extraction of common general features is then too complex. In this context, we develop an ontology-based general system for tissue classification that may easily be shared by different agents and applications, as well as professional experts. This system combines both an image segmentation method and qualitative medical knowledge. Moreover, for the aim of knowledge reusability, we applied DOGMA paradigm to create a partial ontology for formal representation of the involved medical and anatomical knowledge. In this way, professionals from different backgrounds can communicate and collaborate based on this knowledge. In this approach, we first construct a lexon base. Then we create rules specifying and constraining the use of terms in the lexons, in particular, those which permit to disambiguate and identify tissues from image. The ontology-based classification system
12
H. Kang et al.
can be used for formal representation of all kinds of human knowledge on medical image observation and automatically incorporate them into the image segmentation procedure. We successfully validate this system on human thigh, crus, arm, forearm and brain. The obtained tissue classification results are satisfying. Based on the proposed system, we will further work on the quantification of the muscle/fat ratio, assessment of the muscle/fat temporal variation, and measurement of the volume of muscle, fat and bone in human legs. Furthermore, we will make more efforts for exploring other common tissue features and adding them to the lexon base. This will allow fresh application to commit to the new lexons in lexon base.
References Ahmed, M.N., Yamany, S.M., Mohamed, N., Farag, A.A. and Moriarty, T. (2002) ‘A modified fuzzy C-means algorithm for bias field estimation and segmentation of MRI data’, IEEE Transactions on Medical Imaging, March, Vol. 21, No. 3, pp.193–200. Brown, M.S., McNitt-Gray, M.F., Mankovich, N.J., Goldin, G.J., Hiller, J., Wilson, L.S. and Aberle, D.R. (1997) ‘Method for segmenting chest CT image data using an anatomical model: preliminary results’, IEEE Transactions on Medical Imaging, December, Vol. 16, No. 6, pp.828–839. Chen, S. and Zhang, D. (2004) ‘Robust image segmentation using FCM with spatial constraints based on new kernel-induced distance measure’, IEEE Transactions on System, Man, and Cybernetics-Part B: Cybernetics, August, Vol. 34, No. 4, pp.1907–1916. Colin, A., Erbland, E., Datin, C., Boire, J.Y., Veyre, A., and Zanca, M. (1995) ‘Automatic muscle/fat quantification on MR images’, Proceedings of the IEEE Engineering in Medicine and Biology 17th International Conference, Montreal, Vol. 1, pp.479–480. Han, K. (2009) ‘Intelligent medical image annotation system based on expert knowledge – application to the medical analysis of musculo skeletal diseases and disabilities from MRI sequences’, PhD thesis, Université de Valenciennes, France. Jarrar, M. and Meersman, R. (2002) ‘Formal ontology engineering in the DOGMA approach’, ODbase, LNCS, Vol. 2519, pp.1238–1254, Springer, Heidelberg. Li, C., Goldgof, D.B. and Hall, L.O. (1993) ‘Knowledge-based classification and tissue labeling of MR images of human brain’, IEEE Transactions on Medical Imaging, December, Vol. 12, No. 4, pp.740–750. Li, H., Deklerck, R. and Cornelis, J. (1994) ‘Integration of multiple knowledge sources in a system for brain CT-scan interpretation based on the blackboard model’, Proceedings of the 10th Conference on Artificial Intelligence for Applications, San Antonio, TX, USA, pp.336–343. Li, H., Deklerck, R., Cuyper, B.D., Hermanus, A., Nysser, E. and Cornelis, J. (1995) ‘Object recognition in brain CT-scans: knowledge-based fusion of data from multiple feature extractors’, IEEE Transactions on Medical Imaging, June, Vol. 14, No. 2, pp.212–229. Li, X., Zhang, T. and Qu, Z. (2008) ‘Image segmentation using fuzzy clustering with spatial constraints based on Markov random field via Bayesian theory’, Special section on Signal Processing for Audio and Visual Systems and its Implementations, IEICE Transactions Fundamentals, Japon, June, Vol. E91-A, No. 3, pp.723–728. Liu, Z.Q., Liew, H.L., Clement, J.G. and Thomas, C.D.L. (1999) ‘Bone image segmentation’, IEEE Transactions on Biomedical Engineering, May, Vol. 46, No. 5, pp.565–570. Meersman, R. (1999) ‘The use of lexicons and other computer-linguistic tools in semantics, design and cooperation of database systems’, in Zhang, Y., Rusinkiewicz, M. and Kambayashi, Y. (Eds.): Cooperative Database Systems and Applications, Springer, Heidelberg, pp.1–14. Meersman, R. (2001) ‘Ontologies and databases – more than a fleeting resemblance’, OES/SEO Workshop, Luiss Publications.
Automatic tissue classification
13
Spyns, P., Meersman, R. and Jarrar, M. (2002) ‘Data modeling versus ontology engineering’, SIGMOD Record: Special Issue on Semantic Web and Data Management, Vol. 31, pp.12–17. Spyns, P., Tang, Y. and Meersman, R. (2008) ‘An ontology engineering methodology for DOGMA’, Journal of Applied Ontology, Vol. 3, Nos. 1–2, pp.13–39. ter Bekke, J.H. (1992) Semantic Data Modeling, Prentice Hall, Hemel Hempstead. Wang, J., Kong, J., Lu, Y., Qi, M. and Zhang, B. (2008) ‘A modified FCM algorithm for MRI brain image segmentation using both local and non-local spatial constraints’, Computerized Medical Imaging and Graphics, Vol. 32, No. 8, pp.685–698. Zhang, D. and Chen, S. (2004) ‘A novel kernelized fuzzy C-means algorithm with application in medical image segmentation’, Artificial Intelligence in Medicine, Vol. 32, No. 1, pp.37–50.