Detection and quantification of intracerebral and intraventricular hemorrhage from computed tomography images with adaptive thresholding and case-based reasoning Yuanxiu Zhang, Mingyang Chen, Qingmao Hu & Wenhua Huang
International Journal of Computer Assisted Radiology and Surgery A journal for interdisciplinary research, development and applications of image guided diagnosis and therapy ISSN 1861-6410 Volume 8 Number 6 Int J CARS (2013) 8:917-927 DOI 10.1007/s11548-013-0830-x
1 23
Your article is protected by copyright and all rights are held exclusively by CARS. This eoffprint is for personal use only and shall not be self-archived in electronic repositories. If you wish to self-archive your article, please use the accepted manuscript version for posting on your own website. You may further deposit the accepted manuscript version in any repository, provided it is only made publicly available 12 months after official publication or later and provided acknowledgement is given to the original source of publication and a link is inserted to the published article on Springer's website. The link must be accompanied by the following text: "The final publication is available at link.springer.com”.
1 23
Author's personal copy Int J CARS (2013) 8:917–927 DOI 10.1007/s11548-013-0830-x
ORIGINAL ARTICLE
Detection and quantification of intracerebral and intraventricular hemorrhage from computed tomography images with adaptive thresholding and case-based reasoning Yuanxiu Zhang · Mingyang Chen · Qingmao Hu · Wenhua Huang
Received: 22 December 2012 / Accepted: 6 March 2013 / Published online: 23 August 2013 © CARS 2013
Abstract Purpose Hemorrhage within the brain space (HWBS) involves the brain parenchyma and ventricle systems, and is associated with high morbidity and mortality. Computed tomography (CT) head scans are the recommended modality for diagnosis and treatment for HWBS. However, HWBS detection may be difficult when the hemorrhage is inconspicuous, while quantification is hard as hemorrhage can have very variable intensity that overlaps with normal brain tissue. An algorithm is proposed to detect and quantify HWBS. Methods Adaptive thresholding and case-based reasoning (CBR) were applied to HWBS in four steps: preprocessing to extract the brain, adaptive thresholding based on local contrast with varied window sizes to derive candidate HWBS regions, case representation to represent each candidate HWBS region by parameters on context as well as intensity and geometrical characteristics, and classification of HWBS by taking each candidate HWBS region as a case and applying CBR. Additionally, case base indexing and weights optimization were used to increase retrieval speed and improve performances. Refinement of each recognized HWBS was performed for quantifying HWBS. Results Validation on 426 clinical CT data indicates that the proposed algorithm achieved a detection rate of 94.4 % and recall of 79.2 % for detecting HWBS regions. Visually, the HWBS regions calculated from adaptive thresholding plus refinement agreed well with expert delineation. For 10 representative data with small to large hemorrhage, the Y. Zhang · M. Chen · Q. Hu (B) Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China e-mail:
[email protected] W. Huang (B) Nanfang Medical University, Guangzhou, China e-mail:
[email protected]
algorithm quantitatively yielded a segmentation accuracy of 0.950 ± 0.015. Case base indexing increased the retrieval speed by 41.1 times at the expense of decreasing detection rate of 0.5 % and recall of 2.6 %. Genetic algorithm optimization enhanced the detection rate and recall to, respectively, 94.9 and 83.5 %. Conclusions We developed and tested an algorithm that combined adaptive thresholding and CBR for detecting and quantifying HWBS. Experiments showed that adaptive thresholding could provide suitable candidates, while CBR was able to identify HWBS regions. The proposed method has potential as a new tool for accurately detecting and quantifying HWBS.
Keywords Hemorrhage within the brain space · Computed tomography · Adaptive thresholding · Case-based reasoning
Introduction Hemorrhage within the brain space (HWBS) involves the brain parenchyma and ventricle systems, and is associated with high morbidity and mortality. It is a multifactorial disorder with heterogeneous etiologies and potentially long-term debilitating outcomes. Prompt detection and treatment could affect the outcome [1]. The global incidence of HWBS ranges from 10 to 20 cases per 100,000 populations. Non-contrast computed tomography (CT) is the technique of choice for screening HWBS. Spontaneous HWBS is one of the most devastating forms of strokes and has no proven treatment. However, with the recent promising results of recombinant factor VIIa and STICH trials, HWBS is becoming an area of major research [2].
123
Author's personal copy 918
It is found that the delineation of HWBS requires correct interpretation of the demonstrable HWBS on CT [3]. This can become difficult when the lesion is inconspicuous, e.g., small or being masked by normal structures, or when the reader is inexperienced. In most parts of the world, acute care physicians are the only persons to read the CT images at odd hours when radiologists’ expertise may not be immediately available. However, the skill of acute care physicians regarding the interpretation of brain CT has been shown to be imperfect [4]. Another study has shown that radiology residents can, albeit infrequently, overlook hemorrhage on brain CT [5]. Study on quantifying hemorrhage from non-contrast CT is scarce. Loncaric et al. [6] employed a semiautomatic method by setting seed points and fixed intensity thresholds manually to grow intracerebral hemorrhage. Chan [3] developed an image analysis system to delineate small acute intracranial hemorrhage, which could not be extended for non-small HWBS due to possible deformation and mass effect. Hu et al. [7] proposed segmenting HWBS by combining global thresholding of intensity and asymmetry with respect to midsagittal plane and ad hoc knowledge. Bhanu Prakash et al. [8] proposed to segment HWBS using modified level set to achieve an accuracy in the range of 0.858–0.917 for 200 CT scans within 5 min, which might need enhancement both in speed and accuracy. The major problem with existing methods for quantifying HWBS is their inability to be adaptable to the great intensity variability of the HWBS to yield poor accuracy. As such, in clinics, the volume and anatomical location of HWBS are still approximated manually. This suffers from being tedious and laborious, requiring readers to be an expert in neuroradiology, and inability to control accuracy and to provide accurate three-dimensional shape of the HWBS [9]. It is our objective to recognize HWBS based on a mechanism similar to human beings and to delineate HWBS with accuracy and robustness. The proposed method is a combination of producing HWBS candidate regions via adaptive local thresholding and recognition of HWBS from candidates via CBR, where each candidate HWBS region is considered a case.
Int J CARS (2013) 8:917–927
512×512 and field of view being around 220 mm ×220 mm. The age (60 ± 20) ranged from 2 days to 90 years. Among these subjects, 340 were males (79.8 %) and 86 were females (20.2 %), they were all suspected stroke patients needing immediate imaging and corresponding therapeutic actions. The volume of HWBS ranged from 0.35 to 152 ml. All image data were in DICOM (Digital Imaging and Communication in Medicine, 16-bits) format and anonymized. Methods The algorithm consists of four steps: preprocessing, adaptive thresholding, case representation, and CBR. Figure 1 is the corresponding flowchart, while Fig. 2 shows the sketch map of the processing flow. Preprocessing This is to derive the brain from head CT images. This is achieved according to [10] based on fuzzy-c-means clustering and connected component analysis with minor modification such that the thresholds are more preservative and the extracted brain is visually checked with satisfaction. The derived brain contains white matter, gray matter, cerebrospinal fluid, HWBS, and edema within the skull. As we are only concerned with brain tissues and HWBS, and all these tissues have a CT value smaller than 90 Hounsfield Unit, the following windowing operation can be applied to the original DICOM volume orgDV(x, y, z) to yield an 8-bit volume orgVol8(x, y, z) for further processing orgVol8(x, y, z) ⎧ ⎨ 0, = 255 × orgDV(x,y,z) , 90 ⎩ 255,
orgDV(x, y, z) ≤ 0 0 < orgDV(x, y, z) < 90 orgDV(x, y, z) ≥ 90
For the ease of notation, the brain image is still denoted as Brain (x, y, z), which takes the intensity of orgVol8(x, y, z) when (x, y, z) is a brain voxel, or 0 otherwise. Adaptive thresholding
Materials and methods Materials Altogether, 426 non-contrast head CT studies were retrospectively retrieved from 3 hospitals in China with 357 acute or subacute HWBS patients and 69 normal subjects. Ethical committees from collaboration hospitals approved the project. All data were acquired with a single detector CT scanner (GE or Simens Medical Systems). The images were axial, obtained parallel to the orbitomeatal line. The slice thickness was either 5 or 10 mm with the matrix being
123
HWBS is bright on non-contrast CT images, but its intensity varies substantially depending on factors such as hemoglobin concentration and position of the hemorrhage. Detecting HWBS purely from intensity will not work because of intensity overlap between HWBS and normal brain tissue. Due to the intensity overlap, substantial intensity variability within an HWBS region and between HWBS of different patients, as well as the unpredictable size of HWBS, global thresholding, and local thresholding with fixed window size will not work. Inspired by the fact that human beings differentiate objects according to their local contrast, we propose to
Author's personal copy Int J CARS (2013) 8:917–927
919
Case Representation
Input
Connected region Preprecessing Derive the skull
CT Image
Brain Image
Adaptive thresholding segemation
Binary Image
Case Base Case
Connected region
Case
...
...
Connected region
Case
Case-Based Reasoning Is a hemorrhage case?
Hemorrhage Cases
Detection Result
Output Fig. 1 Flow chart of the proposed algorithm
Fig. 2 Sketch map of the processing flow. Top row from left to right: original axial slice, derived brain, HWBS candidates from adaptive thresholding; middle row: 4 cases enclosed by rectangles; bottom row
from right to left: brain with the rectangles of 4 cases, classified HWBS in red rectangles and non-HWBS in green rectangles, and the segmented HWBS in red
find candidate HWBS regions based on a way to adaptively determine local contrast. In the most simple scenario, consider a region with two different intensities B and G, and the proportion of voxels with intensity B is p, then the intensity √ standard deviation is p(1 − p)|G − B|. It is thus obvious that the local standard deviation is proportional to the local contrast |G − B|. When p is 0.5, i.e., the local window contains the same proportion of background and foreground, the local standard deviation will reach maximum of 0.5|G − B|. On the other hand, if the proportion p or (1 − p) is decreased to 0.1 (which may be considered the minimum proportion for a local window with enough proportion of background and
foreground), the local standard deviation will be 0.3|G − B|, which is 0.6 times of the maximum local standard deviation. It is based on these observations, the following local thresholding with varied window size is proposed: for each voxel (x, y, z) within the brain, the corresponding local window is the window centered at (x, y, z) with a window width W (x, y, z) and height H (x, y, z); for each axial slice (z being a constant), find the maximum intensity standard deviation for all possible window widths and heights of all the brain voxels and denote it as sdmax (z); for each voxel (x, y, z), find the minimum W (x, y, z) = H (x, y, z) such that the intensity standard deviation within the local window sd(x, y, z)
123
Author's personal copy 920
Int J CARS (2013) 8:917–927
is not smaller than 0.6 ∗ sdmax (z). With the availability of adaptive window size containing balanced background and foreground, we now seek an appropriate way to determine the local threshold based on local contrast. According to [11], the local thresholding by Sauvola and Pietikainen [12] has been shown to yield best binarization for document images with fixed window size. We thus propose to determine local threshold T (x, y, z) based on the enhancement of Sauvola and Pietikainen’s formula, sd(x, y, z) 1.2 ∗ sdmax (z)
where m(x, y, z) is the local intensity mean, k is a constant to be determined. As T (x, y, z) is for binarizing dark foreground, HWBS (bright foreground) candidate voxels can be binzarized by applying the above formula to 255-orgVol8(x, y, z). All the foreground voxels are then grouped together on each axial slice according to their spatial connectivity by connected component labeling to form foreground regions or HWBS candidates. To speed up the adaptive local thresholding, the local intensity mean and standard deviation are calculated using integral images [13]. To summarize, the proposed adaptive thresholding tries to find an intensity threshold at every voxel (x, y, z); the thresholding is based on adaptive calculation of local contrast which is represented by the local intensity standard deviation sd(x, y, z); the local contrast is approximated with adaptively variable window size such that there will be balanced proportion of background and foreground. Case-based reasoning CBR is a problem-solving paradigm that differs fundamentally from other major artificial intelligence approaches. It classifies a new case by retrieving and comparing similar cases and has several advantages [14]: (1) it has a relatively low computational cost of incremental learning, as it can learn nonlinearly separable categories and continuous functions; (2) it does not rely on statistical assumptions, and its justifications are human-understandable because they are based on the principle of analogy-based reasoning that human beings frequently use during problem-solving; and (3) it does not require an explicit domain model. A comprehensive survey on CBR can be found in [15]. Figure 3 illustrates the CBR process. Given a description of a problem, CBR relies on indices to find potentially useful cases. A case stored consists of a problem describing the state of the world when the case occurs and its solution to that problem. Case indexing assigns labels to cases upon storing and organizes cases so that a similar case or cases can be found efficiently. Upon retrieving a matched case or cases, CBR looks for prominent differences and applies formulae
123
ity ilar
TR
sim
RE
IEV
E
Learned Case Previous Cases
RETAIN
learning
Retrieved Case
General Knowledge Tested/ Reparied Case
New Case
adaption
Solved Case
REVISE
verify
+k ∗ m(x, y, z) ∗
New Case
REUSE
T (x, y, z) = m(x, y, z) ∗ (1 − k)
Problem
Fig. 3 CBR cycle
or rules to account for these differences when suggesting a solution [16]. Storing and indexing a new case in the case base is the final step in the CBR cycle. Case representation The first step of CBR is case representation. In this paper, each HWBS candidate region from adaptive thresholding within an axial slice is considered as a case, detecting HWBS is converted to “whether a case is an HWBS.” Sixteen parameters are explored for case representation through experiments. These parameters could be classified into 2 groups. The first group contains 5 parameters to represent the context of a case within the same axial slice, while the second group includes 11 parameters to describe the intensity and shape properties of a case. The 5 parameters of the first group are: (1) the maximum area of all the cases within axial slice z (max_area(z)); (2) ratio of the area of all the cases to the brain area of axial slice z (total_area(z)); (3) the minimum area below which a case could not be considered as an HWBS region (ltd_area); (4) number of cases whose area is greater than ltd_area within axial slice z (case_num(z)); and (5) relative position of axial slice z in the volumetric data (slice(z)). The slice with the largest brain area (called maximum brain slice and its brain area is denoted as maxBrain) will be assigned slice(z) = 0.5. Suppose the axial slices are arranged from superior to inferior direction, then axial slice z will have slice(z) assigned as follows: an axial slice z superior to the maximum brain slice will have slice(z) calculated as
Author's personal copy Int J CARS (2013) 8:917–927
0.5 ∗ brainArea(z)/maxBrain, while axial slices inferior to the maximum brain slice will have slice(z) calculated as 1−0.5∗brainArea(z)/maxBrain, where brainArea(z) is the brain area of axial slice z. The 11 parameters of the second group are: (1) number of voxels of the case (area); (2) number of voxels to the number of boundary voxels of a case (form_factor); (3) approximation of the shape of the case with a fixed size (8×8) by proportionally scaling the original object and determining the proportion value (from 0 to 1, with 0 for a complete background block, and 1 for a complete foreground block) through averaging (icon); (4) relative x position of the case (center_x); (5) relative y position of the case (center_y); (6) width of the rectangular bounding box of the case (width); (7) height of the rectangular bounding box of the case (height); (8) ratio of width to height of the case (asp_ratio); (9) ratio of the minimum of width and height to the maximum of the width and height of the case (norm_ratio); (10) ratio of the average intensity of the case within axial slice z to the average intensity of the brain of the axial slice z (brightness); and (11) the minimum Euclidean distance between the boundary voxels of the case within axial slice z and all the brain boundary voxels of the axial slice z (distance). Only those candidates whose volumes are greater than 0.1 ml will be considered valid candidates and used for quantification. Case similarity Case similarity is based on the similarity among parameters of cases. Euclidean distance is employed for defining similarities of parameters. Assume Goal and Source are two cases, the Euclidean distances of parameter i between them are calculated as: Sourcei − Goali 2 disti = Spani where Spani is the maximum value of |Sourcei − Goali | for all training cases. The similarity between two cases is a function of weighted sum of these distances, with weights being specified in Table 1. wi × disti simgoal,source = 1 − wi
921 Table 1 Weights of the parameters Parameter
Weight
Parameter
Weight
Parameter
Weight
Slice
1.0
Area
1.0
Form_factor
1.0
Max_area
0.1
Center_x
0.1
Asp_ratio
0.1
Total_area
0.2
Center_y
0.1
Norm_ratio
0.1
Ltd_area
0.2
Width
0.2
Brightness
0.2
Case_num
0.1
Height
0.2
Distance
0.1
Icon
0.2
Assessment criteria The proposed algorithm is assessed using the most common performance measures, i.e., detection rate and recall for CBR, and κ index for segmentation. The detection rate is the ratio of correctly classified cases to all cases of a patient, while the recall is the percentage of correctly identified HWBS cases. In order to have a high accuracy, the system needs to have both high detection rate and high recall rates. The κ index measures the match between the segmented and reference HWBS voxels. Mathematically, detection rate, recall, and κ index are defined, respectively, as: #(TP + TN) #(TP + FP + TN + FN) #TP Recall = #(TP + FN) #(S1 ∩ S2 ) κ= (#S1 + #S2 )/2
DetectionRate =
where TP stands for truth positive (case being HWBS judged as HWBS), FP for false positive (case being non-HWBS judged as HWBS), TN for true negative, FN for false negative, S1 and S2 for the sets of voxels being judged as HWBS by the adaptive thresholding algorithm plus refinement and experts, respectively, and # for the number of elements of the set or cardinality. Another measure is the F-measure [17] defined as: F_Measure =
(β 2 + 1) × DetectionRate × Recall β 2 × DetectionRate + Recall
where β represents the relative importance between detection rate and recall.
Experiments Experiments have been carried out to validate the algorithm regarding the performance of adaptive thresholding and CBR. The statistics of the experimental data are listed in Table 2. Each case is identified by radiological experts as either HWBS or non-HWBS, which is then used as the reference for evaluating the performance of the CBR.
123
Author's personal copy 922
Int J CARS (2013) 8:917–927
17,286
87.5
0
0.87
Non-hemorrhage
Recall
0.2
0.81
12.5
0.74
2,474
0.68
Hemorrhage cases
0.62
19,760
0.56
6,411
Cases
Detection rate
0.4
0.5
Slices
0.6
0.43
16.2
0.37
83.8
69
0.31
357
Without hemorrhage
1 0.8
0.25
With hemorrhage
Percentage
0.19
426
Number
0.12
Data
Including
0.06
Number
0
Items
Performance
(a) 1.2
Table 2 Statistics of the experimental data
Similarity
(b)
0.006 0.005
Probability
Similarity Distribution
0.004 0.003 0.002 0.001
0.87
0.81
0.74
0.68
0.62
0.56
0.5
0.43
0.37
0.31
0.25
0.19
0.06
0.12
0
0
As manually delineating HWBS regions is time-consuming and needs high expertise, 10 HWBS typical data with HWBS sizes being large, medium, and small are picked out for expert to delineate in order to assess the performance of segmentation. The proposed algorithm was implemented with C++, and each data could be automatically processed within 30 s on a Pentium 4 PC (2.4 GHz and 2 G RAM). Among 357 HWBS data, 355 were classified as HWBS while 2 HWBS patients with small HWBS volume around the basal ganglia were misclassified as non-HWBS (HWBS volumes being, respectively, 0.35 and 1.04 ml). All the 69 normal subjects were classified as normal.
Similarity Fig. 4 Similarity test of the case base
1 0.8 0.6 0.4 Detection rate 0.2
Recall 0.91
0.9
0.88
0.87
0.85
0.84
0.83
0.81
0.8
0.78
0.77
0.76
0.74
0.73
0.7
0.71
0
Similarity
(b) 0.025 Similarity Distribution
0.02 0.015 0.01 0.005
0.96
0.94
0.92
0.91
0.89
0.86
0.88
0.84
0.83
0.81
0.8
0.78
0.76
0.75
0.72
0.73
0 0.7
The performance of the CBR method can be assessed by the detection rate and recall measures. Our validation is based on two observations: (1) whether one or several similar cases in the case base could always be found for a given new case; (2) whether one or several similar cases in the case base could always classify the new case correctly. Figure 4a shows that detection rate and recall increase with the growing similarity when the similarity is above 0.1, while Fig. 4b is the frequency or probability distribution of similarity between any two cases. Figure 5b shows that in statistical sense, a random new case could find a case in the case base with the most probable similarity of 0.85. Then, from Fig. 5a, we can find the corresponding detection rate and recall to be, respectively, 0.94 and 0.79.
Probability
Performance of case-based reasoning
Performance
(a) 1.2
Similarity
Performance using case base indexing With the increasing scale of the case base, finding the most similar case from the case base will take more and more time. Indexing of case base using critical parameters provides a way to decrease the retrieving time. We adopted a method similar to [14] and [18] for indexing. The original case base is divided into 10 sets according to the parameter slice, and each set is further divided into 100 subsets according to area and form_factor.
123
Fig. 5 Nearest neighbor test
After the indexing, the original case base is divided into 1,000 subsets. To solve the problem of missing most similar cases, we propose that whether the parameter belongs to set i, sets i − 1, and i + 1 should also be checked. The retrieval speed is enhanced remarkably after the indexing (41.1 times). However, the indexing scheme may
Author's personal copy Int J CARS (2013) 8:917–927
923
960
1024
896
0.85
832
0.8
704
0.025
768
Testing set
640
0.766
576
0.939
448
4,234,683
With indexing
Training set
512
0.792
384
0.944
320
195,218,920
256
Recall
192
Original
Detection rate
128
Retrieving cases
Performance
0
Computational cost
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
64
Case base
Fitness
Table 3 Performance of case-based reasoning with or without case base indexing
Generation
Similarity Distribution After indexing
Fig. 7 Procedure of optimization
0.015
1.2
0.01
Detection rate Recall After opt. After opt.
0.4 0.2
have a risk of missing most similar case. Consequently, the detection rate and recall decrease slightly as shown in Table 3. Figure 6 shows the comparison of similarity distribution for any two cases with or without indexing.
0.9
0.7
0.75
0.65
0.6
0.5
0.55
0.45
0.4
0.3
0.35
0.25
Similarity Fig. 8 Similarity test after optimization Table 4 Comparison of optimal and subjective weights
Performance using optimized weights The previous performance is achieved using weights specified in Table 1 according to subjective judgment. Alternatively, some weights can be assigned using a more objective method. Genetic algorithm (GA) is adopted to optimize weights as it can achieve global optimization and is effective and robust in searching very large spaces in a wide range of applications [19]. Recent development of GA can be found in [20]. GA performs the search process in four stages: initialization, selection, crossover, and mutation. In the initialization stage, a population of genetic structures (known as chromosomes) that are randomly distributed in the solution space is selected as the starting point of the search. After the initialization stage, each chromosome is evaluated using a user-defined fitness function. Our fitness function is f -measure defined in “Materials and methods” section with α = 1. The mutation probability of one individual is calculated by P = nr Pstd , where r and n are, respectively, the rank of current individual in the population and the scale of the population, and Psd is a constant for the standard mutation probability. As optimizing the weights of 16 parameters is very complicated, only the 3 most important parameters (slice, area, and form_factor) are optimized. The 426 CT data are randomly divided into training set and testing set with each con-
0.2
0
0
Fig. 6 The variance of similarity distribution
0.6
0.1
Similarity
0.8
0.15
0.94 0.95
0.81 0.83 0.84 0.85 0.87 0.88 0.9 0.91 0.92
0.7 0.71 0.73 0.74 0.76 0.77 0.78 0.8
0
1
0.05
0.005
Performance
Probability
0.02
Subjective weight Optimized weight
Weight of parameters
Performance
Slice Area Form factor
Detection rate Recall Fitness
1.0
0.944
0.792 0.861
0.949
0.835 0.887
1.0
1.0
0.591 0.514 0.477
taining 213 data. The controlling parameters are specified as: population size being 1,000, crossover rate being 0.5, mutation rate being 0.01 (Pstd ), and the ending criterion being to reach the specified number of iterations. Figure 7 shows the fitness versus number of iterations. It is found that there are no marked differences between training set and testing set for the fitness. If all the 16 parameters are optimized, we may have the so-called overfitting problem, i.e., the fitness measure could be close to 1 for the training cases but be quite low for testing cases. Figure 8 and Table 4 confirm that after optimization of the 3 important parameters, the performance of CBR is enhanced, especially for the recall and fitness. Performance of segmentation As manual delineation of HWBS boundaries is tedious, laborious with uncertainty especially when HWBS is not
123
Author's personal copy 924
Int J CARS (2013) 8:917–927
Fig. 9 An axial slice of HWBS with substantial intensity change a, manually delineated HWBS by an expert in yellow b, segmented HWBS with the proposed method in red c, and the difference between b and c (red for intersection, yellow for false negative and blue for false positive)
conspicuous, careful attention has been made to delineate HWBS boundaries of 10 representative HWBS data for quantification. Because of hemorrhage mixed with normal brain tissue and different hemoglobin concentrations, the candidate regions by the proposed adaptive thresholding will only catch the core part of HWBS (k = 0.25), with HWBS boundary recovered by a simple refinement model: assume the average intensity of the HWBS candidate within axial slice z is avg f , while the average intensity of axial slice z of the brain excluding all HWBS candidates is avgb , then any voxel neighboring the HWBS candidate will be added to the HWBS candidate when its intensity is greater than (avgb + avgg )/2. Figure 9 shows, respectively, an axial slice, the corresponding HWBS regions delineated by the algorithm and expert, and their difference. The HWBS volumes of the 10 representative data are, respectively, 0.69, 14.5, 23.4, 24.5, 24.8, 33.9, 52.5, 76.7, 91.7, and 142.5 ml. The 10 representative data are used to quantify the segmentation accuracy through comparing the manually drawn hemorrhage by the expert and the hemorrhage automatically segmented by the proposed algorithm to yield κ index of 0.950 ± 0.016.
Discussion We have proposed a method which combines reasoning and segmentation to discern HWBS regions and delineate HWBS. Adaptive thresholding based on local contrast with variable window size has been proposed to find HWBS candidates, CBR is employed to discern HWBS, while adaptive thresholding plus refinement is explored to accurately segment HWBS. Extensive experiments have been carried out to validate the performance of CBR on 426 clinical CT data to yield a detection rate of 94.9 % and recall of 83.5 % after employing GA to optimize weights. Figure 10 shows typical misclassified cases (those marked in yellow) and correctly classified cases (red for HWBS cases and green for non-HWBS cases). The characteristics of those misclassified cases are: small area with medium intensity, likely to be misclassified as high signal around longitudinal fissure or sagittal sinus. As the ground truth of HWBS is difficult to attain, 10 clinical CT data with small, medium, and large HWBS are
Fig. 10 A typical image of classification of cases of the proposed method (see text for explanation)
123
Author's personal copy Int J CARS (2013) 8:917–927
925
Fig. 11 Comparative study of the proposed adaptive thresholding c and optimum global thresholding b, a being the original, and d being the eventual HWBS by combining adaptive thresholding (without refinement) with CBR Table 5 Comparison of detection performance with 3 parameters and 16 parameters
Weights
Performance
Slice
Area
Form factor
1.0
1.0
1.0
0.591
0.514
0.477
used to quantify segmentation to yield an accuracy of κ index 0.950 ± 0.016. Experiments have been carried out to compare the performance of the proposed adaptive thresholding (Fig. 11c) against the global optimum thresholding by manually adjusting the threshold to have a balanced background and foreground (Fig. 11b). It can be seen that global thresholding will either miss some HWBS when the threshold is too large or contain non-HWBS in the HWBS region due to intensity overlap between HWBS and non-HWBS voxels, while the proposed adaptive thresholding can overcome the overlapping problem as it is based on local contrast instead of absolute intensity. Due to the intensity overlap and artifacts with high signal, accurate delineation of HWBS is a challenge, while the proposed adaptive thresholding plus compensation of partial volume effects provide a possible solution. The usual way to incorporate knowledge into segmentation procedure is to employ ad hoc knowledge of applications such as intensity asymmetry of HWBS when HWBS is within one hemisphere. The major problem with ad hoc knowledge is that it may not be generally applicable, as the case of intensity asymmetry which will not be applicable when HWBS occurs within both hemispheres. As a case matching paradigm, CBR has the most prominent advantage of reasoning based on analogy or similarity of cases instead of on voxels. In our opinions, CBR is ideal for classifying HWBS from candidates or cases, as we have many identified cases available by experts while describing the rules is not easy due to exceptions for specific rules.
Detection rate
Recall
Only 3 parameters
0.902
0.724
+13 parameters
0.944
0.792
Only 3 parameters
0.916
0.767
+13 parameters
0.949
0.835
For optimizing weights, we have carried out experiments to compare the performance with 3 parameters and that with 16 parameters, with or without weight optimization by GA. The results were given below (Table 5). It can be seen that the performance will enhance when the weights of the 3 parameters (slice, area, and form factor) are optimized. With only the 3 parameters, the CBR could yield a detection rate of 0.916 and a recall of 0.767 using the optimized weights (0.591, 0.514, and 0.477). These performances are enhanced with the addition of the other 13 parameters to yield a detection rate of 0.949 and a recall of 0.835. The enhancement in performance with the addition of the other 13 parameters does show that the other 13 parameters play a role in recognizing hemorrhage from candidates using the subjective weights shown in Table 1. We do not optimize the weights of the other 13 parameters due to two major concerns: (1) optimizing weights of 16 parameters via GA is very complicated and the time to train the weights will grow exponentially, and (2) optimized 16 weights with respect to the training set will lead overfitting of the testing set. So eventually we choose the weights of 3 most relevant parameters through GA optimization and assign the weights of the other 13 parameters subjectively based on the judgment of their relative significance to balance between performance and overfitting (no overfitting according to Fig. 7). How to optimize multi-parameters to achieve good performance and avoid overfitting is still an open issue for future exploration. The main problem with CBR is case consistency. Figure 12 depicts the performance of the CBR with X axis for the slice (z), Y axis for the area of each case, while the Z axis is cut into
123
Author's personal copy 926
Int J CARS (2013) 8:917–927 0.25
2/64
3/64
4/64
5/64
6/64
7/64
8/64
9/64
Area
0.2 0.15 0.1 0.05 0 0
0.2
0.4
0.6
0.8
1
Slice
Fig. 12 Three-dimensional parameters analysis of the performance of the CBR, red dots for HWBS classified by experts, green dots for non-HWBS
different form_factor. It can be seen HWBS and non-HWBS cases can be very close when their area is small, slice(z) is close to 0.5 and form_factor is smaller than 5/64. These cases are cases enclosed by the yellow ellipses in Fig. 12, which have a small area and are hard to be discerned even by experts. Fortunately, these potentially misclassified cases have small area (Fig. 12) and have a little impact on the accuracy of HWBS delineation. The error of segmentation is mainly due to decreased HWBS intensity (Fig. 9b). It reflects the complexity of the problem itself: HWBS may have very variable intensity due to factors such as variation in hemoglobin concentration. Delineating HWBS with intensity similar to normal brain tissue is difficult for both the expert and any automatic algorithm. The proposed method could yield an accuracy of κ index 0.950 ± 0.016 for the 10 representative data, which is close to radiology experts. To enhance accuracy, a complicated modeling of the refinement is necessary which is yet to be explored.
Conclusion We have proposed an algorithm to determine HWBS candidates by exploring adaptive thresholding, to classify HWBS employing CBR, and to segment HWBS by combing the HWBS core from recognized HWBS with refinement using a simple model. The algorithm has been validated against 426 clinical head CT data to have a detection rate of 94.9 % and recall of 83.5 % on discerning HWBS, and a segmentation accuracy of κ index 0.950 ± 0.016 for 10 representative HWBS data with HWBS sizes from 0.69 to 142.5 ml. The proposed method may be a potential tool for detecting and quantifying HWBS for both research and clinical applications.
123
Acknowledgments This work has been supported by: National Program on Key Basic Research Project (No. 2013CB733800, 2013CB733803), Key Joint Program of National Natural Science Foundation and Guangdong Province (No. U1201257), National Natural Science Foundation of China (No. 61272328), Guangdong Natural Science Foundation (No. S2011010001820), and Shenzhen Key Basic Research Project (No. JC201005270370A). Authors would like to thank collaborating hospitals (Linyi People’s Hospital, Nanfang Hospital, and Tiantan Hospital) for providing the real clinical data.
Conflict of interest There is no conflict of interest with any financial organization regarding the materials discussed or described in the paper.
References 1. Bouz P, Zouros A, Taha A et al (2012) Neonatal intracerebral hemorrhage: mechanisms, managements, and the outcomes. Transl Stroke Res 3:S6–S9 2. Bardera A, Boada I, Feixas M et al (2009) Semi-automated method for brain hematoma and edema quantification using computed tomography. Comput Med Imaging Graph 33:304–311 3. Chan T (2007) Computer aided detection of small acute intracranial hemorrhage on computer tomography of brain. Comput Med Imaging Graph 31:285–298 4. Schriger DL, Kalafut M, Starkman S et al (1998) Cranial computed tomography interpretation in acute stroke: physician accuracy in determining eligibility for thrombolytic therapy. JAMA 279(16):1293–1297 5. Wysoki MG, Nassar CJ, Koenigsberg RA et al (1998) Head trauma: CT scan interpretation by radiology residents versus staff radiologists. Radiology 208(1):125–128 6. Loncaric S, Dhawan AP, Broderick J et al (1995) 3-D image analysis of intra-cerebral brain hemorrhage from digital CT films. Comput Methods Programs Biomed 46:207–216 7. Hu Q, Chen Z, Wu J et al (2010) Delineation of intracerebral hemorrhage from clinical non-enhanced computed tomography images. In: Bioinformatics and biomedical engineering 4th international conference 8. Bhanu Prakash KN, Zhou S, Morgan TC, Hanley DF, Nowinski WL (2012) Segmentation and quantification of intra-ventricular/ cerebral hemorrhage in CT scans by modified distance regular-
Author's personal copy Int J CARS (2013) 8:917–927
9.
10.
11.
12. 13. 14.
ized level set evolution technique. Int J Comput Assist Radiol Surg 7(5):785–798 Chalela JA, Kidwell CS, Nentwich LM et al (2007) Magnetic resonance imaging and computed tomography in emergency assessment of patients with suspected acute stroke: a prospective comparison. Lancet 369:293–298 Hu Q, Qian G, Aamer Aziz et al (2005) Segmentation of brain from computed tomography head images. In: Proceedings of the 2005 IEEE engineering in medicine and biology 27th annual conference, pp 3375–3378 Sezgin M, Snakur B (2004) Survey over image thresholding techniques and quantitative performance evaluation. J Electron Imaging 13(1):146–165 Sauvola J, Pietikainen M (2000) Adaptive document image binarization. Pattern Recognit 33:225–236 Viola P, Jones MJ (2004) Robust real-time face detection. Int J Comput Vis 57(2):137–154 Liu CH, Chen HC (2012) A novel CBR system for numeric prediction. Inf Sci 185:178–190
927 15. Begum S, Ahmed MU, Funk P et al (2011) Case-based reasoning systems in the health sciences: a survey on recent trends and developments. IEEE Trans Syst Man Cybern Part C Appl Rev 41(4): 421–434 16. Shin KS, Han I (2001) A case-based approach using inductive indexing for corporate bond rating. Decis Support Syst 32(1): 41–45 17. Eikvil L (1999) Information extraction from world wide web: survey. Technical report 945, Norwegian computing center 18. Negny S, Riesco H et al (2010) Effective retrieval and new indexing method for case based reasoning: application in chemical process design. Eng Appl Artif Intell 23(6):880–894 19. Kim M, Lee S, Woo S et al (2011) Approximate cost estimating model for river facility construction based on case-based reasoning with genetic algorithms. KSCE J Civ Eng 16(3):283–292 20. Liu L, Mu H, Yang X et al (2012) An oriented spanning tree based genetic algorithm for multi-criteria shortest path problems. Appl Soft Comput 12:506–515
123