Learning to detect stent struts in Intravascular

0 downloads 0 Views 3MB Size Report
Combining growcut and temporal correlation for ivus lumen segmentation. In: IbPRIA, LNCS 6669/2011. (2011) 556–563. 9. Gatta, C., Balocco, S., Ciompi, F., ...
Learning to detect stent struts in Intravascular Ultrasound Francesco Ciompi1,2 , Rui Hua1,2 , Simone Balocco1,2 , Marina Alberti1,2 , Oriol Pujol1 , Carles Caus3 , Josepa Mauri3 and Petia Radeva1,2 1

Dep. of Applied Mathematics and Analysis,University of Barcelona,Spain 2 Computer Vision Center, Campus UAB, Bellaterra, Barcelona,Spain 3 Hospital Universitari “Germans Trias i Pujol”, Badalona,Spain [email protected] Abstract. In this paper we tackle the automatic detection of struts elements (metallic braces of a stent device) in Intravascular Ultrasound (IVUS) sequences. The proposed method is based on context-aware classification of IVUS images, where we use Multi-Class Multi-Scale Stacked Sequential Learning (M2 SSL). Additionally, we introduce a novel technique to reduce the amount of required contextual features. The comparison with binary and multi-class learning is also performed, using a dataset of IVUS images with struts manually annotated by an expert. The best performing configuration reaches a F-measure F = 63.97% . Key words: Intravascular Ultrasounds, Stent detection, Stacked Sequential Learning.

1

Introduction

An intraluminal coronary stent is a metal mesh tube deployed in a stenotic artery during Percutaneous Coronary Intervention (PCI), in order to prevent the vessel narrowing after balloon angioplasty (see Figure 1(a)). After stent placement, cases of under-expansion (the stent is correctly placed but not completely expanded) or mala-positioning (the stent is only partially in contact with the luminal wall) may be experienced: these cases are recognized as important risk factors and might lead to restenosis, potentially harming the long term outcome of the intervention [1]. The evaluation of a successful stent placement can be assessed by Intravascular Ultrasound (IVUS), a catheter-based imaging technique that allows the visualization of the internal vessel morphology. A typical IVUS image of an implanted stent in coronary vessels is shown in Fig. 1(b), which represents a cross-sectional view of the artery. In cases of well-posed, recently implanted and completely expanded stent, the struts are visible as bright spots in direct contact with the luminal interface. In case of mala-posed stent, some of the struts are visible inside the luminal area, and their appearance is similar to the guide-wire artifact. Furthermore, other regions of the IVUS image may be confused with struts, when their position in the vessel is not considered (see Figure 1(c-d)). Few works on automatic or semi-automatic strut detection in IVUS have been presented so far. In [2], two deformable generalized cylinders, corresponding to

2 B

D

A

D

B

E

C

F

C

A F E

(a)

(b)

(c)

(d)

Fig. 1. Example of stent (a). Example of IVUS image in cartesian (b) and polar (c) coordinates. In (b) the visible struts are marked along with an elliptical approximation of the stent shape. In (d) some regions including a strut (A-C) and not including a strut (D-F) marked in (c) are depicted.

the vessel wall and the stent, were used. The cylinders were adapted to the image features based on edges and ridges, in order to obtain a three-dimensional reconstruction of the boundaries. A semi-automatic stent contour detection based on a two-stage algorithm was presented in [3]. The method performs first a global modeling of struts by minimum cost algorithm, followed by refinement using local information on stent shape and image intensity gradient. User interaction is finally foreseen to correct the stent shape in 3D. In [4], the same authors proposed an improved version of this method, where the stent shape is accurately reconstructed in images with good quality, but the algorithm requires at least three clearly visible struts and manual correction. A method for automatic strut detection, limited to bio-absorbable stents was proposed in [5], using Haar-like features and a cascade of classifiers. Recently, a work based on two-stage classification for fully automatic stent detection, has been presented [6]. Two classifiers are trained by using different set of features, and a post-processing for locating struts is applied, based on classification confidence value. In this paper we present a fully automatic approach to strut detection based on pixel-wise classification of IVUS images. The first contribution of the paper consists in defining a set of features for the description of IVUS tissues, suitable for strut detection. The second contribution consists in formulating the strut detection as a context-based classification approach. For this purpose, we use Multi-class Multi-scale Stacked Sequential Learning (M2 SSL) [7]. The third contribution is a novel technique to reduce the amount of contextual features in M2 SSL, namely ranked context. Finally, we demonstrate the advantages of the proposed approach, compared with binary and multi-class classification, in presence and absence of contextual information.

2

Method

The architecture of the proposed method is depicted in Figure 2(a). In this section we describe the main steps of the method.

3 Gating

x

h1

(y,P)

J

rank

Context-Based Classification

Feat. Extraction

Struts detection

ext

z

x

h2 binary map

(a)

binary map.*likelihood

0.05 0.34

5

0.30 0.30

2

0.11 0.19

4

0.19 0.11

3

0.34 0.05

1

P

ranked context

IVUS sequence

rank(P)

(b)

Fig. 2. Schematic of the proposed approach (a) where the M2 SSL architecture is depicted; in (b) the ranked context is shown for a neighbor of a pixel (filled circle).

Gating In the context of stent detection, the information on the luminal area is necessary in order to infer the stent condition, given the set of struts. In [8] an accurate method for luminal border detection was presented, demonstrating that the cross-correlation between subsequent frames is a useful information to identify the luminal area. The method ensures an accurate segmentation when gated frames are considered. For this reason, as a design constraint, we choose to work with gated frames in the IVUS pullback. We apply the image gating method presented in [9] to the IVUS sequence, obtaining as result the set of gated position G = {g1 , g2 , . . . , gN }. Features We introduce two operators to compute features that are specific for the strut detection problem. The first one is strictly related to strut appearance in ultrasonic images and is defined as IBD = −I ∗ Lσ , where Lσ is the Laplacian of Gaussian with parameter σ and I is the IVUS image. We apply the filter for values σ = {2, 3, 4, 5, 6} and for each pixel of the IVUS image in position q we q q consider the feature vector xqBD = [IBD |σ=2 , . . . , IBD |σ=6 ] ∈ R1×5 . A visual example of this feature is depicted in Figure 3(b). The second operator is related with the lumen detection. In order to provide the classifier with useful information on luminal area, following [8] we compute the cross-correlation between subsequent images of the sequence. In particular, given a gating position gi , we consider three adjacent frames in the sequence Fgi = {Igi −1 , Igi , Igi +1 }. We apply the cross-correlation between sliding windows of size (W, H) over the three pairs of frames (Igi −1 , Igi ), (Igi , Igi +1 ), (Igi −1 , Igi +1 ). For each pair (IA , IB ), we compute: + m, j + n) − I˜A )(IB (i + m, j + n) − I˜B ) ˜ 2 P (IB (i + m, j + n) − I˜B )2 m,n (IA (i + m, j + n) − IA ) m,n P

ICC |A,B (i, j) = qP

m,n (IA (i

P where I˜ = W1H m,n I(i + m, j + n) and (m, n) vary in the range of (H, W ). The feature vector for a pixel in position q of the IVUS image is then computed as the q q |gi ,gi −1 + ICC |gi ,gi +1 + average over the three cross-correlations: xqCC = 31 (ICC q ICC |gi −1,gi +1 ). A visual example of this feature is depicted in Figure 3(c). We vary the size H = W of the sliding window from 9 to 27 px with a step of 2 px, obtaining a feature vector xCC ∈ R1×10 . Finally, the feature set is completed with descriptors previously used in IVUS image classification [7]. The final length of the feature vector is 42 elements.

4

50

100

50

50

50

100

100

100

150

150

150

200

200

200

250

250

250

300

300

300

350

350

350

150

200

250

300

350 400

400

400

400

450

450

450

450

500

500

550

500 50

100

150

200

250

(a)

300

350

400

450

500

500

550 50

100

150

200

250

300

(b)

350

400

450

500

550

550 50

100

150

200

250

300

(c)

350

400

450

500

550

50

100

150

200

250

300

350

400

450

500

550

(d)

Fig. 3. Example of IVUS image (a), the corresponding map IBD (b), ICC (c), and the labeling with five classes (d).

Context-based classification The architecture used for classification is based on Multi-Class Multi-Scale Stacked Sequential Learning (M2 SSL) [7], which demonstrated to accurately encode the context of IVUS images. In this section we first introduce the M2 SSL architecture and then we present the proposed ranked context approach. The basic Stacked Sequential Learning (SSL) architecture consists in a cascade of two classifiers, h1 and h2 (see Figure 2(a)). In order to separately train the two classifiers, the training dataset X is split into two non-overlapping parts x1 and x2 . The classifier h1 is trained with i.i.d. labeled samples x1 . Afterwards, the set x2 is tested by using h1 and the labels yˆ = h1 (x2 ) are used to enrich the features x2 : the concatenation of data set and labels (additional) set represents the extended training dataset for h2 , xext = [x2 yˆ]. 2 In [10] this architecture is transformed in a context-aware method by defining the additional set as a multi-scale sampling over the labels yˆ of the neighbors of each pixel, according to a pre-defined support Ω. For this purpose, a functional J takes the labels map yˆ as input and applies a multi-scale decomposition, computing the contextual feature vector z. Finally, in [7, 11] the extension to multi-class problem is presented by sampling the pseudo-probabilities P over all the possible classes at all the scales. Ranked context. One of the main problems in M2 SSL is the increment of contextual features with the number of scales and classes. Let us define Nc as the number of classes, Ns the number of scales and Nn the number of neighbors, the cardinality of the extended feature vector in M2 SSL is |z| = Ns Nc Nn . In this paper, we propose to avoid the dependance of z on the number of classes. The idea is that the contextual information of a point can be expressed by the indices of the most probable classes over the neighbors. For this purpose, given a point q of the IVUS image, at a certain scale we consider the vector of pseudo-probabilities pi of the ith neighbor of q (see Figure 2(b)). We compute then the vector of the ranked probabilities Ri as: Ri = rankNr (pi ), such that pi (Ri (k)) > pi (Ri (k + 1)) and |Ri | = Nr . By defining zR = {Ri } considered over all the neighbors, the length of the extended set is now independent on the number of classes. We define this approach as ranked context.

5 classified area FP m1

2

4

automatic struts

6

s1

8

10

manual struts

s2 m2

(a)

TP

12

nt ste

sh

e ap

14

16

18

2

4

6

8

10

(b)

12

14

16

18

(c)

Fig. 4. Example of the criterion used to compute the strut detection performance (a), in (b) the averaged strut appearance (kernel) is depicted along with the result of the cross-correlation between image and kernel (c).

Struts detection The classification result is a map of labeled regions in the IVUS image. From this map we extract the regions labelled as strut. Inside each labelled region, we need to define a unique point (xs , ys ) representative of the strut. For this purpose, we first compute the mean strut appearance (kernel) by averaging the area of a bounding box centered in each point (xm , ym ) marked as strut in the ground truth (see Figure 4(b)). Afterwards, we compute the normalized cross correlation between the kernel and the IVUS image: as a result, the likelihood for each pixel to look like a strut is obtained (see Figure 4(c)). Finally, the point (xs , ys ) for each region is obtained as the position of the maximum value inside the strut region.

3

Experimental setup

Material A set of 30 IVUS sequences of in-vivo human coronary arteries from different patients was used in this study. The sequences have been acquired by using iLab IVUS equipment (Boston Scientific) and a 40 MHz catheter. The sequences have been divided into two groups, for training and testing purposes. We randomly select 10 pullbacks as training data. After applying the imagegating algorithm, we extract features as described in Section 2, and the training samples are divided into two sets x1 (66 images) and x2 (72 images), to guarantee x1 ∩x2 = ∅ while training M2 SSL; 38 of these images contained a visible strut. We define the training data base as TRdb, consisting in 138 images. The remaining 20 pullbacks are used as test data. In this case, once extracted the set of gated frames, all the frames containing at least a visible strut were selected, resulting in 177 test images, which we define as test database (TSdb). Labeling One expert manually labeled four structures in each image: (1) luminal interface, (2) media-adventitia border, (3) calcified plaques, (4) struts. For structures 1-3, the contour of the interested area was traced, while for 4 the position (xm , ym ) of the the most representative pixel for each strut was marked. In order to extract training samples from struts, a bounding box 13 pixel wide centered in the defined strut has been defined. This size has been chosen so that a strut is included inside the statistical kernel box (see Figure 4(b)). Evaluation criteria We evaluate the performance of the proposed method using two criteria. First, we assess the capability of the classification method

6

Fig. 5. Examples of strut detection in IVUS images where the automatic detection and the bounding box used for evaluation are depicted; first two rows, accurate results of ranked context; last row, frames in which the ranked context is less effective. Classes: 2w/o 2w 3w/o 3w 4w/o 4w 5w/o 5w NR = 1 NR = 2 NR = 3 R 67.85 67.50 81.33 71.49 80.19 64.06 73.85 52.93 3w 54.76 56.58 59.33 P 19.15 46.66 29.95 54.20 36.09 62.39 38.65 69.94 4w 61.95 63.97 63.61 F 29.51 52.44 42.93 61.50 48.18 62.47 49.46 61.27 5w 61.67 61.46 61.78 Table 1. Quantitative results on TSdb when context is used (w) compared with no context (w/o), from 2 classes to 5 classes (left). F-measure for different ranking values (right).

to correctly detect regions containing struts. For this purpose, we construct a bounding box around each manually marked strut (xm , ym ) and we consider the detected point s = (xs , ys ) as True Positive (TP), False Positive (FP) and False Negative (FN). A TP is obtained when s is inside the bounding box; a FP is obtained when s is not contained in the bounding box of any manually marked strut and a FN is obtained when a strut present in the ground truth is not detected by the automatic method. Since we are evaluating a detection problem, the considered performance parameters in the evaluation are Precision = TP/(TP+FP), Recall = TP/(TP+FN) and F-Measure = 2PR/(P+R). We do not consider any variable threshold in the computation of P and R, while solely considering the maximum a-posteriori probability as classification output. As a second evaluation criterion, we consider the minimal Euclidean distance between each point s and all the points m.

7

4

Experimental results

The basic classifier for M2 SSL is Adaptive Boosting (AdaBoost) [12], trained with up to T = 150 decision stumps. A number of samples n = 10000 was randomly selected from TRdb for training purposes. The number of scales is Ns = 7 and the support Ω is 8N. The extension to multi-class is obtained by using Error-Correcting Output Codes, as described in [7]. The classes defined by the manual labeling are: lumen, strut, plaque, calcification, external tissue. In our experiments, we incrementally introduce classes in the framework, starting with the binary problem, strut vs rest, where rest is the union of all the other classes. The criteria for selecting the classes to use in the experiments are the following: (1) in case we accept the complexity of three classes, it is desirable to include the lumen; (2) with four classes, we include the plaque, which also embeds calcification; (3) with five classes, we include the calcification, to separate it from plaque and to help solving doubtful strut classification. We test our method on the whole TSdb dataset. In order to evaluate the effect of using contextual information, we compare the classification performance with the results obtained without context. Finally, in order to evaluate the effect of the ranked context, in each experiment with multi-class we vary NR from one to three. The quantitative evaluation is depicted in Table 1. A qualitative result of IVUS image classification is depicted in Figure 3(d), while in Figure 5 we present some results on strut detection where the automatic points (xs , ys ) are marked along with the bounding box corresponding to manual annotation. The mean distance error for the method that results in the best F-measure is davg = 0.51 (0.78) mm, while the median distance for the same case is dmed = 0.09 mm. This difference is due to the presence of struts detected as FP, representing outliers in the stent shape. The most indicative parameter to consider in this problem is the F-measure, being a trade-off between Precision and Recall. Increasing the complexity of the problem from binary to multi-class also increases the F-measure. This demonstrates the effectiveness of multi-class approach. The use of contextual information always improves the F-measure and the Precision, due to a reduction of FP. The Recall parameter when 5 classes are used decreases when compared with 4 classes, while the difference between 3 classes and four classes is not relevant. This effect is even more remarkable when context is applied and is probably due to the fact that some struts may be interpreted as small calcifications: this increases the number of FN, while reducing the number of FP. In Table 1(b), the effect of the ranked context on the F-measure is illustrated. When compared with the absence of ranking, the use of the ranked context strategy does not decrease substantially the performance. In particular, in case of four classes and two ranked labels, the F-Measure is better than the case with full probability: this case represents the best result achieved with the proposed approach. We can then deduce that, at least for strut detection, the information necessary to define a robust context is simply embedded in the presence or absence of the first two probable classes in the neighborhood. As a practical effect of the ranked context, it is worth noting that using rank 2 allows to avoid a number of Ns Nn (Nc − 2) = 112 contextual features.

8

5

Conclusion

In this paper we have tackled the automatic detection of struts in IVUS. For this purpose, we have proposed a context-aware method based on pixel-wise classification; we have also introduced a novel technique to reduce the amount of contextual features. We demonstrated the usefulness of contextual and the effectiveness of ranked context. The application of such approach to other pattern recognition problem is an interesting line in context-aware classification. Finally, modeling the stent shape given the set of detected struts as well as extending the model to the whole sequence is a direct application of the presented method.

6

Acknowledgments

This work was partially founded by the projects CONSOLIDER-INGENIO CSD 2007-00018 and TIN2009-14404-C02.

References 1. Yoon, H.J., Hur, S.H.: Optimization of stent deployment by intravascular ultrasound. Korean J Intern Med 27(1) (2012) 30–38 2. Canero, C., Pujol, O., Radeva, P., Toledo, R., Saludes, J., Gil, D., Villanueva, J., Mauri, J., Garcia, B., Gomez, J.: Optimal stent implantation: three-dimensional evaluation of the mutual position of stent and vessel via intracoronary echocardiography. In: Computers in Cardiology. (1999) 261 –264 3. Dijkstra, J., Koning, G., Tuinenburg, J., Reiber, P.O.J.: Automatic border detection in intravascular iltrasound images for quantitative measurements of the vessel, lumen and stent parameters. Computers in Cardiology 28 (2001) 25–28 4. Dijkstra, J., Koning, G., P.V., J.T., Reiber, O.J.: Automatic stent border detection in intravascular ultrasound images. In: CARS. (2003) 1111–1116 5. Rotger, D., Radeva, P., Bruining, N.: Automatic detection of bioabsorbable coronary stents in ivus images using a cascade of classifiers. IEEE Transactions on Information Technology in Biomedicine 14(2) (2010) 535–537 6. Hua, R., Pujol, O., Ciompi, F., Balocco, S., Alberti, M., Mauri, F., Radeva, P.: Stent strut detection by classifying a wide set of ivus features. In: MICCAI Workshop on Computer Assisted Stenting. (2012) 7. Ciompi, F., Pujol, O., Gatta, C., Alberti, M., Balocco, S., Carrillo, X., Mauri-Ferre, J., Radeva, P.: Holimab: A holistic approach for media-adventitia border detection in intravascular ultrasounds. Medical Image Analysis 16 (2012) 1085–1100 8. Balocco, S., Gatta, C., Ciompi, F., Pujol, O., Carrillo, X., Mauri, J., Radeva, P.: Combining growcut and temporal correlation for ivus lumen segmentation. In: IbPRIA, LNCS 6669/2011. (2011) 556–563 9. Gatta, C., Balocco, S., Ciompi, F., Hemetsberger, R., Rodriguez-Leor, O., Radeva, P.: Real-time gating of ivus sequences based on motion blur analysis: Method and quantitative validation. In: MICCAI 2010, LNCS 6362/2010. (2010) 59–67 10. Gatta, C., Puertas, E., Pujol, O.: Multi-scale stacked sequential learning. Pattern Recognition 44(10-11) (2011) 2414–2426 11. Puertas, E., Esclaera, S., Pujol, O.: Multi-class multi-scale stacked sequential learning. In: Multiple Classifier Systems. (2011) 197–206 12. Schapire, R.: The boosting approach to machine learning: An overview. MSRI Workshop on Nonlinear Estimation and Classification, Berkeley, CA, USA (2001)