Spatio-temporal Image Tracking Based on Optical Flow and Clustering: An Endoneurosonographic Application Andr´es F. Serna-Morales1, Flavio Prieto2, , and Eduardo Bayro-Corrochano3 1
3
Department of Electrical, Electronic and Computer Engineering, Universidad Nacional de Colombia, Sede Manizales, Carrera 27 No. 64-60, Manizales (Caldas), Colombia Tel.: +57 (6) 8879300; Ext.: 55798
[email protected] 2 Department of Mechanical and Mechatronics Engineering, Universidad Nacional de Colombia, Sede Bogot´ a, Carrera 30 No 45-03, Bogot´ a, Colombia Tel.: +57 (1) 316 5000; Ext.: 14103
[email protected] CINVESTAV, Unidad Guadalajara, Av. Cient´ıfica 1145, El Baj´ıo, Zapopan, Jalisco, M´exico Tel.: +52 (33) 37773600; Ext.: 1027
[email protected]
Abstract. On the process of render brain tumors from endoneurosonography, one of the most important steps consists in track the axis line of an ultrasound probe throughout successive endoscopic images. Recognizing of this line is important because it allows computing its 3D coordinates using the projection matrix of the endoscopic cameras. In this paper we present a method to track an ultrasound probe in successive endoscopic images without relying on any external tracking system. The probe is tracked using a spatio-temporal technique based on optical flow and clustering algorithm. Firstly, we compute the optical flow using the Horn-Schunck algorithm. Secondly, a feature space using the optical flow magnitude and luminance is defined. Thirdly, feature space is partitioned in two regions using the k-means clustering algorithm. After this, we calculate the axis line of the ultrasound probe using Principal Component Analysis (PCA) over segmented region. Finally, a motion restriction is defined over consecutive frames in order to avoid tracking errors. We have used endoscopic images from brain phantoms to evaluate the performance of the proposed method, we compare our methodology against ground truth and a based–color particle filter, and our results show that it is robust and accurate. Keywords: Endoneurosonography (ENS), endoscopic images, tracking, ultrasound probe, optical flow, clustering, Principal Component Analysis (PCA).
Corresponding author.
G. Sidorov et al. (Eds.): MICAI 2010, Part I, LNAI 6437, pp. 290–300, 2010. c Springer-Verlag Berlin Heidelberg 2010
Spatio-temporal Image Tracking
1
291
Introduction
Stereotactic neurosurgery involves the registration of both pre-operative and intra-operative medical images, typically from volumetric modalities such as MRI, CT and ultrasound, with a surgical coordinate system [2]. For most operations, the surgical coordinates are defined by a set of x, y, and z reticules on a surgical frame that is affixed to the patient’s skull. In modem computer-aided surgery systems, the frame is replaced by a tracking system (usually optical, but possibly mechanical or magnetic), that the computer uses to track the surgical tools [3]. Virtual representations of the tools are rendered over an image volume to provide guidance for the surgeon. In order to obtain good results using these images, the brain must not shift prior to or during surgery. This is only possible by using minimally invasive surgery, i.e. performing the operation through a small hole in the skull [12]. Recent trends in minimally invasive brain surgery aim the use of the joint acquisition of endoscopic and ultrasound images, a technique that has been called endoneurosonography (ENS) [8]. Endoscopic images have great utility for minimally invasive techniques in neurosurgery. Ultrasound images are cheaper than other medical images like CT and MRI; moreover, these ones are easier to obtain in an intra-operative scenario [14]. In literature, some work has been done in order to extract tridimensional information of brain tumors using endoneurosonography [3]. Due to their advantages, we are planning to use endoneurosonography for rendering internal structures of the brain, such as tumors. In this work, we want to solve one of the most important steps in the process: track an ultrasound probe in a sequence of endoscopic images and compute its pose in 3D space avoiding to use any external tool (neither optic nor magnetic). The equipment setup is as follows: the ultrasound probe is introduced through a channel in an endoscope and is seen by two endoscopic cameras. With a visual tracking system (Polaris) we calculate the 3D position of the endoscope tip and we want to know the pose of the ultrasound probe in order to have the exact location of the ultrasound sensor. This is important because the ultrasound probe is flexible and is rotating around its own axis, it can also move back and forth since the channel is enough wide. With these conditions, we need to develop a robust method for tracking the ultrasound probe throughout endoscopic images. In general, tracking methods can be divided into two main classes specified as top–down and bottom–up approaches. A top–down approach generates object hypotheses and tries to verify them using the image. For example, particle filter follows the top-down approaches, in the sense that the image content is only evaluated at the sample positions [10,11]. On the other hand, in a bottom– up approach the image is segmented into objects which are then used for the tracking. Spatio-temporal method proposed in this paper follows the bottom–up approaches. In our approach, we take advantage of the probe movements to perform a method based on optical flow, in which a clustering algorithm is applied using optical flow and luminance information with the aim of segment the ultrasound probe. Next, we applied Principal Component Analysis (PCA) over segmented
292
A.F. Serna-Morales, F. Prieto, and E. Bayro-Corrochano
region to determine the axis line of the probe. Finally, a motion restriction is defined in order to avoid tracking errors. This paper is organized as follows: Section 2 presents the methodologies used in spatio-temporal segmentation and axis line determination; in Section 3 experiments on an endoneurosonographic database are shown and we exposed reasons to use motion restrictions over probe tracking; finally, Section 4 is devoted to summarize and conclude the work.
2
Endoscopic Image Processing
At each time step, we obtain two color images from each camera of the endoscopic equipment and one ultrasound image from ultrasound probe (that we do not use in this work). These cameras are fully characterized, making it possible to perform a stereo reconstruction of the surgical scene. Endoscopic images have a dimension of 352 × 240 pixels. The object of interest is the ultrasound probe, which is visible by the endoscopic cameras. The probe is made of a metallic (specular) material, rotates on its own axis and moves randomly forward and backward through the endoscopic channel, so we use luminance and optical flow to perform its segmentation and its tracking. 2.1
Segmenting the Ultrasound Probe
The goal is determine the location of the ultrasound probe throughout endoscopic images. This process is developed applying a spatio-temporal method based on optical flow and clustering algorithm [1]. Optical Flow. In this work the algorithm proposed by Horn and Schunck has been used [6]. The algorithm determines the optical flow as a solution of the following partial differential equation: δL δx δL δy dL + + =0 (1) δx δt δy δt dt The solution of the Equation 1 is obtained by numerical procedure for error function minimization. The error function E is defined in terms of spatial and time gradients of optical flow vector field and consist of two terms shown in Equation 2. 2 2 E= α Lc + L2b dx dy Lb = L2c = u=
δL δx u
+
δu 2
dx dt
δx
;
δL δy v
+
+
δu δy
v=
dL dt
2
dy dt
+
δv 2 δx
+
δv δy
2
(2)
Spatio-temporal Image Tracking
293
L(x,y,t) represents luminance of the image in point (x,y) at the time moment t. To solve the minimization problem a steepest descent method is used, which is based on computation of gradient to determine the direction of search for the minimum. The optical flow algorithm has two main phases: in the first phase, gradient coefficients δL/δx, δL/δy, dL/dt are computed from input images; in the second phase, the optical flow vectors u and v, defined by Equation 2, are computed. Figure 1 shows the computation of optical flow for an endoscopic image. Note that the maximum values correspond to the image region where the ultrasound probe is moving in.
(a) Endoscopic image from left side (b) Optical flow from an endoscopic camera image Fig. 1. Optical Flow computation using the Horn and Schunck algorithm
Clustering-Based Segmentation. Most of the classical image segmentation techniques rely only on a single frame to segment the image [15]. However, the motion is a very useful clue for image segmentation [1], because we want to segment an ultrasound probe that rotates around its own axis and it is in continuous movement. In this approach, segmentation is not done on a simple frame-by-frame basis but utilizes multiple image frames to segment the ultrasound probe. For this purpose we extract features both from the current image that has to be segmented and from neighboring image frames in the sequence. The extracted feature vectors are clustered using a clustering algorithm to determine the probe region in the image. Currently, we use two features: the first one is the image luminance because the ultrasound probe is made of a metallic (specular) material then is brighter than the other objects in the background; the second one is the Euclidean norm of the optical flow. By using the above features, we obtain both spatial and temporal information about the scene. K-means clustering algorithm has been used in this work [13]. K-means is a numerical, unsupervised, non-deterministic and iterative method; it is simple and very fast, so in many practical applications, the method is proved
294
A.F. Serna-Morales, F. Prieto, and E. Bayro-Corrochano
to be a very effective way that can produce good clustering results [9]. K-means clustering consists in partitioning the feature space in clusters using an iterative algorithm that minimize the sum, over all clusters, of the within-cluster sums of point-to-cluster-centroid distances. In our application, the feature space is divided into two characteristic areas corresponding with two image regions: the ultrasound probe and the background. After the clustering algorithm was applied, the image is morphologically opened in order to reduce noise and eliminate small regions [7]. Figure 2 shows the result of the segmentation using the procedure described above.
(a) Endoscopic image
(b) Segmentation
Fig. 2. Spatio-Temporal segmentation of the ultrasound probe
2.2
Determining the Axis Line of the Probe
The goal is to determine the axis line of an ultrasound probe throughout endoscopic images. After segmentation by clustering, we have a (x,y) cloud point corresponding to pixels inside the ultrasound probe region in the image. With the purpose of getting the axis line of the probe, we extract the first principal component of the segmented region using PCA [4]. Principal Components Analysis (PCA) can be used to align objects (regions or boundaries) with the eigenvectors of the objects [15]. In our case, the major axis line of the ultrasound probe is determined by first principal component analysis (PCA). With this information, we can track the orientations of the probe in different images of the sequence. The ultrasound probe has an elongated and thin shape, thus its longitudinal axis corresponds to the axis that we want to find. If we compute the region of the ultrasound probe in the image as a set of bivariate data (x1 , x2 ), the longitudinal axis is the one with greater dispersion of data. For this reason, calculating the first principal component (PC) of pixels in segmented region corresponds to determine the axis line of the probe. Consider the variable x =[x1 , x2 ], corresponding to the Cartesian coordinates of the pixels that conform the segmented probe in the image, the covariance
Spatio-temporal Image Tracking
295
matrix Σ and eigenvalues λ1 ≥ λ2 . We can construct the linear combination shown in Equation 3: Y1 = aX = a1 X1 + a2 X2 Y2 = bX = b1 X1 + b2 X2
(3)
Variance Var(Yi ) and covariance Cov(Yi , Yk ) are shown in Equations 4 and 5, respectively. PC is the irrelevant linear combination Y1 , Y2 that makes the variances of above formula as largest as possible. Inside of resultant PCA array, the first PC has the largest variance and the second PC has the second largest variance [5]. In this work we need to extract only the first Principal Component, which correspond to the major axis line of the ultrasound probe. (4) ai ; i = 1, 2 V ar(Yi ) = ai Cov(Yi , Yk ) = ai
ak
;
i = 1, 2
(5)
Figure 3 shows the axis line of the probe extracted using PCA in two endoscopic images.
(a) Endoscopic left camera
(b) Endoscopic right camera
Fig. 3. Axis line of the ultrasound probe in endoscopic images
3
Results
For our experiments, we used a database with 2900 images from brain phantoms acquired with endoneurosonographic equipment. We applied our methodology and the traditional color–based particle filter [10] to all of the images, and we obtain numerical results from comparison with 100 images manually tracked using ground-truth. Figure 4 shows the manual probe tracking performed by ground-truth segmentation through a video sequence. On the other hand, Figures 5 and 6 show the results of the probe tracking using the methodology explained in Section 2
296
A.F. Serna-Morales, F. Prieto, and E. Bayro-Corrochano
and a classical color–based particle filter [10], respectively. In both cases, a priori knowledge of the background is not required. The axis line of the ultrasound probe is defined by two parameters: first, the centroid of segmented region, which determines the point where the axis line must cross the probe; and second, the axis angle, which is computed in our approach using the first eigenvector obtained by Principal Component Analysis (PCA). As is shown in Figures 4, 5 and 6, we compared the axis line obtained by ground-truth, our methodology and particle filter. Errors are calculated using the Euclidean distance between the centroids and the angle difference between axis orientations. Table 1 shows the error measurements between the axes calculated manually using ground-truth, our procedure and particle filter. The Euclidean distances between centroids (EBC) are shown in pixels, and the differences between angles (DBA) are shown in degrees. It remember that all the endoscopic images have a dimension of 352 × 240 pixels. We present the mean (μerror ), the standard deviation (σerror ), minima (minerror ) and maxima (maxerror ) values of these errors in two endoscopic video sequences of 100 images taken at a sampling frequency of 24 Hz (24 frames per second).
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
(j)
Fig. 4. Probe tracking using ground-truth segmentation
3.1
Motion Restriction
Results in Figure 5 show that the tracking is correct when we have full visibility of the ultrasound probe. Unfortunately, as is shown in Figure 7, in some cases the ultrasound probe can leave out of the range of vision of the endoscopic cameras, which generates a wrong tracking and causes the high error values reported in Table 1. For this reason, we included a motion restriction on the axis line of the probe. This restriction consists in defining maximum variations in angle and displacement allowed for the probe axis line between consecutive frames of the video sequence. To achieve this, we defined the state vector shown in Equation 6, which encodes the location of the ultrasound probe axis as a variation in position and rotation of the probe at any instant of time t. In that equation dx (t), dy (t),
Spatio-temporal Image Tracking
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
(j)
297
Fig. 5. Probe tracking using a Spatio-Temporal methodology
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
(j)
Fig. 6. Probe tracking using a Particle Filter
dθ (t) are the first derivates of the probe position with respect to x, y and θ, respectively. The first step of motion restriction consists in estimating ultrasound probe axis location using the methodology proposed in Section 2. This estimation is accepted if two conditions are reached: a) Euclidean distance between centroids of current and previous frames should not exceed the threshold ud ; b) the difference between axis angle in current and previous frames should not exceed the threshold uθ . Heuristically, thresholds ud and uθ were set in 35 pixels and 30 degrees, respectively. If these conditions are not met, it is very probably that a tracking error has occurred due to ultrasound probe is leaving out from range of vision of endoscopic cameras. To solve this problem, the variation of the state vector can be defined by the evolution rule of Equation 7, where N (μn , σn ) is White Gaussian Noise. Analyzing ultrasound probe displacements throughout endoscopic images, Gaussian functions parameters were set in μn =0 and σn =5. This evolution rule ensures that the axis position variation from one image to the next is not sharp, eliminating tracking errors caused by occlusion or leaving out the probe of the range of vision of endoscopic cameras, as is shown in Figure 8. In Table 1 we can
298
A.F. Serna-Morales, F. Prieto, and E. Bayro-Corrochano
(a)
(b)
(c)
(d)
Fig. 7. Error tracking caused by no visibility of the probe in the image
(a)
(b)
(c)
(d)
Fig. 8. Tracking correction using motion restrictions
observe a considerable reduction in tracking errors due to the inclusion of motion restriction in the algorithm. Besides, tracking errors of the particle filter are low for the axis angle, but high for the position of the centroid. This is because the Particle Filter implemented [10] uses an ellipsoidal approximation of the region of interest instead of performing the correct region shape, as we do using our segmentation methodology. T
S (t) = [dx (t) , dy (t) , dθ (t)]
(6)
S (t) = S (t − 1) + N (μn = 0, σn = 5)
(7)
Table 1. Tracking errors with respect to ground-truth method μerror
EBC (pixels) DBA (degrees)
σerror
minerror
Spatio-Temporal Tracking 28.1 27.4 0.6 11.7 23.8 0.1
maxerror
108.6 88.8
Spatio-Temporal tracking with motion restriction EBC (pixels) 18.8 13.5 0.6 75.8 DBA (degrees) 3.9 3.5 0.1 13.4
EBC (pixels) DBA (degrees)
Color–based Particle Filter 49.6 29.5 3.2 14.1 10.3 0.0
109.8 31.0
Spatio-temporal Image Tracking
4
299
Summary and Conclusion
We have shown a straightforward and efficient solution to the problem of ultrasound probe tracking throughout a sequence of endoscopic images. The method is simple and efficient, yet robust to reasonable occlusion, randomly probe displacements and probe leaving out from the range of vision of cameras. Segmentation of the probe was performed based on luminance and optical flow information. We decided to use these features because the ultrasound probe is made of a specular (metallic) material and its luminance is higher than other objects in the background; on the other hand, optical flow is an important clue to detect continuous and erratic movements of the probe. After implementing the spatio-temporal tracking method, errors were noticed due to occlusion and lack of visibility of the probe. Therefore, we defined a state vector that encodes the position of the axis line of the probe at any instant of time, and introduced a motion restriction associated with the maximum allowable rotations and displacements between consecutive frames of the sequence. A comparison between results shown in Table 1 showed that this restriction was effective in reducing the average errors and standard deviations. In order to evaluate the performance of our work, we decided to compare it with one of the most popular tracking methods, the particle filter [11]. In Section 3, we show that there are no high errors in determining the angle of the axis line of the probe, however, there are significant errors in determining its centroid. This error is due to definition of the particle filter algorithm used [10], which uses a ellipse to define the region of interest. According to these results, our method provides a better tracking solution in this specific problem. We are currently working on a methodology for dynamic rendering of brain structures from endoneurosonography, for which the process proposed here is a critical stage to ensure an adequate 3D modeling.
Acknowledgment We want to thank to CONACYT (M´exico), COLCIENCIAS (Colombia) and Universidad Nacional de Colombia (Manizales) for their economical support to this project. We want to thank PhD. student Rub´en Machucho-Cadena from CINVESTAV Guadalajara for his help during acquisition process of endoneurosonographic database used in this work.
References 1. Galic, S., Loncaric, S.: Spatio-temporal image segmentation using optical flow and clustering algorithm. In: Proceedings of the First International Workshop on Image and Signal Processing and Analysis, IWISPA 2000, pp. 63–68 (2000) 2. Gillams, A.: 3d imaging-a clinical perspective. In: IEE Colloquium on 3D Imaging Techniques for Medicine, pp. 111–112 (18, 1991)
300
A.F. Serna-Morales, F. Prieto, and E. Bayro-Corrochano
3. Gobbi, D.G., Comeau, R.M., Lee, B.K.H., Peters, T.M.: Integration of intraoperative 3d ultrasound with pre-operative mri for neurosurgical guidance. In: Proceedings of the 22nd Annual International Conference of the IEEE Engineering in Medicine and Biology Society, vol. 3, pp. 1738–1740 (2000) 4. Gonzales, R.C., Woods, R.E., Eddins, S.L.: Digital Image Processing using MATLAB, 2nd edn. Gatesmark Publishing (2009) 5. Haibo, G., Wenxue, H., Jianxin, C., Yonghong, X.: Optimization of principal component analysis in feature extraction. In: International Conference on Mechatronics and Automation, ICMA 2007, pp. 3128–3132 (5-8, 2007) 6. Horn, B.K.P., Schunck, B.G.: Determining optical flow. Artificial Intelligence 17, 185–203 (1981) 7. J¨ ahne, B.: Digital Image Processing, 5th edn. Springer, Heidelberg (2002) 8. Machucho-Cadena, R., de la Cruz-Rodriguez, S., Bayro-Corrochano, E.: Rendering of brain tumors using endoneurosonography. In: 19th International Conference on Pattern Recognition, ICPR 2008, pp. 1–4 (8-11, 2008) 9. Na, S., Xumin, L., Yong, G.: Research on k-means clustering algorithm: An improved k-means clustering algorithm. In: Third International Symposium on Intelligent Information Technology and Security Informatics, IITSI 2010, pp. 63–67 (2-4, 2010) 10. Nummiaro, K., Koller-Meier, E., Van Gool, L.: An adaptive color-based particle filter. Image Vision Computing 21(1), 99–110 (2003) 11. Ortegon-Aguilar, J., Bayro-Corrochano, E.: Omnidirectional vision tracking with particle filter. In: 18th International Conference on Pattern Recognition, ICPR 2006, vol. 3, pp. 1115–1118 (2006) 12. Roberts, D.W., Hartov, A., Kennedy, F.E., Hartov, E., Miga, M.I., Paulsen, K.D.: Intraoperative brain shift and deformation: A quantitative analysis of cortical displacement in 28 cases (1998) 13. Seber, G.A.F.: Multivariate Observations. John Wiley & Sons, Inc., Hoboken (1984) 14. Tatar, F., Mollinger, J.R., Den Dulk, R.C., van Duyl, W.A., Goosen, J.F.L., Bossche, A.: Ultrasonic sensor system for measuring position and orientation of laproscopic instruments in minimal invasive surgery. In: 2nd Annual International IEEE-EMB Special Topic Conference on Microtechnologies in Medicine Biology, pp. 301–304 (2002) 15. Varshney, S.S., Rajpal, N., Purwar, R.: Comparative study of image segmentation techniques and object matching using segmentation. In: International Conference on Methods and Models in Computer Science, ICM2CS 2009, pp. 1–6 (2009)