Nonrigid Intraoperative Cortical Surface Tracking Using Game Theory Christine DeLorenzo1∗ , Xenophon Papademetris1,2 , Lawrence H. Staib1,2 , Kenneth P. Vives3 , Dennis D. Spencer3 , and James S. Duncan1,2 Departments of 1 Biomedical Engineering, 2 Diagnostic Radiology and 3 Neurosurgery, Yale University, New Haven, CT ∗
[email protected]
Abstract During neurosurgery, nonrigid brain deformation prevents preoperatively acquired images from accurately depicting the intraoperative brain. Stereo vision systems can be used to track cortical surface deformation and update preoperative brain images in conjunction with a biomechanical model. However, these stereo systems are often plagued with calibration error, which can corrupt the deformation estimation. In order to decouple the effects of camera calibration and surface deformation, a framework is needed which can solve for disparate and often competing variables. Game theory, which was developed specifically to handle decision making in this type of competitive environment, has been applied to various fields from economics to biology. In this paper, we apply game theory to cortical surface tracking and use it to infer information about the physical processes of brain deformation and image acquisition.
rate camera calibrations compound the difficulty of imagederived deformation estimation. Consider Figure 1, which shows a single image of the intraoperative cortical surface. The imaged sulci are shown highlighted in black. Projected onto the images in green, by means of the camera calibration parameters, are sulci extracted from a 3D brain surface. To generate the left side of Figure 1, the 3D preoperative sulci were correctly deformed to their intraoperative locations and then projected to image space, using inaccurate calibration parameters. The mismatch between these projected sulci and the outlined sulci is purely due to the effects of the camera calibration parameters. On the right side of Figure 1, the calibration parameters are correct, but the 3D preoperative (rather than intraoperative) sulci locations were projected to the image. This figure shows that calibration inaccuracies are not easily distinguishable from surface deformation. Therefore, in order to track the deforming cortical surface, a framework with the ability to solve for competing variables (surface displacement field/camera calibration parameters) is needed.
1. Introduction The cortical surface can deform up to 1 cm or more during neurosurgery [10]. Because of this brain shift, visualization of the intraoperative brain is necessary for accurate surgical guidance, especially when operating near functional areas. If the cortical surface deformation is detected, this information can be used in conjunction with a biomechanical model to update preoperative brain images. Due to their portability and low cost, stereo camera imaging systems offer a viable option for intraoperative cortical surface detection [6, 11, 12]. Calibration is usually necessary to obtain precise quantitative information from imaging systems. In many real world situations, however, accurate camera calibrations are not possible [8]. This is especially true in the operating room, where extreme time and space constraints limit the calibration procedure possibilities. The resulting inaccu-
978-1-4244-1631-8/07/$25.00 ©2007 IEEE
Figure 1. Intraoperative image indicating the misalignment of projected sulci (green) with the intraoperative imaged sulci positions (black), due to either camera calibration errors (left) or cortical surface deformation (right).
Game theory is the study of multiperson decision making [1]. It has historically been of great interest in business and economics where decisions are made in a competitive environment [9]; however, it has been applied to disciplines ranging from philosophy to biology. Game theory has also made forays into the field of image analysis [3, 4], though,
to our knowledge, its use has been restricted to image segmentation. Aside from the application of game theory to image registration, another considerable distinction between this and previous works is that the variables being updated (surface displacement/camera calibration parameters) are not explicitly contained in the images. That is, the images (preoperative magnetic resonance image (MRI) and intraoperative stereo camera images) are used to infer information about two physical processes (brain deformation/image acquisition), not further information about the image content.
2. Method In a game theoretic formulation, the players are computational modules. In noncooperative games, collusion between players is prevented and the players pursue their own interests, which are partly conflicting with the others’ [1]. Though the players of noncooperative games have competing objectives, the game may still reach an equilibrium state. In game theory, this state is referred to as the Nash equilibrium and it occurs when there is no incentive for any of the players to deviate from their current positions [4]. In the context of intraoperative neurosurgical guidance, the players would be 1) Udense , the dense displacement field applied to the preoperative cortical surface and 2) A = [A0 , A1 ], the camera calibration parameters for the left (0) and right (1) stereo cameras, which are used to transform points from 3D into stereo image space (see Section 4). Since changing the values of the camera calibration parameters can confound the search for the displacement field, the most natural formulation of the intraoperative surface tracking problem is as a noncooperative game. The game will terminate when the Nash equilibrium is reached, as this is the most rational compromise for each player. In this case, a particular instance of Udense , Udense i , within the set of all possible dense displacement fields, Udense ∗ , and a particular instance of A, Aj , within the set of all possible calibration parameters, A∗ , is said to be at Nash equilibrium if : C1 (Udense i , Aj ) ≤ C1 (Udense , Aj ),
∀ Udense ∈ Udense ∗
C2 (Udense i , Aj ) ≤ C2 (Udense i , A),
∀ A ∈ A∗
(1)
where C1 , C2 are the cost functions corresponding to a particular pair of decisions for the dense displacement field and camera calibration parameters, respectively. These functions are described in more detail below. It has been shown [4] that if the cost functions are of the form: C1 (Udense , A) = F1 (Udense ) + αF2 (Udense , A) C2 (Udense , A) = F3 (A) + βF4 (Udense , A)
(2)
where the constants, α and β, are chosen such that the following expression is less than one: −1 α
∂2 ∂2 F1 (U) + F2 (U, A) ∂U∂U ∂U∂U
−1
∂2 F2 (U, A) ∂U∂A
∗ (3)
β
−1
∂2 ∂2 F3 (A) + F4 (U, A) ∂A∂A ∂A∂A
−1
∂2 F4 (U, A) ∂A∂U
then the Nash equilibrium exists. (The subscript for Udense was dropped in equation (3) for simplicity.) Because equation (3) is an inequality, a range of values for α and β will satisfy these constraints. Though the range of acceptable values has been previously formally derived, it has also been shown that these values can be chosen empirically [4]. The inequality of equation (3) constrains the values of these constants to be small. Larger values may prevent the algorithm from converging. Therefore, in this work, small values of α and β, which allowed successful convergence of the algorithm, were empirically chosen. Preliminary tests revealed that when the algorithm converged, the results were not strongly sensitive to variations in α and β.
3. Bayesian Analysis for Surface Tracking With the framework of game theory already in place, cost functions can be written for Udense and A. For this, a deformable model approach is used, which searches over all possible displacement fields applied to the preoperative cortical surface for the one which best matches the intraoperative information. This can be accomplished using Bayesian analysis to help motivate the search for Udense or A given all the information extracted from intraoperative stereo camera images and a preoperative MRI. C1 (Udense , A) = p(Udense |I, K, C, S U , A) C2 (Udense , A) = p(A|I, K, C, S U , Udense )
(4)
where I = [I0 , I1 ], the stereo camera images and K = [K0 , K1 ], the locations of the sulci in those images. At this point, the imaged sulci are manually extracted and stored as 2D curves. In the future, we will perform this extraction automatically. S U is the undeformed preoperative brain surface, extracted from the MRI, and C are the sulci on surface S U [5]. The preoperative sulci are semiautomatically extracted and stored as 3D curves [5]. In a game theoretic framework, the objective is to minimize each player’s cost function, which is the same as maximizing the negative of the cost function. Therefore, the expressions developed in this section can be written as maximizations in which each objective function is maximized relative to one player while keeping the other fixed at each iteration.
Maximizing the expressions in equation (4) is equivalent to maximizing the log of the posterior probabilities: ˆ ˆ dense = arg max C1 (Udense , A) U Udense
ˆ (5) = arg max log [p(Udense |I, K, C, S U , A)]
ˆ dense )] = log [p(I, S U , A|U ˆ dense )] log [p(I, K, C, S U , A|U ˆ dense )] − log [p(A|U ˆ dense )] (9) + log [p(K, C, A|U Combining equations (7)-(9) yields:
Udense
ˆ dense , A) ˆ = arg max C2 (U A A
ˆ dense )] = arg max log [p(A|I, K, C, S U , U A
(6)
ˆ dense and A ˆ are the updated values of the variables where U obtained at each iteration. For the first iteration, the value ˆ dense is based on the distance between C and the 3D of U ˆ is reconstruction of K (Section 3.2.1) and the value of A set by the initial calibration (Section 4).
ˆ = log [p(I, S U , A|U ˆ dense )] log [p(Udense |I, K, C, S U , A)] | {z } Intensity Term ˆ dense )] + log [p(Udense )] + log [p(K, C, A|U (10) | {z } | {z } Feature Term Prior Term U ˆ ˆ dense )] + log [p(I, K, C, S , A)] − log [p(A|U | {z } Constants The terms of equation (10) are developed below.
3.1. Displacement Field Determination 3.1.1 The cost functions were developed using a Bayesian framework. Unlike our previous approach [6], in this work, assumptions about the independence of left and right stereo image sulcal traces were not made and the effect of the constant terms was specifically addressed. This is because, in order to apply game theory, a complete and consistent derivation based on Bayesian analysis is necessary. Additionally, the camera calibration parameters, which we previously defined as simple matrices [6], are treated more realistically in this work. This necessitated the use of different projection functions. For all these reasons, the cost functions had to reevaluated and their complete development, which incorporates all of the above changes, is shown below. Applying Bayes’ Rule to equation (5) yields ˆ = log [p(Udense |I, K, C, S U , A)] ˆ dense )] + log [p(Udense )] log [p(I, K, C, S U , A|U | {z } | {z } Displacement Term Prior Term ˆ − log [p(I, K, C, S U , A)] {z } | Constants
(7)
The Displacement Term may be rewritten as: ˆ dense )] = log [p(A|U ˆ dense )] log [p(I, K, C, S U , A|U ˆ Udense )] + log [p(I, K, C, S U |A, (8) Assuming the independence of the sulci from the surface and images, equation (8) can be further developed.
Intensity Term
The intensity matching is performed on backprojected surfaces. p(I, S U , A|Udense ) = p(I0 , Aˆ0 , S U , I1 , Aˆ1 |Udense ) = p(I0 (P (Aˆ0 , (S U + Udense )), I1 (P (Aˆ1 , (S U + Udense ))) = p(B0S , B1S )
(11)
where S, is the deformed surface, S U + Udense , and B0S , B1S are the backprojected intensities from cameras 0 and 1 respectively to the surface. (See the right side of Figure 4.) This formulation emphasizes that there is a single projection function, P , that more accurately projects every 3D point on the surface into stereo camera space. The need for this projection function is detailed in Section 4. The use of normalized cross correlation (NCC) as a match metric still follows naturally from this formulation. Assuming the noise in each backprojected surface, Pi , is additive, Gaussian distributed and independent with zero mean; i.e., Pi = Pe + ni , where Pi , i = 0,1, the perceived backprojection, is the true projection, Pe plus zero mean Gaussian noise, ni , i = 0,1, it follows that P0 = Pe +n0 , P1 = Pe +n1 ⇒ P0 = P1 +nt , where nt = n0 +n1 and is therefore also Gaussian distributed, leading to: p(P0 , P1 ) =
1 √ n e− σ 2π
P(P0 −P1 )2 2σ 2
(12)
where σ is the standard deviation of the backprojected image intensities and n is the number of surface elements in the image overlap region, in which the sum is also taken. Taking the log of both sides of equation (12) yields:
3.1.2 "
#
P 1 (P0 − P1 )2 √ n − 2σ 2 σ 2π " # P 1 (P02 − 2(P0 )(P1 ) + P12 ) = log (13) √ n − 2σ 2 σ 2π
log [p(P0 , P1 )] = log
This model only accounts for additive noise in each backprojected image. However, if the aperture of each camera is set differently, there may not be an identity transformation between the two images. Using the original variables, in this case, B0S may equal gB1S + nt , where g is a constant gain factor. This indicates that one image is brighter than the other. To account for this possible affine transformation, the backprojections can be normalized by subtracting out the mean intensity, BiS − BiS . Substituting Pi = BiS − BiS back into equation (13) yields: #
"
log [p(B0S , B1S )] P
1 √ n − σ 2π
= log
(14)
[(B0S − B0S )2 − 2(B0S − B0S )(B1S − B1S ) + (B1S − B1S )2 ] 2σ 2
Since the image intensities in the scene will not change substantially throughout the entire surgery, for all Udense , the square of the normalized images, (BiS − BiS )2 , should be close to constant. This is the same as saying that a single camera’s intensity histogram will not vary greatly throughout the surgical procedure. This can also be extended to motivate the assumption that the image overlap size of the two backprojected images will remain fairly constant. Using the above assumptions and neglecting all the constant terms in equations (12) - (14) yields the probability between the two backprojected images: P
log [p(B0S , B1S )] ∝ q
P
(B0S − B0S )(B1S − B1S )
(B0S − B0S )2
q P
The stereo camera images, I, the sulci in those images, K, the preoperative surface, S U , and sulci on that surface, C, are all constants in this framework because they comprise the information supplied to the algorithm. Additionally, while solving for the dense displacement field, the camˆ era calibration parameters from the previous iteration, A, are held fixed. Though all of these variables are given, the probabilities of these variables as expressed in the constant terms of equation (10) may not be fixed. While it may be possible to find exact probabilistic definitions for these terms, the process is non-trivial and not easily validated. The added information supplied by the probability of the given constants could possibly enhance the surface tracking algorithm, however, it is not necessary to solve the model. Because of this, the probabilistic definitions defined above are used to motivate the algorithm development rather than exactly define it. To indicate this, matching terms can be used, which incorporate the goals of the conditional probability terms into an intuitive framework. The model can be reformulated by combining equations (5), (10) and (16) as: ˆ ˆ dense = arg max log [M (Udense |I, K, C, S U , A)] U Udense
=
ˆ dense )] N CC(B0S , B1S ) + log [M (K, C, A|U | {z } Intensity Term R
|
{z Feature Term
(17)
}
00
+ ρe− |Udense |dS | {z } Prior Term where M represents a matching procedure, rather than a formal probability. The prior, TU (Udense ), based on the assumption that the surface deformation is smooth is also formally defined in this equation. It assures that the second 00 , is small and derivative of the displacement field, Udense contains the normalizing constant ρ.
(15)
(B1S − B1S )2
3.1.3
q P S In equation (15), the divisor, (Bi − BiS )2 , is a constant term that is used to normalize the output between 0 and 1. The result is the familiar formulation for NCC. Equations (11) and (15) can be combined to yield the intensity term: TI (Udense , A) = log [p(I, S U , A|Udense )] = ηN CC(B0S , B1S )
Constants and Prior
(16)
where η is a normalizing constant, ensuring that all terms have similar orders of magnitude.
Feature Term
The feature matching term from equation (17), ˆ dense ]), can be simplified by conlog [M (K, C, A|U sidering the sulci outlined in the two camera images K0 and K1 separately. The matching can then be performed in intraoperative stereo image space. Intuitively, the correct displacement field, when applied to the preoperative sulci, will deform those sulci until they are in the same location they were in when imaged with the stereo cameras. If these deformed sulci are projected to the images, by means of the camera calibration parameters, they should be exactly aligned with imaged intraoperative sulci. The matching term, therefore, penalizes displacement fields which do not align the imaged sulci with the projected sulci. The term
is weighted by the mean distance between the two sets of features. (See Figure 2 for a pictorial description.)
ˆ = arg max log [M (A|I, K, C, S U , U ˆ dense )] = A A
ˆ dense )|A)] + log [M (A)] log [M (I, K, C, S U , U | {z } | {z } Prior Term Calibration Term 3.2.1
Figure 2. Pictorial description of the feature term. The top row shows a cortical surface patch extracted from a preoperative MRI. Using the projection function, the 3D sulci from that patch are projected onto the 2D stereo camera images. Assuming an accurate camera calibration, the mismatch between the projected features (cyan) and the imaged features (red) is due to the surface deformation. The bottom row shows various displacement fields applied to the surface patch. The distance between the projected sulci from these patches (yellow) and the imaged sulci (red) is measured for each applied displacement. In this case, the displacement field applied to the patch on the far right aligns the features best.
The feature matching term can therefore be written as: Z
TF (Udense , A) = −
d[K0 , P (Aˆ0 , (C + UC dense ))]dS
Z
+
d[K1 , P (Aˆ1 , (C + UC dense ))]dS
(18)
where UC dense is the dense displacement field restricted to the sulci and d is a mean PNEuclidean distance metric, defined as d(W, V ) = N1 i=1 ||wi − vi ||. In this case, W and V have the components (wx1...N , wy1...N , wz1...N ) and (vx1...N , vy1...N , vz1...N ) respectively.
3.2. Camera Calibration Calculation
Calibration Term
The calibration term of equation (19) refers to the ability of the camera parameters to transform any 3D point to its correct 2D image location. The sulci provide accurate feature information from which correspondence can be reliably calculated. Because of this, the sulci on the 3D surface, C, provide all the necessary information from the surface itself, S U . Additionally, since the imaged sulci are the only features of interest, K is a sufficient statistic for the rest of the image, I. Thus, the calibration term reduces to ˆ dense )|A)]. log [M (K, C, U To further evaluate this term, a stereo reconstruction function, RC, is defined. It can be calculated from the initial camera calibration, A, by determining the rigid transformation between cameras 0 and 1. If this transformation is represented by a rotation, R01 , and translation, T01 , the 3D reconstruction of a set of corresponding points m0i...n in image 0 and m1i...n in image 1, can be found through the equation Mxi m0i −Myi R01 T m1i +Mzi (m0i × R01 T m1i ) = T01 , where Mi = [Mxi , Myi , Mzi ] is the reconstructed 3D location of corresponding points m0i from camera 0 and m1i from camera 1 [13]. Using the above equations, the reconstruction of each of the outlined sulci, K, can be performed and the mean Euclidean distance between these and the 3D deformed sulci can be measured. Z TC (Udense , A) =
The expansion of the camera calibration term can proceed in the same fashion as the dense displacement field. Applying Bayes’ Rule to equation (6) yields
(19)
Again, in this case, once the Bayesian analysis is performed, intuitive matching metrics can be used to develop each term. Equations (6) and (19) can thus be written as:
h i d RC(K0 , K1 , A), (C + UC ) dS dense (21)
3.2.2 ˆ dense )] = log [p(A|I, K, C, S U , U U ˆ log [p(I, K, C, S , Udense |A)] + log [p(A)] | {z } | {z } Prior Term Calibration Term U ˆ − log [p(I, K, C, S , Udense )] | {z } Constants
(20)
Prior Term
Camera calibration is performed by locating fiducial points on the cortical surface and the stereo images [13]. The prior on the camera calibration, TA (A), states that as the camera parameters are updated, the projection of the n fiducial points, L0...n , onto the stereo camera images should be close to imaged fiducial points in cameras 0 and 1, m0i...n and m1i...n , respectively. This yields:
TA (A) =
n 1X ||P (A0 , Li) − m0i || + ||P (A1 , Li) − m1i || n i=1
Use of a prior term to update the camera calibration is more consistent with the Bayesian framework and creates a more accurate result.
3.3. Model Reformulation Putting the above equations together, the model for determining the displacement and camera calibration can be written in the form of equation (2):
C1 (Udense , A) = TU (Udense ) + α[TF (Udense , A) + TI (Udense , A)] (22) C2 (Udense , A) = TA (A) + β[TC (Udense , A)] These objective functions can be iteratively updated using gradient descent optimization until the Nash Equilibrium is reached. Because there may not be a unique Nash equilibrium for each system of equations, the initialization of the displacement field and calibration parameters can be important. However, the results in Section 5.3 show that the method is robust to the range of possible initializations normally encountered for this procedure.
4. Camera Calibration We previously modeled the camera calibration parameters as projection matrices based on a pinhole camera model [6]. This model had simple projection functions; however, it neglected many image acquisition effects. While this may be suitable for some stereo applications, it is invalid for applications requiring high accuracy [7], such as tracking intraoperative brain shift. To account for this, a new model was developed which takes into account the camera calibration parameters of rotation, translation, focal length, principal point, image distortion and skew coefficient, based on the internal camera model by Heikkil¨a [7]. Using these parameters, the projection of any 3D point onto an image can easily be calculated using methods outlined in [7] or even from commercial software packages [2]. This projection process, adapted from [2], is outlined below. First, a 3D point in the camera reference frame, (Xc , Yc , Zc ), is projected to an image using a pinhole camera projection model, where xn (1) = Xc /Zc and xn (2) = Yc /Zc . In this case, xn is the normalized pinhole projection. Next, the point coordinates which account for lens distortion, xd , are calculated. This formulation models lens distortion using sixth order polynomial terms as a function of the distance from the optical axis. The distorted coordinates, xd , are equal to (1 + kc(1)r2 + kc(2)r4 + kc(5)r6 )xn + dx, where r2 = xn (1)2 +xn (2)2 and kc is the 5x1 vector of image distortion coefficients. The tangential distortion vector, dx, is equal to [2kc(3)xn (1)xn (2) + kc(4)(r2 + 2xn (1)2 )kc(3)(r2 +
2xn (2)2 ) + 2kc(4)xn (1)xn (2)]T . Once the distortion is applied, the final pixel coordinates can be found by setting [x y 1]T equal to Kcamera ∗ [xd (1) xd (2) 1]T , where Kcamera is referred to as the camera matrix and is defined as Kcamera = [fx αc fx ox ; 0 fy oy ; 0 0 1]. In this matrix, (fx , fy ) is the focal length in pixels, (ox , oy ) are the principal point coordinates and αc is the skew coefficient. The parameters in the expanded calibration model will more reliably perform the projections and backprojections required for the game theoretic algorithm. It is possible to calculate the expanded model parameters directly from images of a calibration target in different poses [2, 15]. However, this direct calculation process is not feasible in the OR due to time constraints. Because a transformation between the preoperative MRI space (registered patient space) and the stereo image space is sought, the coordinates of the calibration pattern at each pose would have to be known. This could be accomplished, for example, by locating the corners of the calibration object with a tracking tool each time it was moved. The result would be an unwieldy and time consuming calibration process. However, since the calibration parameters will be updated at every iteration, only a rough estimate of these values is needed to initialize the algorithm. This allows the use of a range of possible techniques (such as those which make simplifying assumptions [13, 14]) to obtain the initial parameter estimates. Any parameters not directly estimated from the chosen calibration technique can be initialized to zero and determined by the game theoretic algorithm. Using game theory, therefore, provides all of the advantages of an expanded calibration model without the necessity of an expanded calibration technique.
5. Results 5.1. In Vivo Surface Tracking with Game Theory Game theoretic cortical surface tracking was used in five separate surgeries for a total of eight data sets. The algorithm results for all cases are shown in Figure 3. On the left side of this figure, the blue bars represent the calculated mean average displacement of the cortical surface as predicted by the game theoretic algorithm. Mean residual error of the algorithm (red bars) is calculated by averaging the closest distances between the predicted surface and sparse cortical surface points touched with a 3D locator intraoperatively. Five out of eight of the cases (62.5%) resulted in a mean algorithm error of less than 1.0 mm, and the mean error never exceeded 1.70 mm. These results indicate accurate cortical tracking and a 81% improvement over uncompensated error. While the mean errors were quite low, it is also important to examine the maximum error of the algorithm in each
Figure 3. Mean and maximum displacement versus error for eight intraoperative data sets. The algorithm was run with α = 4, β = 0.83, ρ = 0.1, and η = 25 for all cases.
case. The right side of Figure 3 indicates the decrease in maximum error using the game theoretic algorithm. For half the cases, the maximum error was under 1.6 mm. And, overall, there was a 76% decrease in the maximum errors using the model guidance. This finding is very important because it indicates that the algorithm not only decreases the mean error, but it never increases the maximum surface error. Thus, for all eight cases, every part of the surface was found more accurately using the game theoretic algorithm then by relying on the preoperative surface.
5.2. Sample Case In order to further examine the game theoretic algorithm, images from a sample case, data set #2, are shown. As can be seen in top row of the left side of Figure 4, sulci projected from the preoperative brain surface (green) are not aligned with the features extracted from the stereo images (black). Using the game theoretic surface tracking algorithm, updated camera calibration parameters and a displacement field were calculated. The displacement field was then applied to the preoperative brain surface and the deformed positions of the sulci were projected to the stereo images using the updated camera parameters (yellow). As shown in the bottom row of Figure 4, with the calculated deformation and calibration, the features are better aligned. The intensities are also better matched, as can also be seen from the right side Figure 4. The intensities that are backprojected from the left camera better match the intensities that are backprojected from the right camera when the surface is in its predicted location and the updated calibration parameters are used. Additionally, the updated deformation and camera parameters better align the backprojected imaged sulci with those extracted from the surface.
5.3. Algorithm Robustness Comparison The above results show that the game theoretic algorithm accurately tracks a deforming surface. In order to examine the advantages of this framework, it was compared to three other tracking algorithms for the in vivo case described above (data set #2). For comparison, the surface tracking
Figure 4. Left: Intraoperative stereo camera images of the cortical surface. Misalignment of the projected preoperative sulci (green) with the intraoperative sulci positions (black) due to brain shift and camera calibration errors is shown on top. On bottom, the predicted sulci positions, projected with the updated calibration parameters (yellow), show better alignment. Right: Cortical surface positions with backprojected intensities. Each point on the surface is projected to the left (left column) or right (right column) stereo image and assigned the intensity value of that projected point. Red arrows indicate the misalignment between the sulci on the preoperative surface (green) and those seen in the backprojected image. The sulci on the algorithm-predicted surface are better aligned with the image intensities.
was also performed using: (1) A single objective function comprised of all the terms in the objective functions C1 and C2 , SO(Udense , A) = TU (Udense ) + α[TF (Udense , A) + TI (Udense , A)] + TA (Udense ) + βTC (Udense , A). This function is based on the same Bayesian analysis used to develop the game theoretic algorithm, however, the dense displacement field and camera calibration optimization are performed in a single step. (2) The cost function, C1 , solved without calibration compensation. Although the expression for C1 contains calibration dependent terms, the optimization is only performed over the dense displacement field, using the initial calibration. And, (3) the iterative scheme similar to [6] in which the single objective function, SO(Udense , A), is iteratively updated over Udense and A. Optimization by gradient descent was used in all three cases. The use of (1) and (3) will allow one to determine the advantages, if any, of using a game theoretic framework to solve the tracking problem as a noncooperative game. This comparison will show whether the Nash equilibrium is a better solution than the minimum of a single objective function, either updated over Udense and A jointly or iteratively. The comparison to (2) will show whether calibration compensation improves the solution. Figure 5 shows the robustness of each algorithm to initialization. The plot on the left was generated by adding a range of offsets to the initial surface position. The single objective function, both solved jointly (red) and iteratively (green), is more sensitive to the offset because is cannot resolve whether the initial error is a result of calibration error or deformation. In these cases, the camera calibration could be distorted so much that it will minimize the single cost function, although at the expense of predicting the proper displacement, or vice versa. Accurate tracking could only
tion about physical processes (brain deformation/image acquisition). In related work [5], we use this information with a biomechanical model to calculate the full intraoperative volumetric brain deformation. The ultimate goal is to provide neurosurgeons with updated intraoperative brain images for improved surgical navigation, and thus more successful surgeries. Figure 5. Left: Sensitivity to initial surface position. Right: Sensitivity to camera calibration errors.
be performed when the camera calibration parameters were forced to stay fixed (black) or when game theory was used to distinguish between the two types of errors. As expected, the main difference between using a game theoretic approach and a pure displacement algorithm occurs when camera calibration error exists. The right side of Figure 5 was generated by offsetting the translation component of the calibration by different percentages. This offset prevents 3D points from being accurately transformed into the camera coordinate system, and they will therefore be improperly projected into image space. The cortical intensities from the stereo images will also be improperly backprojected onto the 3D surface. Without proper compensation for the calibration parameters (black), the mean algorithm error increases dramatically as calibration error increases. As before, a single objective function (red/green) cannot resolve the source of the error, regardless of the way in which it was optimized (jointly/iteratively). The mean error of all algorithms, with the exception of game theory, rises above 4 mm once the calibration error reaches 5%. Even at 15% error, it is better to use the game theoretic framework than to rely on the uncompensated surface.
6. Conclusions The game theoretic algorithm developed in this paper achieves accurate results in vivo. The quantitative results reveal that submillimeter accuracy can often be attained and inspection of a sample case revealed visual validation of the results. Further, the robustness comparison shows that, in the absence of calibration error, a game theoretic algorithm performs at least as well as a pure displacement cost function and the addition of the camera calibration optimization did not decrease the surface tracking accuracy. With the other paradigms, this camera calibration optimization interfered with the displacement field detection and led to a suboptimal result. This does not mean that it is impossible to achieve accurate results in these paradigms. However, it does reveal the problems that may arise from solving for competing variables in a single objective framework. Additionally, this work shows that game theory can be used in conjunction with image processing to infer informa-
References [1] T. Bas¸ar and G. J. Olsder. Dynamic Noncooperative Game Theory, 2nd Ed. Academic Press, New York, 1995. [2] J.-Y. Bouguet. Camera calibration toolbox for matlab. http://www.vision.caltech.edu/bouguetj/calib doc/. [3] H. I. Bozma and J. S. Duncan. A game-theoretic approach to integration of modules. IEEE Trans. Pat Anal. & Mach. Intell., 16(11):1074–1086, 1994. [4] A. Chakraborty and J. S. Duncan. Game-Theoretic integration for image segmentation. IEEE Trans Pat Anal & Mach Intell, 21(1):12–30, 1999. [5] C. DeLorenzo, X. Papademetris, K. Vives, D. Spencer, and J. Duncan. A comprehensive system for intraoperative 3D brain deformation recovery. In MICCAI, Brisbane, Australia, 2007. [6] C. DeLorenzo, X. Papademetris, K. Wu, K. Vives, D. Spencer, and J. Duncan. Nonrigid 3D brain registration using intensity/feature information. In MICCAI, pages 932– 939, Copenhagen, Denmark, 2006. [7] J. Heikkil¨a and O. Silv´en. A four-step camera calibration procedure with implicit image correction. In CVPR, pages 1106–1112, San Juan, Puerto Rico, 1997. [8] M. Machacek, M. Sauter, and T. R¨osgen. Two-step calibration of a stereo camera system for measurements in large volumes. Msrmnts Sci & Tech, 14:1631–39, 2003. [9] E. Mendelson. Introducing Game Theory and Its Applications. Chapman & Hall/CRC, Boca Raton, 2004. [10] I. Reinertsen, M. Descoteaux, S. Drouin, K. Siddiqi, and D. L. Collins. Vessel driven correction of brain shift. In MICCAI, volume 3217, pages 208–216, 2004. ˇ [11] O. Skrinjar, A. Nabavi, and J. S. Duncan. Model-driven brain shift compensation. Med Im Anal, 6(4):361–73, 2002. [12] H. Sun, K. Lunn, H. Farid, Z. Wu, D. W. Roberts, A. Hartov, and K. D. Paulsen. Stereopsis-guided brain shift compensation. IEEE Trans Med Imag, 24(8):1039–52, 2005. [13] E. Trucco and A. Verri. Introductory Techniques for 3-D Computer Vision. Prentice-Hall, Inc., Upper Saddle River, New Jersey, 1998. [14] R. Y. Tsai. A versatile camera calibration technique for highaccuracy 3D machine vision metrology using off-the-shelf TV cameras and lenses. IEEE J. Robotics & Automat, RA3(4):323–344, August 1987. [15] Z. Zhang. Flexible camera calibration by viewing a plane from unknown orientations. In ICCV, pages 666–673, Kerkyra, Greece, 1999.