Mapping of Endoscopic Images to Object ... - SPIE Digital Library

1 downloads 0 Views 4MB Size Report
Damini Dey*a, David G. Gobbia, Kathleen J.M. Surrya, Piotr J. Slomkaa, b Terry M. Petersa ajflg Research Laboratory, John P. Robarts Research Institute, ...
Mapping of Endoscopic Images to Object Surfaces via Ray-traced Texture Mapping for Image Guidance in Neurosurgery Damini Dey*a, David G. Gobbia, Kathleen J.M. Surrya, Piotr J. Slomkaa, b Terry M. Petersa

ajflg Research Laboratory, John P. Robarts Research Institute, London, Canada bDiagostic Radiology and Nuclear Medicine, London Health Sciences Center, London, Canada ABSTRACT A major limitation of the use of endoscopes in minimally invasive surgery is the lack of relative context between the endoscope and its surroundings. The purpose of this work is to map endoscopic images to surfaces obtained from 3D preoperative MR or CT data, for assistance in surgical planning and guidance. To test our methods, we acquired pre-operative CT images of a standard brain phantom from which object surfaces were extracted. Endoscopic images were acquired using a neuro-endoscope tracked with an optical tracking system, and the optical properties of the endoscope were characterized using a simple calibration procedure. Registration of the phantom (physical space) and CT images (pre-operative image space) was accomplished using markers that could be identified both on the physical object and in the pre-operative images. The endoscopic images were rectified for radial lens distortion, and then mapped onto the extracted surfaces via a ray-traced texture-mapping algorithm, which explicitly accounts for surface obliquity. The optical tracker has an accuracy of about 0.3 mm, which allows the endoscope tip to be localized to within (0.8+/-0.3) mm. The mapping operation allows the endoscopic images to be effectively "painted" onto the surfaces as they are acquired. Panoramic and stereoscopic visualization and navigation of the painted surfaces may then be reformed from arbitrary orientations, that were not necessarily those from which the original endoscopic views were acquired. Keywords: Endoscopy, image-guided neurosurgery, visual data fusion, painting algorithm, ray-traced texture mapping, panoramic and stereoscopic visualization, 2D-3D registration

1. INTRODUCTION In minimally invasive surgery, endoscopes are introduced through a small entry port into a body cavity. Endoscopes provide high-resolution two-dimensional (2D) real-time video images of a limited view of the surgical site. They have been previously used in arthroscopic and abdominal surgery, and more recently in the ventricles of the brain for neurosurgical

applications. Since the field-of-view of the endoscope is small, a major limitation of endoscopes is the lack of relative context between the endoscope and its surroundings3. Relating endoscopic images to their surroundings, and to pre-operative image data, is a crucial but challenging task for the surgeon. We have developed methods for real-time integration of endoscopic images with pre-operative MRI or CT images, for

assistance in the planning and guidance of surgical procedures. The objective of this work was to develop methods for accurate, real-time intra-operative merging of 2D endoscopic images with 3D pre-operative images. Our specific objectives were: to intra-operatively track the endoscope, to develop and apply an accurate optical model of the endoscope to the image rectification task, and to accurately map the endoscopic images onto polygonal surfaces generated from the pre-operative images. In this work we describe our preliminary results. Following testing and validation on phantom data, our goal is to apply these methods to endoscopy-assisted surgery. The ultimate goal of image-guided neurosurgery is to accurately track intra-operative tools and to display them on the 3D

pre-operative image data. To achieve this, it is necessary to register the patient to the pre-operative image data. A video image is a 2D projection of the 3D scene. This can be modeled using the pinhole camera model4. If the camera can be accurately calibrated, the 3x4 perspective transformation matrix relating an arbitrary point in the 3D scene to a point in the 2D video image can be calculated. Given a point x in 3D, defined by the homogeneous co-ordinates x =(x,y,z,1), the purpose ofthe camera calibration is to find the 3x4 transformation matrix T that relates x to a point u = (u,v, 1) on the 2D image, i.e.:

*

Correspondence: Email: ddeyirus.ni.on.ca WWW: http://www.irus.rri.on.ca/—ddey Telephone: 519-685-8300 ext 34004; Fax: 519-663-3403

290

In Medical Imaging 2000: Image Display and Visualization, Seong K. Mun, Editor, Proceedings of SPIE Vol. 3976 (2000) S 1 605-7422/OOI$1 5.00

WUT :=TXT

(1)

The matrix T represents a rigid body transformation from 3D scene (or world) co-ordinates to 3D camera co-ordinates, followed by a projective transformation onto the 2D imaging plane. w represents a scaling factor for homogeneous coordinates. If the camera characteristics are known, a 2D virtual camera view can be generated by placing a virtual camera at the appropriate position in the 3D scene. The integration of video with virtual scenes has become a common practice in the telerobotics field5. Several investigators have attempted to integrate intra-operative endoscopic video to anatomical images. Typically, the endoscopic video images are displayed alongside the corresponding virtual camera views generated from the pre-operative images 6,7 Konen et al. have mapped anatomical landmarks derived from pre-operative MRI images to live video sequences acquired with a tracked endoscope8. In contrast to these approaches, Jannin et a!. have mapped microscope images onto surfaces derived from anatomical images for enhanced visualization9, while Clarkson et a!. have texture mapped

2D stereoscopic video images onto surfaces derived from anatomical images'°. In some general image visualization applications, multiple photographic or video images have been "stitched" together to form a cylindrical or spherical anama" 12, 13 However, in all these methods, the camera acquiring the images is required to be under some kind of constraint, for example, be made to rotate about a fixed central axis 12, 13 To our knowledge, multiple 2D images have not previously been texture mapped to an arbitrary 3D surface from an arbitrary camera position and orientation. In this work, we

map multiple endoscopic images onto 3D surfaces to provide panoramic visualization, using a novel, fully automatic mapping technique. By mapping the 2D image, acquired monoscopically by a single endoscope, to the 3D surface, we can impart stereoscopic depth information to the endoscope images.

2. METHODS There are five principal steps required to map endoscopic images to object surfaces: 1. Acquire the anatomical data (CT,MRI) and process it to extract the relevant surfaces.

2. Build an optical model ofeach endoscope. 3. Register the patient to the image data. 4. Track the endoscope and localize its tip in world coordinates. 5. Identify the surface-patch that is intersected by the modeled field-of-view of the endoscope and paint the clipped surface patch with the endoscopic image. The first and second steps must be performed prior to the intraoperative procedure.

To test our methods, we acquired 3D CT images of a standard brain phantom from Kilgore International Inc. (Coidwater, Michigan). We acquired a 512 x 5 12 x 0.488mm x 1 mm volume data set with a GE HiSpeed CT scanner (General Electric, Milwaukee). The images were segmented using a 3D region-growing algorithm and the surface of the phantom was extracted using the Marching Cubes algorithm in the Visualization Toolkit (VTK)'4 (Figure 1).

2.1 Tracking the endoscope

Endoscopic images were acquired with a 2.7 mm straight AESCULAP neuro-endoscope, which was tracked by the POLARIS Optical Tracking System (Northern Digital Inc., Waterloo, Canada). The tracking system consists of a T-shaped

assembly of infra-red light-emitting diodes (LEDs), and a position-sensor consisting of two cylindrical camera lenses mounted on a bar (Figure 1 (c)). The LED assembly was attached to the shaft of the endoscope via a plexiglass mount. The endoscopic video images were digitized using a Matrox Corona framegrabber board in a Pentium II 450 MHz PC. The phantom was registered to its 3D image by identifying fiducial markers on both the phantom and inits image, and fmding the rigid body transformation which minimized the least-squares residual between the two sets of homologous points.

291

Figure 1. (a) Pre-operative CT images. (b) Segmented surface of phantom. (c) Endoscope tracked with POLARIS opticai tracking system.

2.2 Optical Model The LEDs, mounted at the end of the endoscope. are continuously tracked by the POLARIS. We developed a fast tip localization method that is part of the optical calibration, and which calculates the position and the six degrees of freedom of the tip from the measured LED position (Figure 1 (c)). We also derived an optical model of the endoscope. similar to the well-known method of Tsai IS by imaging a simple calibration pattern at several known positions. Our calibration pattern is shown in Figure 2 (a). This approach consists of imaging a simple calibration pattern at several known distances from the endoscope tip. and determining from the resulting images a model of the endoscopic viewing cone'5 (Figure 2 (b)). We adopted the method described by Haneishi et al. to extract a model of the radial (or barrel) distortion of the endoscope lens from the same set of images of the calibration pattern. The intersections of the lines of the grid pauern are identified and the radial distance of each point from the center of distortion are calculated using Pythagoras' theorem (Figure 2 (c)). The corrected radial distance is given as a 4th order polynomial function. By iteratively optimizing the polynomial coefficients and estimating the distortion center, the coefficients that characterize the distortion can be determined'6.

292

Figure 2. (a) Calibration pattern — a set of intersecting grid lines. The colored squares are present to identif' x and y directions in the endoscopic image. (b) Our calibration procedure. (c) Endoscopic image of the calibration pattern showing barrel distortion.

2.3 Ray-traced texture mapping algorithm Texture mapping is a computer graphics technique commonly used to add realism to virtual 3D scenes4. It is used to map 2D images (textures) to 3D surfaces, which are usually represented in the computer by a set of polygons in 3D. In our implementation, the surfaces are composed of triangles. In recent years. mapping of textures and navigation of surfaces with textures is accelerated by inexpensive, off-the-shelf graphics hardware developed to support the PC game industry'7. In order to map a texture to a surface, the correct texture co-ordinates for the vertex of each surface polygon must be computed. While this is quite simple for mapping a texture to a regular surface such as a plane. cylinder or a sphere, the projection of a texture to an arbitrary 3D surface, from an arbitrary 3D position is not straightforward.

Texture mapping of multiple views was implemented via a series of computer graphics operations. From our tip localization procedure, for each endoscopic view, we know the position and orientation of the endoscope tip in 3D. We position the modeled endoscopic viewing cone at the endoscope tip and clip the surface of the ventricle with this cone (Figure 2 (a)). Then we extract the "cutout" surface corresponding to the intersection of the viewing cone with the segmented surface, and intersect this surface with 5 rays. These ray intersections correspond to the 3D positions to which the center and the 4 edges of the texture are mapped (Figure 2 (b)). Finally we calculate the texture co-ordinates by tracing virtual rays from the vertex of each triangle, through the texture, to the localized endoscope tip (Figure 2 (c)). The mapping is frilly automatic. requiring no intra-operative manual intervention.

293

(b) Figure 3. Ray-traced texture mapping algorithm. (a) Tracked endoscope with modeled viewing cone. (b) Endoscopic image showing (s,t) axes and 5 ray intersections. (c). (d) Calculation of texture co-ordinates from 3D geometry. (e) Endoscopic image mapped to 3D surface. Each texture has 2 dimensions commonly described as (s,t) that vary from 0 to 1. We define the texture origin to be at (0.5, is a vertex for a triangle in the cutout surface, and the co-ordinates of the tip of the endoscope is at 0.5). If V =

point P =

then the distance h from the vertex to the central ray along the endoscope projection direction is

given by

h = rsinO (2)

where 0 is the azimuth angle (see Figure 3 (d)) and r is the distance between V and P. The distances a and b in Figure 3 (d) along the s,t directions are given by

a = h sin4 b = hcos (3) can be calculated from the 3D position and orientation of the endoscope tip and the co-ordinates of the

The angles 0 and vertex V 0 can be calculated from the scalar product of the central ray vector and the vector from the endoscope tip to the vertex (Equation 3, Figure 3(d)).

294

—> —>

Pv.oP cosO = ->

(4)

IPvI.IoPI can be calculated from the following:

cos4: =

—>

(5)

IQV (QR where QR is the view-up vector that defines the t direction. For every endoscopic pose, our tip localization method gives us

the central ray OP and the view-up vector QR. Finally, texture co-ordinates (s,t) at V are given by:

s=O.5+a/S t=O.5+b/T1 (6)

where S1 , T1 are texture scaling parameters determined from ray intersections with the cutout surface. The texture co-ordinates for each vertex are used to effectively "pin" the texture to the surface. Once the texture coordinates are calculated, the texture may be mapped to the surface.

2.4 Painting Multiple Endoscopic Views As the endoscope sweeps across the object surface, a cutout surface corresponding to the intersection of the endoscopic viewing cone with the whole surface is extracted, the texture co-ordinates are computed and the endoscopic image is texturemapped to this surface. For each pose of the endoscope, a texture-mapped cutout surface is added to the rendered scene. The cutout surface corresponding to the most recent endoscopic pose is added last. This software is written primarily in C++, and the Python Programming Language 18 and is interfaced with the graphics toolkit The Visualization Toolkit (VTK) ' This toolkit is layered on top of OpenGL graphics libraries and can take advantage of any available graphics hardware-based acceleration. The same software can be run on computers with many operating systems including Windows NT 4.0, and Linux.

2.5 Evaluation of error The accuracy with which the texture mapping can be achieved, is governed by the accuracy of several of the subcomponents of the system. We evaluated the error in each sub-component as described below.

. .

POLARIS Tracking error: The tracked endoscope was placed in a fixed position for 5 trials. Since the position was fixed, the position of the LED assembly for all the trials should theoretically be exactly the same. There were, however, small differences due to errors in the tracking system. The POLARIS tracking error was calculated from the 3D position of the LED assembly between successive trials. From these errors, the maximum, mean and standard deviation error values were calculated.

Tip Localization Error: The tracked endoscope was placed at fixed distances from the plane defined by the calibration pattern for 5 trials. The distance from the endoscope tip to the center of the calibration pattern was determined by our tip localization method, and was also measured experimentally. For each trial, the tip localization error was defined as the difference between the measured distance and the distance calculated by our tip localization method. From all the trials, the maximum, mean and standard deviation values were calculated.

295

.

Texture mapping error: The 3D positions of 4 anatomical landmarks were manually determined on the pre-operative images. In the final texture mapped rendered scene, a small red sphere was manually placed on each of these anatomical landmarks. The 3D position ofthe sphere origin, which corresponds to the co-ordinates where the landmark was mapped, was recorded. For each landmark, the texture mapping error, e, was defined as the 3D distance in mm between the measured and the texture-mapped co-ordinate positions.

e = ((Xm _ x)2 + (Ym Y)2 + (Zm z)2 )

(7)

'

Zm ) are the 3D co-ordinates of the landmarks, measured on the pre-operative images, and (x ,y z ) are the 3D co-ordinates where the landmark is mapped, determined by positioning the sphere. If our tip localization, and optical

where (Xm ' Ym

modelling were absolutely accurate, the texture mapped co-ordinates would correspond to the actual co-ordinates and e would be zero. The maximum, mean and standard deviation values were calculated from values of efor all the landmarks.

3. RESULTS 3.1 Visual assessment Figure 3 (b) shows an endoscopic image of the brain phantom, while Figure 3 (a) demonstrates the localized endoscope tip with the viewing cone. In Figure 3 (e) this endoscopic view is texture-mapped to the surface of the phantom. This particular 2D view is an oblique projection of the 3D scene, as is shown clearly by the lighting in the endoscopic image itself, and by the obliquity ofthe modeled viewing cone. In the image, the light intensity decreases as the distance between the surface and the endoscope tip increases (Figure 3 (b)). Because the endoscope is tilted at an oblique angle to the surface, and the surface is non-planar, the edges ofthe viewing cone clipping the surface are jagged (Figure 3 (a)). All 2D views acquired by an endoscope are compressed into a circular field-of-view. However, from Figure 3, it can be

seen that when the endoscope is not oriented perpendicular to the surface, the circular endoscopic image is actually a compressed view of an ellipse. Therefore, when the endoscopic image is mapped back to the 3D surface, the image is stretched back to an ellipse that corresponds to the actual 2D projection of the 3D scene. The mapping of endoscopic images to the 3D surface is done in real time on a 333 MHz Celeron PC with Elsa Erazor X graphics hardware. The total number of virtual rays traced is equal to the number of vertices in the cutout surface (in this case less than 100). In contrast, if we were to trace rays forward though every pixel in the 350x350 endoscopic texture, the number of virtual rays to be traced would be 122,500. Figure 4 shows a stereoscopic rendering of an endoscopic image texture mapped onto the surface derived from the CT images of the brain phantom. Figures 5 and 6 show multiple endoscopic views texture-mapped to the surface. The individual endoscopic images are shown at the sides. Visually, the positioning and scaling of the endoscopic views after mapping appear to be correct. Once the endoscopic images are mapped, they become an integral part of the patient dataset.

3.2 Errors The measured errors in the various components of the system are shown in Table 1.

0.2 0.8

Standard Deviation (mm) 0.1 0.3

1.2

0.6

Mean (mm)

Polaris TrackingError Tip Localization Error Texture mapping Error

Maximum (mm) 0.3 1.0 1.8

Visually, these errors correspond to a good tip localization and a good 2D-3D registration.

296

Figure 4. Texture mapped endoscopic view in stereo. Left pair (a) and (b) are for cross-eyed viewing, right pair (b) and (c) are for parallel-eye viewing.

4. DISCUSSION In this work, we have demonstrated real-time merging of multiple tracked endoscopic views to the corresponding 3D surface via ray-traced texture-mapping. This approach allows panoramic and stereoscopic visualization of the "painted" surface from arbitrary perspectives, as well as navigation within the volume containing these surfaces, after the endoscope has been removed. There are many improvements to be made to our methods for tip localization, and optical calibration, and our texture mapping algorithm. From our experience, before every endoscopic procedure, the endoscope must be optically calibrated, a procedure that needs to be made faster and more reproducible and should be extended to accommodate angled endoscopes. Further visual refinements could be made to the rendered surface with the mapped endoscopic images. In this work, our algorithm was applied to individual digitized video frames acquired with a tracked endoscope, but we plan to adapt the algorithm to digitized streaming video. Our texture mapping error is an estimate of our overall error due to inaccuracies in optical tracking, registration of physical space to image space, and our optical calibration, but a more robust error assessment protocol must be developed. Our current method is subject to errors associated with manual identification of anatomic landmarks. Currently the endoscopes used have been rigid, with their tip position extrapolated from the position of the tracking LED's external to the body. However, technologies exist that employ optical fibre properties (ShapelapeTM from Measurand Inc.. Fredericton, Canada), and electromagnetic localization principles (from Biosense mc, Seatauket. NY) that could detect the position and orientation of the tip itself. This could enable us to employ the techniques described in this paper, with flexible endoscopes.

297

Figure 5. Multiple endoscopic video frames texture-mapped to the 3D surface.

5. CONCLUSIONS In conclusion, we have demonstrated the first steps towards multimodal integration of multiple endoscopic images to 3D pre-operative images. Our method allows panoramic and stereoscopic visualization of the merged 3D dataset from arbitrary perspectives, and navigation of the painted dataset after the procedure. We consider these results to be the first step towards routine multi-modality integration of endoscopic images with corresponding pre-operative images in neuro, orthopedic. abdominal and cardio-thoracic surgery.

6. ACKNOWLEDGEMENTS We would like to thank our colleagues Dr. Yves Starreveld and Dr. Andrew Parrent for many useful discussions: and Trudell Medical and Smith and Nephew for the loan of endoscopic equipment, and Nuclear Diagnostics for the use of Multimodality software for segmentation. We acknowledge the financial support of the Medical Research Council of Canada, and the Institute for Robotics and Intelligent Systems.

298

Figure 6. Multiple endoscopic views texture mapped to 3D surface

REFERENCES I. 2.

A. Perneczky, G. Fries, "Endoscope assisted brain surgery: part 1 — evolution, basic concept, and current technique", Neurosurgery 42. pp. 219-224, 1998. G. Fries, A. Perneczky, "Endoscope assisted brain surgery: part 2 — analysis of 380 procedures," Neurosurgery 42, pp. 226-23 1. 1998.

3. 4. 5. 6.

7.

J.D. Stefansic. A.J. Herline, W.C. Chapman. R.L. Galloway, "Endoscopic tracking for use in interactive, image-guided surgery," SPIE Conference on Image Display Proceedings 3335, pp. 208-129. 1998. J. Foley, A. van Dam, S. Feiner. J. Hughs. Computer Graphics 2'" Edition, Addison Wesley, 1990. D. Drascic, P. Milgram, "Positioning Accuracy of a Virtual Stereographic Pointer in a Real Stereoscopic Video World". SPIE Conference on Stereoscopic Display and Applications 111457, pp. 58-69, 1991. D.L.G. Hill, L.A Langasaeter, P.N. Poynter-Smith, C.L Emery, P.E. Summers, S.F. Keevil. J.P.M. Pracy, R. Walsh, D.J. Hawkes. M.J. Gleeson, "Feasibility study of Magnetic Resonance Imaging-guided intranasal flexible microendoscopy", ComputerAided Surgery 2, pp. 264-275. 1997.

L.M. Auer, D.P Auer, "Virtual endoscopy for planning and simulation of minimally invasive neurosurgery",

Neurosurgerv43, pp. 529-548, 1998. 8. W. Konen, M. Scholz, S. Tombrock, "The VN Project: Endoscopic image processing for neurosurgery", Computer Aided Surgerv3,pp. 144-148, 1998. 9. P. Jannin, A. Bouliou, J.M. Scarabin, C. Barillot, J. Luber. "Visual matching between real and virtual images in image guided neurosurgery". SPIE Proceedings 3031, pp. 518-526, 1998. 10. M.J. Clarkson. D. Rueckert, A.P King, P.J. Edwards, D.L.G. Hill, D.J. Hawkes, "Registration of video images to tomographics images by optimising mutual information using texture mapping", Proceedings MICCAI, pp. 579-588, 1999.

299

11.

R. Szeliski, H.Y. Shum, "Creating full view panoramic image mosaics and Environment maps", Proceedings SIGGRAPH, pp. 251-258, 1997.

//www. apple . corn/quicktime! Surround Video, http : //www.bdiarnond. corn

12. QuickTime VR, http : 13.

14. W. Schroeder, K. Martin, B. Lorensen, "The Visualization Toolkit: An Object-Oriented Approach To 3D Graphics" 2nd Edition, Prentice Hall, 1997. 15. R.Y. Tsai, "A versatile camera calibration technique for high-accuracy 3D machine vision metrology using off-the-shelf TV cameras and lenses", IEEE Journal of Robotics andAutomation, RA-3 (4), pp. 323-345, 1987. 16. H. Haneishi, Y. Yagihashi, Y. Miyake, "A new method for distortion correction of electronic endoscope images," IEEE Transactions on Medical Imaging 14, pp. 548-555, 1995. a as M. fundamental "Texture 17. P. Haeberli, mapping drawing primitive", Segal,

http://www.sgi.com/grafica/texmap/index.htm1, 1993.

18. M. Lutz, "Programming Python", O'Reilly and Associates, 1996.

300